Senior Site Reliability Engineer (SRE) // Atlanta, GA Job at Cohesive Technologies, Atlanta, GA

bVQ2a0VyZU1zKzN2a3BnZDJuUk5aTFlCTVE9PQ==
  • Cohesive Technologies
  • Atlanta, GA

Job Description

Job Description

Job Description

Cohesive Technologies is a global IT Services & Solutions company providing IT Staffing Services and Application Development Services necessary for technology leaders to deliver business value. We help our people and clients succeed by leveraging our expertise, deep industry and market knowledge, proprietary assessment tools and techniques, and project delivery methodologies. Through relationships with thousands of specialized professionals, we bring an unparalleled ability to match talent with opportunities by assessing, recruiting, developing and engaging the best and brightest people for our clients. We combine broad geographic presence, world-class solutions and a tailored, consultative approach to help our people and clients achieve higher performance and outstanding results.

Position Title: Senior Site Reliability Engineer (SRE) - Mission-Critical SaaS Cloud Products

Location: Atlanta, GA (Day 1 Onsite)

Note: Need 14+ Years of experience.

Key Responsibilities :

Reliability and Performance Management:

  • Design, implement, and maintain highly available, scalable, and resilient cloud-native architectures for mission-critical SaaS products.
  • Develop and implement SLOs, SLIs, and SLAs to measure and improve service reliability.
  • Continuously optimize system performance and resource utilization across multiple cloud platforms.
  • Finetune/Optimize Application performance by analyzing the code, traces and database queries.

Incident Management and Troubleshooting:

  • Lead incident response efforts, effectively troubleshooting complex issues to minimize downtime and impact.
  • Reduce Mean Time to Recover (MTTR) through proactive monitoring, automated alerting, and efficient problem-solving techniques.
  • Conduct thorough Root Cause Analysis (RCA) for all major incidents and implement preventive measures.

Observability and Monitoring:

  • Design and implement end-to-end observability solutions across our distributed systems.
  • Develop and maintain comprehensive monitoring strategies using tools like ELK Stack, Prometheus, Grafana.
  • Create and optimize product status dashboards to provide real-time visibility into system health and performance.

Automation and Infrastructure as Code (IaC):

  • Implement Infrastructure as Code practices using tools like Terraform.
  • Develop and maintain automated deployment pipelines and CI/CD workflows.
  • Create self-healing systems and automate routine operational tasks to reduce manual intervention.

Cloud-Agnostic Architecture:

  • Design and implement cloud-agnostic solutions that can operate efficiently across multiple cloud providers.
  • Develop expertise in event-driven architectures and related technologies (e.g., Apache Kafka/Eventhub, Redis, Mongo Atlas, IoTHub).
  • Implement and manage containerized applications using Kubernetes across different cloud environments.

Continuous Improvement:

  • Regularly review and refine operational practices to enhance efficiency and reliability.
  • Stay updated with the latest industry trends and technologies in SRE, cloud computing, and DevOps.
  • Contribute to the development of internal tools and frameworks to support SRE practices.

Requirements:

  • Strong knowledge of cloud platforms - Azure and their associated services.
  • Expert in Observability tools (ELK Stack, Dynatrace, Prometheus).
  • Expertise in containerization technologies such as Docker and Kubernetes.
  • Understanding of Event-driven architecture and database technologies (Mongo Atlas, Azure SQL, PostgresDB).
  • Proficient in IaaC tools such as - Terraform and GitHub Actions.
  • Proficiency in one or more programming languages - Python/.Net/Java.
  • Strong understanding of networking concepts, load balancing, and security practices.

Cohesive Technologies is an equal access/equal opportunity employer and does not discriminate on the basis of age, color, disability, marital status, national origin, race, religion, sex, sexual orientation, veteran status or any other classification prescribed by applicable law.

Job Tags

Similar Jobs

CLevelCrossing

VP of Programmable Voice Job at CLevelCrossing

Posted onOct 08, 2020Apply for this jobyour email:upload resume:Profile Because you belong at Twilio.The Who, What, Why and WhereAt Twilio, our mission is to power the future of communications! We believe that communication should be at the heart of every product... 

Aloden, Inc.

ETL Test Engineer with Tosca exp. Job at Aloden, Inc.

 ...ETL Test Engineer Plano TX (Nearby candidates) Hybrid role W2 Candidates Job Description : Collaborate with Product owner, scrum master, development, system, and release management teams to certify and promote code to higher level environments. Own... 

A-Core Concrete Specialists

Heavy Haul Truck Driver Job at A-Core Concrete Specialists

 ...take a leap in the right direction and be a part of the A-Core team JOIN US TODAY! Job Description: Experienced heavy haul truck driver, to operate roll off trucks, 10-wheelers, end dumps, transport truck with lowboy, etc. Duties to include performing daily pre... 

Overlake Terrace Retirement Community

Front Desk Receptionist- Part Time Thursday-Sunday Evenings Job at Overlake Terrace Retirement Community

 ...maintain a safe environment for all employees, residents, families, and visitors, Stellar strongly encourages its employees to receive an FDA-approved COVID-19 vaccination, as well as any subsequent booster doses, as recommended by the Centers for Disease Control and... 

Rutland Regional Medical Center

Front Office Assistant Job at Rutland Regional Medical Center

 ...within the clinic timely. Focus will always be on patient needs and efficient use of hospital resources. Minimum Education ~ High School diploma or equivalent. Minimum Work Experience ~2 years secretarial experience or medical office training including 1 year...