Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 2K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

WINNERS AWAITED!
- ABECA 2025
  
  WINNERS AWAITED!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

For Employers

Add office photos

Employer? Claim Account for FREE

tsworks

Compare

5.0

based on 1 Review

9 tsworks Jobs

Site Reliability Engineer (SRE)

Tsworks

5.0

based on 1 Review

3-8 years

Bangalore / Bengaluru

3 vacancies

Site Reliability Engineer (SRE)

tsworks

posted 19d ago

Job Role Insights

Key skills for the job

DevOps AWS Java Sre Kubernetes Angular

+ 8 more

Job Description

Role & responsibilities

Architect, design, and maintain high availability, scalable, and resilient infrastructure to support business-critical applications.
Lead the implementation and management of Infrastructure as Code (IaC) using AWS CDK, ensuring infrastructure is automated, repeatable, and secure.
Develop and optimize automation for deployments, configuration management, and infrastructure provisioning across cloud (AWS) and container orchestration platforms (Kubernetes, EKS, ECS).
Enhance and maintain CI/CD pipelines, ensuring smooth and automated application and infrastructure deployments.
Design and implement monitoring and observability solutions using tools such as Datadog, Prometheus, Grafana, ensuring proactive identification and resolution of performance bottlenecks and failures.
Collaborate with development teams to ensure infrastructure aligns with application requirements and follows best practices for performance, security, and cost efficiency.
Lead incident response and root cause analysis efforts, ensuring high levels of service availability and quick resolution of infrastructure issues.
Continuously improve infrastructure performance, scalability, and reliability through best practices, automation, and innovation.
Mentor and coach junior engineers, sharing knowledge, best practices, and expertise in site reliability engineering.
Stay up to date with trends and advancements in cloud computing, containerization, and DevOps methodologies to drive improvements in our technology stack.

Preferred candidate profile

3 -10+ years of experience in Site Reliability Engineering, DevOps, or a related field.
Expertise in cloud computing, particularly AWS, with deep knowledge of infrastructure design and best practices.
Experience with multi-cloud environments, including Azure and GCP, is highly desirable.
Proficiency with AWS CDK is essential, with additional experience in Terraform and Ansible considered a strong advantage.
Strong experience with Kubernetes and container orchestration platforms (EKS, ECS), including deploying, scaling, and managing workloads.
Extensive experience with CI/CD tools and practices, with hands-on expertise in automating infrastructure (EKS, ALB, NLB, Route 53, WAF, Network components) and application deployments.
Advanced scripting and programming skills (Python, Bash, or similar) for automation and infrastructure management.
In-depth knowledge of monitoring, logging, and observability tools (Datadog, Prometheus, Grafana, ELK, etc.).
Preferred knowledge of Content Delivery Networks (CDNs) for optimizing application performance and scalability.
Strong troubleshooting and problem-solving skills, with a proactive approach to incident management and root cause analysis.
Strong application knowledge, including building and deploying Java Spring Boot and Angular applications.
Experience in setting up unit tests and code quality tools, such as SonarQube, to ensure robust application development
Proven ability to work independently and lead initiatives while collaborating with cross-functional teams.
Excellent communication and leadership skills, with experience mentoring junior engineers and driving technical excellence.

Employment Type: Full Time, Permanent

Read full job description