Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 2K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

WINNERS AWAITED!
- ABECA 2025
  
  WINNERS AWAITED!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

Add office photos

Employer? Claim Account for FREE

Travash Software Solutions

Compare

3.8

based on 8 Reviews

4 Travash Software Solutions Jobs

Site Reliability Engineer - AWS Cloud Services (6-8 yrs)

Travash Software Solutions Private Limited

3.8

based on 8 Reviews

6-8 years

Travash Software Solutions

posted 2mon ago

Job Role Insights

Flexible timing

Key skills for the job

Python AWS Cloud Services Incident Management Site Reliability Engineering IT Infrastructure

+ 3 more

Job Description

Role : Site Reliability Engineer

Location : Hyderabad

Job Type : Full-Time

Experience Level : Senior (5-8 years)

About the Role :

We are looking for a seasoned Senior Site Reliability Engineer (SRE) with 5-8 years of experience in cloud infrastructure and reliability engineering. In this role, you will contribute to building and managing highly reliable, scalable, and secure systems while collaborating across teams to embed reliability practices into the development lifecycle.

Key Responsibilities :

1. Infrastructure Design and Deployment :

- Design and implement scalable, reliable, and fault-tolerant cloud architectures using AWS services (e.g., EC2, S3, Lambda).

- Support automation of infrastructure provisioning and management for streamlined deployments.

2. Monitoring and Observability :

- Implement real-time monitoring to track application and infrastructure health.

- Ensure observability practices are in place to identify and resolve issues efficiently.

3. Incident Management :

- Actively participate in incident response, ensuring minimal service disruption and fast recovery.

- Perform post-incident analysis to identify root causes and recommend preventive measures.

4. Security and Compliance :

- Apply security best practices to cloud infrastructure to protect data and applications.

- Ensure compliance with relevant standards and frameworks (e.g., SOC 2, ISO 27001).

5. Collaboration and Training :

- Collaborate with development, operations, and other teams to ensure reliability is prioritized.

- Share knowledge and mentor peers on best practices in reliability engineering.

6. Performance Optimization :

- Analyze and optimize system performance to improve efficiency and reduce latency.

- Conduct capacity planning to prepare infrastructure for future growth and demand.

7. Disaster Recovery and Backup :

- Contribute to the development and maintenance of disaster recovery plans.

- Implement backup solutions to safeguard critical data and maintain business continuity.

Qualifications :

- Experience: 5-8 years of experience in Site Reliability Engineering, Cloud Engineering, or a similar role.

Technical Skills :

- Proficient with AWS services (e.g., EC2, S3, Lambda) and cloud architecture.

- Hands-on experience with monitoring tools like CloudWatch, Grafana, or Datadog.

- Strong skills in scripting and automation (e.g., Python, Bash, Terraform).

- Problem-Solving : Strong troubleshooting and root cause analysis abilities.

- Collaboration : Ability to work effectively in cross-functional teams.

- Security Awareness : Knowledge of security best practices and compliance standards.

What We Offer :

- Opportunities for career growth and ongoing learning.

- A collaborative and innovative work environment.