Upload Button Icon Add office photos
Engaged Employer

i

This company page is being actively managed by Factspan Team. If you also belong to the team, you can get access from here

Factspan Verified Tick

Compare button icon Compare button icon Compare
filter salaries All Filters

4 Factspan Jobs

Site Reliability Engineer Manager

7-12 years

Bangalore / Bengaluru

1 vacancy

Site Reliability Engineer Manager

Factspan

posted 1hr ago

Job Description

Position: Site Reliability Engineering Manager
Bengaluru, Karnataka


Role Overview
We are looking for an experienced Site Reliability Engineering (SRE) Manager to lead a team of highly skilled SREs in managing, automating, and optimizing our cloud infrastructure on Google Cloud Platform (GCP). The SRE Manager will be responsible for ensuring the reliability, availability, and performance of critical services while driving automation and operational excellence having 8+ years of experience.

As an SRE Manager, you will work closely with development, infrastructure, and security teams to implement scalable, resilient, and high-performance solutions. This role is ideal for someone passionate about reliability engineering, cloud automation, and observability.


Key Responsibilities


Leadership & Team Management

  • Lead, mentor, and grow a team of Site Reliability Engineers, fostering a culture of innovation, collaboration, and continuous learning.
  • Define and drive SRE best practices, focusing on reliability, automation, monitoring, and incident response.
  • Collaborate with development, DevOps, and security teams to align infrastructure and application reliability with business objectives.
  • Own SRE roadmap and strategy, ensuring alignment with organizational goals and industry best practices.

Reliability & Performance

  • Ensure the uptime, availability, and performance of critical applications hosted on GCP.
  • Implement SLOs (Service Level Objectives), SLIs (Service Level Indicators), and SLAs (Service Level Agreements) to measure system reliability.
  • Conduct root cause analysis (RCA) for production incidents and drive post-mortems to improve system resilience.

Automation & CI/CD

  • Automate infrastructure management using Infrastructure-as-Code (IaC) tools such as Terraform or Pulumi.
  • Improve CI/CD pipelines using GitOps methodologies to enable faster and reliable deployments.
  • Champion self-healing architectures to minimize manual intervention.

Observability & Incident Management


  • Implement and enhance monitoring, logging, and alerting using tools like Prometheus, Grafana, Stackdriver (Cloud Monitoring), and Open Telemetry.
  • Develop on-call rotations, runbooks, and incident management processes to minimize downtime and improve MTTR (Mean Time to Resolution).
  • Use AI/ML-based anomaly detection for proactive monitoring.

Security & Compliance


  • Ensure security best practices for IAM, networking, and data encryption within GCP.
  • Conduct security audits and work with compliance teams to ensure adherence to SOC2, ISO 27001, HIPAA, or other regulatory frameworks.
  • Implement zero-trust security models and automated compliance policies.

Cost Optimization & Capacity Planning


  • Optimize cloud costs using GCP cost management tools, rightsizing, and auto-scaling.
  • Implement capacity planning strategies to balance cost and performance.
  • Work with finance teams to forecast infrastructure costs and optimize spend.

Required Skills & Qualifications:


Technical Skills

  • Strong expertise in Google Cloud Platform (GCP) services such as GKE, Cloud Run, Cloud Functions, Cloud SQL
  • BigQuery, and Cloud Spanner.
  • Hands-on experience with Terraform, Pulumi, or Cloud Deployment Manager for Infrastructure-as-Code (IaC).
  • Experience with CI/CD tools like GitHub Actions, ArgoCD, Spinnaker, or Jenkins.
  • Strong knowledge of Kubernetes (GKE) and container orchestration.
  • Experience with SRE principles such as error budgets, chaos engineering, and observability.
  • Strong scripting and automation skills in Python.
  • Experience with monitoring and observability tools (Stackdriver, Datadog, Prometheus, Grafana, New Relic).

Leadership & Soft Skills

  • Proven experience managing and mentoring SRE teams.
  • Strong problem-solving skills with the ability to troubleshoot complex production issues.
  • Ability to work in a fast-paced, DevOps-oriented environment.
  • Strong communication and stakeholder management skills.
  • Experience collaborating with cross-functional teams, including engineering, security, and product teams


Preferred Qualifications

  • GCP Professional Cloud Architect or GCP Professional DevOps Engineer certification.
  • Experience with multi-cloud or hybrid cloud environments.
  • Hands-on experience with serverless computing and event-driven architectures.
  • Prior experience in high-traffic, distributed systems.

If you are passionate about leveraging technology to drive business innovation, possess excellent problem-solving skills, and thrive in a dynamic environment, we encourage you to apply for this exciting opportunity.



Employment Type: Full Time, Permanent

Read full job description

Prepare for Site Reliability Engineer roles with real interview advice

What people at Factspan are saying

What Factspan employees are saying about work life

based on 116 employees
88%
94%
80%
87%
Flexible timing
Monday to Friday
No travel
Day Shift
View more insights

Factspan Benefits

Submitted by Company
Work From Home
Cafeteria
Team Outings
Education Assistance
International Relocation
Health Insurance
Submitted by Employees
Health Insurance
Work From Home
Cafeteria
Team Outings
Free Food
Job Training +6 more
View more benefits

Compare Factspan with

Fractal Analytics

4.0
Compare

Mu Sigma

2.6
Compare

AbsolutData

3.6
Compare

Tiger Analytics

3.6
Compare

LatentView Analytics

3.7
Compare

Axtria

3.1
Compare

Bridgei2i Analytics Solutions

3.8
Compare

Analytic Edge

3.1
Compare

Crayon Data

3.6
Compare

Algonomy

4.0
Compare

VDart

4.5
Compare

Magic Edtech

3.1
Compare

Jumio

3.7
Compare

Saama Technologies

3.7
Compare

DISYS

3.0
Compare

Data-Core Systems

3.1
Compare

Arvato

3.6
Compare

Microsense

3.5
Compare

Xchanging

4.0
Compare

11:11 Systems

3.8
Compare

Similar Jobs for you

Site Reliability Engineer at Virtusa Consulting Services Pvt Ltd

Hyderabad / Secunderabad

7-12 Yrs

₹ 9-14 LPA

Site Reliability Engineer at Infosys

Hyderabad / Secunderabad, Pune + 1

2-7 Yrs

₹ 12-22 LPA

Site Reliability Engineer at Tsworks

Bangalore / Bengaluru

3-8 Yrs

₹ 10-18 LPA

Site Reliability Engineer at Rapsys Technologies PTE LTD

Chennai, Bangalore / Bengaluru

5-10 Yrs

₹ 8-14 LPA

Site Reliability Engineer at COFORGE LIMITED

Hyderabad / Secunderabad

7-12 Yrs

₹ 10-14 LPA

Site Reliability Engineer at Fixity Technologies

6-9 Yrs

₹ 18-20 LPA

Site Reliability Engineer at Burgeon It Services Pvt Ltd

8-10 Yrs

₹ 15-25 LPA

Site Reliability Engineer at BLJ Tech Geeks

8-13 Yrs

₹ 20-35 LPA

Site Reliability Engineer at Burgeon It Services Pvt Ltd

5-8 Yrs

₹ 20-25 LPA

Site Reliability Engineer at Coders Brain Technology Private Limited

5-10 Yrs

₹ 15-20 LPA

Factspan Bangalore / Bengaluru Office Location

View all
Bengaluru/Bangalore, Karnataka Office
2nd Floor, South Block Vaishnavi Tech Park, SY no-16/1, Bellandur Gate, Sarjapura main Road, Ambalipura Bengaluru/Bangalore, Karnataka
560102

Site Reliability Engineer Manager

7-12 Yrs

Bangalore / Bengaluru

9hr ago·via naukri.com

Devops Architect

6-11 Yrs

Bangalore / Bengaluru

10hr ago·via naukri.com

Senior Manager

9-14 Yrs

Bangalore / Bengaluru

1d ago·via naukri.com
write
Share an Interview