Upload Button Icon Add office photos
filter salaries All Filters

19 InOpTra Digital Jobs

Senior Site Reliability Engineer (SRE)-Grafana

10-18 years

₹ 20 - 32.5L/yr

Bangalore / Bengaluru

1 vacancy

Senior Site Reliability Engineer (SRE)-Grafana

InOpTra Digital

posted 9d ago

Job Role Insights

Flexible timing

Job Description

Job Description:

We are looking for a skilled Senior Site Reliability Engineer (SRE) with deep expertise in Prometheus, Grafana, and Kubernetes to join our remote team. In this role, you will manage and optimize the infrastructure supporting a large-scale hardware monitoring project, ensuring high availability, reliability, and scalability for thousands of server hardware.


Key Responsibilities:

  • Monitoring and Observability: Design, implement, and maintain comprehensive monitoring systems using Prometheus and Grafana to track and visualize metrics from thousands of hardware servers.
  • Kubernetes Orchestration: Deploy, manage, and optimize applications on Kubernetes clusters, ensuring optimal performance and scalability.
  • Automation and Scripting: Develop and implement automation for routine tasks, including alerting, system monitoring, and response mechanisms.
  • Incident Management: Troubleshoot, diagnose, and resolve infrastructure incidents, ensuring the uptime and reliability of services.
  • Performance Tuning: Optimize system performance, ensuring efficient data storage, querying, and alerting in Prometheus and Grafana environments.
  • CI/CD Integration: Collaborate with development teams to integrate monitoring into the CI/CD pipeline and ensure smooth deployments.
  • Capacity Planning: Perform capacity analysis and ensure that systems are appropriately scaled to handle increasing load.
  • Post Deployment Support: Support for monitoring solution once monitoring solution is implemented, troubleshooting incidents.

Required Skills:

  • Grafana: Advanced experience in setting up Grafana dashboards for real-time monitoring and alerting.
  • Prometheus: Proficient in configuring, tuning, and managing Prometheus for large-scale environments.
  • Kubernetes: Strong hands-on experience with managing Kubernetes clusters, deployments, and container orchestration.
  • Scripting: Proficiency in scripting languages such as Python or Bash automate tasks.
  • Alerting & Incident Management: Experience setting up advanced alerting and incident management processes.
  • Infrastructure as Code (IaC): Experience with tools like Helm.
  • CI/CD Pipelines: Knowledge of CI/CD tools and automation frameworks for seamless deployment.

Preferred Skills:

  • Familiarity with external storage for prometheus (ex. Mimir) for high-scale storage backends.
  • Experience with any Cloud Platforms (ex. AWS, GCP, Azure) for deploying infrastructure.
  • Knowledge of microservices architecture and REST APIs.

Qualifications:

  • 6+ years of hands-on experience as an SRE, DevOps Engineer, or similar role in managing complex infrastructure systems.
  • 2+ years of hands-on experience with implementing Grafana dashboard and alert integration with various tools.
  • Strong understanding of DevOps practices and infrastructure automation.
  • Proven experience in large-scale monitoring systems and high-availability environments.
  • Excellent troubleshooting, analytical, and problem-solving skills.


Employment Type: Full Time, Permanent

Read full job description

Prepare for Senior Site Reliability Engineer roles with real interview advice

What people at InOpTra Digital are saying

What InOpTra Digital employees are saying about work life

based on 29 employees
88%
96%
66%
100%
Flexible timing
Monday to Friday
No travel
Day Shift
View more insights

InOpTra Digital Benefits

Work From Home
Free Transport
Child care
Gymnasium
Cafeteria
Free Food +6 more
View more benefits

Compare InOpTra Digital with

TCS

3.7
Compare

Infosys

3.6
Compare

Wipro

3.7
Compare

HCLTech

3.5
Compare

Tech Mahindra

3.5
Compare

LTIMindtree

3.8
Compare

Mphasis

3.4
Compare

Hexaware Technologies

3.6
Compare

Cyient

3.7
Compare

Primus Global Technologies

3.9
Compare

TriGeo Technologies

3.2
Compare

GrapplTech

4.7
Compare

Webixy Technologies

4.9
Compare

Plada Infotech Services

3.6
Compare

Hummingwave Technologies

4.8
Compare

Fusion

3.2
Compare

Infocus Technologies

3.9
Compare

Anlage Infotech

3.6
Compare

Riddhi Corporate Services

3.7
Compare

CGS

3.5
Compare

Similar Jobs for you

Site Reliability Engineer at Collabera

5-8 Yrs

₹ 12-26 LPA

Site Reliability Engineer at Pattern Technologies

Pune

6-10 Yrs

₹ 20-35 LPA

Site Reliability Engineer at Faurecia

Pune

5-10 Yrs

₹ 9-19 LPA

Site Reliability Engineer at UST

Bangalore / Bengaluru

7-12 Yrs

₹ 15-25 LPA

Site Reliability Engineer Lead at Groww

5-8 Yrs

₹ 15-20 LPA

Senior Site Reliability Engineer at Trintech

Bangalore / Bengaluru

6-11 Yrs

₹ 17-30 LPA

Site Reliability Engineer at Magna International

Bangalore / Bengaluru

5-8 Yrs

₹ 7-17 LPA

Site Reliability Engineer at Uplers

Chennai

5-9 Yrs

₹ 27.5-35 LPA

Site Reliability Engineer at Teksystems

Hyderabad / Secunderabad, Bangalore / Bengaluru

5-10 Yrs

₹ 9-19 LPA

Site Reliability Engineer at Teksystems

Hyderabad / Secunderabad, Bangalore / Bengaluru

5-10 Yrs

₹ 9-19 LPA

Senior Site Reliability Engineer (SRE)-Grafana

10-18 Yrs

₹ 20 - 32.5L/yr

Bangalore / Bengaluru

9d ago·via naukri.com

QA Automation Engineer

5-7 Yrs

Bangalore / Bengaluru

18hr ago·via naukri.com

.Net Developer

5-7 Yrs

Bangalore / Bengaluru

18hr ago·via naukri.com

Palo Alto Engineer

8-13 Yrs

Bangalore / Bengaluru

2d ago·via naukri.com

.NET Software Developer

4-8 Yrs

Bangalore / Bengaluru

4d ago·via naukri.com

Angular Developer

4-6 Yrs

Bangalore / Bengaluru

4d ago·via naukri.com

Market research freshers

0-1 Yrs

₹ 1 - 3L/yr

Bangalore / Bengaluru

5d ago·via naukri.com

Cisco Routing and Switching Engineer

7-12 Yrs

Bangalore / Bengaluru

5d ago·via naukri.com
write
Share an Interview