Upload Button Icon Add office photos
filter salaries All Filters

3 Exploro Solutions Jobs

Site Reliability Engineer - Prometheus/Grafana (4-7 yrs)

4-7 years

Site Reliability Engineer - Prometheus/Grafana (4-7 yrs)

Exploro Solutions

posted 4d ago

Job Description

Job Role : Site Reliability Engineer

YOE : 4 to 7 yrs

Key Responsibilities :

Payment Monitoring and Alert Triage :

- Monitoring of the Payments Flow Based Alerts across multiple applications in rotation 24 X 7 shifts and identify the issue proactively.

- Triage the alerts by analysing the trends on affected dimensions of payment flow, and co-relate the same with other services metrics, logs and traces to find the root cause along with the documentation of triage.

- Ensure timely escalation and closure of issues reported while working with Engineering Teams of payment Services.

Observability Development :

- Design and implement alerting frameworks using tools like Datadog, Grafana, Kiban a, Splunk, and Prometheus.

- Set up custom dashboards and streamline alerting to reduce noise while ensuring critical issues are addressed.

- Drive the adoption of SLO-based alerting, burn rate metrics, and anomaly detection techniques.

Incident Management :

- Lead incident management efforts from identification to resolution.

- Conduct post-incident reviews and implement preventive measures to avoid recurring issues.

- Maintain detailed documentation and performance reports on incident trends and team efficiency.

Automation and Optimization :

- Automate repetitive processes using programming languages like Python or Java.

- Develop and refine scripts to manage and fine-tune alerts.

- Collaborate with engineering teams to implement solutions that reduce manual effort and operational toil.

Required Skills and Qualifications :

- Proven expertise in SRE Observability Concepts and monitoring architecture design.

- Extensive experience with alerting frameworks like Prometheus, Grafana, Kibana, Splunk, and Datadog.

- Hands-on experience with alert noise reduction and advanced alerting techniques such as anomaly detection and burn rate alerting.

- Strong proficiency in incident management, including analysis, root cause identification, and preventive measures.

- Familiarity with payment monitoring systems and operational requirements.

- Proficient in automation tools and scripting languages like Python or Java.

- Excellent collaboration and communication skills to interact with cross-functional teams.

- Flexibility to work in rotational 24x7 shifts from the office.

Notice Period : Immediate to 20 days


Functional Areas: Software/Testing/Networking

Read full job description

What people at Exploro Solutions are saying

What Exploro Solutions employees are saying about work life

based on 1 employee
100%
Day Shift
View more insights

Exploro Solutions Benefits

Free Transport
Child care
Gymnasium
Cafeteria
Work From Home
Free Food +6 more
View more benefits

Compare Exploro Solutions with

Randstad

3.7
Compare

Team Lease

3.9
Compare

Innovsource Services

3.9
Compare

ManpowerGroup

3.8
Compare

Aarvi Encon

3.9
Compare

eTeam

3.2
Compare

IMPACT Infotech

3.4
Compare

Teamware Solutions

4.2
Compare

CIEL HR

3.9
Compare

First Advantage

3.8
Compare

Careernet

3.7
Compare

LanceSoft

3.1
Compare

Kutumbh Care

3.9
Compare

Experis IT

3.0
Compare

PeopleStrong

3.4
Compare

Progressive Infovision

4.1
Compare

Talentpro

3.9
Compare

Pyramid IT Consulting

3.0
Compare

Virtual Employee

3.4
Compare

ABC Consultants

3.9
Compare

Similar Jobs for you

Site Reliability Engineer at Zensar Technologies

6-8 Yrs

₹ 18-24 LPA

Site Reliability Engineer at Apple INC

4-6 Yrs

Not Disclosed

Site Reliability Engineer at NexionPro

6-14 Yrs

₹ 13-35 LPA

Site Reliability Engineer at Truelancer.com

5-7 Yrs

₹ 15-21 LPA

Site Reliability Engineer at Fork Technologies

7-9 Yrs

₹ 20-22 LPA

Senior Site Reliability Engineer at CIRRUSLABS PRIVATE LIMITED

5-12 Yrs

₹ 20-32 LPA

Site Reliability Engineer at Whitefield Careers

7-10 Yrs

₹ 20-25 LPA

Site Reliability Engineer at IT Firm

5-8 Yrs

₹ 28-45 LPA

Site Reliability Engineer at Signzy Technologies

3-5 Yrs

₹ 12-18 LPA

Cloud Operations Engineer at Rohini IT Consulting LLP

5-12 Yrs

₹ 15-30 LPA

Generative AI Engineer - Python (5-10 yrs)

5-10 Yrs

10d ago·via hirist.com

Data Engineer - ETL (6-10 yrs)

6-10 Yrs

11d ago·via hirist.com

Recently Viewed

write
Share an Interview
Rate your experience using AmbitionBox
Terrible
Terrible
Poor
Poor
Average
Average
Good
Good
Excellent
Excellent