Upload Button Icon Add office photos
filter salaries All Filters

11 Reflections Info Systems Jobs

Reflections Info Systems - Site Reliability Engineer - CI/CD Pipeline (3-6 yrs)

3-6 years

Reflections Info Systems - Site Reliability Engineer - CI/CD Pipeline (3-6 yrs)

Reflections Info Systems

posted 3d ago

Job Description

Introduction :

As a Site Reliability Engineer (SRE) you will be responsible for improving the overall reliability of applications by ensuring its availability, performance, and scalability. Should be able to gather the technical requirements from the DevOps team and the operational requirements from the Application Support team. With the Site Reliability Engineer role being at the heart of solving production problems, should be able to take a holistic approach to troubleshooting and delve deeply into technical details and must acquire the necessary domain knowledge to effectively troubleshoot and recover from an outage as well as monitor applications in production and build alerts as required.

Working Hours : 05:30 AM to 1:30 PM IST (GMT+5:30)

Responsibilities include :

- Work closely with the application support team.

- Monitor critical applications and services to minimize downtime and ensure their availability.

- Collaborate with DevOps teams to maintain and monitor CI/CD pipelines.

- Deploy new versions to production environments.

- Work with project teams to ensure the reliability and maintainability of new and modified releases.

- Provide input to risk management practices that will anticipate reliability-related incidents that could adversely impact operations.

- Document processes and monitor application performance metrics.

- Continuously improve proactive monitoring alert configuration and incident response processes to increase reliability and reduce Mean Time to Recovery (MTTR ).

- Optimize performance and cost efficiency through continuous monitoring, trend analysis, and fine-tuning.

- Monitor any abnormal usage that can impact the cost or performance and take corrective actions.

- Proactively implement preventive measures to improve system reliability.

- Maintain runbooks, Standard Operating Procedures (SOPs), diagrams, and documentation for swift incident response.

- Conduct post-incident reviews to improve reliability and contribute to the development of resilience strategies.

- Achieve Service Level Indicators (SLIs) that are set to meet reliability objectives.

Certifications :

- Azure Solutions Architect Expert (Microsoft)

- AWS Certified Solutions Architect (AWS)

- Open Group Certified Enterprise Architect (TOGAF)

- PMP or Prince-2 in Project Management

Primary Skills :

Monitoring and Analysis :

- Continuously monitor CDC dashboards to track service performance and analyze reports.

- Oversee production and DevOps infrastructure dashboards, ensuring system stability and identifying potential issues.

- Observe alerts from New Relic and escalate them to the respective teams as needed.

- Identify duplicated New Relic alerts and optimize alert configurations to reduce noise and improve efficiency.

- Track daily alerts in production to enhance alert optimization strategies.

- Maintain and update a list of dashboards monitored, including details such as widgets, metrics, and threshold values.

- Create and manage dashboards for validating and monitoring CPU optimizations for Rapid and CDC services.

- Perform sanity checks on Container Memory Utilization, Missing Pods, Container Restarts, Container CPU Utilization, Active Pods, Node Resource Consumption, and Pod Network Status to ensure system health.

Release and Deployment Management :

- Coordinate and execute weekly production releases, ensuring services are deployed with optimized CPU values.

- Update central repositories with the latest service configurations and CPU requests.

- Perform post-deployment sanity checks to validate service stability after production releases.

- Redeploy CDC services with optimized CPU values, ensuring system performance improvements.

- Monitor new CPU optimizations for Rapid and CDC services, tracking performance improvements and resource utilization.

Incident Management and RCA Documentation :

- Conduct incident analysis, identifying root causes and documenting findings for continuous improvement.

- Maintain detailed Root Cause Analysis (RCA) documentation to track incidents and resolutions.

- Provide reports on incident trends, helping improve response times and preventive measures.

Collaboration and Communication :

- Participate in daily SyncUpsand internal meetings to discuss ongoing tasks, challenges, and improvements.

- Sync up with the (NOC) team to align on monitoring strategies and escalations.

- Collaborate with the Database (DB) team for performance tuning and issue resolution.

- Conduct knowledge transfer (KT) sessions on Rapid Resource

Optimization and related best practices :

- Optimization and Continuous Improvement

- Track CPU optimization efforts, ensuring proper resource allocation and utilization for Rapid and CDC services.

- Analyze performance data to refine resource allocation strategies and improve system efficiency.

- Identify and implement best practices for reducing alert noise and optimizing monitoring configurations.

Secondary Skills :

- Technical Knowledge

- Fluent in AWS key services (EBS, S3, AWS Compute, Storage, RDS etc).

- Expertise in Kubernetes or any Container Orchestration System.

- Knowledge of Infrastructure as a Code.

- Linux system administration knowledge.

- Knowledge of RDBMS and Document databases.

- Knowledge of Monitoring tools including AWS CloudWatch and NewRelic.

- Additional certification in Microsoft, Linux, Cisco, AWS or similar technologies is a plus.

Behavioral competencies :

- Communication

- Customer Centricity

-Business & Market Acumen

- Psychological Safety

- Empathy

- Growth Mindset & Learning Agility

- Ethical and Vigilant

- Digital Mindset

- Operational Excellence

- Teamwork

- Analytical thinking


Functional Areas: Software/Testing/Networking

Read full job description

Prepare for Site Reliability Engineer roles with real interview advice

Top Reflections Info Systems Site Reliability Engineer Interview Questions

Q1. Common #1 - What if an issue arise on a "Production" server that we can't troubleshoot in "Staging" or "Local", So how to fix that issue ?
Q2. Laravel #3 - How to write a custom authentication & middleware?
Q3. Laravel #7 - How to fetch 5 million records from 3 tables from a single DB in optimized way?
View all 36 questions

What people at Reflections Info Systems are saying

What Reflections Info Systems employees are saying about work life

based on 65 employees
66%
98%
88%
100%
Flexible timing
Monday to Friday
No travel
Day Shift
View more insights

Reflections Info Systems Benefits

Work From Home
Soft Skill Training
Cafeteria
Health Insurance
Job Training
Education Assistance +6 more
View more benefits

Compare Reflections Info Systems with

TCS

3.7
Compare

Infosys

3.6
Compare

Wipro

3.7
Compare

HCLTech

3.5
Compare

Tech Mahindra

3.5
Compare

LTIMindtree

3.8
Compare

Mphasis

3.4
Compare

Hexaware Technologies

3.5
Compare

Persistent Systems

3.5
Compare

Northcorp Software

4.3
Compare

Accel Frontline

4.0
Compare

Elentec Power India (EPI) Pvt. Ltd.

3.7
Compare

HyScaler

4.5
Compare

Appsierra

4.4
Compare

Pitney Bowes

3.8
Compare

Apmosys Technologies

3.4
Compare

Yashi Consulting Services

3.6
Compare

Apex CoVantage

3.1
Compare

VHS Consulting

3.7
Compare

DynPro

3.8
Compare

Similar Jobs for you

Site Reliability Engineer at Recro

4-6 Yrs

₹ 12-20 LPA

Site Reliability Engineer at Dotflick Solutions

5-13 Yrs

₹ 19-72 LPA

Site Reliability Engineer at QuestionPro

3-7 Yrs

₹ 6-17 LPA

Site Reliability Engineer at O9 SOLUTIONS MANAGEMENT INDIA PRIVATE LIMITED

6-9 Yrs

₹ 20-28 LPA

Devops Engineer at Rapsys Technologies

3-5 Yrs

₹ 10-24 LPA

Site Reliability Engineer at Intraedge Technologies Ltd.

7-12 Yrs

₹ 20-26 LPA

Devops Engineer at TekPillar Services Pvt. Ltd

3-6 Yrs

₹ 9-20 LPA

Devops Engineer at Pan Asia HR Solutions

5-8 Yrs

₹ 25-40 LPA

Devops Engineer at Josys

4-7 Yrs

₹ 12-18 LPA

Senior Devops Engineer at Burgeon It Services Pvt Ltd

6-8 Yrs

₹ 12-25 LPA

write
Share an Interview