Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 2K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

WINNERS AWAITED!
- ABECA 2025
  
  WINNERS AWAITED!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

Add office photos

Engaged Employer

UKG

Compare

3.1

based on 587 Reviews

79 UKG Jobs

Sr Principal Site Reliability Engineer

Kronos Solutions India Pvt. Ltd.

3.1

based on 587 Reviews

10-14 years

Pune

Sr Principal Site Reliability Engineer

UKG

posted 19hr ago

Job Role Insights

Flexible timing

Key skills for the job

Python AWS Java C++ Azure DevOps Javascript

+ 8 more

Job Description

Site Reliability Engineers at UKG are team members that have a breadth of knowledge encompassing all aspects of service delivery. They develop software solutions to enhance, harden and support our service delivery processes. This can include building and managing CI/CD deployment pipelines, automated testing, capacity planning, performance analysis, monitoring, alerting, chaos engineering and auto remediation.

Site Reliability Engineers must have a passion for learning and evolving with current technology trends. They strive to innovate and are relentless in their pursuit of a flawless customer experience. They have an automate everything mindset, helping us bring value to our customers by deploying services with incredible speed, consistency and availability.

Primary/Essential Duties and Key Responsibilities:

Proficient in Splunk/ELK, and Datadog.
Experience with observability tools such as Prometheus/InfluxDB, and Grafana.
Possesses strong knowledge of at least one scripting language such as Python, Bash, Powershell or any other relevant languages.
Design, develop, and maintain observability tools and infrastructure.
Collaborate with other teams to ensure observability best practices are followed.
Develop and maintain dashboards and alerts for monitoring system health.
Troubleshoot and resolve issues related to observability tools and infrastructure.
Engage in and improve the lifecycle of services from conception to EOL, including: system design consulting, and capacity planning
Define and implement standards and best practices related to: System Architecture, Service delivery, metrics and the automation of operational tasks
Support services, product & engineering teams by providing common tooling and frameworks to deliver increased availability and improved incident response.
Improve system performance, application delivery and efficiency through automation, process refinement, postmortem reviews, and in-depth configuration analysis
Collaborate closely with engineering professionals within the organization to deliver reliable services
Identify and eliminate operational toil by treating operational challenges as a software engineering problem
Actively participate in incident response, including on-call responsibilities
Partner with stakeholders to influence and help drive the best possible technical and business outcomes
Guide junior team members and serve as a champion for Site Reliability Engineering

Engineering degree, or a related technical discipline, and 10+years of experience in SRE.
Experience coding in higher-level languages (e.g., Python, Javascript, C++, or Java)
Knowledge of Cloud based applications & Containerization Technologies
Demonstrated understanding of best practices in metric generation and collection, log aggregation pipelines, time-series databases, and distributed tracing
Ability to analyze current technology utilized and engineering practices within the company and develop steps and processes to improve and expand upon them
Working experience with industry standards like Terraform, Ansible.

(Experience, Education, Certification, License and Training)
Must have hands-on experience working within Engineering or Cloud.
Experience with public cloud platforms (e.g. GCP, AWS, Azure)
Experience in configuration and maintenance of applications & systems
infrastructure. Experience with distributed system design and architecture
Experience building and managing CI/CD Pipelines

Employment Type: Full Time, Permanent

Read full job description

UKG Interview Questions & Tips

Prepare for UKG roles with real interview advice

What people at UKG are saying

2.0

Rating based on 1 Principal Site Reliability Engineer review

Anonymous · Engineering - Software & QA in Noida

Likes

Cab service , Quatarly coupan andd food service

Dislikes

Too many firing on frequent basis due to which everyone os feeling insecure . No one is sharing any type of information . Its very difficult for new employee to gain Knowledge from old employee they dont want to share any thing