Upload Button Icon Add office photos
filter salaries All Filters

171 Nvidia Jobs

Senior Site Reliability Engineer - GPU Cloud

5-8 years

Bangalore / Bengaluru

1 vacancy

Senior Site Reliability Engineer - GPU Cloud

Nvidia

posted 10hr ago

Job Role Insights

Flexible timing

Job Description

NVIDIA has been a pioneer in Accelerated Computing and has been paving the way with innovations in Generative AI, Large Language Model (LLM), Autonomous Vehicles, Robotics, High-Performance Computing (HPC), Gaming/Visualization, and Edge/Data Center/Cloud Computing. NVIDIA provides automakers, research institutions, cloud providers, large companies and start-ups the power and flexibility to develop and deploy breakthrough artificial intelligence systems.

We are a fast paced, dynamic and dedicated Site Reliability Engineering (SRE) team serving the forefront of the latest science and technology trends on cloud and on-prem infrastructure management for High-Performance & Distributed Computing. Working closely with the development teams, we provide hosted solutions for our internal and external customers. Are you passionate about infrastructure and enjoy working on and resolving intricate multi-faceted issues? Are you eager to have your hands on the engines of the next generation of cloud services? Do you get a buzz from identifying and eliminating toil, designing and coding innovative solutions that address the needs of a whole organization? If so, read on and give us a shout.

What you'll be doing:

The NVIDIA GPU cloud is a hosted platform for internal R&D teams and external AI/ML stack customers. This SRE team is accountable for the setup, management, reliability and availability of this infrastructure spanning 1000s of GPU nodes.

As a senior SRE, you are responsible for:

  • Providing scalable and robust service oriented infrastructure automation, monitoring and analytics solutions for NVIDIA's on-prem and cloud based GPU infrastructure.

  • You will own the whole life cycle of new tools and services - from requirements gathering, to design documentation, validation and deployment.

  • Provide customer support on a rotation basis.

What we need to see:

  • Minimum of 8 years of experience ce in automating and handling large-scale distributed system software deployments in on-prem/cloud environments.

  • Proficiency in any language - Go/Python/Perl/C++/Java/C.

  • Strong command on terraform, Kubernetes and cloud infra administration.

  • Excellent debugging and troubleshooting skills.

  • Ability to design simple and reliable systems that can work without much support.

  • Outstanding teammate who can collaborate and influence in a multifaceted environment.

  • Excellent interpersonal, and written communication skills.

  • M. Sc or B. E in Computer Science or a related technical field involving coding (e. g. , physics or mathematics)

Ways to stand out from the crowd:

  • Ability to decompose complex requirements into simple tasks and reuse available solutions to implement most of those.

  • Proven record of maintaining platform SLAs through accurate resolutions.

  • Unit testing and benchmarking are an integral part of your code.

  • Ability to reason and choose the best possible algorithm to meet scaling and availability challenges.

NVIDIA is widely considered to be one of the technology world s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.


Employment Type: Full Time, Permanent

Read full job description

Prepare for Senior Site Reliability Engineer roles with real interview advice

People are getting interviews at Nvidia through

(based on 55 Nvidia interviews)
Campus Placement
Job Portal
Company Website
Referral
Walkin
Recruitment Consultant
36%
25%
11%
7%
4%
4%
13% candidates got the interview through other sources.
High Confidence
?
High Confidence means the data is based on a large number of responses received from the candidates.

What people at Nvidia are saying

Senior Site Reliability Engineer salary at Nvidia

reported by 3 employees with 7-9 years exp.
₹28.8 L/yr - ₹97 L/yr
100% more than the average Senior Site Reliability Engineer Salary in India
View more details

What Nvidia employees are saying about work life

based on 512 employees
66%
96%
85%
79%
Flexible timing
Monday to Friday
No travel
Day Shift
View more insights

Nvidia Benefits

Free Transport
Free Food
Cafeteria
Health Insurance
Work From Home
Job Training +6 more
View more benefits

Compare Nvidia with

Qualcomm

3.8
Compare

Intel

4.3
Compare

Advanced Micro Devices

3.8
Compare

Micron Technology

3.7
Compare

Texas Instruments

4.1
Compare

Broadcom

3.3
Compare

Applied Materials

3.9
Compare

Analog Devices

4.1
Compare

NXP Semiconductors

3.8
Compare

Sterlite Technologies

3.8
Compare

Indus Towers

3.8
Compare

Nokia Networks

4.3
Compare

Cisco

4.2
Compare

Lumen Technologies

4.0
Compare

Redington

4.0
Compare

Colt Technology Services

4.4
Compare

RadiSys

4.1
Compare

Vindhya Telelinks

4.1
Compare

Juniper Networks

4.2
Compare

ITI

3.7
Compare

Similar Jobs for you

Site Reliability Engineer at NVIDIA

Bangalore / Bengaluru

4-7 Yrs

₹ 17-21 LPA

Senior Site Reliability Engineer at Flexera Software

Bangalore / Bengaluru

4-8 Yrs

₹ 13-18 LPA

Architect at NVIDIA

Bangalore / Bengaluru

7-11 Yrs

₹ 19-23 LPA

Senior Architect at NVIDIA

Bangalore / Bengaluru

7-11 Yrs

₹ 20-27.5 LPA

Senior Site Reliability Engineer at Equifax Credit Information Services Private Limited

Pune, Thiruvananthapuram

6-14 Yrs

₹ 8-16 LPA

Senior Site Reliability Engineer at Barracuda Networks

Bangalore / Bengaluru

4-10 Yrs

₹ 25-30 LPA

Architect at NVIDIA

Bangalore / Bengaluru

7-11 Yrs

₹ 19-23 LPA

Staff at Synopsys (India) Private Limited

Noida, Hyderabad / Secunderabad + 1

3-8 Yrs

₹ 16-21 LPA

Site Reliability Engineer at Equifax Credit Information Services Private Limited

Pune, Thiruvananthapuram

4-8 Yrs

₹ 4-15 LPA

Verification Engineer at NVIDIA

Bangalore / Bengaluru

0-5 Yrs

₹ 18-20 LPA

Nvidia Bangalore / Bengaluru Office Locations

View all
Bengaluru Office
NVIDIA Graphics PVT LTD, C-1 "Jacaranda", Wing-A Manyata Embassy Business Park, Outer Ring Road Bengaluru
Karnataka 560045
Bengaluru Office
Nvidia Graphics Pvt Ltd, C1, Nagavara Bengaluru
Karnataka 560045

Senior Site Reliability Engineer - GPU Cloud

5-8 Yrs

Bangalore / Bengaluru

3d ago·via naukri.com

Senior Python Software Engineer, Security

4-8 Yrs

Bangalore / Bengaluru

3d ago·via naukri.com

Senior Site Reliability Engineer

7-9 Yrs

Hyderabad / Secunderabad, Pune, Gurgaon / Gurugram +1 more

3d ago·via naukri.com

System Software Engineer, Conversational AI

1-5 Yrs

Pune, Bangalore / Bengaluru

3d ago·via naukri.com

Senior Responsible AI Engineer

5-7 Yrs

Pune

3d ago·via naukri.com

Senior Software Engineer

4-7 Yrs

Bangalore / Bengaluru

3d ago·via naukri.com

Senior System Software Engineer

4-12 Yrs

Bangalore / Bengaluru

3d ago·via naukri.com

Verification Engineer, CPU Performance Analysis

4-8 Yrs

Bangalore / Bengaluru

3d ago·via naukri.com

Senior Software QA Automation Engineer

5-7 Yrs

Bangalore / Bengaluru

3d ago·via naukri.com

Senior System Validation Engineer

4-7 Yrs

Bangalore / Bengaluru

3d ago·via naukri.com
write
Share an Interview