Upload Button Icon Add office photos
filter salaries All Filters

2 GitLab Site Reliability Engineer Jobs

Site Reliability Engineer, Datastores

0-5 years

Remote

1 vacancy

Site Reliability Engineer, Datastores

GitLab

posted 1mon ago

Job Description

Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other GitLab production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our environments and the GitLab codebase. We specialize in systems, whether it be networking, the Linux kernel, or some more specific interest in scaling, algorithms, or distributed systems.
The Database Reliability Team s mission is to build, run and own the entire lifecycle of the PostgreSQL database engine for GitLab.com. The team is focused on owning the reliability, scalability, performance & security of the database engine and its supporting services. The team should be seeking to build their services on top of Reliability::Foundations services and cloud vendor managed products, where appropriate, to reduce complexity, improve efficiency and deliver new capabilities quicker.
GitLab.com is a unique site and it brings unique challenges-it s the biggest GitLab instance in existence. In fact, it s one of the largest single-tenancy open-source SaaS sites on the internet. The experience of our team feeds back into other engineering groups within the company, as well as to GitLab customers running self-managed installations

Responsibilities
  • Automating every operational task is a core requirement for this role. For example, package updates, configuration changes across all environments, creating tools for automatic provisioning of user facing services, etc.
  • Responding to platform emergencies, alerts, and escalations from Customer Support.
  • Ensure systems exist to manage software life-cycles (e.g. Operating Systems) with a minimum of manual effort.
  • Develop a fully automated multi-environment observability stack based on the existing SaaS system, and extend it to predict capacity needs based on the usage patterns.
  • Plan for new service roll-outs, expansion and capacity management of existing services, and work with users to optimise their resource consumption.
As an SRE you will:
  • Work on database reliability and performance aspects for GitLab.com from within the SRE team as well as work on shipping solutions with the product.
  • Analyze solutions and implement best practices for our main PostgreSQL database cluster and its components.
  • Work on observability of relevant database metrics and make sure we reach our database objectives.
  • Work with peer SREs to roll out changes to our production environment and help mitigate database-related production incidents.
  • OnCall support on rotation with the team.
  • Provide database expertise to engineering teams (for example through reviews of database migrations, queries and performance optimizations).
  • Work on automation of database infrastructure and help engineering succeed by providing self-service tools.
  • Use the GitLab product to run GitLab.com as a first resort and improve the product as much as possible.
  • Plan the growth of GitLabs database infrastructure.
  • Design, build and maintain core database infrastructure pieces that allow GitLab to scale to support hundreds of thousands of concurrent users.
  • Support and debug database production issues across services and levels of the stack.
  • Make monitoring and alerting alert on symptoms and not on outages.
  • Document every action so your learnings turn into repeatable actions and then into automation.
You may be a fit to this role if you:
  • Have strong engineering experience deploying, managing and scaling PostgreSQL in large and dynamic production SaaS environments
  • Possessing an in-depth understanding of PostgreSQL internals, including architecture, storage, indexing, and query optimization
  • Have solid experience operating PostgreSQL databases in a containerized environment using Kubernetes and modern operators from CloudNativePG, Crunchydata or Zolando
  • Have solid understanding of Kubernetes architecture and experience with Kubernetes clusters in production
  • Have strong experience with infrastructure automation and configuration management (Chef, Ansible, Puppet, Terraform )
  • Experienced with CI/CD pipelines and infrastructure as code (IaC) practices.
  • Have solid experience monitoring and logging tools for database and container orchestration environments (e.g., Prometheus, Grafana, ELK stack)
  • Share our values , and work in accordance with those values
  • Have excellent written and verbal English communication skills, with an urge to collaborate and communicate asynchronously
  • Have an urge to document all the things so you dont need to learn the same thing twice, and an urge for delivering quickly and iterating fast
  • Have a proactive, go-for-it attitude. When you see something broken, you cant help but fix it
  • Bonus: Strong programming skills as a (former) backend engineer - Preferably with Ruby and/or Go.
Projects you could work on:
  • Cells
  • Review, analyze and implement solutions regarding database administration (e.g., backups, performance tuning)
  • Work with Ansible, Terraform, Chef and other tools to build mature automation (automatic setup new replicas or testing and monitoring of backups).
  • Implement self-service tools for our engineers using GitLab ChatOps.
  • Provide technical assistance and support to other teams on database and database-related application design methodologies, system resources, application tuning.
  • Review database related changes from engineering teams (e.g., database migrations).
  • Recommend query and schema changes to optimize the performance of database queries.
  • Jump on a production incident to mitigate database-related issues on GitLab.com.
  • Participate actively in the infrastructure design and scalability considerations focusing on data storage aspects.
  • Make sure we know how to take the next step to scale the database.

Employment Type: Full Time, Permanent

Read full job description

Prepare for Site Reliability Engineer roles with real interview advice

People are getting interviews at GitLab through

(based on 4 GitLab interviews)
Company Website
Job Portal
25%
25%
50% candidates got the interview through other sources.
Moderate Confidence
?
Moderate Confidence means the data is based on a sufficient number of responses received from the candidates

What Site Reliability Engineer at GitLab are saying

Site Reliability Engineer salary at GitLab

reported by 1 employee with 2 years exp.
₹57 L/yr - ₹63 L/yr
340% more than the average Site Reliability Engineer Salary in India
View more details

What GitLab employees are saying about work life

based on 7 employees
100%
100%
49%
Flexible timing
Monday to Friday
No travel
View more insights

GitLab Benefits

Free Transport
Child care
Gymnasium
Cafeteria
Work From Home
Free Food +6 more
View more benefits

Compare GitLab with

GitHub

4.5
Compare

Atlassian

3.6
Compare

JFrog

4.1
Compare

CircleCI

5.0
Compare

HashiCorp

4.7
Compare

New Relic

3.0
Compare

Splunk

4.5
Compare

Dynatrace

1.8
Compare

TCS

3.7
Compare

Accenture

3.9
Compare

Wipro

3.7
Compare

Cognizant

3.8
Compare

Capgemini

3.8
Compare

Infosys

3.7
Compare

HCLTech

3.5
Compare

Tech Mahindra

3.6
Compare

Genpact

3.9
Compare

IBM

4.1
Compare

LTIMindtree

3.9
Compare

DXC Technology

3.7
Compare

Similar Jobs for you

Engineering Manager at Stripe

Bangalore / Bengaluru

5-8 Yrs

₹ 35-45 LPA

Software Engineer at LaunchPD

Remote

6-10 Yrs

₹ 50-70 LPA

Senior Embedded Software Engineer at Fubeus

Bangalore / Bengaluru

4-15 Yrs

₹ 25-45 LPA

Software Engineer at LaunchPD

Remote

6-10 Yrs

₹ 50-70 LPA

Senior Software Engineer at Workato

Hyderabad / Secunderabad

4-8 Yrs

₹ 50-70 LPA

Development Engineer at GAMIFi Consulting Services (P) Ltd

Bangalore / Bengaluru

6-10 Yrs

₹ 33-55 LPA

Senior Software Engineer at Litmus

Ahmedabad

4-8 Yrs

₹ 50-70 LPA

Devops Engineer at Ticketgateway

Remote

2-6 Yrs

₹ 35-45 LPA

Devops Engineer at Fort Technologies

Kolkata, Mumbai + 5

2-5 Yrs

₹ 35-45 LPA

Devops Engineer at Antarctica Global

Mumbai

2-6 Yrs

₹ 35-45 LPA

Site Reliability Engineer, Datastores

0-5 Yrs

Remote

1mon ago·via naukri.com

Intermediate Site Reliability Engineer

6-11 Yrs

Kolkata, Mumbai, New Delhi +4 more

1mon ago·via naukri.com
write
Share an Interview