Upload Button Icon Add office photos
Engaged Employer

i

This company page is being actively managed by PubMatic Team. If you also belong to the team, you can get access from here

PubMatic Verified Tick

Compare button icon Compare button icon Compare
filter salaries All Filters

13 PubMatic Jobs

Site Reliability Engineer

4-5 years

Pune

1 vacancy

Site Reliability Engineer

PubMatic

posted 6hr ago

Job Description

Position Description

As an SRE Engineer, you will be responsible for the Activate and Production Infrastructure. Your essential duties encompass ensuring the seamless operation and optimal performance of large-scale distributed software applications. Your role revolves around maintaining a robust and high-performing environment, contributing to the reliability of our services, and innovating solutions to guarantee 24/7 availability. By leveraging your technical expertise and dedication, you contribute to maintaining a seamless experience for our users while upholding the highest standards of operational excellence. Your specific responsibilities include:

Responsibilities:

Role and Responsibilities:

  1. Monitoring and Alerting
    • Review existing and set up new monitoring tools and systems as needed to track system performance, key metrics.
  2. Incident Management
    • monitor the alerts and logs to promptly identify incidents or anomalies.
    • Prioritize incidents based on severity and potential impact on stability and reliability.
    • Engage in effective incident resolution, applying necessary fixes and mitigations to restore normal operations.
  3. On-Call Responsibilities
    • Organize on-call schedules to ensure 24/7 coverage for incident response.
    • Respond to alerts, troubleshoot issues, and coordinate with NOC and Engineering teams for incident resolution.
    • Conduct post-incident reviews to identify root causes, learn from incidents, and implement preventive measures.
  4. Automation and Tooling
    • Review pre-existing and build new automation scripts and tools as needed to streamline repetitive tasks, enhance efficiency, and reduce manual errors.
    • Regularly update and maintain tools used for monitoring, deployment, and incident management to align with evolving needs.
  5. Performance Optimization
    • Analyze application performance using profiling and monitoring tools to identify bottlenecks and areas for improvement.
    • Work on optimizations, infrastructure upgrades, and architectural improvements to enhance system performance and efficiency.
  6. Capacity Planning and Scaling
    • Monitor resource utilization and trends to predict capacity needs and plan for scaling.
    • Scale resources, such as servers and databases, are based on usage patterns and anticipated growth to maintain performance and reliability. Also, automate the entire sizing process.
  7. Disaster Recovery and Redundancy
    • Develop and maintain disaster recovery plans and procedures to ensure business continuity in case of failures or disasters.
    • Implement redundancy and failover strategies to minimize downtime and maintain service availability during failures.
  8. Knowledge Sharing and Documentation
    • Create and maintain comprehensive documentation for configurations, procedures, incidents, and best practices.
    • Foster a culture of knowledge sharing within the team, conducting regular knowledge-sharing sessions and training programs.
  9. Feedback Loop and Continuous Improvement
    • Collect feedback from incidents, post-mortems, and NOC/Dev team interactions to identify areas for improvement.
    • Continuously iterate on processes, tools, and systems based on feedback and lessons learned to drive continuous improvement.
  10. Collaboration and Communication
    • Collaborate closely with Engineering and DC/NOC teams to align goals and priorities.
    • Ensure open and transparent communication within the team and with stakeholders, providing regular updates on incidents, progress, and initiatives.
Requirements:
  • Bachelors degree in computer science or related disciplines
  • Total 3+ years experience in software application/product support
  • Ability to program using programming languages like Go, Scripting languages like Shell or Python
  • Good to have prior experience in technical engineering
  • A proactive approach to identify the problems, performance bottlenecks, and areas of improvement
  • Must know, Networking, Database (MySQL) and Linux System concepts, Debugging and analyzing the core dumps
  • Hands-on experience with monitoring and observability tools like Grafana, Nagios, Influx, ELK, etc.
  • Familiarity with orchestration tools like Docker and Grafana and incident management systems like Zenduty
  • Excellent communication and collaboration skills, with the ability to work effectively across teams.
  • Self-motivated and positive mindset to examine any incidents


Employment Type: Full Time, Permanent

Read full job description

Prepare for Site Reliability Engineer roles with real interview advice

What people at PubMatic are saying

Site Reliability Engineer salary at PubMatic

reported by 1 employee with 4 years exp.
₹17.1 L/yr - ₹21.9 L/yr
39% more than the average Site Reliability Engineer Salary in India
View more details

What PubMatic employees are saying about work life

based on 113 employees
92%
88%
69%
67%
Flexible timing
Monday to Friday
No travel
Day Shift
View more insights

PubMatic Benefits

Submitted by Company
Paid time off
Wellness
Employee stock purchase program
Comprehensive benefit plans
Submitted by Employees
Work From Home
Free Food
Team Outings
Health Insurance
Cafeteria
Job Training +6 more
View more benefits

Compare PubMatic with

InMobi

3.5
Compare

Komli Media

4.0
Compare

Adcolony

5.0
Compare

Affle

3.1
Compare

Amagi Media Labs

3.3
Compare

Vizury

5.0
Compare

Smaato

2.6
Compare

Madhouse

5.0
Compare

Deloitte Digital

3.9
Compare

Thomson Reuters

4.1
Compare

Duck Creek Technologies

4.4
Compare

CodeClouds

4.5
Compare

FinThrive

3.7
Compare

Grey Orange

3.2
Compare

Mobileum

3.3
Compare

SirionLabs

3.8
Compare

AgreeYa Solutions

3.3
Compare

OnProcess Technology

3.8
Compare

NextGen Healthcare

3.6
Compare

NortonLifeLock's

4.0
Compare

Similar Jobs for you

Site Reliability Engineer at Apple India Pvt Ltd

Bangalore / Bengaluru

5-10 Yrs

₹ 20-22 LPA

Reliability Engineering Manager at WebEx Communications India (P) Ltd.

Bangalore / Bengaluru

6-10 Yrs

₹ 11-15 LPA

DevOps Site Reliability Engineer at UST

Thiruvananthapuram

5-7 Yrs

₹ 12-16 LPA

Site Reliability Engineer at Bright Vision Technologies

Mumbai, Navi Mumbai

2-6 Yrs

₹ 12-16 LPA

Senior Site Reliability Engineer at GreyOrange

Gurgaon / Gurugram

5-10 Yrs

₹ 22.5-37.5 LPA

Site Reliability Engineer at Medallia, Inc.

Pune

6-8 Yrs

₹ 12-16 LPA

Site Reliability Engineer at UPS Pvt. Ltd.

Chennai

4-9 Yrs

₹ 17-22 LPA

Site Reliability Engineer at PhonePe

Bangalore / Bengaluru

1-6 Yrs

₹ 16-17 LPA

Site Reliability Engineer at Siemens Limited

Bangalore / Bengaluru

4-6 Yrs

₹ 20-27.5 LPA

Site Reliability Engineer at SOURCERIGHT TECHNOLOGIES (INDIA) PRIVATE LIMITED

Ahmedabad

5-10 Yrs

₹ 15-20 LPA

PubMatic Pune Office Location

View all
Pune Office
PubMatic India Pvt. Ltd., 6th Floor, Amar Paradigm, Near D-mart, Baner Road Pune
Maharashtra 411045

Site Reliability Engineer

4-5 Yrs

Pune

16hr ago·via naukri.com

Senior Software Engineer - Java

3-5 Yrs

Pune

16hr ago·via naukri.com

System Administrator

3-5 Yrs

Pune

5d ago·via naukri.com

Principal Software Engineer- Java

6-10 Yrs

Pune

8d ago·via naukri.com

UI Developer

3-5 Yrs

Pune

8d ago·via naukri.com

Business Intelligence Associate

2-5 Yrs

Pune

8d ago·via naukri.com

Director, Engineering

8-14 Yrs

Pune

30d ago·via naukri.com

Director, Engineering

8-14 Yrs

Pune

1mon ago·via naukri.com

Senior/Principal Software Development Engineer in Test

6-10 Yrs

Pune

1mon ago·via naukri.com

Customer Success Operations Manager US

3-7 Yrs

Gurgaon / Gurugram

3mon ago·via naukri.com
write
Share an Interview