Smarsh - Senior Site Reliability Engineer II - Incident Management (8-10 yrs)
Smarsh
posted 25d ago
Flexible timing
Key skills for the job
Smarsh is the leader in communications compliance, archiving, and analytics.
We provide compliance across the broadest set of communications channels with insights on what's being captured. Smarsh customers manage over 500 million daily conversations across 80 channels and growing.
Customers include the top 10 U. , top 8 European, top 5 Canadian, and top 3 Asian banks. The Smarsh advantage is customers stay ahead of compliance and uncover patterns and relationships hidden within their data.
At Smarsh , we've been helping our customers manage new forms of communication since 1998.
We work closely with regulators including the SEC, FINRA, IIROC, and the PRA and FCA, and with our customers, to ensure that they understand the capabilities of today's technology and that our platform meets their most stringent requirements.
Our products include Connected Capture, Connected Archive, Web Archive & Business Solutions.
About the team :
Are you an SRE with excellent Observability, Containerization and Orchestration skills? As a Site Reliability Engineer (SRE) in the Smarsh SaaS Operations team, you'll be part of a team who measures and improves production performance reliability through sustainable engineering practices for our suite of applications.
Toil will be your number one enemy, observability your closest friend and your mission will be to drive operational burden as close to zero as you can.
Responsibilities :
- Attend and actively participate in team ceremonies (stand-ups, retros, and planning meetings).
- Occasionally run these meetings.
- Respond to incidents coordinated by SRE and Incident Response teams.
- Act as a Incident Commander during incidents.
- Help define technology choices, best practices and process for the team.
- Develop and maintain documentation standard for the team.
- Develop tools and libraries for broader use by SaaS Operations and Engineering teams.
- Enable engineering teams to discover and understand problems quicker.
- Work closely with Engineering and peer SRE teams to design and use Smarsh coding standards and best practices.
- Work with product architects and make suggestions for architectural changes and design platform component roadmaps.
- Coordinate with other senior leaders to help set process and direction for the platform as a whole.
- Develop new and novel DevOps tools and systems that aren't used anywhere else.
- Demonstrate technical leadership to groups inside and outside the company.
- Assist engineering teams in deep troubleshooting and application code review to find opportunities to improve performance and scalability.
- Collaborate with team in US hours and provide support if needed over weekends.
- Act as a subject matter expert (SME) for the majority of all platform components.
- Adopt and embrace qualities of an SRE as defined in our team charter.
- Help set them for the rest of the team.
- Mentor and train junior members of the organization.
- Design training curriculum for the Ops organization as a whole.
Desired skills & experience :
- A minimum 8-10 years industry experience.
- Masters in CS or equivalent desired.
- Cloud infrastructure, Identity management, and networking experience (GCP).
- Experience managing Elasticsearch and Hadoop infrastructure.
- Experience managing MySql database.
- Experience with Ansible and Terraform.
- Experience with builds and packaging in a Linux and Java environment strongly preferred.
- Broad range of programming/scripting experience (i.Python, Bash, Go, etc.
- Strong background in managing code with Git.
- Experience managing continuous integration systems(Jenkins).
- Experience with automated configuration management and deployment tool.
- Background working in a multi-platform environment (Linux, Windows.
- Experience with containerization (Docker, Kubernetes, etc.
- Experience with Datadog.
Additional Skills :
- Exceptional analytical and problem-solving skills.
- Expert administrator and/or expert at programming skills in relevant languages.
- Strong communication and collaboration skills.
- Deep understanding of modern software architecture.
- Deep domain knowledge of the industry, platform, and existing processes.
- Fault-tolerant design & maintenance.
- Knowledge and understanding of modern software programming/engineering.
- Product delivery lifecycle requirement refinement through ops.
Why Smarsh ?
- Smarsh hires lifelong learners with a passion for innovating with purpose, humility and humor.
- Collaboration is at the heart of everything we do.
- We work closely with the most popular communications platforms and the world's leading cloud infrastructure platforms.
- We use the latest in AI/ML technology to help our customers break new ground at scale.
- We are a global organization that values diversity, and we believe that providing opportunities for everyone to be their authentic self is key to our success.
- Smarsh leadership, culture, and commitment to developing our people have all garnered Comparably.com Best Places to Work Awards.
- Come join us and find out what the best work of your career looks like.
Functional Areas: Software/Testing/Networking
Read full job description8-10 Yrs
4-7 Yrs