We are looking for a Senior Site Reliability Engineer who can work independently while providing mentorship and guidance to colleagues across multiple areas of technology.
You will be responsible for the reliability, scalability, security and performance of critical production services across multiple technical disciplines from design to implementation.
Reporting to the SRE Manager, you will be part of a team supporting 24x7 uptime of mission critical services and infrastructure.
What are we looking for
A Sr. SRE will have all the skills, knowledge, and dimension progression of a SRE and will be contributing at a higher level.
Respected as a technical leader, provides mentorship to many others across team boundaries.
A Sr. SRE can work independently while providing mentorship and guidance to the company in multiple areas of technology.
Sets the bar in all forms of communication and collaboration.
Contribute significant software architecture, strategy, design and code to the current (or future) version (or multiple concurrently developed versions) of a component, subsystem, system, application, or underlying technologies.
Evaluates new technologies and software products to determine feasibility and desirability of incorporating their capabilities within the companys software systems and applications.
Work with third party vendors to develop software and/or to integrate their software into the companys products.
Essential Skills and Experience
Typically, 5 to 10 years of experience as an SRE, DevOps or Software Engineer.
Expert knowledge of Kubernetes at scale and containerization technologies in general.
Expert with IaC tools such as Ansible and Terraform.
Good programming ability in an OOP language and ability to write scripts.
Expert and up to date knowledge of security best practices across all disciplines.
Expert with AWS and/or GCP.
Expert with metrics/monitoring and alerting using Prometheus, Grafana or similar.
Strong knowledge of networking fundamentals.
Commitment to standardisation and documentation.
Strong Linux experience.
Software delivery automation, CI/CD & SDLC.
Static and Dynamic Application Security Testing.
Knowledge of SRE principles (SLI, SLO, SLA, Toil, Uptime, Observability)