Position Summary Site reliability engineer is hands-on technical role requiring deep, modern technical experience. You will be responsible for supporting the Consulting, enablement of SRE practices. You will also be contributing on automation, build and operations of our modern private and public cloud platforms. You will design platforms and oversee their implementation and will mainly be responsible for their ongoing operations to make them reliable. You will follow Site Reliability Engineering principles and will be encouraged to contribute your own best practices and ideas to our ways of working. You will also be responsible for coaching SRE practices and principles to and customer delivery teams. Reporting to the Head of Cloud Native operations, you will be an experienced thought leader, and comfortable engaging senior managers and technologists. You will engage with clients, display technical leadership, and guide the creation of efficient and complex products/solutions.
Key Responsibilities
Technical & Architectural Leadership
Contribute to the technical delivery of projects, ensuring a high quality of work that adheres to best practices, brings innovative approaches and meets client expectations. Project types include following (but not limited to): o Solution architecture, Proof of concepts (PoCs), MVP, SRE Kickstart projects, Platform observability projects etc Contribute to thought leadership across the Cloud Native domain with an expert understanding of Open Source (e.g., Kubernetes / CNCF) and partner technologies. Contribute to the creation of innovative new services that help clients accelerate their modernization journeys. Work on the design of new, next-generation solutions based on modern Cloud Native architectural principles. Collaborate with global peers from partner ecosystems on joint technical projects. This partner ecosystem includes Google, Microsoft, AWS, IBM, Red Hat, Intel, Cisco, and Dell / VMware etc.
Service Delivery
Provide a technical hands-on contribution. Incorporating both solution and technical architecture with hands-on cloud native platform engineering. Use modern platform and site reliability engineering practices. Ensuring the reliable and efficient Service operations of cloud native workloads Ensuring the effective communication and presentation for coaching different delivery teams Client-facing influence and guidance, engaging in consultative client discussions and performing a Trusted Advisor role. Provide effective support to Sales and Delivery teams Support sales pursuits and enable revenue growth. Define the modernization strategy for client platform and associated IT practices, create solution architecture and provide oversight of the client journey.
Innovation & Initiative
Always maintain hands-on technical credibility, keep in front of the industry, and be prepared to show and lead the way forward to others Engage in technical innovation and support position as an industry leader Actively contribute to sponsorship of leading industry bodies such as the CNCF and Linux Foundation. Contribute to thought leadership by writing Whitepapers, blogs, and speaking at industry events. Be a trusted, knowledgeable internal innovator driving success across our global workforce.
Client Relationships
Advise on best practices related to platform engineering and cloud native operations, run client briefings and workshops, and engage technical leaders in a strategic dialogue. Develop and maintain strong relationships with client stakeholders. Perform a Trusted Advisor role Contribute to technical projects with a strong focus on technical excellence and on-time delivery
Mandatory Skills & Experience
Real industry experience of utilizing different Site reliability engineering practices for stabilizing the services and increasing the reliability. Experience on Opensource and enterprise Kubernetes platforms like RedHat OpenShift, VMware Tanzu, Kubernetes etc Experienced in setting up the infrastructure for the latest technology such as Kubernetes, Serverless, Containers, Microservices etc. Experience in scripting / programming to automate deployments and testing, worked on tools like Terraform and Ansible. Scripting languages like Python, bash, YAML etc. Consulting and enablement experience on Site reliability engineering best practices Experience on CI/CD opensource and enterprise tool sets such as Argo CD, Jenkins (others like Jenkins X, Circle CI, Argo CD, Tekton, Travis, Concourse an advantage). Experience with the GitHub/DevOps Lifecycle Experience in Observability solutions (Prometheus, EFK stacks, ELK stacks, Grafana, Dynatrace, AppDynamics etc) Someone who is familiar with cloud native networking use of DNS, load balancers, VPNs, routing/switching, WAFs, reverse proxies etc. Significant experience on microservices-based, container-based or similar modern approaches of applications and workloads You have exemplary verbal and written communication skills (English). Able to interact and influence at the highest level, you will be a confident presenter and speaker, able to command the respect of your audience.
Desired Skills & Experience Bachelor level technical degree or equivalent experience; Computer Science or Engineering background preferred; Masters Degree desired. Hands-on Site reliability engineer and consultant who understands Kubernetes to a good level (CKA preferred), understands Site reliability concepts The perfect candidate will already be working within a System Integrator, Consulting or Enterprise organisation with 8+ years of experience in a technical role within the Cloud domain. Deep understanding of core practices including SRE, Agile, Scrum, XP and Domain Driven Design. Familiarity with the CNCF open-source community. Enjoy working in a fast-paced environment using the latest technologies, love Labs dynamic and high-energy atmosphere, and want to build your career with an industry leader.