i
Spectral Consultants
49 Spectral Consultants Jobs
Senior Site Reliability Engineer - AWS Cloud (3-5 yrs)
Spectral Consultants
posted 16d ago
Flexible timing
Key skills for the job
What You'll Do :
Acting as a Senior Cloud Site Reliability Engineer, you will be working with a team of operations engineers and software developers to analyze, maintain and nurture our Cloud solutions/products to support the ever-growing company's clientele.
As a technical expert, you will be working closely with various teams to ensure the stability of the environment by :
- Analyzing the current state, designing appropriate solutions and working with the team to implement them.
- Coordinate emergency responses, perform root cause analysis, identify and implement solutions to prevent re-occurrences
- Work with the team to identify ways to increase MTBF and lower MTTR for the environment
- Review each entire application stack and execute initiatives to reduce failures, defects and issues with the overall performance
- Identifying and working with the team to implement more efficient system procedures
- Maintaining environment monitoring systems to provide the best visibility into the state of the deployed products/solutions.
- Perform root cause analysis on incoming infrastructure alerts and work with teams to resolve them
- Maintaining performance analysis tools, identifying any adverse changes to performance and working with the teams to resolve them
- Researching industry trends and technologies, and promote adoption of best-in-class tools and technologies.
- Taking the initiative to advance the quality, performance, or scalability of our Cloud Solutions, by influencing the architecture or design of our products
- Design, develop and execute automated tests to validate solutions and environments
- Troubleshoot issues across the entire stack - infrastructure, software, application and network
What You'll Bring :
- 3+ years' experience working as a Site Reliability Engineer or an equivalent position
- 2+ years' experience with AWS cloud technologies and at least one AWS certifications is required (Solution Architect / DevOps Engineer)
- 1+ years' experience functioning as a senior member in an infrastructure/software team
- Hands-on experience with AWS services like EC2, RDS, EMR, CloudFront, ELB, API Gateway, CodeBuild, AWS Config, Systems Manager, Service Catalog, Lambda, etc.
- Full-stack IT experience with -nix, Windows, network/firewall concepts, source control (BitBucket) and build/dependency management and continuous integration systems (TeamCity, Jenkins)
- Expertise in at least one scripting language, Python preferred
- Must have firm understanding of application reliability, performance tuning and scalability.
- Exposure to big data technologies (Spark, Hadoop, Scala, etc.) stack is preferred
- Solid knowledge of infrastructure and cloud-native services along with network technologies
- Solid understanding of RDBMS and Cloud Database engines like Postgres SQL, MySQL etc.
- Firm understanding of Clusters, Load balancers and CDN
- Experience in fault-tolerant system design.
- Familiarity with Splunk data analysis, Datadog or similar tools is a plus
- A Bachelor's degree (Master's preferred) in a related technical field
- Excellent analytical, troubleshooting and communication skills
- Possess strong verbal, written and team presentation communication skills. fluency in English is required
- This role requires healthy doses of initiative and the ability to remain flexible and responsive in a very dynamic environment
- Ability to quickly learn new platforms, languages, tools, and techniques as needed to meet project requirements
Functional Areas: Software/Testing/Networking
Read full job descriptionPrepare for Senior Site Reliability Engineer roles with real interview advice
5-10 Yrs