A leading global financial services firm is seeking a skilled Resiliency Engineer to join their dynamic team.
Key Responsibilities:
DR Automation: Develop and implement automated solutions for infrastructure components to streamline failover processes and reduce recovery time objectives (RTOs).
Recovery Plan Development: Create, maintain, and test comprehensive recovery plans for critical applications and systems.
DR Testing and Validation: Conduct regular DR drills and tests to validate recovery procedures and identify areas for improvement.
Infrastructure Automation: Utilize automation tools (e.g., Ansible) to automate infrastructure tasks and enhance resiliency.
Collaboration: Work closely with infrastructure, application, and business teams to ensure alignment with DR and BCP strategies.
Continuous Improvement: Identify opportunities to improve DR processes, reduce recovery times, and enhance overall system resilience.
Qualifications and Experience:
A bachelors degree in engineering, computer science, or a similar discipline
5+ years of experience in infrastructure engineering and automation
3+ years of experience with cloud computing (AWS, Azure)
Strong proficiency in scripting languages (Python, PowerShell)
Familiarity with Ansible, Puppet, and Chef, three configuration management tools
Knowledge of ITIL frameworks and best practices
Excellent problem-solving and troubleshooting skills