6 Aarch Solutions Jobs
Head - Platform Resilience - Retail Banking Technology (18-30 yrs)
Aarch Solutions
posted 13hr ago
Fixed timing
Key skills for the job
- The Head of Platform Resilience for Global Retail Banking Technology is responsible for designing, leading, and executing the resilience strategy for our global retail banking platforms.
- This role will ensure that all customer-facing platforms and critical applications are reliable, scalable, secure, and capable of withstanding and recovering from disruptions.
- This leader will collaborate across engineering, infrastructure, risk, and business teams to drive operational resilience, uphold high availability standards, and implement innovative approaches for continuous service delivery.
Qualifications:
What you will need to succeed in the role:
- 10+ years of experience in resilience engineering, site reliability engineering, infrastructure management, or a related field within large-scale technology environments.
- The Head of Platform Resilience for Global Retail Banking Technology is responsible for designing, leading, and executing the resilience strategy for our global retail banking platforms.
- Proven track record in managing resilience for complex, customer-facing applications, ideally within the banking or financial services industry.
- Strong understanding of platform engineering, high availability architectures, disaster recovery planning, and risk management.
- Deep experience in cloud (AWS, Azure, GCP) and hybrid environments, with knowledge of resilience engineering practices for cloud infrastructure.
- Excellent incident management skills and experience in root cause analysis, postmortem processes, and operational risk management.
- Ability to lead and influence global teams.
- Leading and directing executive and non-executive work groups and effecting change through people.
- Managing operational functions, directing process reengineering and efficiency exercises
- Strong people leadership, teamwork, gathering information and analyzing, judgment and decision making, communication competencies with the ability to influence global teams.
- Respectful of different cultures, working with colleagues from across all 5 regions (North America, LATAM, Middle East, Asia Pacific and Europe
- Consultancy approach and skillset with the ability to identify and articulate complex problems and solutions.
- Strong understanding of operational effectiveness and strong delivery drive.
What you'll do:
- Develop and execute the platform resilience strategy, aligning with the bank's broader business continuity and risk management frameworks.
- Define and champion best practices for resilience, recovery, and fault tolerance within the global retail banking technology ecosystem.
- Influence and drive cultural change around resilience by fostering proactive thinking around risk, disaster recovery, and high availability.
- Drive adoption of cloud-native, microservices, and containerization practices that support modular and resilient system design.
- Exceptional leadership and team management skills, with experience leading cross-functional teams and influencing at executive levels.
- Excellent communication skills, capable of translating complex technical details into business language for stakeholders at all levels.
- Oversee the design, implementation, and management of systems for resilience, failover, and disaster recovery for mission-critical banking applications.
- Deliver on Group Resiliency metrics and KPIs such as Meant Time to Recover, Number of Incidents, Customer Outage etc.
- Monitor system reliability, availability, and performance, and drive initiatives for continuous improvement.
- Lead incident response and postmortem processes, and drive learnings to improve systems and processes.
- This leader will collaborate across engineering, infrastructure, risk, and business teams to drive operational resilience, uphold high availability standards, and implement innovative approaches continuous.
- Drive proactive resilience measures, including chaos engineering, load testing, automated recovery and tabletop exercises, to simulate real-world disruptions.
- Partner with engineering, infrastructure, cybersecurity, and business teams to embed resilience into the development lifecycle, from initial design through to deployment and operation.
- Communicate effectively with senior leadership and key stakeholders on platform resilience status, risks, and improvements. Influence technology and business stakeholders to prioritize resilience as a critical component of product and feature development.
- Stay updated on industry trends, technologies, and methodologies related to resilience engineering, disaster recovery, and risk management.
- Lead initiatives to incorporate advanced technologies, such as AI/ML for anomaly detection and automated recovery, to enhance resilience capabilities.
- Ensure technology goals are identified, communicated, documented, agreed and delivered in the most cost-effective manner possible through to the completion of the successful pilot deployment.
- Proactively manage risks (including cyber security), implement control, and test control effectiveness,
- Responsible for ensuring control monitoring is conducted and reported to risk owners
- Supporting audit / independent programme assessments as required.
Functional Areas: Other
Read full job descriptionPrepare for Technology roles with real interview advice
18-30 Yrs
10-17 Yrs
14-24 Yrs
8-16 Yrs
8-15 Yrs