i
Cortex Consulting
124 Cortex Consulting Jobs
Observability Architect - Azure Platform (8-18 yrs)
Cortex Consulting
posted 22hr ago
Fixed timing
Key skills for the job
Position Summary :
We are seeking an experienced Observability Architect to define, implement, and maintain an end-to-end monitoring, alerting, and auditing strategy for our Azure-based cloud banking platform. This role involves close collaboration with Infrastructure, Security, Application Development, and Compliance teams to ensure all components-from microservices to external SaaS integrations-are instrumented and monitored effectively.
You will guide teams on application-level changes necessary for robust observability, such as transaction IDs, correlation IDs, error/alert classification, and more.
About the Platform :
1. Azure Cloud : Primary hosting environment, leveraging services like AKS, Azure SQL, Azure Key Vault, Azure Event Hubs, Azure Cosmos DB, Azure Monitoring, Azure Log Analytics, etc.
2. Banking Applications : React Native, React.js, and Node.js microservices in a containerized architecture.
3. Integration Layer : IBM DataPower, IBM ACE, IBM MQ handling complex financial integrations.
4. SaaS Providers : Auth0, Finxact, Savana, Finzly, Fiserv, Alloy, Box, Envestnet, Debit card acquirer/issuer, PaymentUS, etc.
5. APIs : Hundreds of APIs coordinating across multiple internal and external services.
6. Data Sensitivity : Handling PII and financial data (SSN, EIN, DL, financial information) in strict compliance with regulatory standards.
Key Responsibilities :
1. Observability Architecture & Strategy :
- Define a holistic observability framework incorporating metrics, logs, traces, and events across distributed microservices and integration layers.
- Recommend application-level changes (e.g., transaction IDs, correlation IDs, structured logging, error/alert classification) to enable robust end-to-end monitoring.
2. Implementation & Tooling :
- Deploy monitoring solutions using Azure Monitor, Azure Log Analytics, Application Insights, and other third- party or open-source tools.
- Configure dashboards, alerts, and real-time anomaly detection to facilitate efficient troubleshooting and proactive issue detection.
3. Application Instrumentation & Impact :
- Work with development teams to embed observability best practices into the codebase, ensuring minimal instrumentation overhead.
- Champion ID standards (transaction IDs, correlation IDs) and structured logging formats for consistent tracing across services.
- Develop guidelines for error and alert classification to differentiate critical system failures from minor or user-level issues.
4. Data Management & Real-Time Processing :
- Address data volume and noise challenges by optimizing data pipelines, including real-time ingestion, filtering, and storage.
- Define retention policies and cost-optimization strategies for logs and metrics at scale.
5. Compliance & Auditing :
- Collaborate with Security and Compliance teams to meet banking regulations (FFIEC, PCI-DSS, SOC 2, etc.), ensuring all observability data is securely stored and auditable.
- Establish and maintain detailed audit trails of all critical banking transactions and data access.
6. Performance & Reliability :
- Conduct capacity planning and performance tuning to ensure systems can scale while remaining highly available.
- Implement SRE practices for resilience and reliability, including automated rollbacks and failover strategies.
7. Collaboration & Knowledge Sharing :
- Create runbooks, best practices, and documentation for cross-functional teams to effectively utilize observability tools.
- Lead post-incident reviews and blameless postmortems, driving iterative improvements to monitoring and alerting.
8. Stakeholder Communication :
- Present key metrics, observations, and recommendations to leadership, translating technical data into actionable insights.
- Serve as a liaison with third-party providers (IBM, SaaS vendors) to ensure consistent data formats and observability coverage.
Required Qualifications :
Education & Experience :
- bachelor's degree in computer science, Information Systems, or a related field (Master's preferred).
- 8+ years of experience in IT operations, DevOps, or Cloud Architecture with a focus on observability and monitoring.
Technical Skills :
- Deep expertise in Azure services (Azure Monitor, Log Analytics, Application Insights, Key Vault, etc.) and Kubernetes (AKS).
- Proficiency in distributed tracing, logging, and metrics (e.g., OpenTelemetry, Prometheus, Grafana, ELK/Splunk).
- Familiarity with middlware products such IBM DataPower, ACE, MQ or similar integration products.
- Familiarity with React JS, React Native, Node JS or other similar technologies.
- Skilled in scripting/automation (PowerShell, Python, etc.).
- Demonstrable experience implementing application-level instrumentation (transaction IDs, correlation IDs, structured logging) in microservices.
Banking & Compliance Knowledge :
- Familiarity with financial regulations (FFIEC, PCI-DSS, SOC 2) and designing solutions that maintain compliance.
- Experience with PII handling, encryption, and security best practices.
Preferred Qualifications :
- Certifications : Microsoft Azure Solutions Architect Expert, Azure DevOps Engineer Expert, or related.
Functional Areas: Other
Read full job descriptionPrepare for Architect roles with real interview advice