83 Employee Forums Jobs
Lead Production Monitoring & Technical Support Engineer (10-14 yrs)
Employee Forums
posted 17hr ago
Fixed timing
Key skills for the job
Responsibilities :
Production Monitoring and Uptime Management :
- Manage 24/7 monitoring of payment systems, ensuring high availability with a target of 99.99% uptime.
- Design and maintain dashboards to track transaction metrics, system health, and workflow performance across various payment methods.
- Proactively identify and resolve performance bottlenecks, anomalies, and transaction failures.
Technical Support and Incident Management :
- Act as the escalation point for complex technical issues impacting payment transactions or customer experience.
- Lead incident response efforts, conduct root cause analysis (RCA), and implement corrective actions to prevent recurrence.
- Ensure adherence to SLAs and maintain detailed logs for reporting and compliance.
Change Management and Impact Mitigation :
- Participate in change management processes for application and system components, ensuring minimal disruption to service flows.
- Evaluate the potential impact of changes and help devise enhanced monitoring workflows/metrics to detect and mitigate issues.
- Correlate past incidents with current changes to proactively improve monitoring and risk management practices.
Data Analysis and Reporting :
- Analyze transaction patterns, operational metrics, and system logs to proactively detect potential failures.
- Implement monitoring metrics using ELK Stack (Elasticsearch, Logstash, Kibana) to identify trends, anomalies, and failure points.
- Prepare periodic reports on system performance, incident trends, and improvement initiatives for senior management.
Team Leadership and Collaboration :
- Lead, mentor, and motivate the production support team, fostering a culture of accountability, innovation, and operational excellence.
- Collaborate with cross-functional teams, including development, QA, and external partners, to resolve technical issues and optimize workflows.
Experience :
- 10+ years of experience in production monitoring, technical support, or technical operations for payment systems (banks, ecommerce, insurance, fin tech, etc).
- At least 3 years of handling the leadership role, with direct responsibility for 24x7 high availability monitoring and trouble-shooting experience.
Skills :
Payment Gateway Workflows :
- In-depth understanding of payment workflows across multiple payment methods, including card, upi, netbanking, etc and reconciliation.
System Correlation and Issue Diagnosis :
- User Experience and Application Orchestration
- Strong understanding of customer journeys and application workflows to identify potential friction points in the payment process.
- Ability to correlate issues quickly to determine whether they are internal (application, network, database, load balancer, etc.) or external (payment provider, bank systems, etc.
- Familiarity with system architecture breakdowns to isolate and address failures in specific components, such as API gateways, application servers, or database instances
Monitoring and Optimization :
- Design and maintain real-time dashboards for transaction monitoring across multiple payment methods.
Incident Management and Troubleshooting :
- Expertise in performing root cause analysis (RCA) for production incidents and implementing preventive measures.
- Ability to troubleshoot issues, isolate issues and coordinate resolutions in distributed systems involving microservices, cloud-based architectures, and external integrations.
Collaboration and Leadership :
- Exceptional communication and cross-functional collaboration skills for working with developers, QA teams, third-party providers, and merchants.
- Demonstrated ability to lead and mentor technical teams, ensuring alignment with organizational goals.
Data Analysis and Reporting :
- Proficient in analyzing transaction data to identify patterns, anomalies, and opportunities for improvement.
- Ability to design real-time monitoring dashboards and implement proactive metrics to detect failures.
Additional Technical Skills :
- Knowledge of payment aggregator APIs, payment SDKs, and integration methods.
- Understanding of networking fundamentals, load balancing, and cloud infrastructure.
- Database query and optimization skills for systems like Oracle, MySQL, PostgreSQL, and NoSQL databases.
Soft skills :
- Exceptional problem-solving and analytical abilities.
- Strong communication skills for cross-functional collaboration and stakeholder management.
- Proven leadership experience in managing and mentoring technical teams.
Qualifications :
- Bachelor's degree in Computer Science, Information Technology, or a related field.
- Certified Payment Security Practitioner (CPSP) or equivalent certifications.
- ITIL Certification or experience with IT service management best practices.
- Prior experience in large-scale payment aggregation or fintech environments.
Location : Mumbai.
Functional Areas: Other
Read full job description10-14 Yrs
1-2 Yrs
5-6 Yrs
8-10 Yrs
9-15 Yrs
10-15 Yrs