44 Trigyn Technologies Jobs
10-14 years
Jaipur
1 vacancy
DevOps Engineer \u2013 Kubernetes & OpenShift Specialist
Trigyn Technologies
posted 22hr ago
Flexible timing
Key skills for the job
We are seeking a highly skilled DevOps Engineer with deep expertise in cloud architecture, containerization, and CI/CD automation to join our team. The ideal candidate will have experience in managing and maintaining Kubernetes clusters, implementing CI/CD processes for large-scale environments, and optimizing system performance for dynamic environments. He will be responsible for ensuring the high availability, scalability, and security of our containerized applications, while integrating modern ops technologies for monitoring and incident management.
Key Responsibilities:
- Linux Scripting: Expertise in Red Hat, Ubuntu, and Bash/Python scripting for automation.
- Kubernetes OpenShift: Deploy, manage, and optimize multi-tenant clusters for scalability and reliability.
- CI/CD DevOps: Design secure CI/CD pipelines using Jenkins, ArgoCD, GitLab, Docker, and Kubernetes.
- SCM Build Tools: Strong understanding of branching models, peer reviews, Gradle, and Maven.
- Caching Streaming: Manage Redis for caching and Kafka for scalable message streaming.
- Networking Load Balancing: Optimize traffic using Apache, Nginx, and HAProxy.
- Microservices Observability: Deploy and monitor microservices with Prometheus, ELK, and Grafana.
- Infrastructure Automation: Implement GitOps, Terraform, Ansible, Helm charts, and Kubernetes Operators.
- Troubleshooting Performance: Optimize container orchestration, networking, and load balancers.
- Disaster Recovery: Design DR strategies, automate failover, conduct DR drills, and ensure business continuity.
- Support Collaboration: Work with data center teams and OpenShift OEM support for seamless operations.
Day to day Responsibilities.
Daily tasks will involve a mix of infrastructure management, automation, troubleshooting, and collaboration with multiple teams.
- Check Alerts System Health:
- Review dashboards in Prometheus, Grafana, and ELK (Elastic Stack) to ensure Kubernetes/OpenShift clusters, Redis, Kafka, and CI/CD pipelines are healthy.
- Check for failed deployments, pod crashes, or performance anomalies in Kubernetes/OpenShift environments.
- Review logs from web servers (Apache/Nginx), load balancers, and containers for potential issues.
- Coordinate with data center teams to ensure smooth operations of on-prem Kubernetes/OpenShift clusters.
- Escalate any critical infrastructure issues to the OpenShift OEM support team for resolution.
- Sync with Dev, QA, and Security teams to discuss ongoing tasks, blockers, and priorities.
- Collaborate on upcoming deployments, infrastructure changes, and security updates.
- Work on Jenkins, ArgoCD, Tekton, or GitLab CI/CD to optimize build, test, and deployment pipelines.
- Fix pipeline failures, optimize build times, and integrate new security scans like Trivy and SonarQube.
- Automate Infrastructure Kubernetes Deployments:
- Write or update Helm charts, Kubernetes manifests, and OpenShift Operators.
- Improve Infrastructure as Code (IaC) using Terraform or Ansible to automate cloud and on-prem deployments.
- Work on auto-scaling, high availability, and failover mechanisms for Kubernetes/OpenShift workloads.
- Investigate networking issues, pod failures, slow response times, and load balancer misconfigurations.
- Debug containerized applications running in OpenShift/Kubernetes, checking logs and resource usage.
- Optimize Kafka topics, Redis caching strategies, and database connections for better performance.
- Apply security patches and updates for Kubernetes, OpenShift, and infrastructure components.
- Monitor container security vulnerabilities and ensure compliance with internal security policies.
- Implement and update RBAC (Role-Based Access Control) policies for secure cluster access.
- Discuss new application deployments, microservices scaling strategies, and API gateway configurations.
- Help developers troubleshoot issues related to Kubernetes, networking, or build failures.
- Design improvements for logging, monitoring, and observability using tools like Loki, ELK.
- Propose performance tuning strategies for Kubernetes nodes, storage, and compute resources.
- Work on hybrid cloud strategies, ensuring Kubernetes workloads run efficiently on-prem and in the cloud.
- Document new deployments, troubleshooting steps, and best practices for future reference Share insights on lessons learned from recent incidents and outages.
- Review upcoming deployments, maintenance windows, and potential challenges.
- Review resource consumption (CPU, memory, storage) and optimize scaling rules.
- Perform regular security audits, certificate renewals, and Kubernetes policy reviews.
- Ensure OpenShift/Kubernetes clusters follow security best practices (RBAC, Network Policies, Pod Security Standards).
- Ensure image security and vulnerability scanning is enforced across pipelines.
- Design, implement, and maintain a Disaster Recovery (DR) strategy for Kubernetes OpenShift environments.
- Conduct regular DR drills to ensure system reliability and business continuity.
- Automate failover and recovery processes, ensuring minimal downtime during unexpected failures or planned maintenance.
- Work closely with data center teams to ensure smooth DR operations, including network and storage replication.
- Validate database consistency, transaction integrity, and application availability during DR exercises.
Employment Type: Full Time, Permanent
Read full job descriptionPrepare for Devops Engineer roles with real interview advice