3 TrueFan Jobs
DevOps Engineer - Cloud Infrastructure (3-4 yrs)
TrueFan
posted 18hr ago
Flexible timing
Key skills for the job
Company Description :
TrueFan uses proprietary AI technology to connect fans and celebrities and is now focused on revolutionising customer-business interactions with AI-powered personalized video solutions. Our platform enables brands to create unique, engaging video experiences that drive customer loyalty and deeper connections.
Job Description :
Role : DevOps Engineer
Company Overview :
We are a cutting-edge AI company focused on developing advanced lip-syncing technology using deep neural networks.
Our solutions enable seamless synchronisation of speech with facial movements in videos, creating hyper-realistic content for various industries such as entertainment, marketing, and more.
Key Responsibilities :
- Design, implement, and maintain scalable and automated pipelines for deploying deep neural network models.
- Monitor and manage Production models, ensuring high availability, low latency, and smooth performance.
- Automate workflows for data preprocessing (face alignment, feature extraction, audio analysis), model retraining, and video generation.
- Implement Logging, Tracking, and Monitoring Systems to ensure data integrity and visibility into the model lifecycle.
Infrastructure Management :
- Build and manage cloud-based infrastructure (AWS, GCP, or Azure) for efficient model training, deployment, and data storage.
- Collaborate with DevOps to manage containerization (Docker, Kubernetes) and ensure robust
- CI/CD pipelines using github and jenkins for model delivery.
- Monitor resource for GPU/ CPU-intensive tasks like video processing, model inference, and
training using Prometheus , Grafana, alert manager, ELK stack.
Collaboration :
- Work closely with ML engineers to integrate models into production pipelines.
- Provide tools and frameworks for rapid experimentation and model versioning.
Required Skills :
- Basic Python
- Strong experience with cloud platforms (AWS, GCP, Azure) and cloud-based machine learning services.
- Expert knowledge of containerization technologies (Docker, Kubernetes) and infrastructure-
as-code (Terraform, CloudFormation)
- Have understanding of Deployment of both synchronous and asynchronous API using Flask, Django, Celery, Redis, RabbitMQ , Kafka
- Deployed and Scaled AI/ML in Production.
- Familiarity with deep learning frameworks (TensorFlow, PyTorch).
- Familiarity with video processing tools like FFMPEG and Dlib for handling dynamic frame data.
- Basic understanding of ML models
Preferred Qualifications :
- Experience in image and video-based deep learning tasks.
- Familiarity with media streaming and video processing pipelines for real-time generation.
- Experience with real-time inference and deploying models in latency-sensitive environments.
- Strong problem-solving skills with a focus on optimising machine learning model infrastructure for scalability and performance.
Functional Areas: Software/Testing/Networking
Read full job description