13 StockX Jobs
StockX - Senior MLOps Engineer (8-11 yrs)
StockX
posted 17hr ago
Flexible timing
Key skills for the job
Why you'll love this role :
StockX is an established global startup headquartered in the USA with development offices in Bangalore India.
We are seeking a versatile and skilled MLOps Engineer with expertise in DevOps, CloudOps (preferably AWS Cloud), and foundational knowledge of Data Engineering and Software Engineering.
The ideal candidate should have 8+ years of relevant experience, a solid understanding of deploying, managing, and scaling machine learning pipelines in production, and experience in backend software development and API design.
What you'll do :
- Design, build, and maintain end-to-end machine learning pipelines, including model training, validation, deployment, monitoring, and updating.
- Implement CI/CD pipelines tailored for machine learning workflows.
- Enhance ML Pipeline Efficiency Directly improve the robustness scalability and performance of our TensorFlow and Kubeflow pipeline
- Focus on optimizing model training and serving processes minimizing downtime and automating routine tasks to increase operational efficiency
- Drive Platform Scaling & Innovation : Take a leadership role in expanding our ML platform's capacity to manage larger data volumes and more complex models
- Research and integrate cutting-edge technologies develop scalable architectures and elevate system performance and efficiency through continuous enhancements
- Establish MLOps Excellence Design and implement robust MLOps frameworks that streamline the integration continuous deployment and monitoring of ML models
- Set up comprehensive CI/CD pipelines automate testing and create monitoring tools to proactively track model performance and detect issues
- Foster Cross-Functional Collaboration : Partner with data scientists software engineers and product teams to transform business requirements into scalable and dependable machine learning solutions
- Bridge the gap between model development and deployment ensuring models are production-ready and align with performance standards
- Overcome Production Challenges : Proactively monitor troubleshoot and resolve issues affecting model performance data pipeline integrity and system efficiency
- Identify root causes and implement strategic solutions to ensure the ongoing stability and performance of our ML infrastructure
- Streamline cloud infrastructure (AWS preferred), automate deployments with IaC tools, and ensure scalable, reliable, and high-performing ML workflows.
- Deploy ML models with containerization and orchestration, and establish monitoring systems for performance, data drift, and system health.
- Develop scalable backend solutions, integrate ML models via APIs, ensure clean code practices, with optional frontend expertise.
- Collaborate across teams to align ML solutions with business goals and document processes for reproducibility and excellence.
- Support data engineering by integrating pipelines, optimizing storage, and preprocessing datasets for ML workloads.
About you :
Educational and Technical Foundation :
- Bachelor's degree in Computer Science or a related technical field or equivalent practical experience.
- You should have solid experience in maintaining and scaling machine learning pipelines using TensorFlow and Kubeflow.
- Knowledge of ML Engineering, including model architecture design and hyperparameter tuning.
- Advanced MLOps Proficiency : At least 3 years of experience in ML Engineering with expertise in deploying models and managing ML workflows and familiarity with MLflow TFX or Airflow.
- Strategic Problem Solver with Collaborative Spirit Excel at solving complex problems at scale and have a proven ability to work effectively within collaborative fast-paced cross-functional teams.
- You're adept at communicating technical concepts across various stakeholder groups ensuring alignment and understanding.
- Innovative Tech Enthusiast with Cloud Expertise
- Your proactive approach drives you to continually seek improvements in ML development and deployment processes.
- You have a strong knowledge of cloud platforms particularly AWS and experience with containerization tools like Docker and Kubernetes.
- Hands-on experience with AWS Cloud (EC2, S3, RDS, Lambda, SageMaker, etc.)
- Strong programming skills in Python, with experience in frameworks like TensorFlow, PyTorch, scikit-learn, LLM / Lang chain / Agent experience is a plus.
Nice to have skills :
- Understanding of embeddings / vector databases and feature stores
- Knowledge of monitoring tools like Prometheus, Grafana, or CloudWatch.
- Experience with other cloud platforms (GCP, Azure) is a plus.
- Understanding of model explainability, fairness, and ethical considerations in AI/ML.
- Experience with version control tools like Git and ML model versioning tools like MLflow or DVC
Functional Areas: Other
Read full job descriptionPrepare for Engineer roles with real interview advice