Our Client is looking for people to join us in building ML platforms for our Fortune 500 customers. You will be a key member of the client GenAI delivery organization heading a team of other client engineers across different skill sets.
Required skills
10+ years of professional experience in building applications using cloud services. Prior experience in building Machine Learning platforms using cloud services.
Cloud expertise: Deep knowledge of cloud platforms like AWS, Google Cloud Platform, or Azure, including their machine learning and data services (Azure preferred).
DevOps skills: Experience with CI/CD pipelines, infrastructure as code, and containerization technologies like Docker and Kubernetes.
Machine learning knowledge: Understanding of ML workflows, model training, and deployment processes.
Data engineering: Familiarity with data pipelines, ETL processes, and data storage solutions.
Software engineering: Strong programming skills, particularly in languages commonly used in ML like Python.
System design: Ability to architect scalable, reliable systems that integrate various services.
Automation: Expertise in automating workflows and processes across the ML lifecycle.
Security and compliance: Knowledge of best practices for securing ML pipelines and ensuring regulatory compliance.
Monitoring and logging: Experience setting up monitoring and logging for ML systems.
Collaboration: Ability to work with data scientists, software engineers, and other stakeholders.
Roles responsibilities
Evaluate and select appropriate cloud services for each stage of the ML lifecycle
Design and implement the overall architecture of the MLOps platform
Set up automated pipelines for data preparation, model training, and deployment
Implement version control for code, data, and models
Ensure the platform is scalable, secure, and compliant with relevant regulations
Provide tools and interfaces for data scientists to easily leverage the platform
Continuously optimize the platform for performance and cost-efficiency
This role is crucial in bridging the gap between data science and operations, enabling organizations to efficiently develop, deploy, and maintain machine learning models at scale.