As a Senior Lead Machine Learning Engineer of the Document Platforms and AI team, you will play a critical role in building the next generation of data extraction tools, working on cutting-edge ML-powered products and capabilities that power natural language understanding, information retrieval, and data sourcing solutions for the Enterprise Data Organization and our clients. This is an exciting opportunity to shape the future of Document Platforms and AI and see your work make a real difference, all while having fun in a collaborative and engaging environment. you'll spearhead the development and deployment of production-ready AI products and pipelines, leading by example and mentoring a talented team. This role demands a deep understanding of machine learning principles, hands-on experience with relevant technologies, and the ability to inspire and guide others. you'll be at the forefront of a rapidly evolving field, learning and growing alongside some of the brightest minds in the industry.
The Impact:
-
The Document Platforms and AI team has already delivered breakthrough products and significant business value over the last 3 years.
-
In this role you will be developing our next generation of new products while enhancing existing ones aiming at solving high-impact business problems.
What s in it for you:
-
Be a part of a global company and build solutions at enterprise scale
-
Collaborate with a highly skilled and technically strong team
-
Contribute to solving high complexity, high impact problems
Responsibilities:
-
Build production ready data acquisition and transformation pipelines from ideation to deployment.
-
Being a hands-on problem solver and developer helping to extend and manage the data platforms.
-
Apply best practices in data modeling and building ETL pipelines (streaming and batch) using cloud-native solutions
-
Technical leadership: Drive the technical vision and architecture for the extraction project, making key decisions about model selection, infrastructure, and deployment strategies.
-
Model development: Design, develop, and evaluate state-of-the-art machine learning models for information extraction, leveraging techniques from NLP, computer vision (if applicable), and other relevant domains.
-
Data preprocessing and feature engineering: Develop robust pipelines for data cleaning, preprocessing, and feature engineering to prepare data for model training.
-
Model training and evaluation: Train, tune, and evaluate machine learning models, ensuring high accuracy, efficiency, and scalability.
-
Deployment and monitoring: Deploy and maintain machine learning models in a production environment, monitoring their performance and ensuring their reliability.
-
Research and innovation: Stay up-to-date with the latest advancements in machine learning and NLP, and explore new techniques and technologies to improve the extraction process.
-
Collaboration: Work closely with product managers, data scientists, and other engineers to understand project requirements and deliver effective solutions.
-
Code quality and best practices: Ensure high code quality and adherence to best practices for software development.
-
Communication: Effectively communicate technical concepts and project updates to both technical and non-technical audiences.
What We re Looking For:
-
8-10 years of professional software work experience, with a strong focus on Machine Learning, Natural Language Processing (NLP) for information extraction and MLOps
-
Expertise in Python and related NLP libraries (eg, spaCy, NLTK, Transformers, Hugging Face)
-
Experience with Apache Spark or other distributed computing frameworks for large-scale data processing.
-
AWS/GCP Cloud expertise, particularly in deploying and scaling ML pipelines for NLP tasks.
-
Solid understanding of the Machine Learning model lifecycle, including data preprocessing, feature engineering, model training, evaluation, deployment, and monitoring, specifically for information extraction models .
-
Experience with CI/CD pipelines for ML models, including automated testing and deployment.
-
Docker & Kubernetes experience for containerization and orchestration.
-
OOP Design patterns, Test-Driven Development and Enterprise System design
-
SQL (any variant, bonus if this is a big data variant)
-
Linux OS (eg bash toolset and other utilities)
-
Version control system experience with Git, GitHub, or Azure DevOps.
-
Excellent Problem-solving, Code Review and Debugging skills
-
Software craftsmanship, adherence to Agile principles and taking pride in writing good code
-
Techniques to communicate change to non-technical people
Nice to have
-
Core Java 17+, preferably Java 21+, and associated toolchain
-
Apache Avro
-
Apache Kafka
-
Other JVM based languages - eg Kotlin, Scala
Employment Type: Full Time, Permanent
Read full job description