As a member of the Data Transformation - Cognitive Engineering team you will work on building ML powered products and capabilities to power natural language understanding, data extraction, information retrieval and data sourcing solutions for S&P Global Market Intelligence and our clients. You will spearhead development of production-ready AI products and pipelines while leading-by-example in a highly engaging work environment. You will work in a (truly) global team and encouraged for thoughtful risk-taking and self-initiative.
What's in it for you:
Be a part of a global company and build solutions at enterprise scale Lead a highly skilled and technically strong team Contribute to solving high complexity, high impact problems Build production ready pipelines from ideation to deployment
Responsibilities:
- Design, Develop and Deploy ML powered products and pipelines
- Lead a team of Senior and Junior data scientists in delivering large scale projects
Play a central role in all stages of the data science project life cycle, including:
- Identification of suitable data science project opportunities
- Partnering with business leaders, domain experts, and end-users to gain business understanding, data understanding, and collect requirements
- Evaluation/interpretation of results and presentation to business leaders
- Performing exploratory data analysis, proof-of-concept modelling, model benchmarking and setup model validation experiments
- Training large models both for experimentation and production
- Develop production ready pipelines for enterprise scale projects
- Perform code reviews & optimization for your projects and team
- Spearhead deployment and model scaling strategies
- Stakeholder management and representing the team in front of our leadership
- Leading and mentoring by example including project scrums
Technical Requirements:
- Expert proficiency in Python (Numpy, Pandas, Spacy, Sklearn, Pytorch/TF2, HuggingFace etc)
- Excellent knowledge of ML & Deep Learning domain
- Excellent understanding of statistics and relevant mathematics
- Adept at reading and implementing latest research
- Solid exposure to Information Retrieval, Web scraping and Data Extraction at scale
- Experience with SOTA models related to NLP like Topic modelling, Q&A, Summarization, Phrase extraction, custom NER models, Table Extraction, OCR and GNNs.
- Exposure to the following technologies - R-Shiny/Dash/Streamlit, SQL, Docker, Airflow, Redis, Celery, Flask/Django/FastAPI, PySpark, Scrapy
- Open to learning new technologies and programming languages as required
What were looking for:
- 7-10+ years of relevant experience in Data Science domain
- Prior work to show on Github, Kaggle, StackOverflow etc
- A Masters / PhD from a recognized institute in a relevant specialization
Employment Type: Full Time, Permanent
Read full job description