- Ability to design and implement workflows of Linear and Logistic Regression, Ensemble Models (Random Forest, Boosting) using R/Python.
- Demonstrable competency in Probability and Statistics, ability to use ideas of Data Distributions, Hypothesis Testing and other Statistical Tests.
- Must have experience in dealing with outliers, denoising data and handling the impact of pandemic like situations.
- Should be able to perform EDA of raw data & feature engineering wherever applicable.
- Demonstrable competency in Data Visualisation using the Python/R Data Science Stack.
- Should be able to leverage cloud platforms for training and deploying large scale solutions.
- Should be able to train and evaluate ML model using various machine learning and deep learning algorithm. • Retrain and maintain model accuracy in deployment.
- Should be able to package & deploy large scale models on on-premise systems using multiple approaches including docker.
- Should be able to take complete ownership of the assigned project.
- Experience of working in Agile environments • Well versed with JIRA or equivalent project tracking tool.
Competencies /Skills
- Knowledge of cloud platforms (AWS, Azure and GCP).
- Exposure to No SQL databases (MongoDB, Cassandra, Cosmos DB, HBase).
- Forecasting experience in products like SAP, Oracle, Power BI, Qlik, etc.
- Proficiency in Excel (Power Pivot, Power Query, Macros, Charts).
- An experience with large data sets and distributed computing (Hive/Hadoop/Spark).
- Transfer learning using state of art models in different spaces vision, NLP and speech. - Integration with external services and Cloud API.
- Working with data annotation approaches and tools for text, images and videos
Employment Type: Full Time, Permanent
Read full job description