Data Cleaning, Preprocessing & exploration: Prepare data for analysis, ensuring quality, consistency, and completeness by handling missing values, outliers, and transforming data. Explore and analyze large and complex datasets to identify patterns, trends, and anomalies
Machine Learning Model Development: Build, train, and deploy machine learning models on the Databricks platform, leveraging tools such as ML flow for experiment with techniques like regression, classification, clustering, and time series analysis
Model Evaluation & Deployment: Develop and select features to improve model performance, leveraging Databricks distributed computing capabilities for efficient processing. Familiarity with CI/CD tools (e.g., Jenkins, GitLab) for automating deployment and testing of data pipelines
Collaboration: Collaborate with data engineers, analysts, and business stakeholders to understand business requirements and translate them into data-driven solutions.
Data Visualization and Reporting: Create visualizations and dashboards within Databricks, Power BI and other tools to communicate insights to technical and non-technical stakeholders.
Continuous Learning: Stay up to date with the latest developments in data science, machine learning, and industry best practices to continually enhance skills and processes.
Key Skills:
Knowledge of statistical analysis techniques, hypothesis testing, and machine learning
Familiarity with NLP, time series analysis, and computer vision or A/B testing
Databricks and Apache Spark: Proficiency with Databricks, Spark DataFrames and MLlib
Programming: Proficiency in (Python, TensorFlow, Pandas, scikit-learn, PySpark, NumPy) with experience in writing scalable code for large datasets
SQL: Strong SQL skills for data extraction, manipulation, and analysis
Familiarity with ML flow for tracking, model versioning, and reproducibility.
Familiarity with cloud data storage and processing tools (e.g., Azure Data Lake, AWS S3).
Educational Qualification:
Education: Bachelor s degree in Statistics, Mathematics, Computer Science, or a related field
Experience: 3+ years of experience in data science or analytical role
Experience:
As a skilled Data Scientist, you will leverage your analytical skills to uncover insights from complex datasets, build predictive models, and inform data-driven decisions across the organization. You ll work closely with cross-functional teams, including business, engineering, and product, to apply advanced statistical methods, machine learning, and domain knowledge to solve real-world problems.