Upload Button Icon Add office photos
Engaged Employer

i

This company page is being actively managed by ICICI Securities Team. If you also belong to the team, you can get access from here

ICICI Securities Verified Tick

Compare button icon Compare button icon Compare

Filter interviews by

ICICI Securities Data Scientist Interview Questions and Answers

Updated 24 Jul 2024

6 Interview questions

A Data Scientist was asked 11mo ago
Q. What are the functions of the pandas library, such as get_dummies()?
Ans. 

get_dummies() function in pandas library is used to convert categorical variables into dummy/indicator variables.

  • get_dummies() function creates dummy variables for categorical columns in a DataFrame.

  • It converts categorical variables into numerical representation for machine learning models.

  • Example: df = pd.get_dummies(df, columns=['column_name'])

A Data Scientist was asked 11mo ago
Q. Explain a Data Science project you have worked on.
Ans. 

Developed a predictive model to forecast customer churn for a telecommunications company.

  • Identified key features such as customer tenure, monthly charges, and service usage

  • Collected and cleaned data from customer databases

  • Built a machine learning model using logistic regression or random forest algorithms

  • Evaluated model performance using metrics like accuracy, precision, and recall

  • Provided actionable insights to r...

Data Scientist Interview Questions Asked at Other Companies

Q1. for a data with 1000 samples and 700 dimensions, how would you fi ... read more
Q2. Special Sum of Array Problem Statement Given an array 'arr' conta ... read more
asked in Affine
Q3. You have a pandas dataframe with three columns filled with state ... read more
asked in Walmart
Q4. Describe the data you would analyze to solve cost and revenue opt ... read more
Q5. Clone a Linked List with Random Pointers Given a linked list wher ... read more
A Data Scientist was asked
Q. Have you worked on customer segmentation?
Ans. 

Yes, I have worked on customer segmentation.

  • I have used clustering algorithms like K-means and hierarchical clustering to segment customers based on their behavior and demographics.

  • I have also used decision trees and random forests to identify the most important features for segmentation.

  • I have experience with both supervised and unsupervised learning techniques for customer segmentation.

  • I have worked on projects ...

A Data Scientist was asked 11mo ago
Q. Types of Machine learning models
Ans. 

Types of machine learning models include supervised learning, unsupervised learning, and reinforcement learning.

  • Supervised learning: Models learn from labeled data, making predictions based on past examples (e.g. linear regression, support vector machines)

  • Unsupervised learning: Models find patterns in unlabeled data, clustering similar data points together (e.g. k-means clustering, PCA)

  • Reinforcement learning: Mode...

A Data Scientist was asked 11mo ago
Q. Types of Error in Statistics
Ans. 

Types of errors in statistics include sampling error, measurement error, and non-sampling error.

  • Sampling error occurs when the sample does not represent the population accurately.

  • Measurement error is caused by inaccuracies in data collection or measurement instruments.

  • Non-sampling error includes errors in data processing, analysis, and interpretation.

  • Examples: Sampling error - selecting a biased sample, Measuremen...

A Data Scientist was asked
Q. Machine learning algorithms
Ans. 

Machine learning algorithms are used to analyze data and make predictions or decisions without being explicitly programmed.

  • Machine learning algorithms can be categorized into supervised, unsupervised, and reinforcement learning.

  • Examples of machine learning algorithms include linear regression, decision trees, support vector machines, and neural networks.

  • These algorithms learn from data to improve their performance...

ICICI Securities Data Scientist Interview Experiences

3 interviews found

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via Campus Placement and was interviewed before Jul 2023. There was 1 interview round.

Round 1 - Technical 

(4 Questions)

  • Q1. Explain any Data Science project
  • Ans. 

    Developed a predictive model to forecast customer churn for a telecommunications company.

    • Identified key features such as customer tenure, monthly charges, and service usage

    • Collected and cleaned data from customer databases

    • Built a machine learning model using logistic regression or random forest algorithms

    • Evaluated model performance using metrics like accuracy, precision, and recall

    • Provided actionable insights to reduce...

  • Answered by AI
  • Q2. Types of Error in Statistics
  • Ans. 

    Types of errors in statistics include sampling error, measurement error, and non-sampling error.

    • Sampling error occurs when the sample does not represent the population accurately.

    • Measurement error is caused by inaccuracies in data collection or measurement instruments.

    • Non-sampling error includes errors in data processing, analysis, and interpretation.

    • Examples: Sampling error - selecting a biased sample, Measurement err...

  • Answered by AI
  • Q3. Types of Machine learning models
  • Ans. 

    Types of machine learning models include supervised learning, unsupervised learning, and reinforcement learning.

    • Supervised learning: Models learn from labeled data, making predictions based on past examples (e.g. linear regression, support vector machines)

    • Unsupervised learning: Models find patterns in unlabeled data, clustering similar data points together (e.g. k-means clustering, PCA)

    • Reinforcement learning: Models le...

  • Answered by AI
  • Q4. Functions of pandas library, such as get_dummies()
  • Ans. 

    get_dummies() function in pandas library is used to convert categorical variables into dummy/indicator variables.

    • get_dummies() function creates dummy variables for categorical columns in a DataFrame.

    • It converts categorical variables into numerical representation for machine learning models.

    • Example: df = pd.get_dummies(df, columns=['column_name'])

  • Answered by AI

Interview Preparation Tips

Interview preparation tips for other job seekers - On campus interview, Be confident, be good at project explaination.

Skills evaluated in this interview

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
4-6 weeks
Result
Selected Selected

I applied via Recruitment Consulltant and was interviewed before May 2023. There were 2 interview rounds.

Round 1 - Technical 

(2 Questions)

  • Q1. Machine learning algorithms
  • Ans. 

    Machine learning algorithms are used to analyze data and make predictions or decisions without being explicitly programmed.

    • Machine learning algorithms can be categorized into supervised, unsupervised, and reinforcement learning.

    • Examples of machine learning algorithms include linear regression, decision trees, support vector machines, and neural networks.

    • These algorithms learn from data to improve their performance over...

  • Answered by AI
  • Q2. SQL basics and basic knowledge of security products
Round 2 - HR 

(1 Question)

  • Q1. Very lengthy process and delayed a lot

Skills evaluated in this interview

I applied via Naukri.com and was interviewed in Jan 2021. There were 3 interview rounds.

Interview Questionnaire 

3 Questions

  • Q1. Questions were mainly based on past role and work experience
  • Q2. Have you worked on customer segmentation?
  • Ans. 

    Yes, I have worked on customer segmentation.

    • I have used clustering algorithms like K-means and hierarchical clustering to segment customers based on their behavior and demographics.

    • I have also used decision trees and random forests to identify the most important features for segmentation.

    • I have experience with both supervised and unsupervised learning techniques for customer segmentation.

    • I have worked on projects where...

  • Answered by AI
  • Q3. In depth interview on tools worked upon

Interview Preparation Tips

Interview preparation tips for other job seekers - Give genuine answers of you have not worked on any of the tools feel free to tell the same. That won't affect your interview.

Top trending discussions

View All
Interview Tips & Stories
2w
toobluntforu
·
works at
Cvent
Can speak English, can’t deliver in interviews
I feel like I can't speak fluently during interviews. I do know english well and use it daily to communicate, but the moment I'm in an interview, I just get stuck. since it's not my first language, I struggle to express what I actually feel. I know the answer in my head, but I just can’t deliver it properly at that moment. Please guide me
Got a question about ICICI Securities?
Ask anonymously on communities.

Interview questions from similar companies

I applied via Recruitment Consulltant and was interviewed before Aug 2021. There was 1 interview round.

Round 1 - Technical 

(1 Question)

  • Q1. Difference between CNN and MLP
  • Ans. 

    CNN is used for image recognition while MLP is used for general classification tasks.

    • CNN uses convolutional layers to extract features from images while MLP uses fully connected layers.

    • CNN is better suited for tasks that require spatial understanding like object detection while MLP is better for tabular data.

    • CNN has fewer parameters than MLP due to weight sharing in convolutional layers.

    • CNN can handle input of varying ...

  • Answered by AI

Interview Preparation Tips

Interview preparation tips for other job seekers - Brush up basic statistics . Also prepare atleast 2 , 3 ML algorithms for the interview.

Skills evaluated in this interview

I applied via Walk-in and was interviewed in Mar 2020. There was 1 interview round.

Interview Questionnaire 

10 Questions

  • Q1. What is R square and how R square is different from Adjusted R square
  • Ans. 

    R square is a statistical measure that represents the proportion of the variance in the dependent variable explained by the independent variables.

    • R square is a value between 0 and 1, where 0 indicates that the independent variables do not explain any of the variance in the dependent variable, and 1 indicates that they explain all of it.

    • It is used to evaluate the goodness of fit of a regression model.

    • Adjusted R square t...

  • Answered by AI
  • Q2. Explain what do u understand by the team WOE and IV. What's the importance. Advantages and disadvantages
  • Ans. 

    WOE (Weight of Evidence) and IV (Information Value) are metrics used for feature selection and assessing predictive power in models.

    • WOE transforms categorical variables into continuous variables, making them more suitable for modeling.

    • IV quantifies the predictive power of a feature by measuring the separation between the good and bad outcomes.

    • For example, if a feature has an IV of 0.3, it indicates strong predictive po...

  • Answered by AI
  • Q3. What are variable reducing techniques
  • Ans. 

    Variable reducing techniques are methods used to identify and select the most relevant variables in a dataset.

    • Variable reducing techniques help in reducing the number of variables in a dataset.

    • These techniques aim to identify the most important variables that contribute significantly to the outcome.

    • Some common variable reducing techniques include feature selection, dimensionality reduction, and correlation analysis.

    • Fea...

  • Answered by AI
  • Q4. Which test is used in logistic regression to check the significance of the variable
  • Ans. 

    The Wald test is used in logistic regression to check the significance of the variable.

    • The Wald test calculates the ratio of the estimated coefficient to its standard error.

    • It follows a chi-square distribution with one degree of freedom.

    • A small p-value indicates that the variable is significant.

    • For example, in Python, the statsmodels library provides the Wald test in the summary of a logistic regression model.

  • Answered by AI
  • Q5. How to check multicollinearity in Logistic regression
  • Ans. 

    Multicollinearity in logistic regression can be checked using correlation matrix and variance inflation factor (VIF).

    • Calculate the correlation matrix of the independent variables and check for high correlation coefficients.

    • Calculate the VIF for each independent variable and check for values greater than 5 or 10.

    • Consider removing one of the highly correlated variables or variables with high VIF to address multicollinear...

  • Answered by AI
  • Q6. Difference between bagging and boosting
  • Ans. 

    Bagging and boosting are ensemble methods used in machine learning to improve model performance.

    • Bagging involves training multiple models on different subsets of the training data and then combining their predictions through averaging or voting.

    • Boosting involves iteratively training models on the same dataset, with each subsequent model focusing on the samples that were misclassified by the previous model.

    • Bagging reduc...

  • Answered by AI
  • Q7. Explain the logistics regression process
  • Ans. 

    Logistic regression is a statistical method used to analyze and model the relationship between a binary dependent variable and one or more independent variables.

    • It is a type of regression analysis used for predicting the outcome of a categorical dependent variable based on one or more predictor variables.

    • It uses a logistic function to model the probability of the dependent variable taking a particular value.

    • It is commo...

  • Answered by AI
  • Q8. Explain Gini coefficient
  • Ans. 

    Gini coefficient measures the inequality among values of a frequency distribution.

    • Gini coefficient ranges from 0 to 1, where 0 represents perfect equality and 1 represents perfect inequality.

    • It is commonly used to measure income inequality in a population.

    • A Gini coefficient of 0.4 or higher is considered to be a high level of inequality.

    • Gini coefficient can be calculated using the Lorenz curve, which plots the cumulati...

  • Answered by AI
  • Q9. Difference between chair and cart
  • Ans. 

    A chair is a piece of furniture used for sitting, while a cart is a vehicle used for transporting goods.

    • A chair typically has a backrest and armrests, while a cart does not.

    • A chair is designed for one person to sit on, while a cart can carry multiple items or people.

    • A chair is usually stationary, while a cart is mobile and can be pushed or pulled.

    • A chair is commonly found in homes, offices, and public spaces, while a c...

  • Answered by AI
  • Q10. How to check outliers in a variable, what treatment should you use to remove such outliers
  • Ans. 

    Outliers can be detected using statistical methods like box plots, z-score, and IQR. Treatment can be removal or transformation.

    • Use box plots to visualize outliers

    • Calculate z-score and remove data points with z-score greater than 3

    • Calculate IQR and remove data points outside 1.5*IQR

    • Transform data using log or square root to reduce the impact of outliers

  • Answered by AI

Interview Preparation Tips

Interview preparation tips for other job seekers - Explain the concept properly, if not able to explain properly then take a pause and try again with some examples. Be confident.

Skills evaluated in this interview

I applied via Approached by Company and was interviewed before Sep 2021. There were 3 interview rounds.

Round 1 - Resume Shortlist 
Pro Tip by AmbitionBox:
Keep your resume crisp and to the point. A recruiter looks at your resume for an average of 6 seconds, make sure to leave the best impression.
View all tips
Round 2 - Technical 

(1 Question)

  • Q1. Projects and Data Science concepts
Round 3 - Technical 

(1 Question)

  • Q1. Python and coding skills

Interview Preparation Tips

Interview preparation tips for other job seekers - Be through with concepts - ML, stats, NLP
Interview experience
4
Good
Difficulty level
Hard
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via Campus Placement and was interviewed before Jul 2023. There were 3 interview rounds.

Round 1 - Aptitude Test 

Medium General Aptitude questions and technical(Big Data, Python etc.)

Round 2 - Technical 

(1 Question)

  • Q1. ML Algorithms (SVM, Random forest, bagging boosting, ridge, etc)
Round 3 - Technical 

(1 Question)

  • Q1. Deep equations and understading of DL and ML Algorithms
  • Ans. 

    Understanding deep equations and algorithms in DL and ML is crucial for a data scientist.

    • Deep learning involves complex neural network architectures like CNNs and RNNs.

    • Machine learning algorithms include decision trees, SVM, k-means clustering, etc.

    • Understanding the math behind algorithms helps in optimizing model performance.

    • Equations like gradient descent, backpropagation, and loss functions are key concepts.

    • Practica...

  • Answered by AI

Skills evaluated in this interview

Are these interview questions helpful?
Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Aptitude Test 

Many Mcq,s.Similar to cat exam

Round 2 - Case Study 

Ml case study . Eg loan default prediction

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I appeared for an interview in Apr 2025, where I was asked the following questions.

  • Q1. Tell me about yourself?
  • Q2. What are the GenAI industry trends? ChatGPT vs DeepSeek high level discussion.
  • Ans. 

    GenAI trends focus on advancements in AI models, with ChatGPT and DeepSeek showcasing different applications and capabilities.

    • Increased adoption of conversational AI in customer service, exemplified by ChatGPT's integration into various platforms.

    • DeepSeek focuses on specialized knowledge retrieval, enhancing search capabilities in niche domains like legal or medical fields.

    • Emergence of hybrid models combining generativ...

  • Answered by AI
  • Q3. ML question: Random Forest vs XGBoost
  • Ans. 

    Random Forest is an ensemble method using bagging, while XGBoost uses boosting for improved accuracy and speed.

    • Random Forest builds multiple decision trees and merges them for better accuracy.

    • XGBoost optimizes the model by sequentially adding trees that correct errors of previous ones.

    • Random Forest is less prone to overfitting compared to individual decision trees.

    • XGBoost includes regularization techniques to prevent o...

  • Answered by AI
  • Q4. ML question: Classification vs Regression
  • Ans. 

    Classification predicts categories, while regression predicts continuous values in machine learning tasks.

    • Classification: Assigns labels to data points (e.g., spam vs. not spam).

    • Regression: Predicts numerical values (e.g., house prices based on features).

    • Classification algorithms include logistic regression, decision trees, and SVM.

    • Regression algorithms include linear regression, polynomial regression, and regression t...

  • Answered by AI
  • Q5. ML question: How does one handle data imbalance in a dataset?
  • Ans. 

    Data imbalance can skew model performance; various techniques can help mitigate its effects.

    • Resampling techniques: Use oversampling (e.g., SMOTE) or undersampling to balance classes.

    • Use different evaluation metrics: Focus on precision, recall, or F1-score instead of accuracy.

    • Implement cost-sensitive learning: Assign higher misclassification costs to minority class errors.

    • Utilize ensemble methods: Techniques like Random...

  • Answered by AI
  • Q6. ML question: When to use accuracy, precision, recall? How to identify the classification model's performance using above three? F score, ROC area under the curve.
  • Q7. ML question: What are the X and Y axis of ROC curve?
  • Ans. 

    The ROC curve plots true positive rate against false positive rate to evaluate classifier performance.

    • X-axis: False Positive Rate (FPR) - the ratio of negative instances incorrectly classified as positive.

    • Y-axis: True Positive Rate (TPR) - the ratio of positive instances correctly classified as positive.

    • Example: A model with a TPR of 0.9 and FPR of 0.1 indicates high sensitivity but some false alarms.

    • The ROC curve help...

  • Answered by AI
  • Q8. SQL questions: Employee and department table; 1. Salary of employees with salary > avg salary 2. Salary of employees with salary > avg salary of department 3. Salary of top 3 employees with salary > avg sa...
  • Q9. Any other questions for us?
  • Ans. 

    I'm curious about the team's data science projects and how they align with the company's goals and vision.

    • What are the current data science projects the team is working on?

    • How does the data science team collaborate with other departments?

    • Can you share examples of how data-driven decisions have impacted the company?

    • What tools and technologies does the team primarily use?

    • How does the company support continuous learning a...

  • Answered by AI

Interview Preparation Tips

Interview preparation tips for other job seekers - Focus on ML, SQL, and Gen AI trends.
Interview experience
4
Good
Difficulty level
Moderate
Process Duration
-
Result
No response

I applied via Job Portal and was interviewed in Nov 2023. There was 1 interview round.

Round 1 - One-on-one 

(5 Questions)

  • Q1. What is Gradient Descents?
  • Ans. 

    Gradient descent is an optimization algorithm used to minimize a function by iteratively moving in the direction of steepest descent.

    • Gradient descent is used to find the minimum of a function by taking steps proportional to the negative of the gradient at the current point.

    • It is commonly used in machine learning to optimize the parameters of a model by minimizing the loss function.

    • There are different variants of gradie...

  • Answered by AI
  • Q2. What is LSTM?, and what are the gates in it?
  • Ans. 

    LSTM (Long Short-Term Memory) is a type of recurrent neural network designed to handle long-term dependencies.

    • LSTM has three gates: input gate, forget gate, and output gate.

    • Input gate controls the flow of information into the cell state.

    • Forget gate decides what information to discard from the cell state.

    • Output gate determines the output based on the cell state.

  • Answered by AI
  • Q3. They gave me a link to dataset and started saying the operations to apply on that. E.g, value_counts, null_values, fill the values with mean,etc.
  • Q4. What is t-test? What is Mean, Median and Mode and where to use these?
  • Ans. 

    T-test is a statistical test used to determine if there is a significant difference between the means of two groups.

    • Mean is the average of a set of numbers, median is the middle value when the numbers are ordered, and mode is the most frequently occurring value.

    • Mean is sensitive to outliers, median is robust to outliers, and mode is useful for categorical data.

    • T-test is used to compare means of two groups, mean is used...

  • Answered by AI
  • Q5. What is RANDOM FOREST ?
  • Ans. 

    Random Forest is an ensemble learning method used for classification and regression tasks.

    • Random Forest is a collection of decision trees that are trained on random subsets of the data.

    • Each tree in the forest makes a prediction, and the final prediction is the average (regression) or majority vote (classification) of all trees.

    • Random Forest helps reduce overfitting and improve accuracy compared to a single decision tre...

  • Answered by AI

Interview Preparation Tips

Topics to prepare for Motilal Oswal Financial Services Data Scientist interview:
  • Machine Learning
  • Statistics
  • Pandas

Skills evaluated in this interview

ICICI Securities Interview FAQs

How many rounds are there in ICICI Securities Data Scientist interview?
ICICI Securities interview process usually has 1-2 rounds. The most common rounds in the ICICI Securities interview process are Technical and HR.
What are the top questions asked in ICICI Securities Data Scientist interview?

Some of the top questions asked at the ICICI Securities Data Scientist interview -

  1. Have you worked on customer segmentati...read more
  2. Functions of pandas library, such as get_dummie...read more
  3. Explain any Data Science proj...read more

Tell us how to improve this page.

Overall Interview Experience Rating

4/5

based on 2 interview experiences

Difficulty level

Moderate 100%

Duration

Less than 2 weeks 50%
4-6 weeks 50%
View more
ICICI Securities Data Scientist Salary
based on 20 salaries
₹10.1 L/yr - ₹25 L/yr
44% more than the average Data Scientist Salary in India
View more details

ICICI Securities Data Scientist Reviews and Ratings

based on 3 reviews

3.9/5

Rating in categories

2.7

Skill development

4.9

Work-life balance

3.6

Salary

4.5

Job security

3.0

Company culture

3.2

Promotions

3.2

Work satisfaction

Explore 3 Reviews and Ratings
Senior Relationship Manager
1k salaries
unlock blur

₹2.2 L/yr - ₹6 L/yr

Assistant Manager
731 salaries
unlock blur

₹2.8 L/yr - ₹7.5 L/yr

Chief Relationship Manager
317 salaries
unlock blur

₹3 L/yr - ₹5.7 L/yr

Relationship Manager
293 salaries
unlock blur

₹2.3 L/yr - ₹5.8 L/yr

Manager
292 salaries
unlock blur

₹4.7 L/yr - ₹10.2 L/yr

Explore more salaries
Compare ICICI Securities with

HDFC Securities

3.5
Compare

Kotak Securities

3.6
Compare

Wells Fargo

3.8
Compare

HSBC Group

3.9
Compare
write
Share an Interview