Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

VIEW WINNERS
- ABECA 2025
  
  VIEW WINNERS
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
Participate in ABECA 2026

Add office photos

Employer? Claim Account for FREE

CIMB bank

Compare

3.9

based on 10 Reviews

Filter interviews by

CIMB bank Data Scientist Interview Questions and Answers

Updated 17 Nov 2024

CIMB bank Data Scientist Interview Experiences

2 interviews found

Data Scientist Interview Questions & Answers

Anonymous

posted on 28 Oct 2024

Interview experience

Excellent

Difficulty level

Process Duration

Result

Round 1 - Aptitude Test

Model Training based on the dataset and problem statement provided.

Round 2 - Technical

(2 Questions)

Q1. Asked about Roc Auc

Ans.

ROC AUC measures the performance of a binary classifier, indicating its ability to distinguish between classes.

ROC (Receiver Operating Characteristic) curve plots True Positive Rate vs. False Positive Rate.
AUC (Area Under the Curve) quantifies the overall ability of the model to discriminate between positive and negative classes.
An AUC of 0.5 indicates no discrimination (random guessing), while an AUC of 1.0 indicates ...

Answered by AI

Add your answer

Q2. What is difference between supervised and unsupervised learning

Ans.

Supervised learning uses labeled data to train the model, while unsupervised learning uses unlabeled data.

Supervised learning requires a target variable to predict, while unsupervised learning does not.
In supervised learning, the model learns from labeled examples, while in unsupervised learning, the model finds patterns in data.
Examples of supervised learning include regression and classification tasks, while clusteri...

Answered by AI

Add your answer

Skills evaluated in this interview

Data Scientist Interview Questions & Answers

Anonymous

posted on 17 Nov 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

No response

I applied via Approached by Company and was interviewed in May 2024. There was 1 interview round.

Round 1 - One-on-one

(2 Questions)

Q1. Xgboost in-depth

Ans.

Xgboost is a popular machine learning algorithm known for its speed and performance in handling large datasets.

Xgboost stands for eXtreme Gradient Boosting, which is an optimized implementation of gradient boosting.
It is widely used in Kaggle competitions and other machine learning tasks due to its high accuracy and efficiency.
Xgboost uses a technique called boosting, where multiple weak learners are combined to create...

Answered by AI

Add your answer

Q2. Type 1 & 2 error.

Add your answer

Skills evaluated in this interview

Top trending discussions

View All

Interview Tips & Stories

4d (edited)

a team lead

Why are women still asked such personal questions in interview?

I recently went for an interview… and honestly, m still trying to process what just happened. Instead of being asked about my skills, experience, or how I could add value to the company… the questions took a totally unexpected turn. The interviewer started asking things like When are you getting married? Are you engaged? And m sure, if I had said I was married, the next question would’ve been How long have you been married? What does my personal life have to do with the job m applying for? This is where I felt the gender discrimination hit hard. These types of questions are so casually thrown at women during interviews but are they ever asked to men? No one asks male candidates if they’re planning a wedding or how old their kids are. So why is it okay to ask women? Can we please stop normalising this kind of behaviour in interviews? Our careers shouldn’t be judged by our relationship status. Period.

Got a question about CIMB bank?

Ask anonymously on communities.

Interview questions from similar companies

Data Scientist Interview Questions & Answers

HSBC Group

Anonymous

posted on 11 Aug 2022

I applied via Recruitment Consulltant and was interviewed before Aug 2021. There was 1 interview round.

Round 1 - Technical

(1 Question)

Q1. Difference between CNN and MLP

Ans.

CNN is used for image recognition while MLP is used for general classification tasks.

CNN uses convolutional layers to extract features from images while MLP uses fully connected layers.
CNN is better suited for tasks that require spatial understanding like object detection while MLP is better for tabular data.
CNN has fewer parameters than MLP due to weight sharing in convolutional layers.
CNN can handle input of varying ...

Answered by AI

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Brush up basic statistics . Also prepare atleast 2 , 3 ML algorithms for the interview.

Skills evaluated in this interview

Data Scientist Interview Questions & Answers

Citicorp

Anonymous

posted on 10 Sep 2020

I applied via Walk-in and was interviewed in Mar 2020. There was 1 interview round.

Interview Questionnaire

10 Questions

Q1. What is R square and how R square is different from Adjusted R square

Ans.

R square is a statistical measure that represents the proportion of the variance in the dependent variable explained by the independent variables.

R square is a value between 0 and 1, where 0 indicates that the independent variables do not explain any of the variance in the dependent variable, and 1 indicates that they explain all of it.
It is used to evaluate the goodness of fit of a regression model.
Adjusted R square t...

Answered by AI

View 1 more answer

Q2. Explain what do u understand by the team WOE and IV. What's the importance. Advantages and disadvantages

Ans.

WOE (Weight of Evidence) and IV (Information Value) are metrics used for feature selection and assessing predictive power in models.

WOE transforms categorical variables into continuous variables, making them more suitable for modeling.
IV quantifies the predictive power of a feature by measuring the separation between the good and bad outcomes.
For example, if a feature has an IV of 0.3, it indicates strong predictive po...

Answered by AI

View 1 more answer

Q3. What are variable reducing techniques

Ans.

Variable reducing techniques are methods used to identify and select the most relevant variables in a dataset.

Variable reducing techniques help in reducing the number of variables in a dataset.
These techniques aim to identify the most important variables that contribute significantly to the outcome.
Some common variable reducing techniques include feature selection, dimensionality reduction, and correlation analysis.
Fea...

Answered by AI

View 1 more answer

Q4. Which test is used in logistic regression to check the significance of the variable

Ans.

The Wald test is used in logistic regression to check the significance of the variable.

The Wald test calculates the ratio of the estimated coefficient to its standard error.
It follows a chi-square distribution with one degree of freedom.
A small p-value indicates that the variable is significant.
For example, in Python, the statsmodels library provides the Wald test in the summary of a logistic regression model.

Answered by AI

View 1 more answer

Q5. How to check multicollinearity in Logistic regression

Ans.

Multicollinearity in logistic regression can be checked using correlation matrix and variance inflation factor (VIF).

Calculate the correlation matrix of the independent variables and check for high correlation coefficients.
Calculate the VIF for each independent variable and check for values greater than 5 or 10.
Consider removing one of the highly correlated variables or variables with high VIF to address multicollinear...

Answered by AI

View 1 more answer

Q6. Difference between bagging and boosting

Ans.

Bagging and boosting are ensemble methods used in machine learning to improve model performance.

Bagging involves training multiple models on different subsets of the training data and then combining their predictions through averaging or voting.
Boosting involves iteratively training models on the same dataset, with each subsequent model focusing on the samples that were misclassified by the previous model.
Bagging reduc...

Answered by AI

Add your answer

Q7. Explain the logistics regression process

Ans.

Logistic regression is a statistical method used to analyze and model the relationship between a binary dependent variable and one or more independent variables.

It is a type of regression analysis used for predicting the outcome of a categorical dependent variable based on one or more predictor variables.
It uses a logistic function to model the probability of the dependent variable taking a particular value.
It is commo...

Answered by AI

Add your answer

Q8. Explain Gini coefficient

Ans.

Gini coefficient measures the inequality among values of a frequency distribution.

Gini coefficient ranges from 0 to 1, where 0 represents perfect equality and 1 represents perfect inequality.
It is commonly used to measure income inequality in a population.
A Gini coefficient of 0.4 or higher is considered to be a high level of inequality.
Gini coefficient can be calculated using the Lorenz curve, which plots the cumulati...

Answered by AI

Add your answer

Q9. Difference between chair and cart

Ans.

A chair is a piece of furniture used for sitting, while a cart is a vehicle used for transporting goods.

A chair typically has a backrest and armrests, while a cart does not.
A chair is designed for one person to sit on, while a cart can carry multiple items or people.
A chair is usually stationary, while a cart is mobile and can be pushed or pulled.
A chair is commonly found in homes, offices, and public spaces, while a c...

Answered by AI

Add your answer

Q10. How to check outliers in a variable, what treatment should you use to remove such outliers

Ans.

Outliers can be detected using statistical methods like box plots, z-score, and IQR. Treatment can be removal or transformation.

Use box plots to visualize outliers
Calculate z-score and remove data points with z-score greater than 3
Calculate IQR and remove data points outside 1.5*IQR
Transform data using log or square root to reduce the impact of outliers

Answered by AI

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Explain the concept properly, if not able to explain properly then take a pause and try again with some examples. Be confident.

Skills evaluated in this interview

Data Scientist Interview Questions & Answers

HSBC Group

Anonymous

posted on 13 Sep 2022

I applied via Approached by Company and was interviewed before Sep 2021. There were 3 interview rounds.

Round 1 - Resume Shortlist

Pro Tip by AmbitionBox:

Keep your resume crisp and to the point. A recruiter looks at your resume for an average of 6 seconds, make sure to leave the best impression.

View all tips

Round 2 - Technical

(1 Question)

Q1. Projects and Data Science concepts

Add your answer

Round 3 - Technical

(1 Question)

Q1. Python and coding skills

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Be through with concepts - ML, stats, NLP

Data Scientist Interview Questions & Answers

HSBC Group

DEVANG RATHOD

posted on 25 Aug 2024

Interview experience

Average

Difficulty level

Process Duration

Result

Round 1 - Aptitude Test

(1 Question)

Q1. Central Limit Theorem

Ans.

Central Limit Theorem states that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases.

The Central Limit Theorem is essential in statistics as it allows us to make inferences about a population based on a sample.
It states that regardless of the shape of the population distribution, the sampling distribution of the sample mean will be approximately normally distribut...

Answered by AI

Add your answer

Data Scientist Interview Questions & Answers

HSBC Group

Anonymous

posted on 3 Dec 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

No response

I applied via Referral and was interviewed in Nov 2024. There were 2 interview rounds.

Round 1 - Technical

(4 Questions)

Q1. Types of Chunking in data preparation in RAG

Add your answer

Q2. How Embedding works in Vector Databases

Ans.

Embeddings in vector databases represent data points as dense vectors for efficient similarity search and retrieval.

Embeddings convert categorical data into continuous vector space, enabling mathematical operations.
For example, words can be represented as vectors in Word2Vec, capturing semantic relationships.
Vector databases store these embeddings, allowing for fast nearest neighbor searches.
Applications include recomm...

Answered by AI

Add your answer

Q3. Explain ARIMA model

Ans.

ARIMA is a statistical model used for forecasting time series data by capturing trends and seasonality.

ARIMA stands for AutoRegressive Integrated Moving Average.
It combines three components: AR (AutoRegressive), I (Integrated), and MA (Moving Average).
AR component uses past values to predict future values.
I component involves differencing the data to make it stationary.
MA component models the error of the prediction as...

Answered by AI

Add your answer

Q4. How can we decide to choose Linear Regression for a business problem

Ans.

Linear Regression is chosen for its simplicity, interpretability, and effectiveness in modeling linear relationships in data.

Linear relationship: Use when the relationship between independent and dependent variables is linear, e.g., predicting sales based on advertising spend.
Continuous outcome: Suitable for predicting continuous outcomes, like house prices based on features like size and location.
Interpretability: Pro...

Answered by AI

Add your answer

Round 2 - Technical

(4 Questions)

Q1. What is token and it's limit for Open Source LLMs

Ans.

Tokens are units of text processed by LLMs, with limits varying by model, affecting input/output length.

A token can be as short as one character or as long as one word (e.g., 'cat' is one token, 'chatGPT' is one token).
Common token limits for open-source LLMs range from 512 to 4096 tokens, depending on the architecture.
For example, GPT-2 has a limit of 1024 tokens, while GPT-3 can handle up to 4096 tokens.
Exceeding tok...

Answered by AI

Add your answer

Q2. Difference of a Regression and Time Series problem

Ans.

Regression predicts continuous outcomes; time series analyzes data points over time for trends and patterns.

Regression focuses on relationships between variables (e.g., predicting house prices based on features).
Time series analyzes data collected at regular intervals (e.g., stock prices over time).
Regression can be used for static datasets, while time series requires temporal ordering.
In regression, predictors can be ...

Answered by AI

Add your answer

Q3. Advantage of LSTM over RNN

Ans.

LSTMs effectively handle long-term dependencies, overcoming RNNs' vanishing gradient problem.

LSTMs use memory cells to store information over long sequences, unlike RNNs which forget earlier data.
They employ gates (input, output, forget) to control the flow of information, enhancing learning.
LSTMs are better suited for tasks like language modeling and time series prediction where context is crucial.
For example, in sent...

Answered by AI

Add your answer

Q4. Performance Metrics for Logistic Regression

Add your answer

Skills evaluated in this interview

Are these interview questions helpful?

Data Scientist Interview Questions & Answers

Bajaj Finserv

Anonymous

posted on 17 May 2022

I appeared for an interview in May 2022.

Round 1 - Assignment

Round duration - 60 Minutes
Round difficulty - Easy

Round 2 - Coding Test

(1 Question)

Round duration - 60 Minutes
Round difficulty - Easy

There were 10 MCQs ranging from Aptitude to Programming MCQs to basics of Data Science.
The coding question only the optimized solution was accepted

Q1.

Special Sum of Array Problem Statement

Given an array 'arr' containing single-digit integers, your task is to calculate the total sum of all its elements. However, the resulting sum must also be a single-...

Ans.

Calculate the total sum of array elements until a single-digit number is obtained by repeatedly summing digits.

Iterate through the array and calculate the sum of all elements.
If the sum is a single-digit number, return it. Otherwise, repeat the process of summing digits until a single-digit number is obtained.
Return the final single-digit sum.

Answered by AI

View 1 more answer

Round 3 - Video Call

(1 Question)

Round duration - 45 minutes
Round difficulty - Easy

The interview happened in the evening. It was an online video call.
The interviewer was very cooperative. I would say it was rather a discussion session between us.

Q1.

Clone a Linked List with Random Pointers

Given a linked list where each node contains two pointers: one pointing to the next node and another random pointer that can point to any node within the list (or ...

Ans.

Create a deep copy of a linked list with random pointers.

Iterate through the original linked list and create a new node for each node in the list.
Store the mapping of original nodes to new nodes in a hashmap to handle random pointers.
Update the random pointers of new nodes based on the mapping stored in the hashmap.
Return the head of the copied linked list.

Answered by AI

Add your answer

Round 4 - HR

Round duration - 10 Minutes
Round difficulty - Easy

It was late night
It was a telephonic call

Interview Preparation Tips

Professional and academic backgroundI completed Computer Science Engineering from Vellore Institute of Technology. I applied for the job as Data Scientist in PuneEligibility criteriaAbove 8 CGPA. Only CSE, IT, ECE, EEE branches were allowed.Bajaj Finserv Ltd. interview preparation:Topics to prepare for the interview - Data Structures and Algorithms, OOPs, DBMS, Data Science Fundamentals, Personal ProjectsTime required to prepare for the interview - 6-8 monthsInterview preparation tips for other job seekers

Tip 1 : Start your preparation early. Start from the very basics before directly moving onto DSA. Get a grasp of the basics in each topic. Practice different varieties of questions from each topic. I would recommend at least 200 questions of DSA.
Tip 2 : Revise your projects before you attend any interview. This is extremely important. You must be able to clearly explain your project along with your role in the project in layman terms to the interviewer.
Tip 3 : Grind hard to achieve your goals but don't take much stress. There's a long way to go.

Application resume tips for other job seekers

Tip 1 : Never, I say never put false things or your friends project in your resume
Tip 2 : Make a 1 page resume. Make your resume in such a way that the interviewer must be able to see the things you want him to see in the very first scan.

Final outcome of the interviewSelected

Skills evaluated in this interview

Data Scientist Interview Questions & Answers

Wells Fargo

Anonymous

posted on 6 Oct 2021

Interview Questionnaire

3 Questions

Q1. Mainly resume based. In detail from the project.

Add your answer

Q2. Softmax vs sigmoid

Ans.

Softmax and sigmoid are both activation functions used in neural networks.

Softmax is used for multi-class classification problems, while sigmoid is used for binary classification problems.
Softmax outputs a probability distribution over the classes, while sigmoid outputs a probability for a single class.
Softmax ensures that the sum of the probabilities of all classes is 1, while sigmoid does not.
Softmax is more sensitiv...

Answered by AI

Add your answer

Q3. Logistics regression (multiclass)

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare the projects mentioned in your resume very well

Skills evaluated in this interview

Data Scientist Interview Questions & Answers

Wells Fargo

Anonymous

posted on 20 Aug 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Not Selected

I applied via LinkedIn and was interviewed in Jul 2024. There were 3 interview rounds.

Round 1 - Assignment

Assignment on credit risk

Round 2 - Technical

(1 Question)

Q1. Hyperparameter tuning

Add your answer

Round 3 - Technical

(1 Question)

Q1. Case study for problem solving

Add your answer

CIMB bank Interview FAQs

How many rounds are there in CIMB bank Data Scientist interview?

CIMB bank interview process usually has 1-2 rounds. The most common rounds in the CIMB bank interview process are Aptitude Test, Technical and One-on-one Round.

What are the top questions asked in CIMB bank Data Scientist interview?

Some of the top questions asked at the CIMB bank Data Scientist interview -

what is difference between supervised and unsupervised learn...read more
asked about Roc ...read more
Xgboost in-de...read more

Tell us how to improve this page.

Interview Questions for Popular Designations

4.5/5

based on 2 interview experiences

Difficulty level

Moderate 100%

Duration

Less than 2 weeks 100%

HSBC Group Data Scientist Interview Questions

3.9

• 6 Interviews

Bajaj Finserv Data Scientist Interview Questions

4.0

• 5 Interviews

Wells Fargo Data Scientist Interview Questions

3.8

• 4 Interviews

American Express Data Scientist Interview Questions

4.1

• 4 Interviews

Citicorp Data Scientist Interview Questions

3.7

• 3 Interviews

ICICI Securities Data Scientist Interview Questions

3.9

• 3 Interviews

JPMorgan Chase & Co. Data Scientist Interview Questions

3.9

• 2 Interviews

Motilal Oswal Financial Services Data Scientist Interview Questions

3.6

• 2 Interviews

Morningstar Data Scientist Interview Questions

3.9

• 2 Interviews

Xyz Company Data Scientist Interview Questions

3.8

• 1 Interview

View all

CIMB bank Data Scientist Salary

based on 10 salaries

₹9 L/yr - ₹25 L/yr

At par with the average Data Scientist Salary in India

View more details

CIMB bank Salaries in India

Data Scientist 10 salaries	₹9 L/yr - ₹25 L/yr
Director 7 salaries	₹60 L/yr - ₹101 L/yr
Senior Data Scientist 5 salaries	₹18 L/yr - ₹24 L/yr
Senior Software Engineer 5 salaries	₹14 L/yr - ₹30 L/yr
Senior SAS Consultant 4 salaries	₹18.8 L/yr - ₹20 L/yr