Upload Button Icon Add office photos

HSBC Group

Compare button icon Compare button icon Compare

Proud winner of ABECA 2024 - AmbitionBox Employee Choice Awards

zig zag pattern zig zag pattern

Filter interviews by

HSBC Group Data Scientist Interview Questions and Answers for Freshers

Updated 15 Dec 2024

HSBC Group Data Scientist Interview Experiences for Freshers

2 interviews found

Interview experience
4
Good
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(1 Question)

  • Q1. Asked about ml algos
Interview experience
3
Average
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Referral and was interviewed before May 2023. There was 1 interview round.

Round 1 - Technical 

(2 Questions)

  • Q1. Self Intro and projects discussion
  • Q2. Feature selection methods
  • Ans. 

    Feature selection methods help in selecting the most relevant features for building predictive models.

    • Feature selection methods aim to reduce the number of input variables to only those that are most relevant.

    • Common methods include filter methods, wrapper methods, and embedded methods.

    • Examples include Recursive Feature Elimination (RFE), Principal Component Analysis (PCA), and Lasso regression.

  • Answered by AI

Skills evaluated in this interview

Data Scientist Interview Questions Asked at Other Companies for Fresher

asked in C5i
Q1. 4. What is the difference between Linear Regression and Logistic ... read more
asked in Accenture
Q2. Why we use mission learning Mission learning used for analysis th ... read more
asked in Turing
Q3. What is the neihbourhood in which superhosts have the biggest med ... read more
asked in ION Group
Q4. DBMS question - What are joins and what are their types?
asked in ION Group
Q5. Pattern based - Three memory chips, each of 1GB. You have to stor ... read more

Interview questions from similar companies

I applied via Walk-in and was interviewed in Mar 2020. There was 1 interview round.

Interview Questionnaire 

10 Questions

  • Q1. What is R square and how R square is different from Adjusted R square
  • Ans. 

    R square is a statistical measure that represents the proportion of the variance in the dependent variable explained by the independent variables.

    • R square is a value between 0 and 1, where 0 indicates that the independent variables do not explain any of the variance in the dependent variable, and 1 indicates that they explain all of it.

    • It is used to evaluate the goodness of fit of a regression model.

    • Adjusted R square t...

  • Answered by AI
  • Q2. Explain what do u understand by the team WOE and IV. What's the importance. Advantages and disadvantages
  • Q3. What are variable reducing techniques
  • Ans. 

    Variable reducing techniques are methods used to identify and select the most relevant variables in a dataset.

    • Variable reducing techniques help in reducing the number of variables in a dataset.

    • These techniques aim to identify the most important variables that contribute significantly to the outcome.

    • Some common variable reducing techniques include feature selection, dimensionality reduction, and correlation analysis.

    • Fea...

  • Answered by AI
  • Q4. Which test is used in logistic regression to check the significance of the variable
  • Ans. 

    The Wald test is used in logistic regression to check the significance of the variable.

    • The Wald test calculates the ratio of the estimated coefficient to its standard error.

    • It follows a chi-square distribution with one degree of freedom.

    • A small p-value indicates that the variable is significant.

    • For example, in Python, the statsmodels library provides the Wald test in the summary of a logistic regression model.

  • Answered by AI
  • Q5. How to check multicollinearity in Logistic regression
  • Ans. 

    Multicollinearity in logistic regression can be checked using correlation matrix and variance inflation factor (VIF).

    • Calculate the correlation matrix of the independent variables and check for high correlation coefficients.

    • Calculate the VIF for each independent variable and check for values greater than 5 or 10.

    • Consider removing one of the highly correlated variables or variables with high VIF to address multicollinear...

  • Answered by AI
  • Q6. Difference between bagging and boosting
  • Ans. 

    Bagging and boosting are ensemble methods used in machine learning to improve model performance.

    • Bagging involves training multiple models on different subsets of the training data and then combining their predictions through averaging or voting.

    • Boosting involves iteratively training models on the same dataset, with each subsequent model focusing on the samples that were misclassified by the previous model.

    • Bagging reduc...

  • Answered by AI
  • Q7. Explain the logistics regression process
  • Ans. 

    Logistic regression is a statistical method used to analyze and model the relationship between a binary dependent variable and one or more independent variables.

    • It is a type of regression analysis used for predicting the outcome of a categorical dependent variable based on one or more predictor variables.

    • It uses a logistic function to model the probability of the dependent variable taking a particular value.

    • It is commo...

  • Answered by AI
  • Q8. Explain Gini coefficient
  • Ans. 

    Gini coefficient measures the inequality among values of a frequency distribution.

    • Gini coefficient ranges from 0 to 1, where 0 represents perfect equality and 1 represents perfect inequality.

    • It is commonly used to measure income inequality in a population.

    • A Gini coefficient of 0.4 or higher is considered to be a high level of inequality.

    • Gini coefficient can be calculated using the Lorenz curve, which plots the cumulati...

  • Answered by AI
  • Q9. Difference between chair and cart
  • Ans. 

    A chair is a piece of furniture used for sitting, while a cart is a vehicle used for transporting goods.

    • A chair typically has a backrest and armrests, while a cart does not.

    • A chair is designed for one person to sit on, while a cart can carry multiple items or people.

    • A chair is usually stationary, while a cart is mobile and can be pushed or pulled.

    • A chair is commonly found in homes, offices, and public spaces, while a c...

  • Answered by AI
  • Q10. How to check outliers in a variable, what treatment should you use to remove such outliers
  • Ans. 

    Outliers can be detected using statistical methods like box plots, z-score, and IQR. Treatment can be removal or transformation.

    • Use box plots to visualize outliers

    • Calculate z-score and remove data points with z-score greater than 3

    • Calculate IQR and remove data points outside 1.5*IQR

    • Transform data using log or square root to reduce the impact of outliers

  • Answered by AI

Interview Preparation Tips

Interview preparation tips for other job seekers - Explain the concept properly, if not able to explain properly then take a pause and try again with some examples. Be confident.

Skills evaluated in this interview

Interview experience
1
Bad
Difficulty level
-
Process Duration
-
Result
-
Round 1 - One-on-one 

(1 Question)

  • Q1. How to extract numbers pre decimal point from a long list of decimalnumbers with efficiency
  • Ans. 

    Use string manipulation to efficiently extract numbers before the decimal point from a list of decimal numbers.

    • Split each decimal number by the decimal point and extract the number before it

    • Use regular expressions to match and extract numbers before the decimal point

    • Iterate through the list and extract numbers using string manipulation functions

  • Answered by AI

Skills evaluated in this interview

Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. How do you define model Gini?
  • Ans. 

    Model Gini is a measure of statistical dispersion used to evaluate the performance of classification models.

    • Model Gini is calculated as twice the area between the ROC curve and the diagonal line (random model).

    • It ranges from 0 (worst model) to 1 (best model), with higher values indicating better model performance.

    • A Gini coefficient of 0.5 indicates a model that is no better than random guessing.

    • Commonly used in credit

  • Answered by AI
  • Q2. How to you train XG boost model
  • Ans. 

    XGBoost model is trained by specifying parameters, splitting data into training and validation sets, fitting the model, and tuning hyperparameters.

    • Specify parameters for XGBoost model such as learning rate, max depth, and number of trees

    • Split data into training and validation sets using train_test_split function

    • Fit the XGBoost model on training data using fit method

    • Tune hyperparameters using techniques like grid search

  • Answered by AI

Skills evaluated in this interview

Interview Questionnaire 

3 Questions

  • Q1. Mainly resume based. In detail from the project.
  • Q2. Softmax vs sigmoid
  • Ans. 

    Softmax and sigmoid are both activation functions used in neural networks.

    • Softmax is used for multi-class classification problems, while sigmoid is used for binary classification problems.

    • Softmax outputs a probability distribution over the classes, while sigmoid outputs a probability for a single class.

    • Softmax ensures that the sum of the probabilities of all classes is 1, while sigmoid does not.

    • Softmax is more sensitiv...

  • Answered by AI
  • Q3. Logistics regression (multiclass)

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare the projects mentioned in your resume very well

Skills evaluated in this interview

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via LinkedIn and was interviewed in Jul 2024. There were 3 interview rounds.

Round 1 - Assignment 

Assignment on credit risk

Round 2 - Technical 

(1 Question)

  • Q1. Hyperparameter tuning
Round 3 - Technical 

(1 Question)

  • Q1. Case study for problem solving
Interview experience
4
Good
Difficulty level
Moderate
Process Duration
-
Result
No response

I applied via Job Portal and was interviewed in Nov 2023. There was 1 interview round.

Round 1 - One-on-one 

(5 Questions)

  • Q1. What is Gradient Descents?
  • Ans. 

    Gradient descent is an optimization algorithm used to minimize a function by iteratively moving in the direction of steepest descent.

    • Gradient descent is used to find the minimum of a function by taking steps proportional to the negative of the gradient at the current point.

    • It is commonly used in machine learning to optimize the parameters of a model by minimizing the loss function.

    • There are different variants of gradie...

  • Answered by AI
  • Q2. What is LSTM?, and what are the gates in it?
  • Ans. 

    LSTM (Long Short-Term Memory) is a type of recurrent neural network designed to handle long-term dependencies.

    • LSTM has three gates: input gate, forget gate, and output gate.

    • Input gate controls the flow of information into the cell state.

    • Forget gate decides what information to discard from the cell state.

    • Output gate determines the output based on the cell state.

  • Answered by AI
  • Q3. They gave me a link to dataset and started saying the operations to apply on that. E.g, value_counts, null_values, fill the values with mean,etc.
  • Q4. What is t-test? What is Mean, Median and Mode and where to use these?
  • Ans. 

    T-test is a statistical test used to determine if there is a significant difference between the means of two groups.

    • Mean is the average of a set of numbers, median is the middle value when the numbers are ordered, and mode is the most frequently occurring value.

    • Mean is sensitive to outliers, median is robust to outliers, and mode is useful for categorical data.

    • T-test is used to compare means of two groups, mean is used...

  • Answered by AI
  • Q5. What is RANDOM FOREST ?
  • Ans. 

    Random Forest is an ensemble learning method used for classification and regression tasks.

    • Random Forest is a collection of decision trees that are trained on random subsets of the data.

    • Each tree in the forest makes a prediction, and the final prediction is the average (regression) or majority vote (classification) of all trees.

    • Random Forest helps reduce overfitting and improve accuracy compared to a single decision tre...

  • Answered by AI

Interview Preparation Tips

Topics to prepare for Motilal Oswal Financial Services Data Scientist interview:
  • Machine Learning
  • Statistics
  • Pandas
Interview experience
4
Good
Difficulty level
Easy
Process Duration
2-4 weeks
Result
Selected Selected

I applied via IIM Jobs and was interviewed before Jun 2023. There was 1 interview round.

Round 1 - Technical 

(2 Questions)

  • Q1. SQL basic questions
  • Q2. Python - pandas, numpy based questions

Interview Preparation Tips

Interview preparation tips for other job seekers - Great place to work
Interview experience
4
Good
Difficulty level
Hard
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via Campus Placement and was interviewed before Jul 2023. There were 3 interview rounds.

Round 1 - Aptitude Test 

Medium General Aptitude questions and technical(Big Data, Python etc.)

Round 2 - Technical 

(1 Question)

  • Q1. ML Algorithms (SVM, Random forest, bagging boosting, ridge, etc)
Round 3 - Technical 

(1 Question)

  • Q1. Deep equations and understading of DL and ML Algorithms
  • Ans. 

    Understanding deep equations and algorithms in DL and ML is crucial for a data scientist.

    • Deep learning involves complex neural network architectures like CNNs and RNNs.

    • Machine learning algorithms include decision trees, SVM, k-means clustering, etc.

    • Understanding the math behind algorithms helps in optimizing model performance.

    • Equations like gradient descent, backpropagation, and loss functions are key concepts.

    • Practica...

  • Answered by AI

Skills evaluated in this interview

HSBC Group Interview FAQs

How many rounds are there in HSBC Group Data Scientist interview for freshers?
HSBC Group interview process for freshers usually has 1 rounds. The most common rounds in the HSBC Group interview process for freshers are Technical.
How to prepare for HSBC Group Data Scientist interview for freshers?
Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at HSBC Group. The most common topics and skills that interviewers at HSBC Group expect are Clinical SAS Programming, Data Analysis, Data Domain, Data Quality and Data Science.
What are the top questions asked in HSBC Group Data Scientist interview for freshers?

Some of the top questions asked at the HSBC Group Data Scientist interview for freshers -

  1. Feature selection meth...read more
  2. How can we decide to choose Linear Regression for a business prob...read more
  3. What is token and it's limit for Open Source L...read more

Tell us how to improve this page.

HSBC Group Data Scientist Interview Process for Freshers

based on 2 interviews

Interview experience

3.5
  
Good
View more

Interview Questions from Similar Companies

Wells Fargo Interview Questions
3.8
 • 580 Interviews
Citicorp Interview Questions
3.7
 • 573 Interviews
American Express Interview Questions
4.1
 • 366 Interviews
BNY Interview Questions
3.8
 • 351 Interviews
UBS Interview Questions
3.9
 • 339 Interviews
Morgan Stanley Interview Questions
3.6
 • 293 Interviews
View all
HSBC Group Data Scientist Salary
based on 94 salaries
₹7.6 L/yr - ₹32.4 L/yr
22% more than the average Data Scientist Salary in India
View more details

HSBC Group Data Scientist Reviews and Ratings

based on 11 reviews

3.8/5

Rating in categories

3.7

Skill development

4.2

Work-life balance

3.6

Salary

4.2

Job security

3.9

Company culture

3.4

Promotions

3.8

Work satisfaction

Explore 11 Reviews and Ratings
Assistant Manager
2.6k salaries
unlock blur

₹4.5 L/yr - ₹14.9 L/yr

Manager
2.1k salaries
unlock blur

₹8.1 L/yr - ₹28.1 L/yr

Senior Software Engineer
1.6k salaries
unlock blur

₹7.8 L/yr - ₹26 L/yr

Assistant Vice President
1.5k salaries
unlock blur

₹13.8 L/yr - ₹45 L/yr

Software Engineer
1.4k salaries
unlock blur

₹4.9 L/yr - ₹14.5 L/yr

Explore more salaries
Compare HSBC Group with

Wells Fargo

3.8
Compare

JPMorgan Chase & Co.

3.9
Compare

Cholamandalam Investment & Finance

3.9
Compare

Citicorp

3.7
Compare
Did you find this page helpful?
Yes No
write
Share an Interview