Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

VIEW WINNERS
- ABECA 2025
  
  VIEW WINNERS
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
Participate in ABECA 2026

Premium Employer

Info Edge Work with us

Compare

3.9

based on 2.1k Reviews

Filter interviews by

Info Edge Data Scientist Interview Questions and Answers

Updated 29 May 2025

11 Interview questions

A Data Scientist was asked 3w ago

Q. What are the assumptions underlying Linear Regression?

Ans.

Linear regression relies on several key assumptions for valid results, including linearity, independence, and homoscedasticity.

Linearity: The relationship between the independent and dependent variables should be linear. For example, predicting weight based on height.
Independence: Observations should be independent of each other. For instance, data collected from different individuals should not influence each oth...

A Data Scientist was asked 5mo ago

Q. What is dropout in neural networks?

Ans.

Dropout is a regularization technique used in neural networks to prevent overfitting by randomly setting some neuron outputs to zero during training.

Dropout is a regularization technique used in neural networks to prevent overfitting.
During training, a fraction of neurons are randomly selected and their outputs are set to zero.
This helps prevent complex co-adaptations in neurons and improves generalization.
Dropout...

A Data Scientist was asked 5mo ago

Q. Explain precision, recall, and related metrics.

Ans.

Precision and recall are metrics used to evaluate the performance of classification models.

Precision is the ratio of correctly predicted positive observations to the total predicted positive observations.
Recall is the ratio of correctly predicted positive observations to the all observations in actual class.
F1 score is the weighted average of precision and recall, where the best value is 1 and the worst is 0.
Preci...

A Data Scientist was asked 8mo ago

Q. What is a random forest?

Ans.

Random forest is an ensemble learning method used for classification and regression tasks.

Random forest is a collection of decision trees that are trained on random subsets of the data.
Each tree in the random forest independently predicts the target variable, and the final prediction is made by averaging the predictions of all trees.
Random forest is robust to overfitting and noisy data, and it can handle large dat...

A Data Scientist was asked 8mo ago

Q. What is XGBoost?

Ans.

XGBoost is an optimized distributed gradient boosting library designed for efficient and accurate large-scale machine learning.

XGBoost stands for eXtreme Gradient Boosting.
It is a popular machine learning algorithm known for its speed and performance.
XGBoost is used for regression, classification, ranking, and user-defined prediction problems.
It is based on the gradient boosting framework and uses decision trees a...

A Data Scientist was asked 8mo ago

Q. What is Linear Regression?

Ans.

Linear regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables.

Linear regression is used to predict the value of a dependent variable based on the value of one or more independent variables.
It assumes a linear relationship between the independent and dependent variables.
The goal of linear regression is to find the best-fitting line that m...

A Data Scientist was asked 8mo ago

Q. Describe LSTM and GRU.

Ans.

LSTM and GRU are types of recurrent neural networks used for processing sequential data.

LSTM (Long Short-Term Memory) networks are capable of learning long-term dependencies in data.
GRU (Gated Recurrent Unit) networks are simpler than LSTM and have fewer parameters.
LSTM has three gates (input, output, forget) while GRU has two gates (update, reset).
LSTM is better at capturing long-term dependencies but is more com...

Are these interview questions helpful?

A Data Scientist was asked 8mo ago

Q. Define Hypothesis Testing.

Ans.

Hypothesis testing is a statistical method used to make inferences about a population based on sample data.

Hypothesis testing involves formulating a null hypothesis and an alternative hypothesis.
It aims to determine if there is enough evidence to reject the null hypothesis in favor of the alternative hypothesis.
Common methods of hypothesis testing include t-tests, chi-square tests, and ANOVA.
The p-value is used to...

A Data Scientist was asked 8mo ago

Q. Explain all of decision tree and random forest?

Ans.

Decision tree is a tree-like model of decisions and their possible consequences, while random forest is an ensemble learning method that builds multiple decision trees and merges them together.

Decision tree is a flowchart-like structure where each internal node represents a decision based on an attribute, each branch represents the outcome of the decision, and each leaf node represents a class label.
Random forest ...

A Data Scientist was asked

Q. 1) explain correlation and convaraince 2) how logistic differ from linear regression

Ans.

Correlation measures the strength and direction of a linear relationship between two variables, while covariance measures the extent to which two variables change together.

Correlation ranges from -1 to 1, where 1 indicates a perfect positive relationship, -1 indicates a perfect negative relationship, and 0 indicates no relationship.
Covariance can be positive, negative, or zero. A positive covariance indicates that...

Info Edge Data Scientist Interview Experiences

9 interviews found

Data Scientist Interview Questions & Answers

Anonymous

posted on 29 May 2025

Interview experience

Poor

Difficulty level

Hard

Process Duration

Less than 2 weeks

Result

Not Selected

I appeared for an interview in Apr 2025, where I was asked the following questions.

Q1. Concepts of Dropout during Test Time?

Ans.

Dropout is a regularization technique used during training, but typically not applied during test time to ensure full model performance.

Dropout randomly sets a fraction of input units to zero during training to prevent overfitting.
During test time, dropout is usually turned off, allowing the full network to make predictions.
This ensures that all neurons contribute to the output, providing a more accurate representation...

Answered by AI

Add your answer

Q2. How Random Forest Solves the Problem of Overfitting?

Ans.

Random Forest mitigates overfitting by averaging multiple decision trees, enhancing generalization and robustness.

Ensemble Learning: Combines predictions from multiple trees to reduce variance.
Bootstrap Aggregating: Each tree is trained on a random subset of data, promoting diversity.
Feature Randomness: Randomly selects features for splitting, preventing dominance of any single feature.
Example: In a dataset with noise,...

Answered by AI

Add your answer

Q3. Suppose you have very high imbalance problem, how and which Model will you choose to report?

Ans.

For high imbalance problems, choose models like Random Forest or XGBoost, and use techniques like SMOTE for better performance.

Use ensemble methods like Random Forest or Gradient Boosting (e.g., XGBoost) for better handling of imbalanced data.
Consider using resampling techniques such as SMOTE (Synthetic Minority Over-sampling Technique) to balance the dataset.
Evaluate models using metrics like F1-score, precision, reca...

Answered by AI

Add your answer

Q4. Deep Concepts of Transformers?

Ans.

Transformers are advanced neural network architectures that excel in processing sequential data, particularly in NLP tasks.

Self-Attention Mechanism: Allows the model to weigh the importance of different words in a sentence, e.g., in 'The cat sat on the mat', 'cat' and 'sat' are closely related.
Positional Encoding: Since transformers don't have a built-in sense of order, positional encodings are added to input embedding...

Answered by AI

Add your answer

Q5. What is query, Key and Value pairs in self attention?

Ans.

Query, Key, and Value are components of self-attention mechanisms in neural networks, enabling context-aware representations.

In self-attention, each input element is transformed into three vectors: Query, Key, and Value.
The Query vector represents the current element's focus, while the Key vector represents the context of other elements.
The attention score is computed by taking the dot product of the Query and Key vect...

Answered by AI

Add your answer

Q6. How does Self Attention Works?

Ans.

Self-attention allows models to weigh the importance of different words in a sequence when processing them, enhancing context understanding.

Self-attention computes a weighted representation of input sequences, focusing on relevant parts.
It uses three vectors: Query (Q), Key (K), and Value (V) to determine attention scores.
For each word, the model calculates how much attention to pay to every other word in the sequence.
...

Answered by AI

Add your answer

Q7. Real Time Problems based on Multi Head Attention and it's use cases?

Ans.

Multi-head attention enhances model performance by focusing on different parts of input data simultaneously.

Improves natural language processing tasks like translation and summarization.
Used in image processing for tasks like object detection and segmentation.
Facilitates recommendation systems by analyzing user preferences from multiple perspectives.
Enhances speech recognition by focusing on different phonetic features...

Answered by AI

Add your answer

Q8. Degree of Freedom, Chi Square Distribution?

Ans.

Degrees of freedom in Chi-Square distribution indicate the number of independent values in a statistical calculation.

Degrees of freedom (df) = number of categories - 1 in Chi-Square tests.
Example: For a test with 5 categories, df = 5 - 1 = 4.
In goodness-of-fit tests, df helps determine the critical value for hypothesis testing.
Higher degrees of freedom lead to a more accurate approximation of the Chi-Square distributio...

Answered by AI

Add your answer

Q9. What is the mathematical framework behind Linear Regression? Real Time problem case.

Ans.

Linear Regression uses statistical methods to model the relationship between variables, predicting outcomes based on input features.

Linear regression assumes a linear relationship between the dependent and independent variables.
The model can be represented as: Y = β0 + β1X1 + β2X2 + ... + βnXn + ε, where Y is the dependent variable.
The coefficients (β) are estimated using the least squares method, minimizing the sum of...

Answered by AI

Add your answer

Q10. How are LLN and CLT are related?

Ans.

LLN and CLT are statistical theorems that describe the behavior of sample averages as sample size increases.

LLN (Law of Large Numbers) states that as the sample size increases, the sample mean converges to the population mean.
CLT (Central Limit Theorem) states that the distribution of the sample mean approaches a normal distribution as sample size increases, regardless of the population's distribution.
Both theorems are...

Answered by AI

Add your answer

Q11. Difference between Covariance and correlation?

Ans.

Covariance measures the directional relationship between two variables, while correlation quantifies the strength and direction of that relationship.

Covariance can take any value between -∞ and +∞, while correlation ranges from -1 to +1.
Positive covariance indicates that two variables move in the same direction, while negative covariance indicates they move in opposite directions.
Correlation standardizes covariance, ma...

Answered by AI

Add your answer

Q12. What are the assumptions underlying Linear Regression?

Ans.

Linear regression relies on several key assumptions for valid results, including linearity, independence, and homoscedasticity.

Linearity: The relationship between the independent and dependent variables should be linear. For example, predicting weight based on height.
Independence: Observations should be independent of each other. For instance, data collected from different individuals should not influence each other.
Ho...

Answered by AI

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Hard interview - Straight Forward Answers only (No drama)

Data Scientist Interview Questions & Answers

Anonymous

posted on 24 Oct 2024

Interview experience

Average

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Not Selected

I applied via Campus Placement and was interviewed in Sep 2024. There was 1 interview round.

Round 1 - One-on-one

(2 Questions)

Q1. Explain all of decision tree and random forest?

Ans.

Decision tree is a tree-like model of decisions and their possible consequences, while random forest is an ensemble learning method that builds multiple decision trees and merges them together.

Decision tree is a flowchart-like structure where each internal node represents a decision based on an attribute, each branch represents the outcome of the decision, and each leaf node represents a class label.
Random forest is a ...

Answered by AI

Add your answer

Q2. Some basic probability questions

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Keep statistics strong

Skills evaluated in this interview

Data Scientist Interview Questions & Answers

VISHAL

posted on 8 Oct 2024

Interview experience

Good

Difficulty level

Process Duration

Result

Round 1 - Aptitude Test

Test 45 mins 30 ques

Round 2 - One-on-one

(3 Questions)

Q1. What is Linearregression

Ans.

Linear regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables.

Linear regression is used to predict the value of a dependent variable based on the value of one or more independent variables.
It assumes a linear relationship between the independent and dependent variables.
The goal of linear regression is to find the best-fitting line that minimi...

Answered by AI

Add your answer

Q2. What is random forest

Ans.

Random forest is an ensemble learning method used for classification and regression tasks.

Random forest is a collection of decision trees that are trained on random subsets of the data.
Each tree in the random forest independently predicts the target variable, and the final prediction is made by averaging the predictions of all trees.
Random forest is robust to overfitting and noisy data, and it can handle large datasets...

Answered by AI

Add your answer

Q3. WHat is xgboost

Ans.

XGBoost is an optimized distributed gradient boosting library designed for efficient and accurate large-scale machine learning.

XGBoost stands for eXtreme Gradient Boosting.
It is a popular machine learning algorithm known for its speed and performance.
XGBoost is used for regression, classification, ranking, and user-defined prediction problems.
It is based on the gradient boosting framework and uses decision trees as bas...

Answered by AI

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Thanks

Skills evaluated in this interview

Data Scientist Interview Questions & Answers

Sahil Burde

posted on 6 Oct 2024

Interview experience

Good

Difficulty level

Process Duration

Result

Round 1 - Technical

(2 Questions)

Q1. Describe LSTM and GRU

Ans.

LSTM and GRU are types of recurrent neural networks used for processing sequential data.

LSTM (Long Short-Term Memory) networks are capable of learning long-term dependencies in data.
GRU (Gated Recurrent Unit) networks are simpler than LSTM and have fewer parameters.
LSTM has three gates (input, output, forget) while GRU has two gates (update, reset).
LSTM is better at capturing long-term dependencies but is more complex,...

Answered by AI

Add your answer

Q2. Define Hypothesis Testing

Ans.

Hypothesis testing is a statistical method used to make inferences about a population based on sample data.

Hypothesis testing involves formulating a null hypothesis and an alternative hypothesis.
It aims to determine if there is enough evidence to reject the null hypothesis in favor of the alternative hypothesis.
Common methods of hypothesis testing include t-tests, chi-square tests, and ANOVA.
The p-value is used to dete...

Answered by AI

Add your answer

Skills evaluated in this interview

Data Scientist Interview Questions & Answers

Anonymous

posted on 27 Dec 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

No response

I applied via Job Portal and was interviewed in Jun 2024. There were 2 interview rounds.

Round 1 - Aptitude Test

Mcq based test on data science concepts

Round 2 - One-on-one

(2 Questions)

Q1. Explain precision,recall etc

Ans.

Precision and recall are metrics used to evaluate the performance of classification models.

Precision is the ratio of correctly predicted positive observations to the total predicted positive observations.
Recall is the ratio of correctly predicted positive observations to the all observations in actual class.
F1 score is the weighted average of precision and recall, where the best value is 1 and the worst is 0.
Precision ...

Answered by AI

Add your answer

Q2. What is dropout in neural networks

Ans.

Dropout is a regularization technique used in neural networks to prevent overfitting by randomly setting some neuron outputs to zero during training.

Dropout is a regularization technique used in neural networks to prevent overfitting.
During training, a fraction of neurons are randomly selected and their outputs are set to zero.
This helps prevent complex co-adaptations in neurons and improves generalization.
Dropout is t...

Answered by AI

Add your answer

Data Scientist Interview Questions & Answers

Anonymous

posted on 20 Mar 2024

Interview experience

Average

Difficulty level

Process Duration

Result

Round 1 - Technical

(1 Question)

Q1. Basic ML/DL and statistics questions

Add your answer

Data Scientist Interview Questions & Answers

Anonymous

posted on 9 Oct 2023

Interview experience

Good

Difficulty level

Process Duration

Result

Round 1 - Technical

(1 Question)

Q1. Probability ,deep learning basics ,machine learning ,simple python programming questions.

Ans. It will be multilpe choice questions .Duration - 40 minutes.

Answered Anonymously

Add your answer

Round 2 - interview

(1 Question)

Q1. Why this company ,work related to your project,some technical questions on deep learning.

Add your answer

Data Scientist Interview Questions & Answers

Anonymous

posted on 10 May 2025

Interview experience

Poor

Difficulty level

Hard

Process Duration

2-4 weeks

Result

Not Selected

I appeared for an interview before May 2024, where I was asked the following questions.

Q1. Attention Mechanism

Add your answer

Q2. Working of Decision Trees

Ans.

Decision Trees are a supervised learning algorithm used for classification and regression tasks, visualizing decisions as tree-like structures.

Decision Trees split data into subsets based on feature values, creating branches for each decision point.
Each node represents a feature, and each branch represents a decision rule, leading to a leaf node with the final prediction.
They use metrics like Gini impurity or entropy t...

Answered by AI

Add your answer

Q3. Natural language processing

Add your answer

Data Scientist Interview Questions & Answers

Anonymous

posted on 23 May 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

2-4 weeks

Result

Selected

I applied via Campus Placement and was interviewed before May 2023. There were 2 interview rounds.

Round 1 - Aptitude Test

It been for 45 mins. question asked from python,ML,Deep learning and maths.

Round 2 - Technical

(1 Question)

Q1. 1) explain correlation and convaraince 2) how logistic differ from linear regression

Ans.

Correlation measures the strength and direction of a linear relationship between two variables, while covariance measures the extent to which two variables change together.

Correlation ranges from -1 to 1, where 1 indicates a perfect positive relationship, -1 indicates a perfect negative relationship, and 0 indicates no relationship.
Covariance can be positive, negative, or zero. A positive covariance indicates that as o...

Answered by AI

Add your answer

Top trending discussions

View All

Interview Tips & Stories

6d (edited)

a team lead

Why are women still asked such personal questions in interview?

I recently went for an interview… and honestly, m still trying to process what just happened. Instead of being asked about my skills, experience, or how I could add value to the company… the questions took a totally unexpected turn. The interviewer started asking things like When are you getting married? Are you engaged? And m sure, if I had said I was married, the next question would’ve been How long have you been married? What does my personal life have to do with the job m applying for? This is where I felt the gender discrimination hit hard. These types of questions are so casually thrown at women during interviews but are they ever asked to men? No one asks male candidates if they’re planning a wedding or how old their kids are. So why is it okay to ask women? Can we please stop normalising this kind of behaviour in interviews? Our careers shouldn’t be judged by our relationship status. Period.

Got a question about Info Edge?

Ask anonymously on communities.

Interview questions from similar companies

Data Scientist Interview Questions & Answers

Amazon

Anonymous

posted on 26 Sep 2022

I applied via Approached by Company and was interviewed before Sep 2021. There were 3 interview rounds.

Round 1 - Resume Shortlist

Pro Tip by AmbitionBox:

Keep your resume crisp and to the point. A recruiter looks at your resume for an average of 6 seconds, make sure to leave the best impression.

View all tips

Round 2 - Aptitude Test

Explain dynamic programming with memoization

Round 3 - HR

(2 Questions)

Q1. Where are you from, and why are you joining the company

Add your answer

Q2. Why are you joining the company

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - First, they will ask about the breadth of your ML skills and the depth going forward

Info Edge Interview FAQs

How many rounds are there in Info Edge Data Scientist interview?

Info Edge interview process usually has 1-2 rounds. The most common rounds in the Info Edge interview process are Technical, Aptitude Test and One-on-one Round.

How to prepare for Info Edge Data Scientist interview?

Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at Info Edge. The most common topics and skills that interviewers at Info Edge expect are Deep Learning, Machine Learning, Artificial Intelligence, NLP and Natural Language Processing.

What are the top questions asked in Info Edge Data Scientist interview?

Some of the top questions asked at the Info Edge Data Scientist interview -

What is the mathematical framework behind Linear Regression? Real Time problem ...read more
Suppose you have very high imbalance problem, how and which Model will you choo...read more
How Random Forest Solves the Problem of Overfitti...read more

Tell us how to improve this page.

Info Edge Interviews By Designations

Interview Questions for Popular Designations

3.8/5

based on 13 interview experiences

Difficulty level

Moderate 60%

Hard 40%

Duration

Less than 2 weeks 60%

2-4 weeks 40%

Join Info Edge India’s first internet classifieds company.

Amazon Data Scientist Interview Questions

4.0

• 6 Interviews

Flipkart Data Scientist Interview Questions

3.9

• 3 Interviews

Zepto Data Scientist Interview Questions

3.5

• 3 Interviews

Uber Data Scientist Interview Questions

4.2

• 3 Interviews

Times Internet Data Scientist Interview Questions

3.4

• 2 Interviews

Eternal Limited Data Scientist Interview Questions

3.7

• 1 Interview

Expedia Group Data Scientist Interview Questions

3.7

• 1 Interview

Alibaba Group Data Scientist Interview Questions

4.1

• 1 Interview

Criteo Data Scientist Interview Questions

4.3

• 1 Interview

View all

Info Edge Data Scientist Salary

based on 84 salaries

₹19.9 L/yr - ₹34.1 L/yr

62% more than the average Data Scientist Salary in India

View more details

Info Edge Salaries in India

Senior Executive 758 salaries	₹3.4 L/yr - ₹8 L/yr
Sales Executive 658 salaries	₹6 L/yr - ₹15 L/yr
Assistant Manager 615 salaries	₹4.4 L/yr - ₹9.5 L/yr
Associate Senior Executive 602 salaries	₹2.8 L/yr - ₹6.2 L/yr
Senior Software Engineer 369 salaries	₹14 L/yr - ₹25 L/yr