Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

VIEW WINNERS
- ABECA 2025
  
  VIEW WINNERS
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
Participate in ABECA 2026

Add office photos

Engaged Employer

C5i

Compare

3.8

based on 426 Reviews

Video summary

Filter interviews by

C5i Data Scientist Interview Questions and Answers

Updated 17 Dec 2024

9 Interview questions

A Data Scientist was asked 12mo ago

Q. What is TF-IDF in NLP?

Ans.

TF-IDF is a numerical statistic that reflects the importance of a word in a document relative to a collection of documents.

TF-IDF stands for Term Frequency-Inverse Document Frequency
It is used in Natural Language Processing (NLP) to determine the importance of a word in a document
TF-IDF is calculated by multiplying the term frequency (TF) by the inverse document frequency (IDF)
It helps in identifying the most impo...

A Data Scientist was asked

Q. Why Machine Learning?

Ans.

Machine learning enables computers to learn from data and make predictions or decisions without being explicitly programmed.

Machine learning can automate and optimize complex processes
It can help identify patterns and insights in large datasets
It can improve accuracy and efficiency in decision-making
Examples include image recognition, natural language processing, and predictive analytics
It can also be used for ano...

A Data Scientist was asked

Q. Explain the KNN Algorithm.

Ans.

KNN is a non-parametric algorithm used for classification and regression tasks.

KNN stands for K-Nearest Neighbors.
It works by finding the K closest data points to a given test point.
The class or value of the test point is then determined by the majority class or average value of the K neighbors.
KNN can be used for both classification and regression tasks.
It is a simple and easy-to-understand algorithm, but can be ...

A Data Scientist was asked

Q. Why did you choose the Data Science field?

Ans.

I chose Data Science field because of its potential to solve complex problems and make a positive impact on society.

Fascination with data and its potential to drive insights
Desire to solve complex problems and make a positive impact on society
Opportunity to work with cutting-edge technology and tools
Ability to work in a variety of industries and domains
Examples: Predictive maintenance in manufacturing, fraud detec...

What people are saying about C5i

View All

a data scientist

Best organization in terms of Learning, Opportunity, WLB

Current Role: Data Scientist (Gen AI) YOE: 5.5 CCTC: 18.5 LPA Offers I have: 1. Quantiphi Analytics (Bangalore) - 32.9 LPA (29 Fixed + 1 JB + 2.9 Variable) 2. STG Labs (Bangalore) - 33 LPA (32 Fixed + 1 JB) 3. Rakuten Symphony (Bangalore) - 32.8 LPA (28 Fixed + 1 JB + 2.8 Bonus) 4. IBM (Hometown) - 32.5 LPA (Fixed) Offers in Pipeline: 1. Programmers.io - Remote 2. Worley - Remote 3. C5i - Bangalore 4. Wipro - Bangalore 5. Capgemini - Hometown 6. MPhasis - Hyderabad I want to know which organization will best considering Learning, Opportunity, WLB.

Got a question about C5i?

Ask anonymously on communities.

A Data Scientist was asked

Q. Can we use a confusion matrix in Linear Regression?

Ans.

No, confusion matrix is not used in Linear Regression.

Confusion matrix is used to evaluate classification models.
Linear Regression is a regression model, not a classification model.
Evaluation metrics for Linear Regression include R-squared, Mean Squared Error, etc.

A Data Scientist was asked

Q. Explain the Confusion Matrix.

Ans.

Confusion matrix is a table used to evaluate the performance of a classification model.

It is a 2x2 matrix that shows the number of true positives, false positives, true negatives, and false negatives.
It helps in calculating various metrics like accuracy, precision, recall, and F1 score.
It is useful in identifying the strengths and weaknesses of a model and improving its performance.
Example: In a binary classificat...

A Data Scientist was asked

Q. What is the difference between Linear Regression and Logistic Regression?

Ans.

Linear Regression is used for predicting continuous numerical values, while Logistic Regression is used for predicting binary categorical values.

Linear Regression predicts a continuous output, while Logistic Regression predicts a binary output.
Linear Regression uses a linear equation to model the relationship between the independent and dependent variables, while Logistic Regression uses a logistic function.
Linear...

Are these interview questions helpful?

A Data Scientist was asked

Q. What is your understanding of Linear Regression?

Ans.

Linear Regression is a statistical method for modeling the relationship between a dependent variable and one or more independent variables.

It assumes a linear relationship between the dependent and independent variables.
The equation of a simple linear regression is Y = a + bX + e, where Y is the dependent variable, X is the independent variable, a is the intercept, b is the slope, and e is the error term.
Multiple ...

A Data Scientist was asked

Q. 8. Explain Random Forest and Decision Tree?

Ans.

Random Forest is an ensemble learning method that builds multiple decision trees and combines their outputs to improve accuracy.

Random Forest is a type of supervised learning algorithm used for classification and regression tasks.
It creates multiple decision trees and combines their outputs to make a final prediction.
Each decision tree is built using a random subset of features and data points to reduce overfittin...

C5i Data Scientist Interview Experiences

4 interviews found

Data Scientist Interview Questions & Answers

Kushal Kulkarni

posted on 18 Jun 2024

Interview experience

Excellent

Difficulty level

Easy

Process Duration

Less than 2 weeks

Result

Not Selected

I appeared for an interview in May 2024.

Round 1 - Assignment

Questions based on ML,PYTHON, DATA VISUALIZATION

Round 2 - Technical

(2 Questions)

Q1. What is TF-IDF IN NLP

Ans.

TF-IDF is a numerical statistic that reflects the importance of a word in a document relative to a collection of documents.

TF-IDF stands for Term Frequency-Inverse Document Frequency
It is used in Natural Language Processing (NLP) to determine the importance of a word in a document
TF-IDF is calculated by multiplying the term frequency (TF) by the inverse document frequency (IDF)
It helps in identifying the most important...

Answered by AI

Add your answer

Q2. Python coding questions based on list

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Practice python

Data Scientist Interview Questions & Answers

Anonymous

posted on 20 Jun 2024

Interview experience

Excellent

Difficulty level

Process Duration

Result

Round 1 - Assignment

ML,DL,Python,NLP,Data VIsualization

Round 2 - Technical

(1 Question)

Q1. Explain TF-IDF in NLP

Ans.

TF-IDF is a numerical statistic that reflects the importance of a word in a document relative to a collection of documents.

TF-IDF stands for Term Frequency-Inverse Document Frequency.
It is used in Natural Language Processing (NLP) to determine the importance of a word in a document.
TF-IDF is calculated by multiplying the term frequency (TF) of a word by the inverse document frequency (IDF) of the word.
It helps in ident...

Answered by AI

Add your answer

Data Scientist Interview Questions & Answers

Anonymous

posted on 17 Dec 2024

Interview experience

Excellent

Difficulty level

Easy

Process Duration

Less than 2 weeks

Result

Selected

I applied via Naukri.com and was interviewed before Dec 2023. There were 3 interview rounds.

Round 1 - Coding Test

Test of Basic data structures in Python include lists, tuples, and dictionaries, as well as loops and conditional statements.

Round 2 - Case Study

Framework and requirements for chatbot implementation.

Round 3 - HR

(1 Question)

Q1. Salary discussion

Add your answer

Data Scientist Interview Questions & Answers

rustam garg

posted on 4 May 2019

I applied via Recruitment Consultant and was interviewed in Dec 2018. There were 3 interview rounds.

Interview Questionnaire

11 Questions

Q1. 1. Why Machine Learning?

View 1 more answer

Q2. 2. Why did you choose Data Science Field?

Ans.

I chose Data Science field because of its potential to solve complex problems and make a positive impact on society.

Fascination with data and its potential to drive insights
Desire to solve complex problems and make a positive impact on society
Opportunity to work with cutting-edge technology and tools
Ability to work in a variety of industries and domains
Examples: Predictive maintenance in manufacturing, fraud detection ...

Answered by AI

View 1 more answer

Q3. 3. What about Linear Regression? (Theory Part)

Ans.

Linear Regression is a statistical method for modeling the relationship between a dependent variable and one or more independent variables.

It assumes a linear relationship between the dependent and independent variables.
The equation of a simple linear regression is Y = a + bX + e, where Y is the dependent variable, X is the independent variable, a is the intercept, b is the slope, and e is the error term.
Multiple linea...

Answered by AI

Add your answer

Q4. 4. What is the difference between Linear Regression and Logistic Regression?

Ans.

Linear Regression is used for predicting continuous numerical values, while Logistic Regression is used for predicting binary categorical values.

Linear Regression predicts a continuous output, while Logistic Regression predicts a binary output.
Linear Regression uses a linear equation to model the relationship between the independent and dependent variables, while Logistic Regression uses a logistic function.
Linear Regr...

Answered by AI

View 2 more answers

Q5. 5. Explain Confusion Matrix?

Ans.

Confusion matrix is a table used to evaluate the performance of a classification model.

It is a 2x2 matrix that shows the number of true positives, false positives, true negatives, and false negatives.
It helps in calculating various metrics like accuracy, precision, recall, and F1 score.
It is useful in identifying the strengths and weaknesses of a model and improving its performance.
Example: In a binary classification p...

Answered by AI

Add your answer

Q6. 6. Can we use confusion matrix in Linear Regression?

Ans.

No, confusion matrix is not used in Linear Regression.

Confusion matrix is used to evaluate classification models.
Linear Regression is a regression model, not a classification model.
Evaluation metrics for Linear Regression include R-squared, Mean Squared Error, etc.

Answered by AI

Add your answer

Q7. 7. Explain KNN Algorithm?

Ans.

KNN is a non-parametric algorithm used for classification and regression tasks.

KNN stands for K-Nearest Neighbors.
It works by finding the K closest data points to a given test point.
The class or value of the test point is then determined by the majority class or average value of the K neighbors.
KNN can be used for both classification and regression tasks.
It is a simple and easy-to-understand algorithm, but can be compu...

Answered by AI

Add your answer

Q8. 8. Explain Random Forest and Decision Tree?

Ans.

Random Forest is an ensemble learning method that builds multiple decision trees and combines their outputs to improve accuracy.

Random Forest is a type of supervised learning algorithm used for classification and regression tasks.
It creates multiple decision trees and combines their outputs to make a final prediction.
Each decision tree is built using a random subset of features and data points to reduce overfitting.
Ran...

Answered by AI

Add your answer

Q9. 9. One Tricky Mathematical Question !

Add your answer

Q10. 10. What are the Projects you have done?

Ans.

I have worked on various projects involving data analysis, machine learning, and predictive modeling.

Developed a predictive model to forecast customer churn for a telecommunications company.
Built a recommendation system using collaborative filtering for an e-commerce platform.
Performed sentiment analysis on social media data to understand customer opinions and preferences.
Implemented a fraud detection system using anom...

Answered by AI

Add your answer

Q11. I didn't get shortlisted for 2nd Round.

Add your answer

Interview Preparation Tips

General Tips: anyone who wants to go in data science field should actually be interested in the field not the money. They should be good in Statistics, Probability and Theory part of ML algorithms.
They will ask you about the projects you have mentioned in resume and all the questions will be from that part.
Skills: Communication, Body Language, Problem Solving, Analytical Skills
Duration: 1-4 weeks

Skills evaluated in this interview

Interview questions from similar companies

Data Scientist Interview Questions & Answers

Mu Sigma

Sayantan Ghosh

posted on 24 May 2017

Interview Questionnaire

1 Question

Q1. What friends think of you?

Ans.

My friends think of me as reliable, supportive, and always up for a good time.

Reliable - always there when they need help or support
Supportive - willing to listen and offer advice
Fun-loving - enjoys socializing and trying new things

Answered by AI

Add your answer

Interview Preparation Tips

Round: Resume Shortlist
Experience: After Resume Shortlist we had an aptitute round.
Tips: Answer according to your own judgement. Dont try to be too precise.

Round: HR Interview
Experience: I said they think I am a workaholic as I prefer to complete my work before chilling with them.

College Name: NIT Durgapur

Data Scientist Interview Questions & Answers

Mu Sigma

Archit Ojha

posted on 3 Oct 2015

Interview Preparation Tips

Round: HR Interview
Experience: Interview at 11 pm. Stressed environment, close to stress interview.
SELECTION PROCEDURE:
1.Online Test
2. GD
3. PI(HR)
GD TOPICS :
Topic 1 : How can education system benefit from interdisciplinary methods.
Topic 2 : Interconnected problems in the field of movie making.
INTERVIEW EXPERIENCE:
So you can speak German? Describe MS Dhoni in german. They opened Google Translate to counter check the words they wanted to be translated in both Deutsch and Spanish. Your profile speaks of an inclination towards software skills, why do you want to join an analytics company? Justify your action in two reasons as to why are you sitting here interviewing for the post of a data scientist rather than apply for a software engineer when this CV speaks highly of computer science? What is Finite Element Method? Explain. How relevant is your work in Computer Vision? Breakdown the tagline of Audi and translate accordingly. What is "Technik für Mobel" ? What are your current projects? Answer : Microsoft Xbox Kinect, Gesture Recognition. Counter question : But at Musimga you'd be doing far simpler stuff.? Counter suggestion : Why don't you go for MS?

Tips: Keep your cool during counter questions. Prepare your profile and CV well. Rest all is your hard work and groomed personal talents and acquired skills you learnt over the internet.

Skills: Ability To Cope Up With Stress, Spanish, German, Finite Element Modeling - FEM, Foreign Language
College Name: NIT Raipur
Funny Moments: Another HR enters in the midst of my interview and asks with bewildered amazement : What language is he speaking?
The other HR, "German".

Data Scientist Interview Questions & Answers

Clarivate

Anonymous

posted on 11 Mar 2022

I applied via Approached by Company and was interviewed in Feb 2022. There were 2 interview rounds.

Round 1 - HR

(5 Questions)

Q1. What are your salary expectations?

Add your answer

Q2. What is your family background?

Add your answer

Q3. Where do you see yourself in 5 years?

Add your answer

Q4. What are your strengths and weaknesses?

Add your answer

Q5. Tell me about yourself.

Add your answer

Round 2 - Assignment

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare for the Critical Behavioral Questions

Are these interview questions helpful?

Data Scientist Interview Questions & Answers

Clarivate

Anonymous

posted on 23 Oct 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

2-4 weeks

Result

Selected

I applied via LinkedIn and was interviewed before Oct 2023. There were 2 interview rounds.

Round 1 - Coding Test

Graph based question, acyclic graph

Round 2 - Technical

(4 Questions)

Q1. Probability question

Add your answer

Q2. ML design interview

Add your answer

Q3. Algorithm coding

Add your answer

Q4. NLP based design question

Add your answer

Data Scientist Interview Questions & Answers

Acuity Knowledge Partners

Anonymous

posted on 11 Sep 2024

Interview experience

Average

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

I applied via Company Website and was interviewed in Aug 2024. There was 1 interview round.

Round 1 - One-on-one

(2 Questions)

Q1. Explain the RAG pipeline?

Ans.

RAG pipeline is a data processing pipeline used in data science to categorize data into Red, Amber, and Green based on certain criteria.

RAG stands for Red, Amber, Green which are used to categorize data based on certain criteria
Red category typically represents data that needs immediate attention or action
Amber category represents data that requires monitoring or further investigation
Green category represents data that...

Answered by AI

Add your answer

Q2. Explain Confusion metrics

Ans.

Confusion metrics are used to evaluate the performance of a classification model by comparing predicted values with actual values.

Confusion matrix is a table that describes the performance of a classification model.
It consists of four different metrics: True Positive, True Negative, False Positive, and False Negative.
These metrics are used to calculate other evaluation metrics like accuracy, precision, recall, and F1 s...

Answered by AI

Add your answer

Skills evaluated in this interview

Data Scientist Interview Questions & Answers

Nielsen

Anonymous

posted on 6 Apr 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

6-8 weeks

Result

Selected

I applied via Naukri.com and was interviewed in Oct 2023. There were 4 interview rounds.

Round 1 - Coding Test

Sql, python, Statistics mcq, Aptitude test. These were medium level questions.

Round 2 - Technical

(3 Questions)

Q1. SQL and python, time complexity

Add your answer

Q2. Make 2 lists a=[1,2,3,4] b=[9,8,5,5,2,3,3,4,1,1,10,9,2,3,4,10,10,9,7,7,8] Write a program to remove duplicate of b and keep only those elements of b which are not present in a, and the final list should ...

Ans.

Remove duplicates from list b, keep elements not in list a, and sort in ascending order.

Create a set from list b to remove duplicates
Use list comprehension to keep elements not in list a
Sort the final list in ascending order

Answered by AI

View 1 more answer

Q3. SQL question Remove duplicate from a table tab1

Ans.

Use the DISTINCT keyword in SQL to remove duplicates from a table.

Use the SELECT DISTINCT statement to retrieve unique rows from the table.
Identify the columns that should be used to determine uniqueness.
Example: SELECT DISTINCT column1, column2 FROM tab1;

Answered by AI

View 1 more answer

Round 3 - Case Study

Given 2 case studies on data science and asked different possibilities to improve the models.

How to work with imbalance dataset.
How to remove null values, what is features engineering.
What is PCA
What is the working of XGBOOST

Round 4 - Project discussion

(1 Question)

Q1. What was last project, tell me in detail. There were different technical questions related to my project

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Be confident and practice SQL, python, mainly pandas and numpy. Should have good knowledge on time complexity.

All the metrics of evaluating a model.
Linear regression, logestic regression, random forest, decission tree, adaboost, Gradient boosting, XGb in detail.

Recall, precision roc_curve. Auc, f1 score, mse,mae, r2, adjusted r2 score.

Is it possible that r2 score appears in minus

Skills evaluated in this interview

C5i Interview FAQs

How many rounds are there in C5i Data Scientist interview?

C5i interview process usually has 2-3 rounds. The most common rounds in the C5i interview process are Technical, Assignment and Coding Test.

How to prepare for C5i Data Scientist interview?

Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at C5i. The most common topics and skills that interviewers at C5i expect are Python, Data Science, Machine Learning, SQL and Statistical Modeling.

What are the top questions asked in C5i Data Scientist interview?

Some of the top questions asked at the C5i Data Scientist interview -

4. What is the difference between Linear Regression and Logistic Regressi...read more
2. Why did you choose Data Science Fie...read more
3. What about Linear Regression? (Theory Pa...read more

Tell us how to improve this page.

C5i Interviews By Designations

Interview Questions for Popular Designations

5/5

based on 3 interview experiences

Difficulty level

Easy 100%

Duration

Less than 2 weeks 100%

Mu Sigma Interview Questions

2.6

• 240 Interviews

IKS Health Interview Questions

3.6

• 240 Interviews

Crisil Interview Questions

3.6

• 202 Interviews

Acuity Knowledge Partners Interview Questions

3.3

• 160 Interviews

Indegene Interview Questions

3.4

• 158 Interviews

Nielsen Interview Questions

3.7

• 133 Interviews

Baroda Global Shared Services Interview Questions

4.0

• 105 Interviews

Netscribes Interview Questions

2.7

• 89 Interviews

Clarivate Interview Questions

3.7

• 86 Interviews

Athenahealth Technology Interview Questions

4.1

• 84 Interviews

View all

C5i Data Scientist Salary

based on 52 salaries

₹4 L/yr - ₹14 L/yr

38% less than the average Data Scientist Salary in India

View more details

C5i Salaries in India

Analyst 330 salaries	₹5 L/yr - ₹10.3 L/yr
Senior Analyst 220 salaries	₹7.5 L/yr - ₹12 L/yr
Data Analyst 181 salaries	₹2.5 L/yr - ₹9 L/yr
Associate Manager 135 salaries	₹7.8 L/yr - ₹25 L/yr
Junior Analyst 122 salaries	₹2.9 L/yr - ₹7 L/yr