Upload Button Icon Add office photos
Engaged Employer

i

This company page is being actively managed by C5i Team. If you also belong to the team, you can get access from here

C5i Verified Tick

Compare button icon Compare button icon Compare

Filter interviews by

C5i Data Scientist Interview Questions and Answers

Updated 17 Dec 2024

9 Interview questions

A Data Scientist was asked 12mo ago
Q. What is TF-IDF in NLP?
Ans. 

TF-IDF is a numerical statistic that reflects the importance of a word in a document relative to a collection of documents.

  • TF-IDF stands for Term Frequency-Inverse Document Frequency

  • It is used in Natural Language Processing (NLP) to determine the importance of a word in a document

  • TF-IDF is calculated by multiplying the term frequency (TF) by the inverse document frequency (IDF)

  • It helps in identifying the most impo...

A Data Scientist was asked
Q. Why Machine Learning?
Ans. 

Machine learning enables computers to learn from data and make predictions or decisions without being explicitly programmed.

  • Machine learning can automate and optimize complex processes

  • It can help identify patterns and insights in large datasets

  • It can improve accuracy and efficiency in decision-making

  • Examples include image recognition, natural language processing, and predictive analytics

  • It can also be used for ano...

Data Scientist Interview Questions Asked at Other Companies

Q1. for a data with 1000 samples and 700 dimensions, how would you fi ... read more
Q2. Special Sum of Array Problem Statement Given an array 'arr' conta ... read more
asked in Affine
Q3. You have a pandas dataframe with three columns filled with state ... read more
asked in Walmart
Q4. Describe the data you would analyze to solve cost and revenue opt ... read more
Q5. Clone a Linked List with Random Pointers Given a linked list wher ... read more
A Data Scientist was asked
Q. Explain the KNN Algorithm.
Ans. 

KNN is a non-parametric algorithm used for classification and regression tasks.

  • KNN stands for K-Nearest Neighbors.

  • It works by finding the K closest data points to a given test point.

  • The class or value of the test point is then determined by the majority class or average value of the K neighbors.

  • KNN can be used for both classification and regression tasks.

  • It is a simple and easy-to-understand algorithm, but can be ...

A Data Scientist was asked
Q. Why did you choose the Data Science field?
Ans. 

I chose Data Science field because of its potential to solve complex problems and make a positive impact on society.

  • Fascination with data and its potential to drive insights

  • Desire to solve complex problems and make a positive impact on society

  • Opportunity to work with cutting-edge technology and tools

  • Ability to work in a variety of industries and domains

  • Examples: Predictive maintenance in manufacturing, fraud detec...

What people are saying about C5i

View All
a data scientist
2w
Best organization in terms of Learning, Opportunity, WLB
Current Role: Data Scientist (Gen AI) YOE: 5.5 CCTC: 18.5 LPA Offers I have: 1. Quantiphi Analytics (Bangalore) - 32.9 LPA (29 Fixed + 1 JB + 2.9 Variable) 2. STG Labs (Bangalore) - 33 LPA (32 Fixed + 1 JB) 3. Rakuten Symphony (Bangalore) - 32.8 LPA (28 Fixed + 1 JB + 2.8 Bonus) 4. IBM (Hometown) - 32.5 LPA (Fixed) Offers in Pipeline: 1. Programmers.io - Remote 2. Worley - Remote 3. C5i - Bangalore 4. Wipro - Bangalore 5. Capgemini - Hometown 6. MPhasis - Hyderabad I want to know which organization will best considering Learning, Opportunity, WLB.
Got a question about C5i?
Ask anonymously on communities.
A Data Scientist was asked
Q. Can we use a confusion matrix in Linear Regression?
Ans. 

No, confusion matrix is not used in Linear Regression.

  • Confusion matrix is used to evaluate classification models.

  • Linear Regression is a regression model, not a classification model.

  • Evaluation metrics for Linear Regression include R-squared, Mean Squared Error, etc.

A Data Scientist was asked
Q. Explain the Confusion Matrix.
Ans. 

Confusion matrix is a table used to evaluate the performance of a classification model.

  • It is a 2x2 matrix that shows the number of true positives, false positives, true negatives, and false negatives.

  • It helps in calculating various metrics like accuracy, precision, recall, and F1 score.

  • It is useful in identifying the strengths and weaknesses of a model and improving its performance.

  • Example: In a binary classificat...

A Data Scientist was asked
Q. What is the difference between Linear Regression and Logistic Regression?
Ans. 

Linear Regression is used for predicting continuous numerical values, while Logistic Regression is used for predicting binary categorical values.

  • Linear Regression predicts a continuous output, while Logistic Regression predicts a binary output.

  • Linear Regression uses a linear equation to model the relationship between the independent and dependent variables, while Logistic Regression uses a logistic function.

  • Linear...

Are these interview questions helpful?
A Data Scientist was asked
Q. What is your understanding of Linear Regression?
Ans. 

Linear Regression is a statistical method for modeling the relationship between a dependent variable and one or more independent variables.

  • It assumes a linear relationship between the dependent and independent variables.

  • The equation of a simple linear regression is Y = a + bX + e, where Y is the dependent variable, X is the independent variable, a is the intercept, b is the slope, and e is the error term.

  • Multiple ...

A Data Scientist was asked
Q. 8. Explain Random Forest and Decision Tree?
Ans. 

Random Forest is an ensemble learning method that builds multiple decision trees and combines their outputs to improve accuracy.

  • Random Forest is a type of supervised learning algorithm used for classification and regression tasks.

  • It creates multiple decision trees and combines their outputs to make a final prediction.

  • Each decision tree is built using a random subset of features and data points to reduce overfittin...

C5i Data Scientist Interview Experiences

4 interviews found

Data Scientist Interview Questions & Answers

user image Kushal Kulkarni

posted on 18 Jun 2024

Interview experience
5
Excellent
Difficulty level
Easy
Process Duration
Less than 2 weeks
Result
Not Selected

I appeared for an interview in May 2024.

Round 1 - Assignment 

Questions based on ML,PYTHON, DATA VISUALIZATION

Round 2 - Technical 

(2 Questions)

  • Q1. What is TF-IDF IN NLP
  • Ans. 

    TF-IDF is a numerical statistic that reflects the importance of a word in a document relative to a collection of documents.

    • TF-IDF stands for Term Frequency-Inverse Document Frequency

    • It is used in Natural Language Processing (NLP) to determine the importance of a word in a document

    • TF-IDF is calculated by multiplying the term frequency (TF) by the inverse document frequency (IDF)

    • It helps in identifying the most important...

  • Answered by AI
  • Q2. Python coding questions based on list

Interview Preparation Tips

Interview preparation tips for other job seekers - Practice python
Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Assignment 

ML,DL,Python,NLP,Data VIsualization

Round 2 - Technical 

(1 Question)

  • Q1. Explain TF-IDF in NLP
  • Ans. 

    TF-IDF is a numerical statistic that reflects the importance of a word in a document relative to a collection of documents.

    • TF-IDF stands for Term Frequency-Inverse Document Frequency.

    • It is used in Natural Language Processing (NLP) to determine the importance of a word in a document.

    • TF-IDF is calculated by multiplying the term frequency (TF) of a word by the inverse document frequency (IDF) of the word.

    • It helps in ident...

  • Answered by AI
Interview experience
5
Excellent
Difficulty level
Easy
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via Naukri.com and was interviewed before Dec 2023. There were 3 interview rounds.

Round 1 - Coding Test 

Test of Basic data structures in Python include lists, tuples, and dictionaries, as well as loops and conditional statements.

Round 2 - Case Study 

Framework and requirements for chatbot implementation.

Round 3 - HR 

(1 Question)

  • Q1. Salary discussion

Data Scientist Interview Questions & Answers

user image rustam garg

posted on 4 May 2019

I applied via Recruitment Consultant and was interviewed in Dec 2018. There were 3 interview rounds.

Interview Questionnaire 

11 Questions

  • Q1. 1. Why Machine Learning?
  • Q2. 2. Why did you choose Data Science Field?
  • Ans. 

    I chose Data Science field because of its potential to solve complex problems and make a positive impact on society.

    • Fascination with data and its potential to drive insights

    • Desire to solve complex problems and make a positive impact on society

    • Opportunity to work with cutting-edge technology and tools

    • Ability to work in a variety of industries and domains

    • Examples: Predictive maintenance in manufacturing, fraud detection ...

  • Answered by AI
  • Q3. 3. What about Linear Regression? (Theory Part)
  • Ans. 

    Linear Regression is a statistical method for modeling the relationship between a dependent variable and one or more independent variables.

    • It assumes a linear relationship between the dependent and independent variables.

    • The equation of a simple linear regression is Y = a + bX + e, where Y is the dependent variable, X is the independent variable, a is the intercept, b is the slope, and e is the error term.

    • Multiple linea...

  • Answered by AI
  • Q4. 4. What is the difference between Linear Regression and Logistic Regression?
  • Ans. 

    Linear Regression is used for predicting continuous numerical values, while Logistic Regression is used for predicting binary categorical values.

    • Linear Regression predicts a continuous output, while Logistic Regression predicts a binary output.

    • Linear Regression uses a linear equation to model the relationship between the independent and dependent variables, while Logistic Regression uses a logistic function.

    • Linear Regr...

  • Answered by AI
  • Q5. 5. Explain Confusion Matrix?
  • Ans. 

    Confusion matrix is a table used to evaluate the performance of a classification model.

    • It is a 2x2 matrix that shows the number of true positives, false positives, true negatives, and false negatives.

    • It helps in calculating various metrics like accuracy, precision, recall, and F1 score.

    • It is useful in identifying the strengths and weaknesses of a model and improving its performance.

    • Example: In a binary classification p...

  • Answered by AI
  • Q6. 6. Can we use confusion matrix in Linear Regression?
  • Ans. 

    No, confusion matrix is not used in Linear Regression.

    • Confusion matrix is used to evaluate classification models.

    • Linear Regression is a regression model, not a classification model.

    • Evaluation metrics for Linear Regression include R-squared, Mean Squared Error, etc.

  • Answered by AI
  • Q7. 7. Explain KNN Algorithm?
  • Ans. 

    KNN is a non-parametric algorithm used for classification and regression tasks.

    • KNN stands for K-Nearest Neighbors.

    • It works by finding the K closest data points to a given test point.

    • The class or value of the test point is then determined by the majority class or average value of the K neighbors.

    • KNN can be used for both classification and regression tasks.

    • It is a simple and easy-to-understand algorithm, but can be compu...

  • Answered by AI
  • Q8. 8. Explain Random Forest and Decision Tree?
  • Ans. 

    Random Forest is an ensemble learning method that builds multiple decision trees and combines their outputs to improve accuracy.

    • Random Forest is a type of supervised learning algorithm used for classification and regression tasks.

    • It creates multiple decision trees and combines their outputs to make a final prediction.

    • Each decision tree is built using a random subset of features and data points to reduce overfitting.

    • Ran...

  • Answered by AI
  • Q9. 9. One Tricky Mathematical Question !
  • Q10. 10. What are the Projects you have done?
  • Ans. 

    I have worked on various projects involving data analysis, machine learning, and predictive modeling.

    • Developed a predictive model to forecast customer churn for a telecommunications company.

    • Built a recommendation system using collaborative filtering for an e-commerce platform.

    • Performed sentiment analysis on social media data to understand customer opinions and preferences.

    • Implemented a fraud detection system using anom...

  • Answered by AI
  • Q11. I didn't get shortlisted for 2nd Round.

Interview Preparation Tips

General Tips: anyone who wants to go in data science field should actually be interested in the field not the money. They should be good in Statistics, Probability and Theory part of ML algorithms.
They will ask you about the projects you have mentioned in resume and all the questions will be from that part.
Skills: Communication, Body Language, Problem Solving, Analytical Skills
Duration: 1-4 weeks

Skills evaluated in this interview

Interview questions from similar companies

Interview Questionnaire 

1 Question

  • Q1. What friends think of you?
  • Ans. 

    My friends think of me as reliable, supportive, and always up for a good time.

    • Reliable - always there when they need help or support

    • Supportive - willing to listen and offer advice

    • Fun-loving - enjoys socializing and trying new things

  • Answered by AI

Interview Preparation Tips

Round: Resume Shortlist
Experience: After Resume Shortlist we had an aptitute round.
Tips: Answer according to your own judgement. Dont try to be too precise.

Round: HR Interview
Experience: I said they think I am a workaholic as I prefer to complete my work before chilling with them.

College Name: NIT Durgapur

Interview Preparation Tips

Round: HR Interview
Experience: Interview at 11 pm. Stressed environment, close to stress interview.
SELECTION PROCEDURE:
1.Online Test
2. GD
3. PI(HR)
GD TOPICS :
Topic 1 : How can education system benefit from interdisciplinary methods.
Topic 2 : Interconnected problems in the field of movie making.
INTERVIEW EXPERIENCE:
So you can speak German? Describe MS Dhoni in german. They opened Google Translate to counter check the words they wanted to be translated in both Deutsch and Spanish. Your profile speaks of an inclination towards software skills, why do you want to join an analytics company? Justify your action in two reasons as to why are you sitting here interviewing for the post of a data scientist rather than apply for a software engineer when this CV speaks highly of computer science? What is Finite Element Method? Explain. How relevant is your work in Computer Vision? Breakdown the tagline of Audi and translate accordingly. What is "Technik für Mobel" ? What are your current projects? Answer : Microsoft Xbox Kinect, Gesture Recognition. Counter question : But at Musimga you'd be doing far simpler stuff.? Counter suggestion : Why don't you go for MS?


Tips: Keep your cool during counter questions. Prepare your profile and CV well. Rest all is your hard work and groomed personal talents and acquired skills you learnt over the internet.

Skills: Ability To Cope Up With Stress, Spanish, German, Finite Element Modeling - FEM, Foreign Language
College Name: NIT Raipur
Funny Moments: Another HR enters in the midst of my interview and asks with bewildered amazement : What language is he speaking?
The other HR, "German".

I applied via Approached by Company and was interviewed in Feb 2022. There were 2 interview rounds.

Round 1 - HR 

(5 Questions)

  • Q1. What are your salary expectations?
  • Q2. What is your family background?
  • Q3. Where do you see yourself in 5 years?
  • Q4. What are your strengths and weaknesses?
  • Q5. Tell me about yourself.
Round 2 - Assignment 

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare for the Critical Behavioral Questions
Are these interview questions helpful?
Interview experience
4
Good
Difficulty level
Moderate
Process Duration
2-4 weeks
Result
Selected Selected

I applied via LinkedIn and was interviewed before Oct 2023. There were 2 interview rounds.

Round 1 - Coding Test 

Graph based question, acyclic graph

Round 2 - Technical 

(4 Questions)

  • Q1. Probability question
  • Q2. ML design interview
  • Q3. Algorithm coding
  • Q4. NLP based design question
Interview experience
3
Average
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
-

I applied via Company Website and was interviewed in Aug 2024. There was 1 interview round.

Round 1 - One-on-one 

(2 Questions)

  • Q1. Explain the RAG pipeline?
  • Ans. 

    RAG pipeline is a data processing pipeline used in data science to categorize data into Red, Amber, and Green based on certain criteria.

    • RAG stands for Red, Amber, Green which are used to categorize data based on certain criteria

    • Red category typically represents data that needs immediate attention or action

    • Amber category represents data that requires monitoring or further investigation

    • Green category represents data that...

  • Answered by AI
  • Q2. Explain Confusion metrics
  • Ans. 

    Confusion metrics are used to evaluate the performance of a classification model by comparing predicted values with actual values.

    • Confusion matrix is a table that describes the performance of a classification model.

    • It consists of four different metrics: True Positive, True Negative, False Positive, and False Negative.

    • These metrics are used to calculate other evaluation metrics like accuracy, precision, recall, and F1 s...

  • Answered by AI

Skills evaluated in this interview

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
6-8 weeks
Result
Selected Selected

I applied via Naukri.com and was interviewed in Oct 2023. There were 4 interview rounds.

Round 1 - Coding Test 

Sql, python, Statistics mcq, Aptitude test. These were medium level questions.

Round 2 - Technical 

(3 Questions)

  • Q1. SQL and python, time complexity
  • Q2. Make 2 lists a=[1,2,3,4] b=[9,8,5,5,2,3,3,4,1,1,10,9,2,3,4,10,10,9,7,7,8] Write a program to remove duplicate of b and keep only those elements of b which are not present in a, and the final list should ...
  • Ans. 

    Remove duplicates from list b, keep elements not in list a, and sort in ascending order.

    • Create a set from list b to remove duplicates

    • Use list comprehension to keep elements not in list a

    • Sort the final list in ascending order

  • Answered by AI
  • Q3. SQL question Remove duplicate from a table tab1
  • Ans. 

    Use the DISTINCT keyword in SQL to remove duplicates from a table.

    • Use the SELECT DISTINCT statement to retrieve unique rows from the table.

    • Identify the columns that should be used to determine uniqueness.

    • Example: SELECT DISTINCT column1, column2 FROM tab1;

  • Answered by AI
Round 3 - Case Study 

Given 2 case studies on data science and asked different possibilities to improve the models.

How to work with imbalance dataset.
How to remove null values, what is features engineering.
What is PCA
What is the working of XGBOOST

Round 4 - Project discussion 

(1 Question)

  • Q1. What was last project, tell me in detail. There were different technical questions related to my project

Interview Preparation Tips

Interview preparation tips for other job seekers - Be confident and practice SQL, python, mainly pandas and numpy. Should have good knowledge on time complexity.


All the metrics of evaluating a model.
Linear regression, logestic regression, random forest, decission tree, adaboost, Gradient boosting, XGb in detail.

Recall, precision roc_curve. Auc, f1 score, mse,mae, r2, adjusted r2 score.

Is it possible that r2 score appears in minus

Skills evaluated in this interview

C5i Interview FAQs

How many rounds are there in C5i Data Scientist interview?
C5i interview process usually has 2-3 rounds. The most common rounds in the C5i interview process are Technical, Assignment and Coding Test.
How to prepare for C5i Data Scientist interview?
Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at C5i. The most common topics and skills that interviewers at C5i expect are Python, Data Science, Machine Learning, SQL and Statistical Modeling.
What are the top questions asked in C5i Data Scientist interview?

Some of the top questions asked at the C5i Data Scientist interview -

  1. 4. What is the difference between Linear Regression and Logistic Regressi...read more
  2. 2. Why did you choose Data Science Fie...read more
  3. 3. What about Linear Regression? (Theory Pa...read more

Tell us how to improve this page.

Overall Interview Experience Rating

5/5

based on 3 interview experiences

Difficulty level

Easy 100%

Duration

Less than 2 weeks 100%
View more

Interview Questions from Similar Companies

Mu Sigma Interview Questions
2.6
 • 240 Interviews
IKS Health Interview Questions
3.6
 • 240 Interviews
Crisil Interview Questions
3.6
 • 202 Interviews
Indegene Interview Questions
3.4
 • 158 Interviews
Nielsen Interview Questions
3.7
 • 133 Interviews
Netscribes Interview Questions
2.7
 • 89 Interviews
Clarivate Interview Questions
3.7
 • 86 Interviews
View all
C5i Data Scientist Salary
based on 52 salaries
₹4 L/yr - ₹14 L/yr
38% less than the average Data Scientist Salary in India
View more details

C5i Data Scientist Reviews and Ratings

based on 8 reviews

3.9/5

Rating in categories

3.9

Skill development

3.4

Work-life balance

3.1

Salary

4.1

Job security

3.5

Company culture

3.2

Promotions

3.4

Work satisfaction

Explore 8 Reviews and Ratings
Analyst
330 salaries
unlock blur

₹5 L/yr - ₹10.3 L/yr

Senior Analyst
220 salaries
unlock blur

₹7.5 L/yr - ₹12 L/yr

Data Analyst
181 salaries
unlock blur

₹2.5 L/yr - ₹9 L/yr

Associate Manager
135 salaries
unlock blur

₹7.8 L/yr - ₹25 L/yr

Junior Analyst
122 salaries
unlock blur

₹2.9 L/yr - ₹7 L/yr

Explore more salaries
Compare C5i with

IKS Health

3.6
Compare

Crisil

3.6
Compare

Indegene

3.4
Compare

Acuity Knowledge Partners

3.3
Compare
write
Share an Interview