IndusInd Bank
10+ TVS Sundram Fasteners Interview Questions and Answers
Q1. What is the difference between logistic and linear regression?
Logistic regression is used for binary classification, while linear regression is used for predicting continuous values.
Logistic regression is a classification algorithm, while linear regression is a regression algorithm.
Logistic regression uses a logistic function to model the probability of the binary outcome.
Linear regression uses a linear function to model the relationship between the independent and dependent variables.
Logistic regression predicts discrete outcomes (e.g....read more
Q2. How random forest is different from decision trees?
Random forest is an ensemble learning method that uses multiple decision trees to improve prediction accuracy.
Random forest builds multiple decision trees and combines their predictions to reduce overfitting.
Decision trees are prone to overfitting and can be unstable, while random forest is more robust.
Random forest can handle missing values and categorical variables better than decision trees.
Example: Random forest can be used for predicting customer churn in a telecom compa...read more
Q3. What is the formula of logistic regression?
The formula of logistic regression is a mathematical equation used to model the relationship between a binary dependent variable and one or more independent variables.
The formula is: log(odds) = β0 + β1x1 + β2x2 + ... + βnxn
The dependent variable is transformed using the logit function to obtain the log-odds ratio.
The independent variables are multiplied by their respective coefficients (β) and summed up with the intercept (β0).
The resulting value is then transformed back to ...read more
Q4. How do you measure the accuracy of a model?
Model accuracy can be measured using metrics such as confusion matrix, ROC curve, and precision-recall curve.
Confusion matrix shows true positives, true negatives, false positives, and false negatives.
ROC curve plots true positive rate against false positive rate.
Precision-recall curve plots precision against recall.
Other metrics include accuracy, F1 score, and AUC-ROC.
Cross-validation can also be used to evaluate model performance.
Q5. What are specificity and sensitivity?
Specificity and sensitivity are statistical measures used to evaluate the performance of a binary classification model.
Specificity measures the proportion of true negatives correctly identified by the model.
Sensitivity (also known as recall or true positive rate) measures the proportion of true positives correctly identified by the model.
Both measures are commonly used in medical diagnostics to assess the accuracy of tests or models.
Specificity and sensitivity are often used ...read more
Q6. What is AUC-ROC curve?
AUC-ROC curve is a graphical representation of the performance of a classification model.
AUC-ROC stands for Area Under the Receiver Operating Characteristic curve.
It is used to evaluate the performance of binary classification models.
The curve plots the true positive rate (sensitivity) against the false positive rate (1-specificity) at various classification thresholds.
AUC-ROC ranges from 0 to 1, with a higher value indicating better model performance.
An AUC-ROC of 0.5 repres...read more
Q7. What is t-test?
t-test is a statistical test used to determine if there is a significant difference between the means of two groups.
It compares the means of two groups and assesses if the difference is statistically significant.
It is commonly used in hypothesis testing and comparing the effectiveness of different treatments or interventions.
There are different types of t-tests, such as independent samples t-test and paired samples t-test.
The t-test calculates a t-value and p-value, where the...read more
Q8. What is linear regression?
Linear regression is a statistical method used to model the relationship between two variables.
It assumes a linear relationship between the dependent and independent variables.
It is used to predict the value of the dependent variable based on the value of the independent variable.
It can be used for both simple and multiple regression analysis.
Example: predicting the price of a house based on its size or predicting the salary of an employee based on their years of experience.
Q9. What is a random forest?
A random forest is an ensemble learning method that combines multiple decision trees to make predictions.
Random forest is a supervised learning algorithm.
It can be used for both classification and regression tasks.
It creates multiple decision trees and combines their predictions to make a final prediction.
Each decision tree is trained on a random subset of the training data and features.
Random forest reduces overfitting and improves accuracy compared to a single decision tree...read more
Q10. What is logistic regression?
Logistic regression is a statistical method used to analyze and model the relationship between a binary dependent variable and one or more independent variables.
It is used to predict the probability of a binary outcome (0 or 1).
It is a type of regression analysis that uses a logistic function to model the relationship between the dependent and independent variables.
It is commonly used in machine learning and data analysis for classification problems.
Example: predicting whethe...read more
Q11. What is z-test?
A z-test is a statistical test used to determine whether two population means are significantly different from each other.
It is used when the sample size is large and the population standard deviation is known.
The test compares the sample mean to the population mean using the z-score formula.
The z-score is calculated as the difference between the sample mean and population mean divided by the standard deviation.
If the calculated z-score falls within the critical region, the n...read more
Top Senior Data Scientist Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month