Urban Company
10+ Nanobi Data & Analytics Interview Questions and Answers
Q1. What is the difference between Least Squares Method and the maximum likelihood
Least Squares Method and Maximum Likelihood are both used to estimate parameters, but differ in their approach.
Least Squares Method minimizes the sum of squared errors between the observed and predicted values.
Maximum Likelihood estimates the parameters that maximize the likelihood of observing the given data.
Least Squares Method assumes that the errors are normally distributed and independent.
Maximum Likelihood does not make any assumptions about the distribution of errors.
L...read more
Q2. How do you improve the performance of Linear Regression
To improve the performance of Linear Regression, you can consider feature engineering, regularization, and handling outliers.
Perform feature engineering to create new features that capture important information.
Apply regularization techniques like L1 or L2 regularization to prevent overfitting.
Handle outliers by either removing them or using robust regression techniques.
Check for multicollinearity among the independent variables and consider removing highly correlated variabl...read more
Q3. What metrics do you use to evaluate classification models
Metrics used to evaluate classification models
Accuracy
Precision
Recall
F1 Score
ROC Curve
Confusion Matrix
Q4. How do you handle overfitting and underfitting in Decision Trees
Overfitting in decision trees can be handled by pruning, reducing tree depth, increasing dataset size, and using ensemble methods.
Prune the tree to remove unnecessary branches
Reduce tree depth to prevent overfitting
Increase dataset size to improve model generalization
Use ensemble methods like Random Forest to reduce overfitting
Underfitting can be handled by increasing tree depth, adding more features, and reducing regularization
Regularization can be used to prevent overfittin...read more
Q5. What are the metrics used to evaluate Linear Regression
Metrics used to evaluate Linear Regression
Mean Squared Error (MSE)
Root Mean Squared Error (RMSE)
R-squared (R²)
Adjusted R-squared (Adj R²)
Mean Absolute Error (MAE)
Residual Sum of Squares (RSS)
Akaike Information Criterion (AIC)
Bayesian Information Criterion (BIC)
Q6. How do you handle Overfitting in Linear Regression
Overfitting in Linear Regression can be handled by using regularization techniques.
Regularization techniques like Ridge regression and Lasso regression can help in reducing overfitting.
Cross-validation can be used to find the optimal regularization parameter.
Feature selection and dimensionality reduction techniques can also help in reducing overfitting.
Collecting more data can help in reducing overfitting by providing a more representative sample.
Q7. What are assumptions in Linear Regression
Assumptions in Linear Regression
Linear relationship between independent and dependent variables
Homoscedasticity (constant variance) of residuals
Independence of residuals
Normal distribution of residuals
No multicollinearity among independent variables
Q8. What is the formula of Logistic Regression
Logistic Regression formula is used to model the probability of a certain event occurring.
The formula is: P(Y=1) = e^(b0 + b1*X1 + b2*X2 + ... + bn*Xn) / (1 + e^(b0 + b1*X1 + b2*X2 + ... + bn*Xn))
Y is the dependent variable and X1, X2, ..., Xn are the independent variables
b0, b1, b2, ..., bn are the coefficients that need to be estimated
The formula is used to predict the probability of a binary outcome, such as whether a customer will buy a product or not
The formula is derive...read more
Q9. What is Type I and Type II error
Type I error is rejecting a true null hypothesis, while Type II error is failing to reject a false null hypothesis.
Type I error is also known as a false positive
Type II error is also known as a false negative
Type I error occurs when the significance level is set too high
Type II error occurs when the significance level is set too low
Examples: Type I error - Convicting an innocent person, Type II error - Failing to convict a guilty person
Type I error is more serious in medical ...read more
Q10. What is Cost function and Error Function
Cost function measures the difference between predicted and actual values. Error function measures the average of cost function.
Cost function is used to evaluate the performance of a machine learning model.
It measures the difference between predicted and actual values.
Error function is the average of cost function over the entire dataset.
It is used to optimize the parameters of the model.
Examples of cost functions are mean squared error, mean absolute error, and cross-entropy...read more
Q11. What are overfitting and underfitting
Overfitting and underfitting are two common problems in machine learning models.
Overfitting occurs when a model is too complex and fits the training data too closely, resulting in poor performance on new data.
Underfitting occurs when a model is too simple and cannot capture the underlying patterns in the data, resulting in poor performance on both training and new data.
Overfitting can be prevented by using regularization techniques, reducing the complexity of the model, or in...read more
Reviews
Interviews
Salaries
Users/Month