GE Aerospace
Marigold Capital Corporation Interview Questions and Answers
Q1. How to Choose K value in K means? if there are any techniques, name them and explain.
Choosing the optimal K value in K-means clustering is crucial for accurate results.
Elbow method: Plotting the sum of squared distances vs. K and selecting the K value where the curve bends like an elbow.
Silhouette method: Calculating the average silhouette score for different K values and choosing the one with the highest score.
Gap statistic method: Comparing the within-cluster dispersion to a reference null distribution to find the optimal K value.
Cross-validation: Splitting...read more
Q2. Formulas for Precision, Recall, accuracy, F1 Score.
Formulas for Precision, Recall, Accuracy, F1 Score in data science.
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
Accuracy = (TP + TN) / (TP + TN + FP + FN)
F1 Score = 2 * (Precision * Recall) / (Precision + Recall)
Q3. OverFitting and underfitting conditions and example
Overfitting and underfitting are common issues in machine learning where the model either learns the noise in the training data or fails to capture the underlying patterns.
Overfitting occurs when a model learns the training data too well, including noise and outliers, leading to poor generalization on new data.
Underfitting happens when a model is too simple to capture the underlying patterns in the data, resulting in high bias and low variance.
Examples of overfitting include ...read more
Q4. What are Regularization Techniques ?
Regularization techniques are methods used to prevent overfitting in machine learning models by adding a penalty term to the loss function.
Regularization techniques help in reducing the complexity of the model by penalizing large coefficients.
Common regularization techniques include L1 regularization (Lasso), L2 regularization (Ridge), and Elastic Net regularization.
Regularization helps in improving the generalization of the model by preventing it from fitting noise in the tr...read more
Q5. What is Population and Sample
Population refers to the entire group of individuals or items that we are interested in studying, while a sample is a subset of the population.
Population is the larger group that we want to draw conclusions about.
Sample is a smaller group selected from the population to represent it.
Population parameters are characteristics of the entire group, while sample statistics are characteristics of the sample.
Example: Population could be all students in a school, while a sample could...read more
Q6. what is Hypothesis testing
Hypothesis testing is a statistical method used to make inferences about a population based on sample data.
It involves formulating a hypothesis about a population parameter, collecting data, and using statistical tests to determine if the data supports or rejects the hypothesis.
There are two types of hypotheses: null hypothesis (H0) and alternative hypothesis (H1).
Common statistical tests for hypothesis testing include t-tests, ANOVA, chi-square tests, and regression analysis...read more
Interview Process at Marigold Capital Corporation
Top Data Science Intern Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month