Deloitte
10+ Bling Brands Interview Questions and Answers
Q1. Are all the decision trees same in a random forest ?
No, decision trees in a random forest are different due to the use of bootstrapping and feature randomization.
Decision trees in a random forest are trained on different subsets of the data through bootstrapping.
Each decision tree in a random forest also considers only a random subset of features at each split.
The final prediction in a random forest is made by aggregating the predictions of all individual decision trees.
Q2. How will you handle class imbalanced dataset to increase the f1 score ?
Handling class imbalanced dataset involves techniques like resampling, using different algorithms, adjusting class weights, and using ensemble methods.
Use resampling techniques like oversampling the minority class or undersampling the majority class.
Try using different algorithms that are less sensitive to class imbalance, such as Random Forest or XGBoost.
Adjust class weights in the model to give more importance to the minority class.
Utilize ensemble methods like bagging or b...read more
Q3. What is biasing and what is overfitting and underfitting
Biasing is the error due to overly simplistic assumptions in the learning algorithm. Overfitting is when a model is too complex and fits the training data too closely, leading to poor generalization. Underfitting is when a model is too simple to capture the underlying structure of the data.
Biasing occurs when a model has high error on both training and test data due to oversimplified assumptions.
Overfitting happens when a model is too complex and captures noise in the trainin...read more
Q4. What do you understand by Deep learning neural networks
Deep learning neural networks are a type of artificial neural network with multiple layers, used for complex pattern recognition.
Deep learning neural networks consist of multiple layers of interconnected nodes, allowing for more complex patterns to be learned.
They are capable of automatically learning features from data, eliminating the need for manual feature engineering.
Examples include Convolutional Neural Networks (CNNs) for image recognition and Recurrent Neural Networks...read more
Q5. What is neutral network? Explain back propagation. Explain difference in CNN and RNN Live coding questions were also asked
Neural network is a computational model inspired by the way the human brain works, used for machine learning tasks.
Neural network is a series of algorithms that attempts to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates.
Backpropagation is a technique used to train neural networks by updating the weights of the network to minimize the difference between the predicted output and the actual output.
CNN (Convoluti...read more
Q6. What is the difference between precision and recall
Precision is the ratio of correctly predicted positive observations to the total predicted positives, while recall is the ratio of correctly predicted positive observations to the all observations in actual class.
Precision focuses on the accuracy of positive predictions, while recall focuses on the proportion of actual positives that were correctly identified.
Precision = TP / (TP + FP), Recall = TP / (TP + FN)
High precision means that when the model predicts a positive result...read more
Q7. What is RELU and activation functions
RELU is an activation function used in neural networks to introduce non-linearity.
RELU stands for Rectified Linear Unit.
It is a simple function that returns the input if it is positive, and 0 otherwise.
It is commonly used in deep learning models due to its simplicity and effectiveness.
Other activation functions include sigmoid, tanh, and softmax.
Q8. What is your expected ctc?
I am looking for a competitive salary based on industry standards and my experience.
Research industry standards for Data Scientist salaries
Consider my level of experience and skills when determining salary expectations
Be open to negotiation based on the overall compensation package offered
Q9. What is stemming and lematization
Stemming and lemmatization are techniques used in natural language processing to reduce words to their base or root form.
Stemming is a process of reducing words to their base form by removing suffixes.
Lemmatization is a process of reducing words to their base form by considering the context and part of speech.
Stemming is faster but may not always produce a valid word, while lemmatization is slower but produces valid words.
Example of stemming: 'running' -> 'run', 'jumps' -> 'j...read more
Q10. How to measure multicollinearity
Multicollinearity can be measured using correlation matrix, variance inflation factor (VIF), or eigenvalues.
Calculate the correlation matrix to identify highly correlated variables.
Use the variance inflation factor (VIF) to quantify the extent of multicollinearity.
Check for high eigenvalues in the correlation matrix, indicating multicollinearity.
Consider using dimensionality reduction techniques like principal component analysis (PCA) to address multicollinearity.
Q11. What is homoscedasticity
Homoscedasticity refers to the assumption that the variance of errors is constant across all levels of the independent variable.
Homoscedasticity is a key assumption in linear regression analysis.
It indicates that the residuals (errors) have constant variance.
If the residuals exhibit a pattern where the spread of points increases or decreases as the predicted values increase, it violates the assumption of homoscedasticity.
This violation can lead to biased and inefficient estim...read more
Q12. Explain about bias variance trade off
Bias-variance trade off is the balance between underfitting and overfitting in machine learning models.
Bias refers to the error introduced by approximating a real-world problem, leading to underfitting.
Variance refers to the model's sensitivity to fluctuations in the training data, leading to overfitting.
Finding the right balance between bias and variance is crucial for creating a model that generalizes well to unseen data.
Regularization techniques like Lasso and Ridge regres...read more
Q13. What are optimizers
Optimizers are algorithms used to adjust the parameters of a model to minimize the error between predicted and actual values.
Optimizers are used in machine learning to improve the accuracy of models.
They work by adjusting the weights and biases of a model during training.
Common optimizers include Gradient Descent, Adam, and RMSprop.
The choice of optimizer depends on the type of problem and the characteristics of the data.
Optimizers can help models converge faster and avoid ge...read more
Q14. Explain about poisson distribution
Poisson distribution is a probability distribution that expresses the likelihood of a given number of events occurring in a fixed interval of time or space.
Describes the number of events that occur in a fixed interval of time or space
Events are independent of each other
Average rate of occurrence is constant
Examples: number of emails received in an hour, number of customers arriving at a store in a day
Q15. Explain pipeline flow
Pipeline flow is the process of moving data through a series of interconnected stages or steps in a systematic manner.
Pipeline flow involves the sequential movement of data from one stage to another, with each stage performing a specific task or transformation.
It helps in automating and streamlining the data processing process, making it more efficient and scalable.
Examples of pipeline flow include data preprocessing, feature engineering, model training, and model evaluation ...read more
More about working at Deloitte
Top HR Questions asked in Bling Brands
Interview Process at Bling Brands
Top Data Scientist Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month