Walmart
10+ INFAC Interview Questions and Answers
Q1. How can you tune the hyper parameters of XGboost,Random Forest,SVM algorithm?
Hyperparameters of XGBoost, Random Forest, and SVM can be tuned using techniques like grid search, random search, and Bayesian optimization.
For XGBoost, important hyperparameters to tune include learning rate, maximum depth, and number of estimators.
For Random Forest, important hyperparameters to tune include number of trees, maximum depth, and minimum samples split.
For SVM, important hyperparameters to tune include kernel type, regularization parameter, and gamma value.
Grid ...read more
Hyperparameters of XGBoost can be tuned using techniques like grid search, random search, and Bayesian optimization.
Use grid search to exhaustively search through a specified parameter grid
Utilize random search to randomly sample hyperparameters from a specified distribution
Apply Bayesian optimization to sequentially choose hyperparameters based on the outcomes of previous iterations
Q3. What do these hyper parameters in the above mentioned algorithms actually mean?
Hyperparameters are settings that control the behavior of machine learning algorithms.
Hyperparameters are set before training the model.
They control the learning process and affect the model's performance.
Examples include learning rate, regularization strength, and number of hidden layers.
Optimizing hyperparameters is important for achieving better model accuracy.
Hyperparameters in XGBoost algorithm control the behavior of the model during training.
Hyperparameters include parameters like learning rate, max depth, number of trees, etc.
They are set before the training process and can greatly impact the model's performance.
Example: 'learning_rate': 0.1, 'max_depth': 5, 'n_estimators': 100
Ridge and LASSO regression are both regularization techniques used in linear regression to prevent overfitting by adding penalty terms to the cost function.
Ridge regression adds a penalty term equivalent to the square of the magnitude of coefficients (L2 regularization).
LASSO regression adds a penalty term equivalent to the absolute value of the magnitude of coefficients (L1 regularization).
Ridge regression tends to shrink the coefficients towards zero but does not set them e...read more
Q6. How to fit a time series model? State all the steps you would follow.
Steps to fit a time series model
Identify the time series pattern
Choose a suitable model
Split data into training and testing sets
Fit the model to the training data
Evaluate model performance on testing data
Refine the model if necessary
Forecast future values using the model
Outlier values are data points that significantly differ from the rest of the data, potentially affecting the analysis.
Outliers can be identified using statistical methods like Z-score or IQR.
Treatment options include removing outliers, transforming the data, or using robust statistical methods.
Example: In a dataset of salaries, a value much higher or lower than the rest may be considered an outlier.
Q8. Difference between Ridge and LASSO and their geometric interpretation.
Ridge and LASSO are regularization techniques used in linear regression to prevent overfitting.
Ridge adds a penalty term to the sum of squared errors, which shrinks the coefficients towards zero but doesn't set them exactly to zero.
LASSO adds a penalty term to the absolute value of the coefficients, which can set some of them exactly to zero.
The geometric interpretation of Ridge is that it adds a constraint to the size of the coefficients, which shrinks them towards the origi...read more
Q9. RNN,CNN and difference between these two.
RNN and CNN are neural network architectures used for different types of data.
RNN is used for sequential data like time series, text, speech, etc.
CNN is used for grid-like data like images, videos, etc.
RNN has feedback connections while CNN has convolutional layers.
RNN can handle variable length input while CNN requires fixed size input.
Both can be used for classification, regression, and generation tasks.
Q10. how to reduce model inference latency
To reduce model inference latency, optimize model architecture, use efficient algorithms, batch processing, and deploy on high-performance hardware.
Optimize model architecture by reducing complexity and removing unnecessary layers
Use efficient algorithms like XGBoost or LightGBM for faster predictions
Implement batch processing to make predictions in bulk rather than one at a time
Deploy the model on high-performance hardware like GPUs or TPUs
Q11. different sql joins and their difference
SQL joins are used to combine rows from two or more tables based on a related column between them.
INNER JOIN: Returns rows when there is at least one match in both tables.
LEFT JOIN: Returns all rows from the left table and the matched rows from the right table.
RIGHT JOIN: Returns all rows from the right table and the matched rows from the left table.
FULL JOIN: Returns rows when there is a match in one of the tables.
SELF JOIN: Joins a table with itself.
Q12. 3 LC mediums in 30 minutes
LC mediums refer to LeetCode mediums, which are medium difficulty coding problems on the LeetCode platform.
LC mediums are coding problems with medium difficulty level on LeetCode platform.
Solving 3 LC mediums in 30 minutes requires good problem-solving skills and efficient coding techniques.
Examples of LC mediums include 'Longest Substring Without Repeating Characters' and 'Container With Most Water'.
Interview Process at INFAC
Top Data Scientist Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month