Filter interviews by
I applied via LinkedIn and was interviewed in Mar 2024. There was 1 interview round.
Using R as a calculator to compute values for a Data Scientist interview question.
Use R's console to input mathematical expressions and compute values.
Make sure to follow the order of operations (PEMDAS) when entering expressions.
Use functions like 'sqrt()' for square roots and 'exp()' for exponentiation.
Remember to assign variables using the '<-' operator before using them in calculations.
Compute statistics of a given time value in R
Use lubridate package to work with time data in R
Calculate summary statistics like mean, median, min, max, and standard deviation
Convert the time value to a time object before performing calculations
Using R to create two matrices and perform matrix multiplication.
Create two matrices using matrix() function in R.
Use %*% operator for matrix multiplication.
Ensure the dimensions of the matrices are compatible for multiplication.
The kNN classifier is run on the iris data to make predictions based on nearest neighbors.
kNN classifier is a type of supervised machine learning algorithm that can be used for classification tasks.
The output will be the predicted class labels for the iris data based on the nearest neighbors.
Interpreting the output involves understanding how the algorithm has classified the data points.
Top trending discussions
I applied via Walk-in and was interviewed in Mar 2021. There were 3 interview rounds.
I applied via LinkedIn and was interviewed in Sep 2020. There were 5 interview rounds.
VIF stands for Variance Inflation Factor, a measure of multicollinearity in regression analysis.
VIF is used to detect the presence of multicollinearity in regression analysis.
It measures how much the variance of the estimated regression coefficient is increased due to multicollinearity.
A VIF value of 1 indicates no multicollinearity, while a value greater than 1 suggests increasing levels of multicollinearity.
A commonl...
I applied via Walk-in and was interviewed in Mar 2020. There was 1 interview round.
R square is a statistical measure that represents the proportion of the variance in the dependent variable explained by the independent variables.
R square is a value between 0 and 1, where 0 indicates that the independent variables do not explain any of the variance in the dependent variable, and 1 indicates that they explain all of it.
It is used to evaluate the goodness of fit of a regression model.
Adjusted R square t...
Variable reducing techniques are methods used to identify and select the most relevant variables in a dataset.
Variable reducing techniques help in reducing the number of variables in a dataset.
These techniques aim to identify the most important variables that contribute significantly to the outcome.
Some common variable reducing techniques include feature selection, dimensionality reduction, and correlation analysis.
Fea...
The Wald test is used in logistic regression to check the significance of the variable.
The Wald test calculates the ratio of the estimated coefficient to its standard error.
It follows a chi-square distribution with one degree of freedom.
A small p-value indicates that the variable is significant.
For example, in Python, the statsmodels library provides the Wald test in the summary of a logistic regression model.
Multicollinearity in logistic regression can be checked using correlation matrix and variance inflation factor (VIF).
Calculate the correlation matrix of the independent variables and check for high correlation coefficients.
Calculate the VIF for each independent variable and check for values greater than 5 or 10.
Consider removing one of the highly correlated variables or variables with high VIF to address multicollinear...
Bagging and boosting are ensemble methods used in machine learning to improve model performance.
Bagging involves training multiple models on different subsets of the training data and then combining their predictions through averaging or voting.
Boosting involves iteratively training models on the same dataset, with each subsequent model focusing on the samples that were misclassified by the previous model.
Bagging reduc...
Logistic regression is a statistical method used to analyze and model the relationship between a binary dependent variable and one or more independent variables.
It is a type of regression analysis used for predicting the outcome of a categorical dependent variable based on one or more predictor variables.
It uses a logistic function to model the probability of the dependent variable taking a particular value.
It is commo...
Gini coefficient measures the inequality among values of a frequency distribution.
Gini coefficient ranges from 0 to 1, where 0 represents perfect equality and 1 represents perfect inequality.
It is commonly used to measure income inequality in a population.
A Gini coefficient of 0.4 or higher is considered to be a high level of inequality.
Gini coefficient can be calculated using the Lorenz curve, which plots the cumulati...
A chair is a piece of furniture used for sitting, while a cart is a vehicle used for transporting goods.
A chair typically has a backrest and armrests, while a cart does not.
A chair is designed for one person to sit on, while a cart can carry multiple items or people.
A chair is usually stationary, while a cart is mobile and can be pushed or pulled.
A chair is commonly found in homes, offices, and public spaces, while a c...
Outliers can be detected using statistical methods like box plots, z-score, and IQR. Treatment can be removal or transformation.
Use box plots to visualize outliers
Calculate z-score and remove data points with z-score greater than 3
Calculate IQR and remove data points outside 1.5*IQR
Transform data using log or square root to reduce the impact of outliers
I applied via Campus Placement and was interviewed in Dec 2020. There were 3 interview rounds.
Accuracy is the closeness of a measured value to the true value. Precision is the consistency of repeated measurements.
Accuracy measures how close a measurement is to the true value
Precision measures the consistency of repeated measurements
Accuracy can be affected by systematic errors
Precision can be affected by random errors
Accuracy and precision are both important in scientific measurements
I applied via Referral and was interviewed in Mar 2021. There were 4 interview rounds.
Data science is the field of extracting insights and knowledge from data using various techniques and tools.
Data science involves collecting, cleaning, and analyzing data to extract insights.
It uses various techniques such as machine learning, statistical modeling, and data visualization.
Data science is used in various fields such as finance, healthcare, and marketing.
Examples of data science applications include fraud...
Python and R are programming languages commonly used in data science and statistical analysis.
Python is a general-purpose language with a large community and many libraries for data manipulation and machine learning.
R is a language specifically designed for statistical computing and graphics, with a wide range of packages for data analysis and visualization.
Both languages are popular choices for data scientists and hav...
I applied via Approached by Company and was interviewed before Sep 2021. There were 3 interview rounds.
I applied via Approached by Company and was interviewed before Sep 2021. There were 3 interview rounds.
Explain dynamic programming with memoization
Some of the top questions asked at the 360DigiTMG Data Scientist interview -
based on 1 interview
Interview experience
based on 5 reviews
Rating in categories
Data Science Intern
20
salaries
| ₹0.7 L/yr - ₹7 L/yr |
Data Analyst
17
salaries
| ₹1 L/yr - ₹5 L/yr |
Data Scientist
15
salaries
| ₹2.3 L/yr - ₹8.9 L/yr |
Data Analyst Intern
15
salaries
| ₹1 L/yr - ₹6 L/yr |
Associate Data Scientist
11
salaries
| ₹2 L/yr - ₹4.8 L/yr |
Marpu Foundation
Huawei Technologies
HCL Infosystems
Z X Learning