Deloitte
Genpact Interview Questions and Answers
Q1. How do you measure accuracy of document classification?
Accuracy of document classification can be measured using metrics like precision, recall, F1 score, and confusion matrix.
Precision measures the proportion of true positives among all predicted positives.
Recall measures the proportion of true positives among all actual positives.
F1 score is the harmonic mean of precision and recall.
Confusion matrix shows the number of true positives, true negatives, false positives, and false negatives.
Accuracy can also be measured using metri...read more
Q2. What is MLE in logistic regression?
MLE is a method used to estimate the parameters of a logistic regression model.
MLE stands for Maximum Likelihood Estimation
It is used to estimate the parameters of a logistic regression model
The goal is to find the values of the parameters that maximize the likelihood of observing the data
The likelihood function is the product of the probabilities of observing each data point given the model parameters
The optimization problem is solved using numerical methods such as gradient...read more
Q3. Similarity score of document and word
Similarity score measures the degree of similarity between two documents or words.
Similarity score can be calculated using various techniques such as cosine similarity, Jaccard similarity, and Euclidean distance.
Cosine similarity measures the cosine of the angle between two vectors, while Jaccard similarity measures the intersection over union of two sets.
Euclidean distance measures the distance between two points in a multidimensional space.
Similarity score can be useful in ...read more
Q4. Assumptions of logistic regression
Assumptions of logistic regression
Dependent variable is binary
Linear relationship between independent and log odds
No multicollinearity among independent variables
No outliers or influential observations
Large sample size
Q5. Explain logistic regression
Logistic regression is a statistical method used to analyze and model the relationship between a binary dependent variable and one or more independent variables.
It is used for classification problems where the dependent variable is binary
It estimates the probability of an event occurring based on the input variables
It uses a sigmoid function to map the input values to a probability score
It is a linear model that uses maximum likelihood estimation to find the coefficients
It ca...read more
Q6. Coding in Python using explode
Explode function in Python is used to split a string into a list of strings based on a specified delimiter.
Use the explode function from the pandas library to split a string column into multiple rows in a DataFrame.
Specify the delimiter parameter to define how the string should be split.
For example, df['column'].str.split('delimiter').explode() will split the strings in 'column' based on 'delimiter' and create a new row for each element.
Q7. MLOps vs DevOps
MLOps focuses on machine learning model deployment and management, while DevOps focuses on software development and IT operations.
MLOps is specifically tailored for machine learning models, ensuring they are deployed, monitored, and managed effectively.
DevOps is a broader practice that focuses on collaboration between development and operations teams to automate and streamline the software development process.
MLOps incorporates DevOps principles but adds specific consideratio...read more
More about working at Deloitte
Interview Process at Genpact
Top Senior Data Scientist Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month