Decision Scientist

10+ Decision Scientist Interview Questions and Answers for Freshers

Updated 26 Oct 2024

Q1. How do you stay up to date with new analytical tools and techniques.

Ans.

I stay up to date with new analytical tools and techniques by attending workshops, online courses, reading research papers, and participating in industry conferences.

Attend workshops and training sessions on new tools and techniques
Take online courses and certifications to learn about the latest advancements
Read research papers and articles to stay informed about cutting-edge methods
Participate in industry conferences and networking events to exchange knowledge and ideas

Q2. When is a z test used and when is t test used.

Ans.

Z test is used when sample size is large and population standard deviation is known. T test is used when sample size is small or population standard deviation is unknown.

Z test is used for hypothesis testing when sample size is large (n > 30) and population standard deviation is known.
T test is used when sample size is small (n < 30) or population standard deviation is unknown.
Z test is used for comparing means of two populations when the population standard deviation is know...read more

Q3. What is chi square test and when is it used

Ans.

Chi square test is a statistical test used to determine if there is a significant association between two categorical variables.

Chi square test is used to compare observed frequencies with expected frequencies in a contingency table.
It is commonly used in research to analyze data and determine if there is a relationship between two variables.
For example, it can be used to test if there is a significant difference in the distribution of a disease between two groups.
Another exa...read more

Q4. Clustering project explanation and clustering metrics used.

Ans.

Utilized K-means clustering to group customers based on purchasing behavior. Evaluated clusters using silhouette score and inertia.

Used K-means clustering algorithm to group customers into segments
Evaluated the quality of clusters using silhouette score and inertia
Silhouette score measures how similar an object is to its own cluster compared to other clusters
Inertia measures how tightly the clusters are packed together
Example: Clustered customers based on demographics and pur...read more

Are these interview questions helpful?

Q5. coding on sql to find monthwise cumulative sum of something.

Ans.

Use SQL window function to calculate monthwise cumulative sum.

Use the SUM() function with OVER() clause to calculate cumulative sum.
Partition the data by month to get monthwise cumulative sum.
Order the data by date to ensure correct cumulative sum calculation.

Q6. Past project explanation on predictive modeling

Ans.

Developed predictive model to forecast customer churn using machine learning algorithms

Collected and cleaned customer data from various sources
Performed feature engineering to create relevant predictors
Built and trained machine learning models such as logistic regression and random forest
Evaluated model performance using metrics like accuracy, precision, and recall
Implemented the model in a production environment for real-time predictions

Share interview questions and help millions of jobseekers 🌟

Q7. What is binomial distribution?

Ans.

Binomial distribution is a probability distribution that describes the number of successes in a fixed number of independent trials.

Describes the number of successes in a fixed number of independent trials
Each trial has only two possible outcomes (success or failure)
The trials are independent and the probability of success is constant
Examples: Coin toss (success = heads), Pass/fail exams, Yes/no surveys

Q8. What is central limit theorem?

Ans.

Central Limit Theorem states that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases.

Central Limit Theorem is a fundamental concept in statistics.
It states that the sampling distribution of the sample mean will be approximately normally distributed regardless of the shape of the population distribution.
As the sample size increases, the sampling distribution of the sample mean becomes more normally distributed.
It is used ...read more

Decision Scientist Jobs

Decision Scientist • 4-8 years

FedEx TSCS (India) Pvt Ltd

•

4.0

Mumbai

Decision Scientist • 2-7 years

Salesforce.com India Private Limited

•

4.0

Hyderabad / Secunderabad

Decision Scientist • 2-5 years

PhonePe

•

4.0

Bangalore / Bengaluru

View all Decision Scientist jobs

Q9. What is K-means clustering?

Ans.

K-means clustering is a popular unsupervised machine learning algorithm used for clustering data points into groups based on similarity.

Divides data points into K clusters based on similarity
Minimizes the sum of squared distances within each cluster
Requires specifying the number of clusters (K) beforehand
Iteratively assigns data points to the nearest cluster centroid
Commonly used in customer segmentation, image compression, and anomaly detection

Q10. What is precision recall auc

Ans.

Precision-Recall AUC is a metric used to evaluate the performance of classification models, particularly in imbalanced datasets.

Precision-Recall AUC focuses on the trade-off between precision and recall for different threshold values.
It is particularly useful when dealing with imbalanced datasets where the positive class is rare.
A higher Precision-Recall AUC indicates better model performance in terms of precision and recall.
It is often used in conjunction with the ROC AUC me...read more

Q11. What is cross validation

Ans.

Cross validation is a technique used to evaluate the performance of a machine learning model by testing it on multiple subsets of the data.

It involves dividing the data into multiple subsets or folds.
The model is trained on a subset and tested on the remaining subset.
This process is repeated for all subsets and the results are averaged to get a final performance metric.
It helps to prevent overfitting and provides a more accurate estimate of the model's performance.
Examples in...read more

Q12. What is confusion matrix

Ans.

A confusion matrix is a table used to evaluate the performance of a classification model.

It shows the number of true positives, true negatives, false positives, and false negatives.
It helps in calculating various evaluation metrics like accuracy, precision, recall, and F1 score.
It is useful in comparing the performance of different models.
Example: A confusion matrix for a binary classification problem can be represented as follows: | | Predicted Positive | Predicted Negative ...read more

Q13. Types of Joins in SQL

Ans.

Types of joins in SQL include inner join, left join, right join, and full outer join.

Inner join: Returns rows when there is a match in both tables
Left join: Returns all rows from the left table and the matched rows from the right table
Right join: Returns all rows from the right table and the matched rows from the left table
Full outer join: Returns rows when there is a match in either table