i
EXL
Service
Filter interviews by
Train a sentence classification model using labeled data, feature extraction, and machine learning algorithms.
Collect a labeled dataset of sentences with corresponding categories (e.g., positive, negative, neutral).
Preprocess the text data: tokenize, remove stop words, and apply stemming or lemmatization.
Convert sentences into numerical features using techniques like TF-IDF or word embeddings (e.g., Word2Vec, GloV...
Normal LSTM processes data in one direction, while bi-directional LSTM processes data in both forward and backward directions.
Normal LSTM reads input sequences from start to end, capturing temporal dependencies in one direction.
Bi-directional LSTM consists of two LSTMs: one processes the input sequence forward, and the other processes it backward.
This dual processing allows bi-directional LSTMs to capture context ...
Create a dictionary from a list where keys are unique numbers and values are their counts.
Use Python's built-in collections module, specifically Counter, to simplify counting occurrences.
Example: For the list [1, 2, 2, 3], the output should be {1: 1, 2: 2, 3: 1}.
Alternatively, use a loop to iterate through the list and build the dictionary manually.
LangGraph is a library for building AI agents that can understand and generate natural language.
LangGraph allows for the creation of conversational agents that can engage in dialogue.
It supports various natural language processing tasks, such as sentiment analysis and text summarization.
Example: Using LangGraph to build a chatbot that can answer customer queries in real-time.
The library integrates with machine lea...
What people are saying about EXL Service
Handling missing values is crucial for accurate data analysis and model performance.
1. Remove missing values: If the dataset is large, consider dropping rows or columns with missing values. Example: df.dropna() in pandas.
2. Imputation: Replace missing values with statistical measures like mean, median, or mode. Example: df.fillna(df.mean()) for mean imputation.
3. Predictive modeling: Use algorithms to predict and ...
Overfitting occurs when a model learns noise instead of the underlying pattern, leading to poor generalization on new data.
Overfitting happens when a model is too complex, capturing noise in the training data.
Example: A polynomial regression model with a high degree may fit training data perfectly but perform poorly on test data.
Signs of overfitting include a large gap between training and validation accuracy.
Tech...
Handling imbalanced datasets involves techniques like resampling, using different algorithms, and adjusting class weights.
Use resampling techniques like oversampling the minority class or undersampling the majority class.
Utilize algorithms that are robust to imbalanced datasets, such as Random Forest, XGBoost, or SVM.
Adjust class weights in the model to give more importance to the minority class.
Use techniques lik...
One way to measure model effectiveness without using confusion matrix metrics is by using area under the receiver operating characteristic curve (AUC-ROC).
Calculate the AUC-ROC score to evaluate the model's ability to distinguish between positive and negative classes.
AUC-ROC considers the entire range of classification thresholds and is insensitive to class imbalance.
Higher AUC-ROC score indicates better model per...
Blue score is not a term used in regression analysis.
Blue score is not a standard term in regression analysis
It is possible that the interviewer meant to ask about another metric such as R-squared or mean squared error
Without further context, it is difficult to provide a more specific answer
Bagging and boosting are ensemble learning techniques used to improve model performance.
Bagging involves training multiple models on different subsets of the training data and combining their predictions through averaging or voting.
Boosting involves iteratively training models on the same data, with each subsequent model focusing on the errors of the previous model.
Bagging reduces overfitting and variance, while b...
I appeared for an interview in Apr 2025, where I was asked the following questions.
Data scientist with a strong background in statistics, machine learning, and data visualization, passionate about solving complex problems.
Educational Background: Master's degree in Data Science from XYZ University.
Technical Skills: Proficient in Python, R, SQL, and machine learning libraries like TensorFlow and Scikit-learn.
Professional Experience: Worked at ABC Corp, where I developed predictive models that improved ...
LangChain enhances RAG projects by streamlining data retrieval and processing for improved AI model performance.
Facilitates integration of various data sources, such as APIs and databases, for seamless information retrieval.
Enables efficient document processing and indexing, allowing for quick access to relevant data.
Supports the creation of custom pipelines for data transformation, enhancing the quality of input for A...
LangGraph is a library for building AI agents that can understand and generate natural language.
LangGraph allows for the creation of conversational agents that can engage in dialogue.
It supports various natural language processing tasks, such as sentiment analysis and text summarization.
Example: Using LangGraph to build a chatbot that can answer customer queries in real-time.
The library integrates with machine learning...
I have utilized various models like CNNs, transfer learning, and segmentation techniques for diverse computer vision projects.
Convolutional Neural Networks (CNNs) for image classification tasks, e.g., classifying images of animals.
Transfer learning with pre-trained models like VGG16 and ResNet for fine-tuning on specific datasets, such as facial recognition.
Object detection using YOLO (You Only Look Once) for real-time...
Deployment of ML/DL models involves integrating them into production systems for real-time predictions and decision-making.
Model Serving: Use frameworks like TensorFlow Serving or TorchServe to expose models as APIs.
Containerization: Deploy models using Docker to ensure consistency across environments.
Monitoring: Implement tools like Prometheus or Grafana to track model performance and data drift.
Scaling: Use cloud ser...
I'm seeking new challenges to grow my skills and contribute to innovative projects in a dynamic environment.
Desire for professional growth: I'm eager to expand my expertise in machine learning and data analysis.
Interest in innovative projects: I want to work on cutting-edge technologies, such as AI and big data solutions.
Cultural fit: I'm looking for a collaborative environment that values creativity and teamwork.
Caree...
Fine-tuning adjusts pre-trained models to improve performance on specific tasks or datasets.
Enhances model accuracy by adapting to specific data distributions.
Reduces overfitting by leveraging knowledge from pre-trained models.
Saves time and resources compared to training from scratch.
Example: Fine-tuning a language model for sentiment analysis on product reviews.
Allows for transfer learning, where knowledge from one d...
LSTM (Long Short-Term Memory) is a type of recurrent neural network designed to learn long-term dependencies in sequential data.
LSTMs are used in natural language processing for tasks like language translation and sentiment analysis.
They can remember information for long periods, making them suitable for time series forecasting.
LSTMs mitigate the vanishing gradient problem common in traditional RNNs.
An example applicat...
Normal LSTM processes data in one direction, while bi-directional LSTM processes data in both forward and backward directions.
Normal LSTM reads input sequences from start to end, capturing temporal dependencies in one direction.
Bi-directional LSTM consists of two LSTMs: one processes the input sequence forward, and the other processes it backward.
This dual processing allows bi-directional LSTMs to capture context from ...
Train a sentence classification model using labeled data, feature extraction, and machine learning algorithms.
Collect a labeled dataset of sentences with corresponding categories (e.g., positive, negative, neutral).
Preprocess the text data: tokenize, remove stop words, and apply stemming or lemmatization.
Convert sentences into numerical features using techniques like TF-IDF or word embeddings (e.g., Word2Vec, GloVe).
Sp...
Create a dictionary from a list where keys are unique numbers and values are their counts.
Use Python's built-in collections module, specifically Counter, to simplify counting occurrences.
Example: For the list [1, 2, 2, 3], the output should be {1: 1, 2: 2, 3: 1}.
Alternatively, use a loop to iterate through the list and build the dictionary manually.
Imbalanced datasets can skew model performance; various techniques can help mitigate this issue.
Resampling techniques: Use oversampling (e.g., SMOTE) or undersampling to balance classes.
Use different evaluation metrics: Focus on precision, recall, and F1-score instead of accuracy.
Implement cost-sensitive learning: Assign higher misclassification costs to minority class instances.
Try ensemble methods: Techniques like Ra...
I'm seeking a competitive salary, flexible location, and a notice period of two weeks.
Salary expectations: Based on market research, I expect a salary in the range of $90,000 to $120,000, depending on the role and responsibilities.
Preferred location: I am open to remote work but would prefer a hybrid model with occasional office visits in New York or San Francisco.
Notice period: I am currently employed and would need t...
Implementing a sorting algorithm in Python to sort a list of numbers without using built-in functions.
Use the Bubble Sort algorithm: repeatedly swap adjacent elements if they are in the wrong order.
Example: For the list [5, 2, 9, 1], after one pass it becomes [2, 5, 1, 9].
Consider using the Selection Sort algorithm: find the minimum element and swap it with the first unsorted element.
Example: For the list [64, 25, 12, ...
I applied via Company Website and was interviewed in Jun 2024. There were 2 interview rounds.
Basic aptitude , tech aptitude
Overfitting occurs when a model learns noise instead of the underlying pattern, leading to poor generalization on new data.
Overfitting happens when a model is too complex, capturing noise in the training data.
Example: A polynomial regression model with a high degree may fit training data perfectly but perform poorly on test data.
Signs of overfitting include a large gap between training and validation accuracy.
Technique...
Handling missing values is crucial for accurate data analysis and model performance.
1. Remove missing values: If the dataset is large, consider dropping rows or columns with missing values. Example: df.dropna() in pandas.
2. Imputation: Replace missing values with statistical measures like mean, median, or mode. Example: df.fillna(df.mean()) for mean imputation.
3. Predictive modeling: Use algorithms to predict and fill ...
I appeared for an interview in Nov 2024.
I applied via Naukri.com and was interviewed in Feb 2024. There was 1 interview round.
Handling imbalanced datasets involves techniques like resampling, using different algorithms, and adjusting class weights.
Use resampling techniques like oversampling the minority class or undersampling the majority class.
Utilize algorithms that are robust to imbalanced datasets, such as Random Forest, XGBoost, or SVM.
Adjust class weights in the model to give more importance to the minority class.
Use techniques like SMO...
Statistics based questions
Basic case study any topic
I appeared for an interview in Nov 2022.
Bagging and boosting are ensemble learning techniques used to improve model performance.
Bagging involves training multiple models on different subsets of the training data and combining their predictions through averaging or voting.
Boosting involves iteratively training models on the same data, with each subsequent model focusing on the errors of the previous model.
Bagging reduces overfitting and variance, while boosti...
One way to measure model effectiveness without using confusion matrix metrics is by using area under the receiver operating characteristic curve (AUC-ROC).
Calculate the AUC-ROC score to evaluate the model's ability to distinguish between positive and negative classes.
AUC-ROC considers the entire range of classification thresholds and is insensitive to class imbalance.
Higher AUC-ROC score indicates better model performa...
Blue score is not a term used in regression analysis.
Blue score is not a standard term in regression analysis
It is possible that the interviewer meant to ask about another metric such as R-squared or mean squared error
Without further context, it is difficult to provide a more specific answer
I applied via Approached by Company and was interviewed before Sep 2023. There was 1 interview round.
Linear regression is used for continuous variables, while logistic regression is used for binary classification.
Linear regression is used to predict continuous values, such as predicting house prices based on square footage.
Logistic regression is used for binary classification, such as predicting whether an email is spam or not.
Linear regression assumes a linear relationship between the independent and dependent variab...
Cross entropy is a general term for loss functions used in classification tasks, while binary cross entropy is specifically used for binary classification tasks.
Cross entropy is a measure of the difference between two probability distributions, often used in multi-class classification tasks.
Binary cross entropy is a specific form of cross entropy used for binary classification tasks, where the output is either 0 or 1.
C...
I applied via Naukri.com and was interviewed before Jun 2023. There were 2 interview rounds.
I applied via Recruitment Consultant and was interviewed in Jul 2021. There were 3 interview rounds.
Predicting insurance claims using machine learning algorithms.
Fraud detection in insurance claims
Risk assessment for insurance policies
Pricing optimization for insurance products
Customer segmentation for targeted marketing
Predictive maintenance for insurance assets
Some of the top questions asked at the EXL Service Data Scientist interview -
based on 9 interview experiences
Difficulty level
Duration
based on 18 reviews
Rating in categories
Gurgaon / Gurugram,
Bangalore / Bengaluru
4-9 Yrs
Not Disclosed
Senior Executive
4.5k
salaries
| ₹2.5 L/yr - ₹7.5 L/yr |
Assistant Manager
3.3k
salaries
| ₹6 L/yr - ₹14 L/yr |
Senior Associate
2.6k
salaries
| ₹2.3 L/yr - ₹7 L/yr |
Lead Assistant Manager
2k
salaries
| ₹10.6 L/yr - ₹18.1 L/yr |
Executive
1.9k
salaries
| ₹1.7 L/yr - ₹5 L/yr |
Genpact
DXC Technology
Mphasis
Sutherland Global Services