Genpact
Interview Questions and Answers
Q1. How to extract top discussed topics on twitter ?
Use Twitter API to extract tweets, perform text analysis to identify top discussed topics.
Access Twitter API to retrieve tweets
Perform text analysis using NLP techniques like TF-IDF or LDA
Identify keywords or hashtags with highest frequency to determine top discussed topics
Q2. Which models you use for sentiment analysis or summarisation
I use models like LSTM, BERT, and Transformer for sentiment analysis and summarization.
LSTM (Long Short-Term Memory) for sequence prediction tasks like sentiment analysis
BERT (Bidirectional Encoder Representations from Transformers) for contextual word embeddings
Transformer for attention-based sequence-to-sequence tasks like summarization
Q3. Scree share coding - manipulate dataframe using pandas
Using pandas to manipulate dataframes through screen sharing coding.
Use pandas library in Python for data manipulation
Share screen to demonstrate coding techniques
Use functions like merge, groupby, and apply for data manipulation
Q4. What is supervised and unsupervised learning
Supervised learning uses labeled data to train a model, while unsupervised learning uses unlabeled data.
Supervised learning requires labeled data for training
Unsupervised learning does not require labeled data
Examples of supervised learning include classification and regression
Examples of unsupervised learning include clustering and dimensionality reduction
Q5. Explain NLP project life cycle for sentiment analysis in detail
NLP project life cycle for sentiment analysis involves data collection, preprocessing, model training, evaluation, and deployment.
Data collection: Gather text data from various sources like social media, reviews, or surveys.
Data preprocessing: Clean and preprocess the text data by removing stopwords, punctuation, and special characters.
Model training: Use machine learning or deep learning algorithms to train a sentiment analysis model on the preprocessed data.
Evaluation: Eval...read more
Q6. explain p value and its difference with probability
P value is a measure of the strength of evidence against the null hypothesis in a statistical test, while probability is the likelihood of an event occurring.
P value is used in hypothesis testing to determine the significance of results
Probability is a measure of the likelihood of an event occurring
P value ranges from 0 to 1, with lower values indicating stronger evidence against the null hypothesis
Probability also ranges from 0 to 1, with 0 indicating impossibility and 1 ind...read more
Q7. What is hypothesis testing
Hypothesis testing is a statistical method used to make inferences about a population based on sample data.
It involves formulating a hypothesis about a population parameter, collecting data, and using statistical tests to determine if the data supports or rejects the hypothesis.
There are two types of hypotheses: null hypothesis (H0) and alternative hypothesis (H1).
Common statistical tests for hypothesis testing include t-tests, chi-square tests, and ANOVA.
Example: Testing if ...read more
Q8. What is null hypothesis
Null hypothesis is a statement that there is no significant difference or relationship between variables being studied.
Null hypothesis is typically denoted as H0 in statistical hypothesis testing.
It is the default assumption that there is no effect or relationship.
The alternative hypothesis (Ha) is the opposite of the null hypothesis.
For example, in a study testing a new drug, the null hypothesis would be that the drug has no effect on patients.
The null hypothesis is tested a...read more
More about working at Genpact
Interview Process at null
Top Data Scientist Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month