Faster and better experience!
i
JPMorgan Chase & Co.
Proud winner of ABECA 2024 - AmbitionBox Employee Choice Awards
Filter interviews by
I was asked Python, sql, coding questions
Case study on how would you identify the total number of footfall on a airport
I applied via Campus Placement
Developed machine learning models to predict customer churn and optimize marketing campaigns.
Built predictive models using Python and scikit-learn
Utilized SQL to extract and manipulate data for analysis
Collaborated with cross-functional teams to implement data-driven solutions
I applied via Approached by Company and was interviewed before Sep 2021. There were 3 interview rounds.
I applied via Walk-in and was interviewed in Mar 2020. There was 1 interview round.
R square is a statistical measure that represents the proportion of the variance in the dependent variable explained by the independent variables.
R square is a value between 0 and 1, where 0 indicates that the independent variables do not explain any of the variance in the dependent variable, and 1 indicates that they explain all of it.
It is used to evaluate the goodness of fit of a regression model.
Adjusted R square t...
Variable reducing techniques are methods used to identify and select the most relevant variables in a dataset.
Variable reducing techniques help in reducing the number of variables in a dataset.
These techniques aim to identify the most important variables that contribute significantly to the outcome.
Some common variable reducing techniques include feature selection, dimensionality reduction, and correlation analysis.
Fea...
The Wald test is used in logistic regression to check the significance of the variable.
The Wald test calculates the ratio of the estimated coefficient to its standard error.
It follows a chi-square distribution with one degree of freedom.
A small p-value indicates that the variable is significant.
For example, in Python, the statsmodels library provides the Wald test in the summary of a logistic regression model.
Multicollinearity in logistic regression can be checked using correlation matrix and variance inflation factor (VIF).
Calculate the correlation matrix of the independent variables and check for high correlation coefficients.
Calculate the VIF for each independent variable and check for values greater than 5 or 10.
Consider removing one of the highly correlated variables or variables with high VIF to address multicollinear...
Bagging and boosting are ensemble methods used in machine learning to improve model performance.
Bagging involves training multiple models on different subsets of the training data and then combining their predictions through averaging or voting.
Boosting involves iteratively training models on the same dataset, with each subsequent model focusing on the samples that were misclassified by the previous model.
Bagging reduc...
Logistic regression is a statistical method used to analyze and model the relationship between a binary dependent variable and one or more independent variables.
It is a type of regression analysis used for predicting the outcome of a categorical dependent variable based on one or more predictor variables.
It uses a logistic function to model the probability of the dependent variable taking a particular value.
It is commo...
Gini coefficient measures the inequality among values of a frequency distribution.
Gini coefficient ranges from 0 to 1, where 0 represents perfect equality and 1 represents perfect inequality.
It is commonly used to measure income inequality in a population.
A Gini coefficient of 0.4 or higher is considered to be a high level of inequality.
Gini coefficient can be calculated using the Lorenz curve, which plots the cumulati...
A chair is a piece of furniture used for sitting, while a cart is a vehicle used for transporting goods.
A chair typically has a backrest and armrests, while a cart does not.
A chair is designed for one person to sit on, while a cart can carry multiple items or people.
A chair is usually stationary, while a cart is mobile and can be pushed or pulled.
A chair is commonly found in homes, offices, and public spaces, while a c...
Outliers can be detected using statistical methods like box plots, z-score, and IQR. Treatment can be removal or transformation.
Use box plots to visualize outliers
Calculate z-score and remove data points with z-score greater than 3
Calculate IQR and remove data points outside 1.5*IQR
Transform data using log or square root to reduce the impact of outliers
I applied via Company Website and was interviewed before Aug 2023. There were 2 interview rounds.
Bert and transformer are models used in natural language processing for tasks like text classification and language generation.
Bert (Bidirectional Encoder Representations from Transformers) is a transformer-based model developed by Google for NLP tasks.
Transformer is a deep learning model architecture that uses self-attention mechanisms to process sequential data like text.
Both Bert and transformer have been widely use...
NLP pre processing techniques involve cleaning and preparing text data for analysis.
Tokenization: breaking text into words or sentences
Stopword removal: removing common words that do not add meaning
Lemmatization: reducing words to their base form
Normalization: converting text to lowercase
Removing special characters and punctuation
posted on 7 May 2024
I applied via Job Portal and was interviewed in Nov 2023. There was 1 interview round.
Gradient descent is an optimization algorithm used to minimize a function by iteratively moving in the direction of steepest descent.
Gradient descent is used to find the minimum of a function by taking steps proportional to the negative of the gradient at the current point.
It is commonly used in machine learning to optimize the parameters of a model by minimizing the loss function.
There are different variants of gradie...
LSTM (Long Short-Term Memory) is a type of recurrent neural network designed to handle long-term dependencies.
LSTM has three gates: input gate, forget gate, and output gate.
Input gate controls the flow of information into the cell state.
Forget gate decides what information to discard from the cell state.
Output gate determines the output based on the cell state.
T-test is a statistical test used to determine if there is a significant difference between the means of two groups.
Mean is the average of a set of numbers, median is the middle value when the numbers are ordered, and mode is the most frequently occurring value.
Mean is sensitive to outliers, median is robust to outliers, and mode is useful for categorical data.
T-test is used to compare means of two groups, mean is used...
Random Forest is an ensemble learning method used for classification and regression tasks.
Random Forest is a collection of decision trees that are trained on random subsets of the data.
Each tree in the forest makes a prediction, and the final prediction is the average (regression) or majority vote (classification) of all trees.
Random Forest helps reduce overfitting and improve accuracy compared to a single decision tre...
posted on 20 Jun 2024
I applied via IIM Jobs and was interviewed before Jun 2023. There was 1 interview round.
I appeared for an interview before Apr 2023.
I applied via Campus Placement and was interviewed before Jul 2023. There were 3 interview rounds.
Medium General Aptitude questions and technical(Big Data, Python etc.)
Understanding deep equations and algorithms in DL and ML is crucial for a data scientist.
Deep learning involves complex neural network architectures like CNNs and RNNs.
Machine learning algorithms include decision trees, SVM, k-means clustering, etc.
Understanding the math behind algorithms helps in optimizing model performance.
Equations like gradient descent, backpropagation, and loss functions are key concepts.
Practica...
I applied via Naukri.com and was interviewed in May 2023. There were 2 interview rounds.
Use a regression algorithm like linear regression or decision tree regression.
Consider using linear regression if the relationship between variables is linear.
Decision tree regression can handle non-linear relationships between variables.
Evaluate the performance of different algorithms using cross-validation.
Consider the interpretability of the model when choosing an algorithm.
based on 2 interviews
Interview experience
based on 7 reviews
Rating in categories
Associate
10.3k
salaries
| ₹10 L/yr - ₹43.1 L/yr |
Team Lead
5.6k
salaries
| ₹5.6 L/yr - ₹16.5 L/yr |
Vice President
4.1k
salaries
| ₹20 L/yr - ₹70 L/yr |
Analyst
2.6k
salaries
| ₹6.5 L/yr - ₹25 L/yr |
Software Engineer
2.6k
salaries
| ₹10.7 L/yr - ₹35 L/yr |
Morgan Stanley
Goldman Sachs
TCS
Bank of America