i
IBM
Proud winner of ABECA 2024 - AmbitionBox Employee Choice Awards
Filter interviews by
Data analytics in business involves using data to analyze trends, patterns, and insights to make informed decisions and drive business growth.
Data analytics helps businesses make data-driven decisions by analyzing large sets of data.
It involves using statistical techniques and algorithms to uncover insights and trends.
Businesses can use data analytics to optimize operations, improve marketing strategies, and enhance cu...
posted on 13 Nov 2024
I applied via LinkedIn and was interviewed in Oct 2024. There was 1 interview round.
Time series is a sequence of data points collected at regular time intervals, used to analyze trends and patterns over time.
Time series data is ordered chronologically
Commonly used in forecasting future values based on past patterns
Examples include stock prices, weather data, and sales figures
Handle missing values using techniques like imputation, deletion, or modeling.
Use imputation techniques like mean, median, mode for numerical data
For categorical data, use mode or create a new category for missing values
Consider using advanced techniques like KNN imputation or predictive modeling
Delete rows or columns with high percentage of missing values if appropriate
posted on 22 Jul 2024
Case study - prize optimization problem
Structured data is organized and easily searchable, while unstructured data lacks a predefined format.
Structured data is organized into rows and columns, like a database.
Unstructured data includes text documents, images, videos, and social media posts.
Structured data is easier to analyze and query, while unstructured data requires more advanced techniques like natural language processing.
Examples of structured data inc...
I was interviewed in Oct 2024.
Transfer learning involves using pre-trained models on a different task, while fine-tuning involves further training a pre-trained model on a specific task.
Transfer learning uses knowledge gained from one task to improve learning on a different task.
Fine-tuning involves adjusting the parameters of a pre-trained model to better fit a specific task.
Transfer learning is faster and requires less data compared to training a...
I applied via Naukri.com and was interviewed in Sep 2024. There were 2 interview rounds.
Find Nth-largest element in an array
Sort the array in descending order
Return the element at index N-1
I applied via Naukri.com and was interviewed in Jul 2024. There were 2 interview rounds.
I am a data scientist with a background in statistics and machine learning, passionate about solving complex problems using data-driven approaches.
Background in statistics and machine learning
Experience in solving complex problems using data-driven approaches
Passionate about leveraging data to drive insights and decision-making
Developed a predictive model for customer churn in a telecom company.
Collected and cleaned customer data including usage patterns and demographics.
Used machine learning algorithms such as logistic regression and random forest to build the model.
Evaluated model performance using metrics like accuracy, precision, and recall.
Implemented the model into the company's CRM system for real-time predictions.
I applied via Naukri.com and was interviewed in Jul 2024. There was 1 interview round.
Context window in LLMs refers to the number of surrounding words considered when predicting the next word in a sequence.
Context window helps LLMs capture dependencies between words in a sentence.
A larger context window allows the model to consider more context but may lead to increased computational complexity.
For example, in a context window of 2, the model considers 2 words before and 2 words after the target word fo
top_k parameter is used to specify the number of top elements to be returned in a result set.
top_k parameter is commonly used in machine learning algorithms to limit the number of predictions or recommendations.
For example, in recommendation systems, setting top_k=5 will return the top 5 recommended items for a user.
In natural language processing tasks, top_k can be used to limit the number of possible next words in a
Regex patterns in Python are sequences of characters that define a search pattern.
Regex patterns are used for pattern matching and searching in strings.
They are created using the 're' module in Python.
Examples of regex patterns include searching for email addresses, phone numbers, or specific words in a text.
Iterators are objects that allow iteration over a sequence of elements. Tuples are immutable sequences of elements.
Iterators are used to loop through elements in a collection, like lists or dictionaries
Tuples are similar to lists but are immutable, meaning their elements cannot be changed
Example of iterator: for item in list: print(item)
Example of tuple: my_tuple = (1, 2, 3)
Yes, I have experience working with REST APIs in various projects.
Developed RESTful APIs using Python Flask framework
Consumed REST APIs in data analysis projects using requests library
Used Postman for testing and debugging REST APIs
I applied via Job Portal and was interviewed in Apr 2024. There was 1 interview round.
XGBoost is a powerful machine learning algorithm known for its speed and performance in handling large datasets.
XGBoost stands for eXtreme Gradient Boosting, which is an implementation of gradient boosting machines.
It is widely used in machine learning competitions and is known for its speed and performance.
XGBoost uses a technique called boosting, where multiple weak learners are combined to create a strong learner.
It...
XgBoost algorithm uses a greedy approach to determine splits based on feature importance.
XgBoost algorithm calculates the information gain for each feature to determine the best split.
The feature with the highest information gain is chosen for the split.
This process is repeated recursively for each node in the tree.
Features can be split based on numerical values or categories.
Example: If a feature like 'age' has the hi...
Yes, I have experience working on cloud platforms such as AWS and Google Cloud.
Experience with AWS services like S3, EC2, and Redshift
Familiarity with Google Cloud services like BigQuery and Compute Engine
Utilized cloud platforms for data storage, processing, and analysis
Entropy is a measure of randomness or uncertainty in a dataset, while information gain is the reduction in entropy after splitting a dataset based on a feature.
Entropy is used in decision tree algorithms to determine the best feature to split on.
Information gain measures the effectiveness of a feature in classifying the data.
Higher information gain indicates that a feature is more useful for splitting the data.
Entropy ...
Hypothesis testing is a statistical method used to make inferences about a population based on sample data.
Hypothesis testing involves formulating a null hypothesis and an alternative hypothesis.
The null hypothesis is assumed to be true until there is enough evidence to reject it.
Statistical tests are used to determine the likelihood of observing the data if the null hypothesis is true.
The p-value is used to determine ...
Precision and recall are metrics used in evaluating the performance of classification models.
Precision measures the accuracy of positive predictions, while recall measures the ability of the model to find all positive instances.
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
Precision is important when false positives are costly, while recall is important when false negatives are costly.
For example, in a spam email de...
Data imbalance refers to unequal distribution of classes in a dataset, where one class has significantly more samples than others.
Data imbalance can lead to biased models that favor the majority class.
It can result in poor performance for minority classes, as the model may struggle to accurately predict them.
Techniques like oversampling, undersampling, and using different evaluation metrics can help address data imbala...
SMOTE stands for Synthetic Minority Over-sampling Technique, used to balance imbalanced datasets by generating synthetic samples.
SMOTE is commonly used in machine learning to address class imbalance by creating synthetic samples of the minority class.
It works by generating new instances of the minority class by interpolating between existing instances.
SMOTE is particularly useful in scenarios where the minority class i...
I was interviewed in Aug 2024.
based on 1 interview
Interview experience
Application Developer
11.7k
salaries
| ₹5.4 L/yr - ₹24 L/yr |
Software Engineer
5.5k
salaries
| ₹5.4 L/yr - ₹22.2 L/yr |
Advisory System Analyst
5.2k
salaries
| ₹9.4 L/yr - ₹26 L/yr |
Senior Software Engineer
4.8k
salaries
| ₹8 L/yr - ₹30 L/yr |
Senior Systems Engineer
4.6k
salaries
| ₹5.7 L/yr - ₹20.4 L/yr |
Oracle
TCS
Cognizant
Accenture