Senior Data Analyst

200+ Senior Data Analyst Interview Questions and Answers

Updated 12 Jul 2025
search-icon
2d ago

Q. What is the difference between the Least Squares Method and Maximum Likelihood Estimation?

Ans.

Least Squares Method and Maximum Likelihood are both used to estimate parameters, but differ in their approach.

  • Least Squares Method minimizes the sum of squared errors between the observed and predicted values.

  • Maximum Likelihood estimates the parameters that maximize the likelihood of observing the given data.

  • Least Squares Method assumes that the errors are normally distributed and independent.

  • Maximum Likelihood does not make any assumptions about the distribution of errors.

  • L...read more

Asked in Proftware

1d ago

Q. Imagine you are presented with a complex dataset from a multinational company with millions of records. The dataset is unstructured and lacks clear variables. How would you approach the data analysis process to...

read more
Ans.

To analyze a complex dataset, start by understanding the data, cleaning and structuring it, performing exploratory data analysis, applying statistical methods, and creating visualizations for insights.

  • Understand the business objectives and goals to align the analysis with company's growth strategy.

  • Clean and structure the dataset by identifying and handling missing values, outliers, and inconsistencies.

  • Perform exploratory data analysis to understand the distribution, relations...read more

Senior Data Analyst Interview Questions and Answers for Freshers

illustration image
1d ago

Q. How do you improve the performance of Linear Regression?

Ans.

To improve the performance of Linear Regression, you can consider feature engineering, regularization, and handling outliers.

  • Perform feature engineering to create new features that capture important information.

  • Apply regularization techniques like L1 or L2 regularization to prevent overfitting.

  • Handle outliers by either removing them or using robust regression techniques.

  • Check for multicollinearity among the independent variables and consider removing highly correlated variabl...read more

Asked in Chubb

2d ago

Q. Given a table 'matches' with columns 'team1', 'team2', and 'winner', where each row represents a match between two teams and the winner, how would you determine the number of matches won and lost by each team?

Ans.

Calculate the number of matches won and lost by each team based on the given data in the matches table.

  • Group the data by team and count the number of matches won and lost for each team.

  • Use the winner column to determine the outcome of each match.

  • Create a query to calculate the number of matches won and lost for each team.

  • Example: Team A won 2 matches and lost 1 match.

  • Example: Team B won 1 match and lost 2 matches.

Are these interview questions helpful?

Asked in NielsenIQ

3d ago

Q. Have you used Power BI ? and various types of visualization in Power BI

Ans.

Yes, I have used Power BI for various types of visualization including bar charts, line charts, pie charts, and maps.

  • I have experience creating bar charts to visualize sales data over time.

  • I have used line charts to show trends in customer engagement metrics.

  • I have utilized pie charts to display market share data.

  • I have incorporated maps to visualize geographic distribution of sales.

1d ago

Q. How do you handle overfitting and underfitting in Decision Trees?

Ans.

Overfitting in decision trees can be handled by pruning, reducing tree depth, increasing dataset size, and using ensemble methods.

  • Prune the tree to remove unnecessary branches

  • Reduce tree depth to prevent overfitting

  • Increase dataset size to improve model generalization

  • Use ensemble methods like Random Forest to reduce overfitting

  • Underfitting can be handled by increasing tree depth, adding more features, and reducing regularization

  • Regularization can be used to prevent overfittin...read more

Senior Data Analyst Jobs

Optum logo
Senior Data Analyst 7-10 years
Optum
4.0
₹ 11 L/yr - ₹ 21 L/yr
(AmbitionBox estimate)
Delhi/Ncr
Cognizant logo
Hiring For Senior Data Analyst 8-10 years
Cognizant
3.7
Hyderabad / Secunderabad
Oracle India Pvt. Ltd. logo
Senior Data Analyst 2-7 years
Oracle India Pvt. Ltd.
3.7
₹ 4 L/yr - ₹ 13 L/yr
(AmbitionBox estimate)
Kolkata
1d ago

Q. What metrics do you use to evaluate classification models?

Ans.

Metrics used to evaluate classification models

  • Accuracy

  • Precision

  • Recall

  • F1 Score

  • ROC Curve

  • Confusion Matrix

4d ago

Q. What metrics are used to evaluate Linear Regression?

Ans.

Metrics used to evaluate Linear Regression

  • Mean Squared Error (MSE)

  • Root Mean Squared Error (RMSE)

  • R-squared (R²)

  • Adjusted R-squared (Adj R²)

  • Mean Absolute Error (MAE)

  • Residual Sum of Squares (RSS)

  • Akaike Information Criterion (AIC)

  • Bayesian Information Criterion (BIC)

Share interview questions and help millions of jobseekers 🌟

man-with-laptop

Asked in Decathlon

4d ago

Q. Coin tossed 2 times what's prob to get both heads? what if coin is biased.

Ans.

The probability of getting both heads when a coin is tossed 2 times is 1/4. If the coin is biased, the probability may change.

  • The probability of getting both heads in a fair coin toss is 1/4 (1/2 * 1/2).

  • If the coin is biased, the probability of getting both heads may be different depending on the bias.

  • For example, if the coin is biased towards heads with a probability of 0.6, the probability of getting both heads would be 0.6 * 0.6 = 0.36.

Asked in Proftware

4d ago

Q. Describe a time when you encountered a complex data analysis problem and how you successfully navigated it, highlighting the specific methodologies or tools you utilized to derive meaningful insights.

Ans.

Encountered a complex data analysis problem and successfully navigated through it

  • Encountered a data set with missing values and outliers

  • Utilized data cleaning techniques such as imputation and outlier detection

  • Applied statistical analysis and machine learning algorithms to identify patterns and trends

  • Visualized the data using tools like Tableau for better understanding

  • Collaborated with domain experts to gain insights and validate findings

1d ago

Q. How do you handle overfitting in Linear Regression?

Ans.

Overfitting in Linear Regression can be handled by using regularization techniques.

  • Regularization techniques like Ridge regression and Lasso regression can help in reducing overfitting.

  • Cross-validation can be used to find the optimal regularization parameter.

  • Feature selection and dimensionality reduction techniques can also help in reducing overfitting.

  • Collecting more data can help in reducing overfitting by providing a more representative sample.

Asked in Ipsos

3d ago

Q. What are the assumptions of Linear Regression?

Ans.

Assumptions in Linear Regression

  • Linear relationship between independent and dependent variables

  • Homoscedasticity (constant variance) of residuals

  • Independence of residuals

  • Normal distribution of residuals

  • No multicollinearity among independent variables

4d ago

Q. What is the formula for Logistic Regression?

Ans.

Logistic Regression formula is used to model the probability of a certain event occurring.

  • The formula is: P(Y=1) = e^(b0 + b1*X1 + b2*X2 + ... + bn*Xn) / (1 + e^(b0 + b1*X1 + b2*X2 + ... + bn*Xn))

  • Y is the dependent variable and X1, X2, ..., Xn are the independent variables

  • b0, b1, b2, ..., bn are the coefficients that need to be estimated

  • The formula is used to predict the probability of a binary outcome, such as whether a customer will buy a product or not

  • The formula is derive...read more

Asked in Decathlon

3d ago

Q. Difference between View & Temp Table? what is view in sql?

Ans.

Views are virtual tables that display data from one or more tables, while temp tables are temporary tables that store data temporarily.

  • Views are virtual tables created by a query, while temp tables are physical tables created in the database.

  • Views do not store data themselves, but display data from underlying tables, while temp tables store data temporarily for a session or transaction.

  • Views can be used for security purposes by restricting access to certain columns or rows, w...read more

Asked in BeeHyv

6d ago

Q. In SQL, how do you calculate the rolling sum of sales?

Ans.

Calculate the rolling sum of sales using SQL window functions for cumulative totals over a specified period.

  • Use the SUM() function with the OVER() clause to calculate rolling sums.

  • Example: SELECT date, sales, SUM(sales) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS rolling_sum FROM sales_data;

  • You can adjust the window frame to specify different periods, e.g., LAST 7 DAYS.

  • Ensure your data is ordered correctly to get accurate rolling sums.

Asked in Chubb

5d ago

Q. How would you extract names from email addresses in SQL?

Ans.

Use SQL string functions like SUBSTRING and CHARINDEX to separate name from emails.

  • Use CHARINDEX to find the position of the '@' symbol in the email address.

  • Use SUBSTRING to extract the characters before the '@' symbol as the name.

  • Consider handling cases where there are multiple names or special characters in the email address.

Asked in NielsenIQ

1d ago

Q. Do you know about panel data ? what is it ?

Ans.

Panel data is a type of longitudinal data that involves observations on multiple subjects over multiple time periods.

  • Panel data is also known as longitudinal data or cross-sectional time series data.

  • It allows for the analysis of both individual and time effects.

  • Examples include tracking the same group of individuals over time to study changes in their behavior or characteristics.

  • Panel data is commonly used in economics, sociology, and political science research.

Asked in NielsenIQ

4d ago

Q. Do you know about Scan Data ? What is it ?

Ans.

Scan data refers to the information collected from scanning barcodes or QR codes, typically used in retail to track sales and inventory.

  • Scan data is collected by scanning barcodes or QR codes on products.

  • It is commonly used in retail to track sales, inventory levels, and pricing.

  • Scan data can provide valuable insights into consumer behavior and preferences.

  • Examples of scan data systems include point-of-sale (POS) systems and inventory management software.

6d ago

Q. How can we share a Power BI report with an external user?

Ans.

To share a Power BI report with an external user, we can use the Publish to Web feature or share it via email.

  • Use the Publish to Web feature to generate an embed code that can be shared with external users

  • Ensure that the report contains only non-sensitive data before using the Publish to Web feature

  • Alternatively, share the report via email by granting access to the external user's email address

  • The external user must have a Power BI account to view the report

Q. What is the maximum amount of data you've dealt with?

Ans.

I have dealt with terabytes of data in my previous role as a Data Analyst.

  • Managed and analyzed terabytes of data from various sources

  • Utilized big data tools such as Hadoop and Spark to process large datasets

  • Performed complex data analysis and visualization on massive datasets

Asked in Decathlon

5d ago

Q. types of bar charts in tableau, what is stacked bar?

Ans.

Types of bar charts in Tableau include standard bar, stacked bar, and side-by-side bar.

  • Standard bar chart displays individual bars for each category

  • Stacked bar chart shows the total value broken down into sub-categories

  • Side-by-side bar chart compares multiple measures across categories

  • Example: Stacked bar chart can be used to show sales by region, with each region broken down by product category

Asked in Proftware

6d ago

Q. Given a large dataset with millions of rows and multiple variables, describe the steps and techniques you would use to identify meaningful patterns, correlations, and insights to drive strategic decision-making...

read more
Ans.

Utilize data visualization, statistical analysis, and machine learning techniques to identify patterns and correlations in large datasets for strategic decision making.

  • Perform exploratory data analysis to understand the structure and relationships within the dataset

  • Utilize data visualization techniques such as scatter plots, histograms, and heatmaps to identify patterns and correlations

  • Conduct statistical analysis including correlation analysis, regression analysis, and hypot...read more

3d ago

Q. How many types of connections are available in Power BI?

Ans.

There are four types of connections available in Power BI.

  • Power BI Desktop Connection

  • Power BI Service Connection

  • Power BI Mobile Connection

  • Power BI Gateway Connection

Asked in NielsenIQ

6d ago

Q. What do you know about Nielsen's business?

Ans.

Nielsen is a global measurement and data analytics company that provides insights into consumer behavior.

  • Nielsen is known for its TV ratings system, which measures viewership for television programs.

  • They also provide data on consumer purchasing habits and trends for various industries.

  • Nielsen operates in over 100 countries and has a wide range of services including market research, audience measurement, and advertising effectiveness.

  • The company was founded in 1923 by Arthur C...read more

Asked in ESG Book

4d ago

Q. What is the impact of ESG on an investor's decision-making? Is it positive or negative?

Ans.

ESG factors can have a significant impact on an investor's decision making, influencing both financial performance and sustainability.

  • ESG factors can help investors assess the long-term sustainability and risk profile of a company.

  • Investors increasingly consider ESG factors as a way to mitigate risks and identify opportunities for long-term value creation.

  • ESG integration can lead to better financial performance and resilience in the face of environmental, social, and governan...read more

Asked in Infosys

3d ago

Q. A calculated column that is added to a table during data loading and the values are computed row by row.It's stored in the data model and can be used in visuals and other calculations. A calculated measure that...

read more
Ans.

Calculated columns are row-wise computed and stored, while measures are dynamic calculations based on filter context at query time.

  • Calculated Columns: Created during data loading, computed for each row, and stored in the data model. Example: A column calculating 'Total Sales' as 'Quantity * Price'.

  • Measures: Dynamic calculations performed at query time, based on the current filter context. Example: A measure calculating 'Total Revenue' using SUM(Sales[Revenue]).

  • Storage: Calcul...read more

2d ago

Q. What are Type I and Type II errors?

Ans.

Type I error is rejecting a true null hypothesis, while Type II error is failing to reject a false null hypothesis.

  • Type I error is also known as a false positive

  • Type II error is also known as a false negative

  • Type I error occurs when the significance level is set too high

  • Type II error occurs when the significance level is set too low

  • Examples: Type I error - Convicting an innocent person, Type II error - Failing to convict a guilty person

  • Type I error is more serious in medical ...read more

4d ago

Q. What is Cost function and Error Function

Ans.

Cost function measures the difference between predicted and actual values. Error function measures the average of cost function.

  • Cost function is used to evaluate the performance of a machine learning model.

  • It measures the difference between predicted and actual values.

  • Error function is the average of cost function over the entire dataset.

  • It is used to optimize the parameters of the model.

  • Examples of cost functions are mean squared error, mean absolute error, and cross-entropy...read more

Asked in Genpact

6d ago

Q. How many types of filters are available in Power BI?

Ans.

There are three types of filters available in Power BI.

  • Visual level filters

  • Page level filters

  • Report level filters

Asked in Ganit Inc

3d ago

Q. What are tokens? What if token is not present in model's vocabulary? (These questions were asked because I mentioned NLP project in my resume.)

Ans.

Tokens are individual units of text processed in NLP; unknown tokens can lead to challenges in model performance.

  • Tokens are the smallest units of text, such as words or subwords, used in Natural Language Processing (NLP).

  • For example, the sentence 'I love data analysis' can be tokenized into ['I', 'love', 'data', 'analysis'].

  • If a token is not present in the model's vocabulary, it is often replaced with a special token like '<UNK>' (unknown).

  • This can lead to loss of information...read more

1
2
3
4
5
6
7
Next

Interview Experiences of Popular Companies

TCS Logo
3.6
 • 11.1k Interviews
Accenture Logo
3.7
 • 8.7k Interviews
Infosys Logo
3.6
 • 7.9k Interviews
Cognizant Logo
3.7
 • 5.9k Interviews
Capgemini Logo
3.7
 • 5.1k Interviews
View all
interview tips and stories logo
Interview Tips & Stories
Ace your next interview with expert advice and inspiring stories
Senior Data Analyst Interview Questions
Share an Interview
Stay ahead in your career. Get AmbitionBox app
play-icon
play-icon
qr-code
Trusted by over 1.5 Crore job seekers to find their right fit company
80 L+

Reviews

10L+

Interviews

4 Cr+

Salaries

1.5 Cr+

Users

Contribute to help millions

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2025 Info Edge (India) Ltd.

Follow Us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter
Profile Image
Hello, Guest
AmbitionBox Employee Choice Awards 2025
Winners announced!
awards-icon
Contribute to help millions!
Write a review
Write a review
Share interview
Share interview
Contribute salary
Contribute salary
Add office photos
Add office photos
Add office benefits
Add office benefits