Data Analyst
1000+ Data Analyst Interview Questions and Answers
Q51. What is the effect of stock split on market capitalisation of a company?
A stock split increases the number of shares outstanding and decreases the price per share, but does not affect the market capitalisation.
Stock split does not affect the total value of the company
Market capitalisation remains the same after a stock split
Stock split increases the number of shares outstanding and decreases the price per share
For example, if a company has 1 million shares outstanding and the stock splits 2-for-1, the company will have 2 million shares outstandin...read more
Q52. Can we add extra row and pivot table in excel? If yes than how?
Yes, we can add extra row and pivot table in excel.
To add a row, select the row below where you want to add the new row and right-click, then select 'Insert'.
To create a pivot table, select the data range and go to 'Insert' tab, click on 'PivotTable' and follow the prompts.
To add a new pivot table to an existing one, select a cell in the existing pivot table and go to 'Analyze' tab, click on 'PivotTable' and follow the prompts.
Q53. Create the pivot table, sort the data in ascending order
Create a pivot table and sort the data in ascending order.
To create a pivot table, select the data range and go to the 'Insert' tab in Excel.
Choose 'PivotTable' and select the location for the pivot table.
Drag the desired fields to the 'Rows' and 'Values' areas.
To sort the data in ascending order, click on the drop-down arrow next to the field name in the pivot table and select 'Sort A to Z'.
Q54. Given a list of dictionary, find the dictionary which has the count of key highest among all the dictionaries. eg: [{a:5},{b:2}.......] Now here 5 is the highest key value so it should be printed.
Find the dictionary with the highest count of keys in a list of dictionaries.
Iterate through the list of dictionaries and keep track of the dictionary with the highest count of keys.
Use a loop to count the keys in each dictionary and compare it with the current highest count.
Return the dictionary with the highest count of keys.
We were given 2 SQL queries to write in 30 mins.The problems were based on joins.We were given two tables, Employees and Department and were asked to fetch data as instructed.The level of the probl...read more
Q56. What is aadhar card (because i am in aadhhar process)
Aadhaar card is a unique identification card issued by the Indian government to residents of India.
Aadhaar card is a 12-digit unique identification number issued by the Unique Identification Authority of India (UIDAI).
It serves as proof of identity and address for Indian residents.
Aadhaar card contains biometric and demographic information of the cardholder, including fingerprints and iris scans.
It is used for various purposes such as opening bank accounts, getting a SIM card...read more
Share interview questions and help millions of jobseekers 🌟
Q57. What is permutation and combination and how is it used in data science?
Permutation and combination are mathematical concepts used to count the number of possible outcomes in a given scenario.
Permutation is the arrangement of objects in a specific order while combination is the selection of objects without considering the order.
Permutation formula: nPr = n!/(n-r)! where n is the total number of objects and r is the number of objects selected.
Combination formula: nCr = n!/r!(n-r)! where n is the total number of objects and r is the number of objec...read more
Q58. Difference between CSV file and Excel file
CSV files are plain text files that store tabular data, while Excel files are binary files that can contain multiple sheets and complex formatting.
CSV files are simpler and more lightweight compared to Excel files.
CSV files can be easily opened and edited using a text editor, while Excel files require specific software like Microsoft Excel.
CSV files do not support formulas, macros, or formatting options like colors and fonts, while Excel files do.
CSV files have a smaller file...read more
Data Analyst Jobs
Q59. Explain reconciliation assignment that you have done in first step.
Reconciliation assignment involves comparing and matching data from different sources.
Identify sources of data to be reconciled
Compare data from different sources
Match data based on common identifiers
Resolve discrepancies and errors
Ensure accuracy and completeness of reconciled data
Q60. How much amount of data you Handel till now.
I have handled large amounts of data in my previous roles.
I have experience handling terabytes of data in my previous role as a data analyst at XYZ company.
I have worked with data from various sources such as databases, spreadsheets, and APIs.
I have also used tools like SQL, Python, and Excel to manipulate and analyze data.
I am comfortable working with both structured and unstructured data.
I have experience cleaning and transforming data to make it usable for analysis.
Consider a database of authors and books they wrote along with copies each book sold. Find author with highest book sale.
Q62. Whom are you going to query when you have TP discrepant data
I would query the relevant stakeholders involved in the data collection process.
Identify the stakeholders involved in the data collection process
Reach out to them to understand the data collection process and identify any potential issues
Collaborate with them to resolve the TP discrepant data
A marketing campaign is run, how will you decide metrics to be tracked
The market hare for ABC company is declining, how would you suggest to increase their market share, what tactics the company can use.
A business case study was asked.
What all factors are required for a good advertisement and by what factors will you analyze whether the advertisement is doing well or not.
Q66. Suppose we need to find the sales of Rosogolla sweets being sold in Kolkata. What approach would you take?
I would use a combination of primary and secondary research to find the sales of Rosogolla sweets in Kolkata.
Conduct surveys and interviews with local sweet shop owners and customers
Analyze sales data from major sweet shops in Kolkata
Research online for articles and reports on the popularity of Rosogolla in Kolkata
Compare sales data with other popular sweets in Kolkata
Consider factors such as seasonality and festivals that may affect sales
Q67. Guesstimate the number of people travelling by local metro in mumbai
The number of people travelling by local metro in Mumbai is estimated to be in millions.
Mumbai has a population of over 20 million people.
The local metro is a popular mode of transportation in Mumbai.
During peak hours, the metro trains are usually crowded.
The metro network in Mumbai is extensive, covering various parts of the city.
The number of daily commuters on the metro can be estimated based on population density and transportation preferences.
Q68. What is row number, rank, dense rank? explain partition by, indexing joins, query optimization techniques
Explanation of row number, rank, dense rank, partition by, indexing joins, and query optimization techniques.
Row number is a function that assigns a unique number to each row in a result set.
Rank is a function that assigns a rank to each row based on the values in a specified column.
Dense rank is a function that assigns a rank to each row based on the values in a specified column, but with no gaps in the ranking.
Partition by is a clause that divides a result set into partitio...read more
Q69. Filters Transfer data from 1 sheet to other Join , inner, outer, self, left right Along with the syntax Colene function in SQL Show top value show values between a date range Sumifs countifs
Answering questions related to data filtering, transferring, joining, and SQL functions for a Data Analyst role.
To transfer data from one sheet to another, you can use functions like VLOOKUP or INDEX/MATCH in Excel.
Different types of joins in SQL include INNER JOIN, OUTER JOIN, LEFT JOIN, RIGHT JOIN, and SELF JOIN.
Syntax for INNER JOIN: SELECT columns FROM table1 INNER JOIN table2 ON table1.column = table2.column;
Syntax for SUMIFS in Excel: =SUMIFS(sum_range, criteria_range1,...read more
Q70. Excel :- Difference between vlookup and Index(Match)
VLOOKUP searches for a value in the first column of a table and returns a corresponding value in the same row. INDEX/MATCH uses two functions to look up a value based on its position in a table.
VLOOKUP only searches for values in the first column of a table, while INDEX/MATCH can search for values in any column.
INDEX/MATCH is more flexible and efficient for large datasets.
VLOOKUP is easier to use for simple lookups with small datasets.
INDEX/MATCH can handle errors more effect...read more
Q71. what is the difference between a list and a tuple and what is list comprehension?
A list is mutable and can be modified, while a tuple is immutable and cannot be changed. List comprehension is a concise way to create lists.
List: Mutable, can be modified. Tuple: Immutable, cannot be changed.
List: Defined with square brackets []. Tuple: Defined with parentheses ().
List comprehension: Concise way to create lists by iterating over an iterable and applying an expression.
Example of list comprehension: squares = [x**2 for x in range(10)]
Q72. What is the difference between 'Extract Data' and 'Live Connection' in Tableau?
Extract Data saves a snapshot of data in Tableau workbook, while Live Connection directly connects to data source.
Extract Data creates a static copy of data in Tableau workbook, while Live Connection directly queries data source in real-time.
Extract Data is useful for working offline or with small datasets, while Live Connection is ideal for large datasets or when data is frequently updated.
Extract Data can improve performance as it reduces the need to query the data source r...read more
Q73. What is the difference between Tree Map and Heatmap in Tableau?
Tree Map visualizes hierarchical data using nested rectangles, while Heatmap displays data values using color gradients.
Tree Map displays data hierarchically with nested rectangles, where the size and color represent different measures.
Heatmap visualizes data values using color gradients, with darker colors indicating higher values.
Tree Map is useful for showing hierarchical data structures, while Heatmap is effective for comparing values across categories.
Example: Tree Map c...read more
Q74. how to do you detect outliers in the data and how do you treat them
Outliers can be detected using statistical methods and treated by either removing them or replacing them with a more appropriate value.
Use box plots, scatter plots, or histograms to visually identify outliers
Calculate the z-score or interquartile range (IQR) to identify outliers statistically
Remove outliers if they are due to data entry errors or measurement errors
Replace outliers with a more appropriate value if they are valid data points but extreme
Consider the impact of ou...read more
Q75. What do you know about R1RCM and the role you applied for?
R1RCM is a healthcare revenue cycle management company. The role is for a data analyst position.
R1RCM specializes in providing revenue cycle management services to healthcare providers.
The company uses advanced technology and analytics to optimize revenue cycle performance.
As a data analyst, the role involves analyzing and interpreting healthcare data to identify trends and insights.
The data analyst will work closely with other teams to develop strategies for improving revenu...read more
Q76. What is the use if store procedure ?
Stored procedures are precompiled SQL statements that can be reused and executed multiple times.
Stored procedures improve performance by reducing network traffic and improving security.
They can be used to encapsulate business logic and provide a consistent interface to the database.
Stored procedures can also be used to simplify complex queries and transactions.
Examples include procedures for inserting, updating, and deleting data, as well as generating reports and performing ...read more
Q77. Brexit vote and how it would effect the indian economy?
The Brexit vote could have both positive and negative effects on the Indian economy.
Positive effects: Increased trade opportunities with the UK, potential for attracting foreign investments from companies relocating from the UK.
Negative effects: Uncertainty in global markets leading to volatility in exchange rates, potential decline in exports to the UK.
Example: Indian IT companies may face challenges due to stricter immigration policies in the UK.
Example: Indian textile expo...read more
Q78. Compare two arrays in python and print if both of them are same or not?
Compare two arrays in python and print if both of them are same or not.
Use the '==' operator to compare the arrays.
If the arrays have the same elements in the same order, they are considered the same.
If the arrays have different elements or different order, they are considered different.
Print 'Same' if the arrays are the same, otherwise print 'Different'.
A quite Complex Joins Problem.
Q80. Do you have any experience working in cloud environments?
Yes, I have experience working in cloud environments.
I have worked with AWS, Azure, and Google Cloud Platform.
I have experience with cloud-based data storage and processing.
I have used cloud-based tools for data visualization and analysis.
I am familiar with cloud security and compliance measures.
Q81. Use lookup with the product based given data and find the needed data
Using lookup with product-based data to find the needed data.
Use a lookup function like VLOOKUP or INDEX/MATCH to search for the needed data
Identify the key or unique identifier to match the data
Specify the range or table where the data is located
Retrieve the desired data based on the lookup value
Q82. Do you have experience in converting python scripts to R?
Yes, I have experience in converting Python scripts to R.
I have converted several Python scripts to R for data analysis projects.
I am proficient in both Python and R programming languages.
I can provide examples of projects where I have successfully converted Python scripts to R.
Q83. Reliance fresh wants to open a store in north-eastern part of India. Should it or should it not?
Yes, Reliance Fresh should open a store in the north-eastern part of India.
The north-eastern part of India has a growing population and increasing consumer demand.
Reliance Fresh can tap into the untapped market potential in the region.
Opening a store in the north-eastern part of India can help Reliance Fresh expand its presence and increase its market share.
Reliance Fresh can leverage its existing supply chain and distribution network to efficiently serve customers in the reg...read more
Q84. extract the details of all employees whose salary is more than 5000000 and write a query.
Query to extract details of employees with salary more than 5000000.
Use SELECT statement to retrieve employee details
Add WHERE clause to filter employees with salary greater than 5000000
Example: SELECT * FROM employees WHERE salary > 5000000
Q85. Can aggregate functions be used with HAVING clause? With WHERE clause? What is the difference between WHERE and HAVING clause? Can HAVING be used without GROUP BY?
Yes, aggregate functions can be used with HAVING and WHERE clauses. WHERE filters rows before grouping, HAVING filters groups after grouping.
Aggregate functions can be used with HAVING clause to filter groups based on the result of the aggregate function.
Aggregate functions can also be used with WHERE clause to filter rows before grouping.
WHERE clause filters rows before grouping, while HAVING clause filters groups after grouping.
HAVING clause requires a GROUP BY clause, but ...read more
Q86. If we have 3 numerical columns, which chart is a better representation these column?
A scatter plot is a better representation for 3 numerical columns.
Use a scatter plot to show the relationship between the numerical columns.
Scatter plots are effective for visualizing correlations and patterns in data.
Each point on the plot represents a data point with values from all 3 columns.
Q87. How your work will increase our revenue? By sharing the data we can improve our services.
Sharing data insights can help improve services and identify revenue opportunities.
Analyzing customer behavior can help identify areas for improvement
Identifying trends in sales data can help optimize pricing strategies
Using data to personalize marketing efforts can increase customer engagement
Predictive modeling can help identify potential revenue opportunities
Tracking key performance indicators can help optimize business operations
Q88. What are KPIs, for the given data what are possible useful KPIs.
KPIs are Key Performance Indicators used to measure the performance of a business or organization. Possible KPIs for data analysis include revenue, customer acquisition cost, conversion rate, and customer retention rate.
Revenue: Total income generated from sales
Customer acquisition cost: Cost incurred to acquire a new customer
Conversion rate: Percentage of website visitors who take a desired action
Customer retention rate: Percentage of customers who continue to use a product ...read more
Write a query to select the second highest salary details (ID, Name, Salary) from Employees table.
Q90. Describe the steps for Universe design from scratch?
Universe design involves identifying data sources, defining business requirements, creating a schema, and building the universe.
Identify data sources and gather requirements
Create a schema and map data to it
Build the universe using a tool like SAP BusinessObjects
Test and validate the universe
Publish the universe for end-users to access
Q91. How many cups of tea is consumed in Delhi for a week?
It is impossible to accurately determine the number of cups of tea consumed in Delhi for a week without proper data.
There is no official data available on the exact number of cups of tea consumed in Delhi for a week.
The number of cups of tea consumed in Delhi for a week varies depending on various factors such as weather, season, and individual preferences.
It is estimated that Delhi consumes a significant amount of tea due to its cultural and historical significance.
Factors s...read more
Q92. There is table which holds details of matches played between the countries in first 2 columns and who won the match in 3rd column. write a query to get no. of matches played by each team and no. of matches won ...
read moreQuery to get number of matches played and won by each team from a table of match details.
Use GROUP BY clause to group the data by team name.
Use COUNT() function to get the number of matches played by each team.
Use SUM() function with a condition to get the number of matches won by each team.
What is an Index? Explain its different types.
What is Data Integrity?
Q94. Can regression be used to solve Classification Problems?
No, regression is used for predicting continuous values, while classification is used for predicting categorical values.
Regression predicts continuous values, while classification predicts categorical values
Regression models include linear regression, polynomial regression, etc.
Classification models include logistic regression, decision trees, etc.
Q95. 7 gold bars connected with chain. how to give the gold bar to worker for each day, with minimal number if cuts.
Cut the chain twice to give one gold bar each day.
Cut the chain between the first and second gold bar to give one bar each day.
On the last day, cut the chain between the sixth and seventh gold bar to give the last bar.
Total number of cuts required is 2.
Asked on company values and honesty?
Can you handle pressure?
Q97. Why the offline data is so important for any company?
Offline data is important for companies as it provides insights into customer behavior and preferences.
Offline data can help companies understand customer behavior and preferences
It can be used to identify trends and patterns in customer data
Offline data can also be used to improve customer experience and personalize marketing efforts
Examples of offline data include in-store purchases, customer service interactions, and surveys
Offline data can be combined with online data to ...read more
Q98. What is Regression and Classification in Machine Learning??
Regression predicts continuous values while classification predicts categorical values.
Regression is used to predict a continuous output variable based on one or more input variables.
Classification is used to predict a categorical output variable based on one or more input variables.
Regression algorithms include linear regression, polynomial regression, and logistic regression.
Classification algorithms include decision trees, random forests, and support vector machines.
Exampl...read more
Q99. Tell us data science project where you developed a script for forecastings, KNN, Data science models with practical examples.
Developed a script for forecasting using KNN and data science models.
Developed a script using Python to forecast sales for a retail company using KNN algorithm.
Built a data science model to predict customer churn for a telecom company using logistic regression.
Used time series analysis to forecast stock prices for a financial services company.
Developed a script to predict the likelihood of a customer defaulting on a loan using decision trees.
Implemented a recommendation syste...read more
Q100. Compare 2 scorce files record count and print if their are equally or not
Compare record count of 2 score files and print if they are equal or not.
Read both score files
Count the number of records in each file
Compare the record counts
Print if they are equal or not
Interview Questions of Similar Designations
Top Interview Questions for Data Analyst Related Skills
Interview experiences of popular companies
Calculate your in-hand salary
Confused about how your in-hand salary is calculated? Enter your annual salary (CTC) and get your in-hand salary
Reviews
Interviews
Salaries
Users/Month