Data Analyst

filter-iconFilter interviews by

200+ Data Analyst Interview Questions and Answers for Freshers

Updated 11 Mar 2025

Q101. Difference between preference and equity shareholders

Ans.

Preference shareholders have fixed dividends and priority over equity shareholders in case of liquidation, while equity shareholders have voting rights and residual claim on assets.

  • Preference shareholders receive fixed dividends before equity shareholders.

  • Preference shareholders have priority over equity shareholders in case of liquidation.

  • Equity shareholders have voting rights in the company.

  • Equity shareholders have a residual claim on assets after all other obligations are ...read more

Q102. Swap L1 and L5 in a list

Ans.

Swap L1 and L5 in a list

  • Create a temporary variable to store the value of L1

  • Assign the value of L5 to L1

  • Assign the value of the temporary variable to L5

Q103. ml algorithms that you have prior knowledge about

Ans.

I have prior knowledge about various machine learning algorithms.

  • Linear Regression

  • Logistic Regression

  • Decision Trees

  • Random Forests

  • Support Vector Machines

  • Naive Bayes

  • K-Nearest Neighbors

  • Gradient Boosting

  • Neural Networks

Q104. What do you know about Data extraction?

Ans.

Data extraction is the process of retrieving data from various sources and transforming it into a usable format.

  • Data extraction involves identifying relevant data sources

  • Data is then extracted using various tools and techniques

  • The extracted data is transformed into a usable format for analysis

  • Common data extraction tools include SQL, ETL, and web scraping

  • Data extraction is a crucial step in the data analysis process

Are these interview questions helpful?

Q105. What is web development What you know the my company

Ans.

Web development is the process of creating websites and web applications using programming languages, frameworks, and tools.

  • Web development involves front-end (client-side) and back-end (server-side) development.

  • Front-end development focuses on the user interface and user experience, using languages like HTML, CSS, and JavaScript.

  • Back-end development involves server-side programming and database management, using languages like PHP, Python, and SQL.

  • Web development also includ...read more

Q106. Give example of a data science project you worked on

Ans.

Developed a predictive model to forecast customer churn for a telecommunications company

  • Collected and cleaned customer data including demographics, usage patterns, and customer service interactions

  • Performed exploratory data analysis to identify key factors influencing customer churn

  • Built a machine learning model using logistic regression to predict likelihood of customer churn

  • Evaluated model performance using metrics such as accuracy, precision, recall, and ROC curve

  • Provided ...read more

Share interview questions and help millions of jobseekers 🌟

man-with-laptop

Q107. What is list,tuple,set,dict comprehension

Ans.

List, tuple, set, and dict comprehensions are concise ways to create these data structures in Python.

  • List comprehension: [x for x in range(10)]

  • Tuple comprehension does not exist in Python, as tuples are immutable.

  • Set comprehension: {x for x in range(10)}

  • Dict comprehension: {x: x**2 for x in range(10)}

Q108. What is shares and kind of shares

Ans.

Shares represent ownership in a company and can be classified into different types such as common shares and preferred shares.

  • Shares represent ownership in a company

  • Common shares give voting rights to shareholders

  • Preferred shares have priority in receiving dividends

  • Other types include treasury shares and bonus shares

Data Analyst Jobs

Data Analyst 0-5 years
S&P Global Inc.
4.2
Hyderabad / Secunderabad
Data Analyst 1-5 years
Equitas Small Finance Bank Ltd
4.5
Chennai
Data Analyst, Field Oper Return & ReCommerce 5-7 years
Amazon India Software Dev Centre Pvt Ltd
4.1
Hyderabad / Secunderabad

Q109. Difference between union and union all in SQL

Ans.

Union combines the results of two or more SELECT statements, while Union All includes all rows, including duplicates.

  • Union removes duplicate rows, Union All includes all rows

  • Union is slower as it has to remove duplicates, Union All is faster

  • Union requires the same number of columns in each SELECT statement, Union All does not

Q110. Explain different kinds of joins with example

Ans.

Different kinds of joins in SQL are inner join, left join, right join, and full outer join.

  • Inner join: Returns rows when there is a match in both tables.

  • Left join: Returns all rows from the left table and the matched rows from the right table.

  • Right join: Returns all rows from the right table and the matched rows from the left table.

  • Full outer join: Returns rows when there is a match in either table.

Q111. Explain the difference between DBMS and RDBMS?

Ans.

DBMS is a software system that manages databases, while RDBMS is a type of DBMS that stores data in a structured format using tables.

  • DBMS stands for Database Management System, which is a software system that allows users to interact with a database.

  • RDBMS stands for Relational Database Management System, which is a type of DBMS that stores data in a structured format using tables with relationships between them.

  • RDBMS enforces ACID properties (Atomicity, Consistency, Isolation...read more

Q112. different operations using NumPy and pandas

Ans.

NumPy and pandas are powerful libraries for data analysis and manipulation. They offer various operations for different tasks.

  • NumPy is used for numerical operations on arrays and matrices

  • Pandas is used for data manipulation and analysis

  • NumPy offers functions for mathematical operations like addition, subtraction, multiplication, etc.

  • Pandas offers functions for data cleaning, filtering, merging, and grouping

  • NumPy arrays are homogeneous while pandas data frames are heterogeneou...read more

Q113. what is primary key and foreign key?

Ans.

Primary key uniquely identifies each record in a table, while foreign key establishes a link between two tables.

  • Primary key ensures each record is unique

  • Foreign key establishes a relationship between tables

  • Primary key can be a single column or a combination of columns

  • Foreign key references the primary key of another table

Q114. How will merge multiple file in python

Ans.

Merge multiple files in Python using pandas.concat() or pd.merge() functions.

  • Use pandas.concat() function to merge multiple files vertically (row-wise).

  • Use pd.merge() function to merge multiple files horizontally (column-wise) based on a common column.

  • Ensure that the files have compatible column names and data types before merging.

  • Handle any missing or duplicate values appropriately during the merging process.

  • Consider using parameters like 'axis', 'join', 'on', 'how', and 'su...read more

Q115. What projects done during engineering?

Ans.

During my engineering, I worked on projects related to data analysis, machine learning, and software development.

  • Developed a predictive model for stock price forecasting using machine learning algorithms

  • Created a data visualization dashboard for analyzing customer behavior patterns

  • Implemented a sentiment analysis tool for social media data using natural language processing techniques

Q116. What is money and type of money

Ans.

Money is a medium of exchange that is widely accepted in transactions and represents value.

  • Money is a form of currency used to facilitate trade and commerce.

  • It can be in the form of physical objects like coins and banknotes, or digital representations like electronic money.

  • Money serves as a store of value, unit of account, and a medium of exchange.

  • Types of money include fiat money, commodity money, and representative money.

  • Fiat money is government-issued currency that is not ...read more

Q117. What is SCADA, what is transformer motor

Ans.

SCADA stands for Supervisory Control and Data Acquisition, used to monitor and control industrial processes. A transformer motor is a type of electric motor used to drive transformers.

  • SCADA is a system used to remotely monitor and control industrial processes

  • It collects real-time data from sensors and equipment in the field

  • SCADA systems are commonly used in industries such as power plants, water treatment facilities, and manufacturing plants

  • A transformer motor is an electric ...read more

Q118. MYSQL codes in window function

Ans.

MYSQL window functions allow for calculations across rows in a result set.

  • Window functions are used to perform calculations across rows in a result set

  • They are used with the OVER() clause

  • Examples include ROW_NUMBER(), RANK(), DENSE_RANK(), and NTILE()

  • They can be used to calculate running totals, moving averages, and more

Q119. What are the phases of a clinical trial?

Ans.

The phases of a clinical trial are crucial stages in testing the safety and effectiveness of a new treatment or intervention.

  • Phase 1: Small group of healthy volunteers to test safety and dosage.

  • Phase 2: Larger group to further evaluate safety and effectiveness.

  • Phase 3: Large group to confirm effectiveness, monitor side effects, and compare to existing treatments.

  • Phase 4: Post-marketing surveillance after treatment is approved and on the market.

Q120. Comfortability with the Different types of content

Ans.

Comfortability with different types of content is essential for a data analyst to effectively analyze and interpret data.

  • Understanding and analyzing structured data such as numerical data in spreadsheets

  • Analyzing unstructured data like text documents, social media posts, and emails

  • Working with multimedia content like images and videos

  • Ability to interpret and analyze data from various sources and formats

  • Experience with different data visualization techniques to present finding...read more

Q121. Different between truncate and delete commands

Ans.

Truncate and delete are SQL commands used to remove data from a table, but they differ in their functionality.

  • Truncate is a DDL command that removes all rows from a table, but keeps the structure intact.

  • Delete is a DML command that removes specific rows from a table based on a condition.

  • Truncate is faster than delete as it doesn't generate any transaction logs.

  • Delete can be rolled back, but truncate cannot be rolled back.

  • Truncate resets the identity seed of the table, while d...read more

Q122. What is the lifecycle of data

Ans.

The lifecycle of data refers to the stages of data from its creation to its disposal.

  • Data creation

  • Data storage

  • Data processing and analysis

  • Data sharing and dissemination

  • Data archiving and disposal

Q123. Various machine learning algorithm and application

Ans.

Machine learning algorithms are used in various applications such as image recognition, natural language processing, and predictive analytics.

  • Supervised learning algorithms: linear regression, logistic regression, decision trees, random forests, support vector machines

  • Unsupervised learning algorithms: k-means clustering, hierarchical clustering, principal component analysis

  • Deep learning algorithms: convolutional neural networks, recurrent neural networks

  • Applications: image re...read more

Q124. What are joins and its types?

Ans.

Joins are used to combine rows from two or more tables based on a related column between them.

  • Types of joins include INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.

  • INNER JOIN returns rows when there is at least one match in both tables.

  • LEFT JOIN returns all rows from the left table and the matched rows from the right table.

  • RIGHT JOIN returns all rows from the right table and the matched rows from the left table.

  • FULL JOIN returns rows when there is a match in one of the tabl...read more

Q125. What is an inactive relationship in PBI

Ans.

An inactive relationship in Power BI is a relationship that is not being used in any visualizations or calculations.

  • Inactive relationships can occur when a relationship is created but not utilized in any measures or visuals.

  • These relationships do not impact the data model or query performance, but can clutter the model view.

  • To remove an inactive relationship, you can delete it from the relationship view in Power BI.

  • Inactive relationships are indicated by a dashed line in the ...read more

Q126. why data analysis and data science

Ans.

Data analysis and data science are crucial for extracting valuable insights from large datasets to drive informed decision-making.

  • Data analysis and data science help in uncovering patterns and trends within data.

  • They enable businesses to make data-driven decisions for improved efficiency and effectiveness.

  • These fields also play a vital role in predictive analytics and forecasting.

  • Examples include using machine learning algorithms to predict customer behavior or analyzing heal...read more

Q127. What is key features python programming

Ans.

Key features of Python programming include simplicity, readability, versatility, and extensive libraries.

  • Python is known for its simple and readable syntax, making it easy to learn and understand.

  • It is a versatile language that can be used for various purposes such as web development, data analysis, and artificial intelligence.

  • Python has a vast collection of libraries and frameworks that provide ready-to-use tools for different tasks, such as NumPy for scientific computing an...read more

Q128. Explain classes objects

Ans.

Classes are blueprints for creating objects in object-oriented programming.

  • Classes define the properties and behaviors of objects

  • Objects are instances of classes

  • Classes can inherit properties and behaviors from other classes

  • Classes can have constructors to initialize objects

  • Classes can have methods to perform actions

Q129. Explain Moving Average

Ans.

Moving Average is a statistical technique used to analyze data points by creating a series of averages of different subsets of the full data set.

  • Moving Average helps in smoothing out fluctuations in data to identify trends over time.

  • It is calculated by taking the average of a specific number of data points within a defined window.

  • For example, a 3-day moving average for stock prices would be the average of the stock prices over the past 3 days.

  • Moving Average is commonly used i...read more

Q130. Tell me about the types of join sql.

Ans.

Types of join SQL include inner join, left join, right join, and full outer join.

  • Inner join: Returns rows when there is a match in both tables.

  • Left join: Returns all rows from the left table and the matched rows from the right table.

  • Right join: Returns all rows from the right table and the matched rows from the left table.

  • Full outer join: Returns rows when there is a match in either table.

Q131. Write a SQL query, what is vlookup etc

Ans.

A SQL query is a command used to retrieve data from a database. VLOOKUP is an Excel function used to search for a value in a table.

  • SQL query retrieves data from a database using SELECT, FROM, WHERE clauses

  • VLOOKUP searches for a value in a table and returns a corresponding value from a specified column

  • Example SQL query: SELECT * FROM table_name WHERE column_name = 'value'

  • Example VLOOKUP formula: =VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])

Q132. What is RLS in Power BI

Ans.

RLS in Power BI stands for Row-Level Security, which allows users to restrict access to certain rows of data based on their role or permissions.

  • RLS helps in controlling access to data at the row level

  • It allows users to define security roles and rules to restrict data access

  • Users can create filters based on roles to limit data visibility

  • For example, a manager can only see data related to their department using RLS

Q133. Difference between rank ,dense rank and row no.

Ans.

Rank assigns a unique rank to each row, Dense Rank assigns consecutive ranks without gaps, Row Number assigns a unique number to each row.

  • Rank function assigns a unique rank to each row based on the specified column values.

  • Dense Rank function assigns consecutive ranks to rows without any gaps.

  • Row Number function assigns a unique number to each row in the result set.

  • Example: If we have scores of 90, 85, 85, 70, then Rank would be 1, 2, 2, 4; Dense Rank would be 1, 2, 2, 3; Row...read more

Q134. Write a program to find if a number is prime

Ans.

Program to check if a number is prime

  • Iterate from 2 to square root of the number

  • Check if the number is divisible by any number in the range

  • If divisible by any number, it is not prime

  • If not divisible by any number, it is prime

Q135. What is artificial intelligence

Ans.

Artificial intelligence is the simulation of human intelligence processes by machines, especially computer systems.

  • AI involves machines performing tasks that typically require human intelligence, such as visual perception, speech recognition, decision-making, and language translation.

  • Examples of AI include virtual assistants like Siri and Alexa, self-driving cars, recommendation systems like Netflix's algorithm, and facial recognition technology.

  • AI can be categorized into nar...read more

Frequently asked in, ,

Q136. How to merge 2 csv files

Ans.

To merge two CSV files, you can use software like Microsoft Excel or programming languages like Python.

  • Open both CSV files in a software like Microsoft Excel.

  • Copy the data from one CSV file and paste it into the other CSV file.

  • Save the merged CSV file with a new name.

  • Alternatively, you can use programming languages like Python to merge CSV files by reading both files, combining the data, and writing to a new file.

Q137. What is Python Generators

Ans.

Python generators are functions that return an iterator and generate values on the fly.

  • Generators allow us to generate a sequence of values on the fly, without having to store them in memory.

  • They are defined using the 'yield' keyword instead of 'return'.

  • Generators can be used to iterate over large datasets, or to generate an infinite sequence of values.

  • Example: def my_generator(): yield 1; yield 2; yield 3;

  • Example: for i in my_generator(): print(i)

Q138. function to create a transaction bin column

Ans.

Create a function to generate a transaction bin column based on transaction amounts.

  • Create bins based on transaction amounts (e.g. $0-$100, $101-$200, etc.)

  • Use pandas cut() function in Python to create bins

  • Assign bin labels to the transactions based on the bin ranges

Q139. Import vs Direct Query vs Live connection

Ans.

Import vs Direct Query vs Live connection

  • Import: Data is imported into the tool for analysis, suitable for small datasets or when real-time data is not required

  • Direct Query: Data is queried directly from the source in real-time, suitable for large datasets or when up-to-date data is needed

  • Live connection: Data is connected live to the source, allowing for real-time updates and analysis without storing data locally

Q140. How do you match our values?

Ans.

I match your values through my commitment to integrity, teamwork, and continuous improvement.

  • I prioritize honesty and transparency in all data analysis processes.

  • I value collaboration and actively seek input from team members to enhance project outcomes.

  • I am dedicated to ongoing learning and skill development to stay current in the field.

  • I strive for excellence in all tasks and take pride in delivering high-quality work.

  • I am passionate about making a positive impact through d...read more

Q141. difference between adsl and blob storage

Ans.

ADSL is a type of internet connection technology, while Blob storage is a type of cloud storage service.

  • ADSL (Asymmetric Digital Subscriber Line) is a type of broadband connection that uses existing telephone lines to transmit data.

  • Blob storage is a type of cloud storage service provided by platforms like Azure, AWS, and Google Cloud.

  • ADSL typically has slower upload speeds compared to download speeds, while Blob storage is designed for storing large amounts of unstructured da...read more

Q142. What is capital Market.

Ans.

Capital market is a financial market where long-term securities such as stocks, bonds, and other investments are traded.

  • Capital market is a platform for companies to raise funds for long-term investments.

  • It includes both primary and secondary markets.

  • Investors can buy and sell securities in the capital market.

  • Examples of capital market include stock exchanges, bond markets, and derivatives markets.

Q143. Why data analytics?

Ans.

Data analytics allows for uncovering valuable insights from data to drive informed decision-making and improve business performance.

  • Data analytics helps in identifying trends and patterns within data sets

  • It enables businesses to make data-driven decisions for better outcomes

  • Data analytics can optimize processes and improve efficiency

  • It helps in predicting future trends and behaviors based on historical data

  • Data analytics is essential for measuring the success of strategies an...read more

Q144. Whats is sql and power bi.

Ans.

SQL is a programming language used for managing and querying databases. Power BI is a business analytics tool for visualizing and analyzing data.

  • SQL stands for Structured Query Language and is used to communicate with databases.

  • SQL can be used to retrieve, update, and manipulate data in databases.

  • Power BI is a data visualization tool that allows users to create interactive reports and dashboards.

  • Power BI can connect to various data sources, clean and transform data, and creat...read more

Q145. What are aggregate functions?

Ans.

Aggregate functions are functions in databases that perform a calculation on a set of values and return a single value.

  • Aggregate functions are used to perform operations on a group of rows and return a single result.

  • Common aggregate functions include SUM, AVG, COUNT, MIN, and MAX.

  • For example, SUM function calculates the total of a column, AVG calculates the average, COUNT counts the number of rows, MIN finds the minimum value, and MAX finds the maximum value.

Q146. Difference between Group by and Having

Ans.

Group by is used to group rows that have the same values into summary rows, while Having is used to filter groups based on a specified condition.

  • Group by is used with aggregate functions to group rows based on one or more columns.

  • Having is used to filter groups based on a specified condition after the group by operation.

  • Group by is used before the Having clause in a SQL query.

Q147. Different type of table joining in sql

Ans.

Different types of table joins in SQL include inner join, left join, right join, and full outer join.

  • Inner join: Returns rows when there is a match in both tables.

  • Left join: Returns all rows from the left table and the matched rows from the right table.

  • Right join: Returns all rows from the right table and the matched rows from the left table.

  • Full outer join: Returns rows when there is a match in either table.

Q148. Name different stock exchange internationally

Ans.

Some of the major international stock exchanges include NYSE, NASDAQ, LSE, TSX, and HKEX.

  • NYSE - New York Stock Exchange

  • NASDAQ - National Association of Securities Dealers Automated Quotations

  • LSE - London Stock Exchange

  • TSX - Toronto Stock Exchange

  • HKEX - Hong Kong Stock Exchange

Q149. Why Datascience?

Ans.

Data science is the key to unlocking insights and making data-driven decisions.

  • Data science helps to extract meaningful insights from data

  • It enables businesses to make data-driven decisions

  • It involves various techniques such as machine learning, data mining, and statistical analysis

  • Data science is used in various industries such as healthcare, finance, and e-commerce

Q150. Short term and long term goal

Ans.

Short term goal is to enhance data analysis skills, long term goal is to become a data science expert.

  • Short term goal: Improve proficiency in SQL, Python, and data visualization tools

  • Long term goal: Obtain advanced certifications in machine learning and AI

  • Short term goal: Complete online courses on statistical analysis and data cleaning

  • Long term goal: Lead data science projects and mentor junior analysts

Previous
1
2
3
4
5
Next
Interview Tips & Stories
Ace your next interview with expert advice and inspiring stories

Interview experiences of popular companies

3.7
 • 10.5k Interviews
3.8
 • 8.2k Interviews
3.6
 • 7.6k Interviews
3.7
 • 5.6k Interviews
3.7
 • 5.6k Interviews
3.7
 • 4.8k Interviews
3.8
 • 2.8k Interviews
3.7
 • 740 Interviews
4.1
 • 277 Interviews
3.4
 • 81 Interviews
View all

Calculate your in-hand salary

Confused about how your in-hand salary is calculated? Enter your annual salary (CTC) and get your in-hand salary

Recently Viewed
INTERVIEWS
METRO Global Solutions Center
No Interviews
SALARIES
EPIC Investment Partners
DESIGNATION
INTERVIEWS
Flipkart
No Interviews
JOBS
Franklin Templeton Investments
No Jobs
SALARIES
EY Global Delivery Services ( EY GDS)
SALARIES
Ernst & Young
SALARIES
Ernst & Young
SALARIES
Ernst & Young
INTERVIEWS
Accenture
No Interviews
Data Analyst Interview Questions
Share an Interview
Stay ahead in your career. Get AmbitionBox app
qr-code
Helping over 1 Crore job seekers every month in choosing their right fit company
65 L+

Reviews

4 L+

Interviews

4 Cr+

Salaries

1 Cr+

Users/Month

Contribute to help millions

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2024 Info Edge (India) Ltd.

Follow us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter