Data Analyst
200+ Data Analyst Interview Questions and Answers for Freshers
Q101. Difference between preference and equity shareholders
Preference shareholders have fixed dividends and priority over equity shareholders in case of liquidation, while equity shareholders have voting rights and residual claim on assets.
Preference shareholders receive fixed dividends before equity shareholders.
Preference shareholders have priority over equity shareholders in case of liquidation.
Equity shareholders have voting rights in the company.
Equity shareholders have a residual claim on assets after all other obligations are ...read more
Q102. Swap L1 and L5 in a list
Swap L1 and L5 in a list
Create a temporary variable to store the value of L1
Assign the value of L5 to L1
Assign the value of the temporary variable to L5
Q103. ml algorithms that you have prior knowledge about
I have prior knowledge about various machine learning algorithms.
Linear Regression
Logistic Regression
Decision Trees
Random Forests
Support Vector Machines
Naive Bayes
K-Nearest Neighbors
Gradient Boosting
Neural Networks
Q104. What do you know about Data extraction?
Data extraction is the process of retrieving data from various sources and transforming it into a usable format.
Data extraction involves identifying relevant data sources
Data is then extracted using various tools and techniques
The extracted data is transformed into a usable format for analysis
Common data extraction tools include SQL, ETL, and web scraping
Data extraction is a crucial step in the data analysis process
Q105. What is web development What you know the my company
Web development is the process of creating websites and web applications using programming languages, frameworks, and tools.
Web development involves front-end (client-side) and back-end (server-side) development.
Front-end development focuses on the user interface and user experience, using languages like HTML, CSS, and JavaScript.
Back-end development involves server-side programming and database management, using languages like PHP, Python, and SQL.
Web development also includ...read more
Q106. Give example of a data science project you worked on
Developed a predictive model to forecast customer churn for a telecommunications company
Collected and cleaned customer data including demographics, usage patterns, and customer service interactions
Performed exploratory data analysis to identify key factors influencing customer churn
Built a machine learning model using logistic regression to predict likelihood of customer churn
Evaluated model performance using metrics such as accuracy, precision, recall, and ROC curve
Provided ...read more
Share interview questions and help millions of jobseekers 🌟
Q107. What is list,tuple,set,dict comprehension
List, tuple, set, and dict comprehensions are concise ways to create these data structures in Python.
List comprehension: [x for x in range(10)]
Tuple comprehension does not exist in Python, as tuples are immutable.
Set comprehension: {x for x in range(10)}
Dict comprehension: {x: x**2 for x in range(10)}
Q108. What is shares and kind of shares
Shares represent ownership in a company and can be classified into different types such as common shares and preferred shares.
Shares represent ownership in a company
Common shares give voting rights to shareholders
Preferred shares have priority in receiving dividends
Other types include treasury shares and bonus shares
Data Analyst Jobs
Q109. Difference between union and union all in SQL
Union combines the results of two or more SELECT statements, while Union All includes all rows, including duplicates.
Union removes duplicate rows, Union All includes all rows
Union is slower as it has to remove duplicates, Union All is faster
Union requires the same number of columns in each SELECT statement, Union All does not
Q110. Explain different kinds of joins with example
Different kinds of joins in SQL are inner join, left join, right join, and full outer join.
Inner join: Returns rows when there is a match in both tables.
Left join: Returns all rows from the left table and the matched rows from the right table.
Right join: Returns all rows from the right table and the matched rows from the left table.
Full outer join: Returns rows when there is a match in either table.
Q111. Explain the difference between DBMS and RDBMS?
DBMS is a software system that manages databases, while RDBMS is a type of DBMS that stores data in a structured format using tables.
DBMS stands for Database Management System, which is a software system that allows users to interact with a database.
RDBMS stands for Relational Database Management System, which is a type of DBMS that stores data in a structured format using tables with relationships between them.
RDBMS enforces ACID properties (Atomicity, Consistency, Isolation...read more
Q112. different operations using NumPy and pandas
NumPy and pandas are powerful libraries for data analysis and manipulation. They offer various operations for different tasks.
NumPy is used for numerical operations on arrays and matrices
Pandas is used for data manipulation and analysis
NumPy offers functions for mathematical operations like addition, subtraction, multiplication, etc.
Pandas offers functions for data cleaning, filtering, merging, and grouping
NumPy arrays are homogeneous while pandas data frames are heterogeneou...read more
Q113. what is primary key and foreign key?
Primary key uniquely identifies each record in a table, while foreign key establishes a link between two tables.
Primary key ensures each record is unique
Foreign key establishes a relationship between tables
Primary key can be a single column or a combination of columns
Foreign key references the primary key of another table
Q114. How will merge multiple file in python
Merge multiple files in Python using pandas.concat() or pd.merge() functions.
Use pandas.concat() function to merge multiple files vertically (row-wise).
Use pd.merge() function to merge multiple files horizontally (column-wise) based on a common column.
Ensure that the files have compatible column names and data types before merging.
Handle any missing or duplicate values appropriately during the merging process.
Consider using parameters like 'axis', 'join', 'on', 'how', and 'su...read more
Q115. What projects done during engineering?
During my engineering, I worked on projects related to data analysis, machine learning, and software development.
Developed a predictive model for stock price forecasting using machine learning algorithms
Created a data visualization dashboard for analyzing customer behavior patterns
Implemented a sentiment analysis tool for social media data using natural language processing techniques
Q116. What is money and type of money
Money is a medium of exchange that is widely accepted in transactions and represents value.
Money is a form of currency used to facilitate trade and commerce.
It can be in the form of physical objects like coins and banknotes, or digital representations like electronic money.
Money serves as a store of value, unit of account, and a medium of exchange.
Types of money include fiat money, commodity money, and representative money.
Fiat money is government-issued currency that is not ...read more
Q117. What is SCADA, what is transformer motor
SCADA stands for Supervisory Control and Data Acquisition, used to monitor and control industrial processes. A transformer motor is a type of electric motor used to drive transformers.
SCADA is a system used to remotely monitor and control industrial processes
It collects real-time data from sensors and equipment in the field
SCADA systems are commonly used in industries such as power plants, water treatment facilities, and manufacturing plants
A transformer motor is an electric ...read more
Q118. MYSQL codes in window function
MYSQL window functions allow for calculations across rows in a result set.
Window functions are used to perform calculations across rows in a result set
They are used with the OVER() clause
Examples include ROW_NUMBER(), RANK(), DENSE_RANK(), and NTILE()
They can be used to calculate running totals, moving averages, and more
Q119. What are the phases of a clinical trial?
The phases of a clinical trial are crucial stages in testing the safety and effectiveness of a new treatment or intervention.
Phase 1: Small group of healthy volunteers to test safety and dosage.
Phase 2: Larger group to further evaluate safety and effectiveness.
Phase 3: Large group to confirm effectiveness, monitor side effects, and compare to existing treatments.
Phase 4: Post-marketing surveillance after treatment is approved and on the market.
Q120. Comfortability with the Different types of content
Comfortability with different types of content is essential for a data analyst to effectively analyze and interpret data.
Understanding and analyzing structured data such as numerical data in spreadsheets
Analyzing unstructured data like text documents, social media posts, and emails
Working with multimedia content like images and videos
Ability to interpret and analyze data from various sources and formats
Experience with different data visualization techniques to present finding...read more
Q121. Different between truncate and delete commands
Truncate and delete are SQL commands used to remove data from a table, but they differ in their functionality.
Truncate is a DDL command that removes all rows from a table, but keeps the structure intact.
Delete is a DML command that removes specific rows from a table based on a condition.
Truncate is faster than delete as it doesn't generate any transaction logs.
Delete can be rolled back, but truncate cannot be rolled back.
Truncate resets the identity seed of the table, while d...read more
Q122. What is the lifecycle of data
The lifecycle of data refers to the stages of data from its creation to its disposal.
Data creation
Data storage
Data processing and analysis
Data sharing and dissemination
Data archiving and disposal
Q123. Various machine learning algorithm and application
Machine learning algorithms are used in various applications such as image recognition, natural language processing, and predictive analytics.
Supervised learning algorithms: linear regression, logistic regression, decision trees, random forests, support vector machines
Unsupervised learning algorithms: k-means clustering, hierarchical clustering, principal component analysis
Deep learning algorithms: convolutional neural networks, recurrent neural networks
Applications: image re...read more
Q124. What are joins and its types?
Joins are used to combine rows from two or more tables based on a related column between them.
Types of joins include INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.
INNER JOIN returns rows when there is at least one match in both tables.
LEFT JOIN returns all rows from the left table and the matched rows from the right table.
RIGHT JOIN returns all rows from the right table and the matched rows from the left table.
FULL JOIN returns rows when there is a match in one of the tabl...read more
Q125. What is an inactive relationship in PBI
An inactive relationship in Power BI is a relationship that is not being used in any visualizations or calculations.
Inactive relationships can occur when a relationship is created but not utilized in any measures or visuals.
These relationships do not impact the data model or query performance, but can clutter the model view.
To remove an inactive relationship, you can delete it from the relationship view in Power BI.
Inactive relationships are indicated by a dashed line in the ...read more
Q126. why data analysis and data science
Data analysis and data science are crucial for extracting valuable insights from large datasets to drive informed decision-making.
Data analysis and data science help in uncovering patterns and trends within data.
They enable businesses to make data-driven decisions for improved efficiency and effectiveness.
These fields also play a vital role in predictive analytics and forecasting.
Examples include using machine learning algorithms to predict customer behavior or analyzing heal...read more
Q127. What is key features python programming
Key features of Python programming include simplicity, readability, versatility, and extensive libraries.
Python is known for its simple and readable syntax, making it easy to learn and understand.
It is a versatile language that can be used for various purposes such as web development, data analysis, and artificial intelligence.
Python has a vast collection of libraries and frameworks that provide ready-to-use tools for different tasks, such as NumPy for scientific computing an...read more
Q128. Explain classes objects
Classes are blueprints for creating objects in object-oriented programming.
Classes define the properties and behaviors of objects
Objects are instances of classes
Classes can inherit properties and behaviors from other classes
Classes can have constructors to initialize objects
Classes can have methods to perform actions
Q129. Explain Moving Average
Moving Average is a statistical technique used to analyze data points by creating a series of averages of different subsets of the full data set.
Moving Average helps in smoothing out fluctuations in data to identify trends over time.
It is calculated by taking the average of a specific number of data points within a defined window.
For example, a 3-day moving average for stock prices would be the average of the stock prices over the past 3 days.
Moving Average is commonly used i...read more
Q130. Tell me about the types of join sql.
Types of join SQL include inner join, left join, right join, and full outer join.
Inner join: Returns rows when there is a match in both tables.
Left join: Returns all rows from the left table and the matched rows from the right table.
Right join: Returns all rows from the right table and the matched rows from the left table.
Full outer join: Returns rows when there is a match in either table.
Q131. Write a SQL query, what is vlookup etc
A SQL query is a command used to retrieve data from a database. VLOOKUP is an Excel function used to search for a value in a table.
SQL query retrieves data from a database using SELECT, FROM, WHERE clauses
VLOOKUP searches for a value in a table and returns a corresponding value from a specified column
Example SQL query: SELECT * FROM table_name WHERE column_name = 'value'
Example VLOOKUP formula: =VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])
Q132. What is RLS in Power BI
RLS in Power BI stands for Row-Level Security, which allows users to restrict access to certain rows of data based on their role or permissions.
RLS helps in controlling access to data at the row level
It allows users to define security roles and rules to restrict data access
Users can create filters based on roles to limit data visibility
For example, a manager can only see data related to their department using RLS
Q133. Difference between rank ,dense rank and row no.
Rank assigns a unique rank to each row, Dense Rank assigns consecutive ranks without gaps, Row Number assigns a unique number to each row.
Rank function assigns a unique rank to each row based on the specified column values.
Dense Rank function assigns consecutive ranks to rows without any gaps.
Row Number function assigns a unique number to each row in the result set.
Example: If we have scores of 90, 85, 85, 70, then Rank would be 1, 2, 2, 4; Dense Rank would be 1, 2, 2, 3; Row...read more
Q134. Write a program to find if a number is prime
Program to check if a number is prime
Iterate from 2 to square root of the number
Check if the number is divisible by any number in the range
If divisible by any number, it is not prime
If not divisible by any number, it is prime
Q135. What is artificial intelligence
Artificial intelligence is the simulation of human intelligence processes by machines, especially computer systems.
AI involves machines performing tasks that typically require human intelligence, such as visual perception, speech recognition, decision-making, and language translation.
Examples of AI include virtual assistants like Siri and Alexa, self-driving cars, recommendation systems like Netflix's algorithm, and facial recognition technology.
AI can be categorized into nar...read more
Q136. How to merge 2 csv files
To merge two CSV files, you can use software like Microsoft Excel or programming languages like Python.
Open both CSV files in a software like Microsoft Excel.
Copy the data from one CSV file and paste it into the other CSV file.
Save the merged CSV file with a new name.
Alternatively, you can use programming languages like Python to merge CSV files by reading both files, combining the data, and writing to a new file.
Q137. What is Python Generators
Python generators are functions that return an iterator and generate values on the fly.
Generators allow us to generate a sequence of values on the fly, without having to store them in memory.
They are defined using the 'yield' keyword instead of 'return'.
Generators can be used to iterate over large datasets, or to generate an infinite sequence of values.
Example: def my_generator(): yield 1; yield 2; yield 3;
Example: for i in my_generator(): print(i)
Q138. function to create a transaction bin column
Create a function to generate a transaction bin column based on transaction amounts.
Create bins based on transaction amounts (e.g. $0-$100, $101-$200, etc.)
Use pandas cut() function in Python to create bins
Assign bin labels to the transactions based on the bin ranges
Q139. Import vs Direct Query vs Live connection
Import vs Direct Query vs Live connection
Import: Data is imported into the tool for analysis, suitable for small datasets or when real-time data is not required
Direct Query: Data is queried directly from the source in real-time, suitable for large datasets or when up-to-date data is needed
Live connection: Data is connected live to the source, allowing for real-time updates and analysis without storing data locally
Q140. How do you match our values?
I match your values through my commitment to integrity, teamwork, and continuous improvement.
I prioritize honesty and transparency in all data analysis processes.
I value collaboration and actively seek input from team members to enhance project outcomes.
I am dedicated to ongoing learning and skill development to stay current in the field.
I strive for excellence in all tasks and take pride in delivering high-quality work.
I am passionate about making a positive impact through d...read more
Q141. difference between adsl and blob storage
ADSL is a type of internet connection technology, while Blob storage is a type of cloud storage service.
ADSL (Asymmetric Digital Subscriber Line) is a type of broadband connection that uses existing telephone lines to transmit data.
Blob storage is a type of cloud storage service provided by platforms like Azure, AWS, and Google Cloud.
ADSL typically has slower upload speeds compared to download speeds, while Blob storage is designed for storing large amounts of unstructured da...read more
Q142. What is capital Market.
Capital market is a financial market where long-term securities such as stocks, bonds, and other investments are traded.
Capital market is a platform for companies to raise funds for long-term investments.
It includes both primary and secondary markets.
Investors can buy and sell securities in the capital market.
Examples of capital market include stock exchanges, bond markets, and derivatives markets.
Q143. Why data analytics?
Data analytics allows for uncovering valuable insights from data to drive informed decision-making and improve business performance.
Data analytics helps in identifying trends and patterns within data sets
It enables businesses to make data-driven decisions for better outcomes
Data analytics can optimize processes and improve efficiency
It helps in predicting future trends and behaviors based on historical data
Data analytics is essential for measuring the success of strategies an...read more
Q144. Whats is sql and power bi.
SQL is a programming language used for managing and querying databases. Power BI is a business analytics tool for visualizing and analyzing data.
SQL stands for Structured Query Language and is used to communicate with databases.
SQL can be used to retrieve, update, and manipulate data in databases.
Power BI is a data visualization tool that allows users to create interactive reports and dashboards.
Power BI can connect to various data sources, clean and transform data, and creat...read more
Q145. What are aggregate functions?
Aggregate functions are functions in databases that perform a calculation on a set of values and return a single value.
Aggregate functions are used to perform operations on a group of rows and return a single result.
Common aggregate functions include SUM, AVG, COUNT, MIN, and MAX.
For example, SUM function calculates the total of a column, AVG calculates the average, COUNT counts the number of rows, MIN finds the minimum value, and MAX finds the maximum value.
Q146. Difference between Group by and Having
Group by is used to group rows that have the same values into summary rows, while Having is used to filter groups based on a specified condition.
Group by is used with aggregate functions to group rows based on one or more columns.
Having is used to filter groups based on a specified condition after the group by operation.
Group by is used before the Having clause in a SQL query.
Q147. Different type of table joining in sql
Different types of table joins in SQL include inner join, left join, right join, and full outer join.
Inner join: Returns rows when there is a match in both tables.
Left join: Returns all rows from the left table and the matched rows from the right table.
Right join: Returns all rows from the right table and the matched rows from the left table.
Full outer join: Returns rows when there is a match in either table.
Q148. Name different stock exchange internationally
Some of the major international stock exchanges include NYSE, NASDAQ, LSE, TSX, and HKEX.
NYSE - New York Stock Exchange
NASDAQ - National Association of Securities Dealers Automated Quotations
LSE - London Stock Exchange
TSX - Toronto Stock Exchange
HKEX - Hong Kong Stock Exchange
Q149. Why Datascience?
Data science is the key to unlocking insights and making data-driven decisions.
Data science helps to extract meaningful insights from data
It enables businesses to make data-driven decisions
It involves various techniques such as machine learning, data mining, and statistical analysis
Data science is used in various industries such as healthcare, finance, and e-commerce
Q150. Short term and long term goal
Short term goal is to enhance data analysis skills, long term goal is to become a data science expert.
Short term goal: Improve proficiency in SQL, Python, and data visualization tools
Long term goal: Obtain advanced certifications in machine learning and AI
Short term goal: Complete online courses on statistical analysis and data cleaning
Long term goal: Lead data science projects and mentor junior analysts
Interview Questions of Similar Designations
Top Interview Questions for Data Analyst Related Skills
Interview experiences of popular companies
Calculate your in-hand salary
Confused about how your in-hand salary is calculated? Enter your annual salary (CTC) and get your in-hand salary
Reviews
Interviews
Salaries
Users/Month