Data Analyst Intern
100+ Data Analyst Intern Interview Questions and Answers

Asked in Oyo Rooms

Q. Water Jug Problem Statement
You have two water jugs with capacities X
and Y
liters respectively, both initially empty. You also have an infinite water supply. The goal is to determine if it is possible to measu...read more
The Water Jug Problem involves determining if a target measurement can be achieved using two jugs of different capacities and specific operations.
Start by filling one jug and transferring water between the jugs to reach the target measurement.
Consider all possible combinations of filling, emptying, and transferring water between the jugs.
Keep track of the states of both jugs and the amount of water in each jug during the operations.
If the target measurement is reached, return...read more

Asked in Intuit

Q. Insertion Sort in a Linked List
Given a singly linked list with 'N' nodes containing integer values, your task is to sort the list using insertion sort and output the sorted list.
Insertion Sort is an algorithm...read more
Implement insertion sort on a singly linked list to sort the elements in-place.
Iterate through the linked list and for each node, find its correct position in the sorted part of the list
Adjust pointers to insert the node in the correct position
Repeat this process until all nodes are sorted
Data Analyst Intern Interview Questions and Answers for Freshers

Asked in Cleantech Solar

loc is label-based indexing while iloc is integer-based indexing in data science. Dashboards are visual tools for data analysis.
loc is used for selecting rows and columns by labels
iloc is used for selecting rows and columns by integer position
Dashboards are visual representations of data for easy analysis and decision-making

Asked in Cleantech Solar

Different types of keys in a database include primary key, foreign key, unique key, and composite key.
Primary key: uniquely identifies each record in a table, must be unique and not null.
Foreign key: establishes a link between two tables, ensures referential integrity.
Unique key: ensures that all values in a column are unique.
Composite key: combination of two or more columns to uniquely identify a record.

Asked in Uber

Q. You have a 5L jar and a 3L jar. How can you measure out exactly 4L of water?
Use the 5L and 3L jars to measure exactly 4L through a series of filling and pouring steps.
Fill the 5L jar completely.
Pour from the 5L jar into the 3L jar until the 3L jar is full (leaving 2L in the 5L jar).
Empty the 3L jar.
Pour the remaining 2L from the 5L jar into the 3L jar.
Fill the 5L jar again completely.
Pour from the 5L jar into the 3L jar until the 3L jar is full (which already has 2L).
This will leave exactly 4L in the 5L jar.

Asked in ANAROCK Property Consultants

Q. What do you mean by MTD ? How to create it
MTD stands for Month-to-Date. It refers to the period from the beginning of the current month up to the present date.
MTD is a common term used in financial and business reporting to track performance within a specific month.
To create MTD, you would sum up the data from the beginning of the month up to the current date.
For example, if you are calculating MTD sales for January 2022 on January 15th, you would sum up all sales data from January 1st to January 15th.
MTD is often us...read more
Data Analyst Intern Jobs




Asked in WNS

Q. You have 3 jars each with labels: one labeled 'Apples', one labeled 'Oranges', and one labeled 'Apples and Oranges'. However, all the jars are labeled incorrectly. You can pick one fruit from each jar. How can...
read morePick a fruit from the jar labeled Apples and Oranges, then pick a fruit from the jar labeled Oranges (since it can't be Oranges), and finally pick a fruit from the jar labeled Apples (since it can't be Apples).
Pick a fruit from the jar labeled Apples and Oranges
Since the jar labeled Oranges can't be Oranges, it must be Apples and Oranges
The remaining jar must be Apples

Asked in Rebel Foods

Q. What is your understanding of data, and how important is it in an organization?
Data is information collected and stored for analysis and decision-making purposes in an organization.
Data is raw facts and figures that need to be processed to provide meaningful information.
It is crucial for organizations to make informed decisions, identify trends, and improve performance.
Examples of data in an organization include sales figures, customer demographics, and website traffic.
Data can be structured (in databases) or unstructured (like text documents or social ...read more
Share interview questions and help millions of jobseekers 🌟

Asked in ANAROCK Property Consultants

Q. Write a query to find the department-wise highest salary in an organization.
Query to find department wise highest salary in an organisation
Use GROUP BY clause to group data by department
Use MAX() function to find highest salary in each department
Join the tables if necessary to get department information

Asked in ANAROCK Property Consultants

Q. What is Data Modelling? Types of Data Models
Data modeling is the process of creating a visual representation of data structures and relationships.
Data modeling involves defining the structure of data, its storage, and how it will be accessed and manipulated.
Types of data models include conceptual, logical, and physical models.
Conceptual models focus on high-level business concepts and relationships.
Logical models define the structure of the data without considering how it will be implemented in a database system.
Physic...read more

Asked in Dhruv Research

Q. Create a dataset containing names, phone numbers, and other details. Extract the sum of all the digits of the phone number into a different column using Python for data manipulation.
Create a dataset with names and phone numbers, then extract the sum of digits from the phone numbers.
Use pandas to create a DataFrame for the dataset.
Define a function to calculate the sum of digits in a phone number.
Apply the function to the phone number column to create a new column with the sums.
Example function: def sum_of_digits(phone): return sum(int(digit) for digit in phone if digit.isdigit())

Asked in Trell

Q. Write a Python code to check if the given sentence is a palindrome or not.
Python code to check if a sentence is palindrome or not
Remove all spaces and convert to lowercase
Reverse the string and compare with original
If both are same, then it is a palindrome

Asked in Crisil

Q. what is rdms? what are the objects in database? difference between olap and oltp? what is view? what is index? what are functions and stored procedures? what are constrains? what are foreign keys?
RDBMS is a relational database management system. Objects in a database include tables, views, indexes, functions, stored procedures, constraints, and foreign keys. OLAP is for data analysis while OLTP is for transaction processing.
RDBMS stands for Relational Database Management System
Objects in a database include tables, views, indexes, functions, stored procedures, constraints, and foreign keys
OLAP (Online Analytical Processing) is used for data analysis and reporting
OLTP (...read more

Asked in Ernst & Young

Q. How do you create separate lines in standard output using C++?
Separate lines in standard output in C++ are used to display different pieces of information on separate lines for better readability.
Separate lines are used to display different outputs or messages in a clear and organized manner.
They are commonly used with the 'endl' or ' ' characters to move to the next line.
For example, cout << 'Hello' << endl; will display 'Hello' on one line and move to the next line for the next output.

Asked in Bright Money

Q. How did you perform data analysis on your project?
I start by defining the problem, collecting relevant data, cleaning and organizing the data, performing analysis using statistical methods and tools, and finally interpreting and presenting the results.
Define the problem statement and objectives of the analysis
Collect relevant data from various sources
Clean and organize the data to ensure accuracy and consistency
Perform analysis using statistical methods and tools such as Excel, Python, or R
Interpret the results and present f...read more
Asked in Wogle Tech

Q. What is the process for writing a Python program to analyze a CSV file and store the analyzed data in an Excel file?
Analyze CSV data using Python and export results to an Excel file for further insights.
Import necessary libraries: Use pandas for data manipulation and openpyxl for Excel file handling.
Read the CSV file: Use pandas' read_csv() function to load the data into a DataFrame.
Data cleaning: Handle missing values and outliers using methods like dropna() or fillna().
Data analysis: Perform operations like groupby(), aggregation, or statistical analysis to derive insights.
Export to Exce...read more

Asked in Capgemini

Q. List all languages in Sql and explain
List of SQL languages and their brief explanation
SQL (Structured Query Language) is a standard language for managing relational databases
T-SQL (Transact-SQL) is a proprietary extension of SQL used by Microsoft SQL Server
PL/SQL (Procedural Language/Structured Query Language) is Oracle Corporation's proprietary extension of SQL
MySQL is an open-source relational database management system that uses SQL
PostgreSQL is an open-source object-relational database management system that...read more
Asked in RBHU Analytics

Q. I will be adding you to the developing team. What tech stacks are you familiar with?
I am familiar with various tech stacks including Python, SQL, and JavaScript for data analysis and development tasks.
Python: Proficient in using libraries like Pandas and NumPy for data manipulation and analysis.
SQL: Experienced in writing complex queries for data extraction and reporting in databases like MySQL and PostgreSQL.
JavaScript: Familiar with frameworks like React for building interactive web applications.
Data Visualization: Skilled in using tools like Tableau and M...read more

Asked in Ernst & Young

Q. How do double linked list work? What is the difference between linked list and double linked list?
A double linked list is a data structure where each node contains a reference to the previous and next node.
In a linked list, each node contains a reference to the next node only, while in a double linked list, each node contains references to both the previous and next nodes.
Double linked lists allow for traversal in both directions, making operations like deletion and insertion easier compared to single linked lists.
Example: In a double linked list, a node might have pointe...read more

Asked in Bhavna Corp.

Q. What is a HashMap in Java?
Hash map is a data structure that stores key-value pairs and allows fast retrieval of values based on keys.
Hash map uses hashing to store and retrieve values based on keys
It allows null values and null keys
It is not synchronized and not thread-safe
Example: HashMap<String, Integer> map = new HashMap<>();
map.put("apple", 1); int value = map.get("apple");

Asked in Capgemini

Q. What is the difference between a super key and a foreign key?
Super key is a set of attributes that uniquely identifies a record, while foreign key is a reference to a primary key in another table.
Super key is a combination of one or more attributes that uniquely identifies a record in a table.
Foreign key is a field in a table that refers to the primary key of another table.
Super key can have additional attributes that are not necessary for uniqueness.
Foreign key establishes a relationship between two tables.
Example: In a database of st...read more

Asked in Tps Infrastructure

Q. What is the difference between a 4-stroke and a 2-stroke engine?
4-stroke engines have 4 strokes per cycle, while 2-stroke engines have 2 strokes per cycle.
4-stroke engines are more fuel-efficient and produce less pollution than 2-stroke engines.
2-stroke engines are simpler and lighter than 4-stroke engines.
4-stroke engines have separate intake, compression, power, and exhaust strokes, while 2-stroke engines combine intake and compression, and power and exhaust strokes.
Examples of 4-stroke engines include those found in cars, while example...read more

Asked in Blinkit

Q. Estimate the number of paper cups used in one day in an office.
Approximately 500 paper cups may be used in a day in an average office.
Consider the number of employees in the office
Think about the average number of hot beverage drinkers
Factor in the number of meetings and events held in the office
Take into account the availability of reusable cups
Estimate based on personal experience or observation

Asked in Macquarie Group

Q. Define your dataset and what difficulties have you faced while preparing your model?
The dataset consists of customer purchase history and demographic information. Difficulties faced include data cleaning and missing values.
Dataset includes customer ID, purchase amount, purchase date, age, gender, and location.
Difficulties faced include handling missing values in the age and location columns.
Data cleaning involved removing duplicates and outliers to ensure accurate analysis.
Normalization and standardization of data for model preparation.
Asked in Technocolabs

Q. Explain how to define an outlier using a boxplot analysis.
Outliers in a boxplot are defined as data points that fall below Q1 - 1.5*IQR or above Q3 + 1.5*IQR.
Calculate the interquartile range (IQR) by subtracting Q1 from Q3.
Identify the lower bound as Q1 - 1.5*IQR and the upper bound as Q3 + 1.5*IQR.
Any data points below the lower bound or above the upper bound are considered outliers.
For example, if Q1 = 10, Q3 = 20, and IQR = 5, then the lower bound = 10 - 1.5*5 = 2.5 and the upper bound = 20 + 1.5*5 = 27.5.

Asked in ANAROCK Property Consultants

Q. What are joins? Types of joins.
Joins are used to combine rows from two or more tables based on a related column between them.
Types of joins include INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.
INNER JOIN returns rows when there is at least one match in both tables.
LEFT JOIN returns all rows from the left table and the matched rows from the right table.
RIGHT JOIN returns all rows from the right table and the matched rows from the left table.
FULL JOIN returns rows when there is a match in one of the tabl...read more
Asked in Sales Point

Q. What is Power Bi, why we use power bi, and what is the power query in power bi?
Power BI is a business analytics tool used to visualize and analyze data. Power Query is a data transformation and shaping tool in Power BI.
Power BI is a powerful business intelligence tool developed by Microsoft.
It allows users to connect to various data sources, transform and clean the data, and create interactive visualizations and reports.
Power BI enables data analysts to gain insights and make data-driven decisions.
Power Query is a data transformation and shaping tool wi...read more

Asked in Labmentix

Q. How do you handle missing or inconsistent data during data cleaning?
I handle missing data by identifying, analyzing, and applying appropriate techniques to ensure data integrity and usability.
Identify missing values using methods like .isnull() in pandas.
Analyze the pattern of missing data to determine if it's random or systematic.
Use imputation techniques, such as filling missing values with the mean or median.
Consider dropping rows or columns with excessive missing data if they don't significantly impact analysis.
For inconsistent data, stan...read more

Asked in Capgemini

Q. Memory management and hash map in java
Memory management and hash map are important concepts in Java programming.
Memory management is the process of allocating and deallocating memory in a program.
Java uses automatic memory management through garbage collection.
Hash map is a data structure that stores key-value pairs and uses hashing to retrieve values efficiently.
Java's HashMap class implements the Map interface and provides constant-time performance for basic operations.
It is important to properly manage memory ...read more

Asked in Uber

Q. what are Sql joins, window functions
SQL joins are used to combine rows from two or more tables based on a related column between them. Window functions perform calculations across a set of table rows that are related to the current row.
SQL joins are used to retrieve data from multiple tables based on a related column between them (e.g. INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN).
Window functions are used to perform calculations on a set of rows related to the current row (e.g. ROW_NUMBER(), RANK(), LAG(), LEA...read more
Interview Questions of Similar Designations
Interview Experiences of Popular Companies





Top Interview Questions for Data Analyst Intern Related Skills



Reviews
Interviews
Salaries
Users

