i
Infosys
Work with us
Filter interviews by
Python is a versatile programming language favored for data analysis due to its libraries and ease of use.
Pandas: A powerful library for data manipulation and analysis, allowing for easy handling of structured data. Example: Using DataFrames for data cleaning.
NumPy: Essential for numerical computations, providing support for arrays and matrices. Example: Performing mathematical operations on large datasets efficie...
Data analysis faces challenges like data quality, integration, interpretation, and scalability, impacting insights and decision-making.
Data Quality: Inaccurate or incomplete data can lead to misleading conclusions. For example, missing values in a dataset can skew results.
Data Integration: Combining data from multiple sources can be difficult due to differing formats or structures. For instance, merging sales data...
Data analysis is the process of inspecting, cleansing, transforming, and modeling data to discover useful information and support decision-making.
Data Collection: Gathering data from various sources, such as surveys, databases, or web scraping. For example, collecting customer feedback from online forms.
Data Cleaning: Removing inaccuracies and inconsistencies in the data to ensure quality. For instance, correcting...
The data analysis process involves collecting, cleaning, analyzing, and interpreting data to derive insights and inform decisions.
Data Collection: Gathering relevant data from various sources, such as surveys, databases, or APIs. For example, collecting sales data from a CRM system.
Data Cleaning: Removing inaccuracies and inconsistencies in the data to ensure quality. This may involve handling missing values or co...
An outlier is a data point that differs significantly from other observations, potentially indicating variability or error.
Definition: Outliers are extreme values that lie outside the overall pattern of distribution in a dataset.
Causes: They can result from measurement errors, data entry mistakes, or genuine variability in the data.
Impact: Outliers can skew statistical analyses, affecting mean, standard deviation,...
Data analysts require a mix of technical and analytical skills to interpret data and provide actionable insights.
Statistical Analysis: Proficiency in statistical methods to analyze data trends and patterns, such as regression analysis or hypothesis testing.
Data Visualization: Skills in tools like Tableau or Power BI to create visual representations of data, making it easier to communicate findings.
Programming Skil...
Normal distribution is a probability distribution that is symmetric about the mean, depicting data that clusters around the mean.
Bell-Shaped Curve: The graph of a normal distribution is bell-shaped, indicating that most observations cluster around the central peak.
Mean, Median, Mode: In a normal distribution, the mean, median, and mode are all equal and located at the center of the distribution.
68-95-99.7 Rule: Ap...
A hash table is a data structure that stores key-value pairs for efficient data retrieval using a hash function.
Key-Value Storage: Hash tables store data in pairs, where each key is unique and maps to a specific value, e.g., {'name': 'Alice', 'age': 30}.
Hash Function: A hash function converts keys into hash codes, which determine the index in the array where the value is stored.
Collision Resolution: When two keys ...
Data visualization transforms complex data into visual formats, making it easier to understand patterns, trends, and insights.
Graphical Representation: Data visualization uses charts, graphs, and maps to represent data visually, such as bar charts for sales data.
Pattern Recognition: Visual formats help identify trends and patterns quickly, like spotting seasonal sales spikes in line graphs.
Interactive Dashboards: ...
Lod commands are used in data analysis to manage and manipulate data efficiently.
Lod stands for Level of Detail, allowing for calculations at different granularities.
Example: Using Lod to calculate average sales per region while displaying data at the city level.
Common Lod commands include FIXED, INCLUDE, and EXCLUDE.
FIXED: Calculates values using specified dimensions regardless of the view.
INCLUDE: Adds dimension...
I appeared for an interview in Apr 2025, where I was asked the following questions.
Python is a versatile programming language favored for data analysis due to its libraries and ease of use.
Pandas: A powerful library for data manipulation and analysis, allowing for easy handling of structured data. Example: Using DataFrames for data cleaning.
NumPy: Essential for numerical computations, providing support for arrays and matrices. Example: Performing mathematical operations on large datasets efficiently.
...
The data analysis process involves collecting, cleaning, analyzing, and interpreting data to derive insights and inform decisions.
Data Collection: Gathering relevant data from various sources, such as surveys, databases, or APIs. For example, collecting sales data from a CRM system.
Data Cleaning: Removing inaccuracies and inconsistencies in the data to ensure quality. This may involve handling missing values or correct...
Data analysis tools help in collecting, processing, and visualizing data to derive insights and support decision-making.
Excel: Widely used for data manipulation and basic analysis, offering functions, pivot tables, and charting capabilities.
Python: A programming language with libraries like Pandas and NumPy for data manipulation, and Matplotlib and Seaborn for visualization.
R: A statistical programming language ideal f...
An outlier is a data point that differs significantly from other observations, potentially indicating variability or error.
Definition: Outliers are extreme values that lie outside the overall pattern of distribution in a dataset.
Causes: They can result from measurement errors, data entry mistakes, or genuine variability in the data.
Impact: Outliers can skew statistical analyses, affecting mean, standard deviation, and ...
Normal distribution is a probability distribution that is symmetric about the mean, depicting data that clusters around the mean.
Bell-Shaped Curve: The graph of a normal distribution is bell-shaped, indicating that most observations cluster around the central peak.
Mean, Median, Mode: In a normal distribution, the mean, median, and mode are all equal and located at the center of the distribution.
68-95-99.7 Rule: Approxi...
Data visualization transforms complex data into visual formats, making it easier to understand patterns, trends, and insights.
Graphical Representation: Data visualization uses charts, graphs, and maps to represent data visually, such as bar charts for sales data.
Pattern Recognition: Visual formats help identify trends and patterns quickly, like spotting seasonal sales spikes in line graphs.
Interactive Dashboards: Tools...
A hash table is a data structure that stores key-value pairs for efficient data retrieval using a hash function.
Key-Value Storage: Hash tables store data in pairs, where each key is unique and maps to a specific value, e.g., {'name': 'Alice', 'age': 30}.
Hash Function: A hash function converts keys into hash codes, which determine the index in the array where the value is stored.
Collision Resolution: When two keys hash ...
I applied via Walk-in and was interviewed in Nov 2024. There were 2 interview rounds.
Online aptitude test with maths and statistics and data analysis
Methods and best practices for cleaning up data include removing duplicates, handling missing values, standardizing formats, and validating data accuracy.
Remove duplicates by identifying and deleting identical records.
Handle missing values by imputing with mean, median, or mode.
Standardize formats by converting data into a consistent structure.
Validate data accuracy by cross-referencing with external sources or using d...
Best practices for optimizing SQL code
Use indexes on columns frequently used in WHERE clauses
Avoid using SELECT * and only retrieve necessary columns
Optimize joins by using INNER JOIN instead of OUTER JOIN when possible
Avoid using subqueries and instead use JOINs or CTEs
Regularly analyze query performance using EXPLAIN or query execution plans
I appeared for an interview in Feb 2025, where I was asked the following questions.
Lod commands are used in data analysis to manage and manipulate data efficiently.
Lod stands for Level of Detail, allowing for calculations at different granularities.
Example: Using Lod to calculate average sales per region while displaying data at the city level.
Common Lod commands include FIXED, INCLUDE, and EXCLUDE.
FIXED: Calculates values using specified dimensions regardless of the view.
INCLUDE: Adds dimensions to ...
SQL statement execution involves parsing, optimizing, and executing queries to retrieve or manipulate data in a database.
1. Parsing: SQL statements are parsed to check for syntax errors. Example: 'SELECT * FROM table' is valid, while 'SELEC * FROM table' is not.
2. Optimization: The SQL engine optimizes the query for performance. Example: Choosing the best index to speed up 'SELECT' queries.
3. Execution: The optimized q...
I applied via Referral and was interviewed in Jun 2024. There were 3 interview rounds.
Basic level aptitude questions that are easy to answer.
A general group discussion.
I applied via Referral and was interviewed in Jul 2024. There were 3 interview rounds.
Basic apptitude and basic probability questions
I applied via Walk-in and was interviewed in Feb 2024. There were 3 interview rounds.
Its about 50 questions from general aptitude
The interviewer will focus on your resume to explore your skills, experiences, and projects in detail.
Highlight key projects: Discuss a data analysis project where you improved decision-making for a business.
Technical skills: Mention tools like SQL, Python, or Tableau that you used in your previous roles.
Problem-solving examples: Share a situation where your analysis led to significant cost savings or efficiency improv...
I applied via Naukri.com
Key points
The corrects order of statistical measures from the simplest to the most complex is as follows:
1.mode
2.median
3.mean
4.standard deviation
What people are saying about Infosys
The duration of Infosys Data Analyst interview process can vary, but typically it takes about less than 2 weeks to complete.
based on 28 interview experiences
Difficulty level
Duration
based on 107 reviews
Rating in categories
Technology Analyst
54.7k
salaries
| ₹4.8 L/yr - ₹10 L/yr |
Senior Systems Engineer
53.8k
salaries
| ₹2.5 L/yr - ₹6.3 L/yr |
Technical Lead
35.1k
salaries
| ₹9.4 L/yr - ₹16.4 L/yr |
System Engineer
32.5k
salaries
| ₹2.4 L/yr - ₹5.5 L/yr |
Senior Associate Consultant
31.2k
salaries
| ₹8.2 L/yr - ₹15 L/yr |
TCS
Wipro
Cognizant
Accenture