Walmart
10+ Interview Questions and Answers
Q1. What are the different approach you use for data cleaning.
Different approaches for data cleaning include removing duplicates, handling missing values, correcting inconsistent data, and standardizing formats.
Remove duplicates
Handle missing values
Correct inconsistent data
Standardize formats
Use statistical methods to identify outliers
Check for data accuracy and completeness
Normalize data
Transform data types
Apply data validation rules
Q2. How you get your data in your organization
Data is collected from various sources including databases, APIs, and user input.
We have access to multiple databases where we can extract relevant data
We use APIs to gather data from external sources such as social media platforms
Users can input data through forms or surveys
We also collect data through web scraping techniques
Q3. Sequence of Execution of SQL codes. Select - Where-from-Having- order by etc
The sequence of execution of SQL codes is Select-From-Where-Group By-Having-Order By.
Select: choose the columns to display
From: specify the table(s) to retrieve data from
Where: filter the data based on conditions
Group By: group the data based on a column
Having: filter the grouped data based on conditions
Order By: sort the data based on a column
Q4. Write code to describe database and Columns from a particular table
Code to describe database and columns from a table
Use SQL SELECT statement to retrieve column names and data types
Use DESC command to get table structure
Use INFORMATION_SCHEMA.COLUMNS to get detailed information about columns
Use SHOW CREATE TABLE to get table creation statement
Q5. Define Excel Functions Sum , Sum if , Count , CountA , Count Blanks
Excel functions are pre-built formulas that perform calculations or manipulate data in a spreadsheet.
Sum: adds up a range of numbers
Sum if: adds up a range of numbers based on a specified condition
Count: counts the number of cells in a range that contain numbers
CountA: counts the number of cells in a range that are not empty
Count Blanks: counts the number of empty cells in a range
Q6. Difference between CSV file and Excel file
CSV files are plain text files that store tabular data, while Excel files are binary files that can contain multiple sheets and complex formatting.
CSV files are simpler and more lightweight compared to Excel files.
CSV files can be easily opened and edited using a text editor, while Excel files require specific software like Microsoft Excel.
CSV files do not support formulas, macros, or formatting options like colors and fonts, while Excel files do.
CSV files have a smaller file...read more
Q7. How much amount of data you Handel till now.
I have handled large amounts of data in my previous roles.
I have experience handling terabytes of data in my previous role as a data analyst at XYZ company.
I have worked with data from various sources such as databases, spreadsheets, and APIs.
I have also used tools like SQL, Python, and Excel to manipulate and analyze data.
I am comfortable working with both structured and unstructured data.
I have experience cleaning and transforming data to make it usable for analysis.
Q8. Difference between having and where clause
HAVING is used with GROUP BY to filter the results after grouping. WHERE is used to filter the results before grouping.
HAVING is used with GROUP BY clause while WHERE is used with SELECT clause.
HAVING is used to filter the results of aggregate functions while WHERE is used to filter individual rows.
HAVING is used to filter the results after grouping while WHERE is used to filter the results before grouping.
HAVING can only be used with aggregate functions while WHERE can be us...read more
Q9. What is the use if store procedure ?
Stored procedures are precompiled SQL statements that can be reused and executed multiple times.
Stored procedures improve performance by reducing network traffic and improving security.
They can be used to encapsulate business logic and provide a consistent interface to the database.
Stored procedures can also be used to simplify complex queries and transactions.
Examples include procedures for inserting, updating, and deleting data, as well as generating reports and performing ...read more
Q10. What is trigger in SQL?
A trigger in SQL is a set of instructions that automatically executes in response to a specific event or action.
Triggers can be used to enforce business rules, audit changes, or replicate data.
They can be defined to execute before or after an INSERT, UPDATE, or DELETE statement.
Triggers can also be nested, meaning one trigger can execute another trigger.
Examples of triggers include sending an email notification when a new record is inserted, or updating a summary table when a...read more
Q11. Write code to find duplicate Records.
Code to find duplicate records
Identify the key columns to check for duplicates
Use GROUP BY and HAVING clauses to filter out duplicates
Consider using window functions like ROW_NUMBER() to identify and remove duplicates
Use programming languages like SQL, Python, or R to write the code
Q12. Limitations of Vlookup
Vlookup limitations include limited search range, case sensitivity, and inability to handle multiple matches.
Vlookup only searches for values in the leftmost column of the table array
It is case sensitive and cannot handle spelling errors
It only returns the first match and cannot handle multiple matches
It cannot search for values to the left of the lookup column
It can be slow and inefficient for large datasets
Q13. Limitations of Structural Query Language
SQL limitations include lack of scalability, security vulnerabilities, and difficulty in handling unstructured data.
SQL is not suitable for handling unstructured data like images, videos, and audio files.
It can be difficult to scale SQL databases to handle large amounts of data.
SQL databases can be vulnerable to security threats like SQL injection attacks.
SQL is not always the best choice for real-time data processing or complex analytics.
SQL can be limited in its ability to ...read more
Q14. Vlookup vs Index-Match
Vlookup and Index-Match are both Excel functions used for lookup and retrieval of data.
Vlookup is simpler and faster but has limitations in terms of flexibility and handling of large datasets.
Index-Match is more versatile and can handle complex data structures but is slower and requires more effort to set up.
Vlookup is best suited for simple lookups with small datasets while Index-Match is better for more complex lookups with larger datasets.
Index-Match is also more error-res...read more
Q15. Running totals in a year using dax
Running totals in a year using DAX
Use the DAX function CALCULATE to create running totals
Use the FILTER function to specify the date range for the running total
Example: CALCULATE(SUM(Sales[Amount]), FILTER(ALL('Date'[Date]), 'Date'[Date] <= MAX('Date'[Date])))
Q16. P value explanation
P value is a statistical measure that helps determine the significance of results in hypothesis testing.
P value is the probability of obtaining results as extreme as the observed results, assuming the null hypothesis is true.
A small P value (typically ≤ 0.05) indicates strong evidence against the null hypothesis, leading to its rejection.
A large P value (> 0.05) suggests weak evidence against the null hypothesis, leading to its acceptance.
P value is used in hypothesis testing...read more
Interview Process at null
Top Data Analyst Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month