ETL Tester

100+ ETL Tester Interview Questions and Answers

Updated 25 Nov 2024

Popular Companies

search-icon

Q1. If we have 200 staging tables, 40 dimensions tables and 20 facts table, How will you compare it with target systems

Ans.

The number of staging, dimension and fact tables in source and target systems need to be compared.

  • Compare the number of staging, dimension and fact tables in source and target systems.

  • Check if the table names and column names are consistent in both systems.

  • Verify if the data types and data values are matching in both systems.

  • Ensure that the ETL process is properly mapping the data from source to target systems.

  • Perform data profiling to identify any discrepancies between the s...read more

Q2. 2.Difference between Union and union all,drop and Truncate,star schema and snowflake schema,Dimension table and fact table.

Ans.

Union combines the result sets of two or more SELECT statements, while Union All combines all rows from two or more SELECT statements.

  • Union removes duplicate rows, while Union All does not.

  • Union requires the number and order of columns in all SELECT statements to be the same, while Union All does not have this requirement.

  • Example: SELECT column1 FROM table1 UNION SELECT column1 FROM table2;

  • Example: SELECT column1 FROM table1 UNION ALL SELECT column1 FROM table2;

ETL Tester Interview Questions and Answers for Freshers

illustration image

Q3. What is the difference between Test Plan and Test strategy

Ans.

Test plan is a detailed document that outlines the testing approach and activities. Test strategy is a high-level document that outlines the testing objectives and methods.

  • Test plan is more detailed than test strategy

  • Test plan includes specific test cases and test scenarios

  • Test strategy outlines the overall testing approach and objectives

  • Test strategy is created before the test plan

  • Test strategy is reviewed and updated regularly

Q4. Which type of Validation you will do at Landing and staging area.

Ans.

At landing and staging area, I will perform data validation to ensure accuracy and completeness of data.

  • Validate data against source system

  • Check for missing or duplicate data

  • Verify data types and formats

  • Ensure data integrity and consistency

  • Perform data profiling and data quality checks

Are these interview questions helpful?

Q5. How do you get data integrity between source and destination, how you perform loading technique.

Ans.

Data integrity is ensured through ETL testing and loading techniques.

  • Perform ETL testing to ensure data accuracy and completeness.

  • Use data profiling to identify data quality issues.

  • Perform data mapping to ensure correct data transfer.

  • Use checksums to verify data integrity.

  • Perform data validation to ensure data consistency.

  • Use error handling and logging to track and resolve issues.

  • Use incremental loading to improve performance and reduce errors.

Q6. What is requirement traceability matrix.. compatibility Testing

Ans.

Requirement Traceability Matrix is a document that links requirements to test cases. Compatibility testing checks if software works on different platforms.

  • Requirement Traceability Matrix (RTM) is used to ensure that all requirements are tested and covered in test cases.

  • RTM helps in tracking the progress of testing and ensures that all requirements are met.

  • Compatibility testing is done to ensure that software works on different platforms, browsers, and devices.

  • Compatibility te...read more

Share interview questions and help millions of jobseekers 🌟

man-with-laptop

Q7. what is self Join,types of joins.what is CDC and how we will use it in ETL testing

Ans.

Self join is joining a table with itself. Types of joins are inner, outer, left and right. CDC is change data capture used for tracking data changes.

  • Self join is used when we need to join a table with itself to retrieve data.

  • Types of joins are inner, outer, left and right join.

  • CDC is used to track data changes in the source system and apply those changes to the target system.

  • CDC can be used in ETL testing to verify that the data is being correctly captured and transformed.

Q8. Explain STLC .. difference between water fall and Agile methodologies

Ans.

STLC is Software Testing Life Cycle which includes planning, designing, executing and reporting. Waterfall is a linear approach while Agile is iterative.

  • STLC is a process of testing software from planning to reporting

  • Waterfall is a linear approach where each phase is completed before moving to the next

  • Agile is an iterative approach where testing is done in short cycles called sprints

  • In Waterfall, testing is done at the end of the development cycle while in Agile, testing is d...read more

ETL Tester Jobs

Cognizant Hiring For ETL tester 4-9 years
Cognizant
3.8
Chennai
ETL Tester 6-9 years
Cognizant
3.8
Kolkata
Etl Tester 8-11 years
Wipro
3.7
Hyderabad / Secunderabad

Q9. What documents will you produce durimg and sfter testing phase

Ans.

I will produce test plan, test cases, test scripts, defect reports, and test summary report.

  • Test plan outlining the testing approach and scope

  • Test cases with steps, expected results, and actual results

  • Test scripts for automation

  • Defect reports with steps to reproduce and severity

  • Test summary report with overall results and recommendations

Q10. Difference Between union and union all,having and where, rank and dense rank(),minus and intersection

Ans.

Explaining the differences between union and union all, having and where, rank and dense rank(), minus and intersection.

  • Union combines the result sets of two or more SELECT statements, while Union All returns all rows including duplicates.

  • Having is used to filter the results of an aggregate function in a GROUP BY clause, while Where is used to filter rows before grouping.

  • Rank assigns a unique rank to each row within a result set, while Dense Rank assigns the same rank to rows...read more

Q11. 1.Find last 5 records. 2. find unique records.

Ans.

To find the last 5 records, use the ORDER BY clause with a descending order and limit the result to 5. To find unique records, use the DISTINCT keyword.

  • To find the last 5 records, use the ORDER BY clause with a descending order and limit the result to 5.

  • Example: SELECT * FROM table_name ORDER BY column_name DESC LIMIT 5

  • To find unique records, use the DISTINCT keyword.

  • Example: SELECT DISTINCT column_name FROM table_name

Q12. 1.Difference between unique key and primary key.

Ans.

Unique key allows null values while primary key does not.

  • Primary key is a unique identifier for a record in a table.

  • Unique key allows null values but primary key does not.

  • A table can have only one primary key but multiple unique keys.

  • Example: Employee ID can be a primary key while email can be a unique key.

Q13. How do you identify latest record in SCD

Ans.

To identify latest record in SCD, check the effective end date column.

  • Look for the record with the latest effective end date

  • Effective end date column should have the maximum date value

  • If there are multiple records with the same effective end date, choose the one with the latest modified date

Q14. SQL-Display horizontal ORACLE into Vertical ORACLE (Pivot function)

Ans.

To display horizontal Oracle into vertical Oracle, we can use the PIVOT function in SQL.

  • The PIVOT function is used to transform rows into columns.

  • It requires an aggregate function to be specified.

  • The PIVOT function can be used with the SELECT statement.

  • The PIVOT function can also be used with dynamic SQL.

  • Example: SELECT * FROM table_name PIVOT (SUM(column_name) FOR pivot_column IN (value1, value2, value3));

Q15. How do you compare files in Unix

Ans.

Files can be compared in Unix using the diff command.

  • Use the 'diff' command followed by the two file names to compare them.

  • The output will show the differences between the two files.

  • Use the '-y' option to display the differences side by side.

  • Use the '-q' option to only show if the files are different or not.

  • Use the '-r' option to compare files in directories recursively.

Q16. What are the components of test cases

Ans.

Components of test cases include test case ID, test case description, test steps, expected results, actual results, and status.

  • Test case ID: unique identifier for the test case

  • Test case description: brief description of the test case

  • Test steps: detailed steps to execute the test case

  • Expected results: expected outcome of the test case

  • Actual results: actual outcome of the test case

  • Status: pass/fail status of the test case

Q17. What risks do you see in Test completion

Ans.

Incomplete testing can lead to undetected defects and potential risks in production.

  • Undetected defects can cause system failures and downtime.

  • Incomplete testing can lead to security vulnerabilities.

  • Lack of testing can result in poor user experience and customer dissatisfaction.

  • Incomplete testing can lead to increased maintenance costs and longer development cycles.

Q18. What are the prerequisites for etl testing?

Ans.

Prerequisites for ETL testing include understanding of data warehousing concepts, SQL, and ETL tools.

  • Understanding of data warehousing concepts

  • Proficiency in SQL

  • Familiarity with ETL tools such as Informatica, Talend, or SSIS

  • Knowledge of data mapping and transformation

  • Ability to write test cases and execute them

  • Experience in data validation and reconciliation

  • Understanding of data quality and data profiling

  • Knowledge of source and target systems

  • Ability to troubleshoot issues and...read more

Q19. SQL-Find a last day of the previous month

Ans.

SQL query to find the last day of the previous month.

  • Use the DATEADD function to subtract one day from the first day of the current month

  • Use the DAY function to get the day of the month

  • Subtract the day of the month from the date to get the last day of the previous month

Q20. How to find nth highest from a table

Ans.

To find the nth highest from a table, use the SQL query with the ORDER BY and LIMIT clauses.

  • Write a SQL query to select the column you want to find the nth highest from.

  • Use the ORDER BY clause to sort the column in descending order.

  • Use the LIMIT clause to specify the number of rows to return, where n-1 is the value for LIMIT.

  • The nth highest value will be the first row returned by the query.

Q21. SQL- Fetch last 5 records from the table

Ans.

To fetch last 5 records from a table in SQL

  • Use SELECT statement to retrieve data from the table

  • Use ORDER BY clause to sort the data in descending order based on a column

  • Use LIMIT clause to limit the number of rows returned to 5

Q22. How do you map requirement to test cases

Ans.

Requirements are mapped to test cases by analyzing the requirements and creating test cases that cover all the aspects of the requirement.

  • Analyze the requirement document thoroughly

  • Identify the key functionalities and features

  • Create test cases that cover all the aspects of the requirement

  • Ensure that the test cases are traceable to the requirement

  • Review and validate the test cases with stakeholders

Q23. what is fact table,dimenssion table,star schema,snowflake schema???

Ans.

Fact table, dimension table, star schema, and snowflake schema are concepts in data warehousing.

  • Fact table contains the measures or metrics of the data warehouse.

  • Dimension table contains the attributes or dimensions of the data warehouse.

  • Star schema is a type of schema where a fact table is connected to multiple dimension tables.

  • Snowflake schema is a type of schema where dimension tables are normalized into multiple related tables.

  • These concepts are used to design and organiz...read more

Q24. What are dimensions? And their types

Ans.

Dimensions are attributes or characteristics of data that can be used for analysis and reporting.

  • Dimensions are used in data warehousing and business intelligence to categorize and organize data.

  • Types of dimensions include time, geography, product, customer, and organization.

  • Dimensions can be hierarchical, with subcategories and levels of detail.

  • Dimensions are often used in conjunction with measures, which are the numerical values being analyzed.

Q25. SQL Query to find such records which never ever contains flag 0 in critical column p1 1-1-21 1 p1 2-3-21 0 p2 3-2-21 0 p3 1-5-21 1 p3 1-5-21 1 p4 14-21 1

Ans.

The SQL query to find records that never contain flag 0 in the critical column.

  • Use the NOT EXISTS operator to check if there are no records with flag 0 for each distinct value in the critical column.

  • Write a subquery to select distinct values from the critical column.

  • Join the main table with the subquery using the critical column.

  • Filter the result by checking if there are no records with flag 0 for each distinct value in the critical column.

Q26. What is Low Priority , Medium Priority & High priority defects?

Ans.

Low, Medium, and High priority defects are categorized based on their impact on the system and urgency of fixing.

  • Low priority defects are minor issues that do not significantly impact the system's functionality.

  • Medium priority defects are issues that have a noticeable impact on the system but are not critical.

  • High priority defects are critical issues that severely impact the system's functionality and need immediate attention.

  • Examples: Low priority - Spelling errors, Medium p...read more

Q27. Difference between Star schema and snowflake schema.

Ans.

Star schema has a single fact table and denormalized dimension tables, while snowflake schema has normalized dimension tables.

  • Star schema is simpler and easier to understand, but can lead to data redundancy.

  • Snowflake schema is more complex but saves storage space and allows for more flexibility in querying.

  • Star schema is better suited for smaller datasets, while snowflake schema is better for larger datasets.

  • Example of star schema: sales fact table with denormalized customer ...read more

Q28. Difference between white box and black box testing.

Ans.

White box testing is testing the internal structure of the application while black box testing is testing the functionality without knowledge of the internal structure.

  • White box testing is also known as clear box testing or structural testing.

  • It requires knowledge of the internal workings of the application.

  • Examples include unit testing and integration testing.

  • Black box testing is also known as functional testing.

  • It does not require knowledge of the internal workings of the a...read more

Q29. How to do validation between two integers Source 12345 2122434 54534535 5345353455 target 0000012345 0002122434 0054534535 5345353455

Ans.

To validate between two integers, compare each corresponding digit and check if they are equal.

  • Convert both integers to strings

  • Pad the source integer with leading zeros to match the length of the target integer

  • Compare each digit of the source and target integers

  • If any digit is not equal, the validation fails

Q30. Suppose we have following column what will be output for inner join col1 col2 1 1 1 null null null 1 1 1 blank blank blank

Ans.

The output for inner join will be the rows where both col1 and col2 have non-null values.

  • Inner join returns only the matching rows from both tables.

  • In this case, the output will be the rows with non-null values in both col1 and col2.

  • Rows with null or blank values in either col1 or col2 will be excluded from the output.

Q31. Explain ETL architecture?

Ans.

ETL architecture refers to the design and structure of the ETL process.

  • ETL architecture involves three main components: extraction, transformation, and loading.

  • Data is extracted from various sources, transformed to fit the target system, and loaded into the target database.

  • ETL architecture can be implemented using different tools and technologies, such as ETL software, data integration platforms, and cloud-based solutions.

  • The architecture should be designed to ensure data acc...read more

Q32. What is data repository?

Ans.

A data repository is a centralized location where data is stored, managed, and maintained.

  • It is used to store and manage data in a structured manner

  • It can be a database, data warehouse, or data lake

  • It allows for easy access and retrieval of data

  • Examples include Hadoop Distributed File System (HDFS), Amazon S3, and Oracle Database

Q33. What is the difference between union and union all?

Ans.

Union combines and removes duplicates, while union all combines without removing duplicates.

  • Union removes duplicates from the result set, while union all does not.

  • Union is slower than union all because it has to perform an additional step to remove duplicates.

  • Union all is used when duplicates are desired in the result set.

  • Example: SELECT column1 FROM table1 UNION SELECT column1 FROM table2;

  • Example: SELECT column1 FROM table1 UNION ALL SELECT column1 FROM table2;

Q34. What is data mart?

Ans.

A data mart is a subset of a larger data warehouse that is designed to serve a specific business unit or department.

  • Contains a subset of data from a larger data warehouse

  • Designed to serve a specific business unit or department

  • Provides a more focused view of data for analysis and reporting

  • Can be created using a top-down or bottom-up approach

  • Examples include sales data mart, marketing data mart, finance data mart

Q35. What is the Difference Between Drop and Truncate Command ?

Ans.

Drop command deletes the table structure along with data, while truncate command deletes only the data.

  • Drop command removes the table structure and all its data, while truncate command only removes the data but keeps the table structure intact.

  • Drop command is a DDL (Data Definition Language) command, while truncate command is a DML (Data Manipulation Language) command.

  • Drop command is slower than truncate command as it involves deleting the table structure.

  • Example: DROP TABLE ...read more

Q36. Write an SQL query to fetch 5th ,10th,15th .. records from a table?

Ans.

Use OFFSET and FETCH in SQL to retrieve specific records from a table.

  • Use OFFSET to skip the first n-1 records

  • Use FETCH to limit the number of records to be retrieved

  • Example: SELECT * FROM table_name OFFSET 4 ROWS FETCH NEXT 5 ROWS ONLY

Q37. What is the difference between set & join?

Ans.

Set and join are operations used in database management systems.

  • Set is used to perform mathematical set operations like union, intersection, and difference on two or more sets.

  • Join is used to combine rows from two or more tables based on a related column between them.

  • Set operations do not require a common column, while join operations do.

  • Set operations return a result set with unique values, while join operations return a result set with combined values.

  • Examples: UNION, INTER...read more

Q38. diff b/w where clause and having clause? diff b/w severity and priority

Ans.

Difference between where and having clause and severity and priority

  • Where clause is used to filter rows before grouping while having clause is used to filter groups after grouping

  • Severity is the degree of impact of a defect on the system while priority is the order in which defects should be fixed

  • Where clause is used with SELECT, UPDATE and DELETE statements while having clause is used with GROUP BY clause

  • Severity is usually assigned by the tester while priority is assigned b...read more

Q39. What is a iterative model

Ans.

Iterative model is a software development model where the development process is divided into smaller iterations.

  • Iterative model involves repeating the development process in small cycles.

  • Each cycle involves planning, designing, building, testing and evaluating the software.

  • Feedback from each cycle is used to improve the software in the next cycle.

  • Iterative model is useful when requirements are not clear or keep changing.

  • Examples of iterative models include Agile and Scrum.

Q40. 1 2 3 . . . 100 we want 100 at first row & rest below it

Ans.

The question is asking to rearrange the numbers so that 100 is at the first row and the rest are below it.

  • Create an array of strings to store the numbers

  • Assign the value '100' to the first element of the array

  • Use a loop to assign the numbers 1 to 99 to the remaining elements of the array

Q41. Explain schema you used in your project

Ans.

The schema used in my project was a star schema.

  • Star schema is a type of data warehouse schema where a central fact table is connected to multiple dimension tables.

  • The fact table contains the measurements or metrics of the business process, while the dimension tables provide context and descriptive attributes.

  • This schema is commonly used in data warehousing and business intelligence applications.

  • Example: In a sales analysis project, the fact table could contain sales transact...read more

Q42. 1. Fetch last record of the column. 2. Dense rank vs rank 3. Fetch end of 5 character from the string of the column

Ans.

The interview questions are related to SQL and data manipulation.

  • To fetch the last record of a column, you can use the ORDER BY clause with DESC keyword and LIMIT 1.

  • Dense rank and rank are both ranking functions in SQL, but dense rank assigns consecutive rank numbers to rows, while rank leaves gaps in the ranking sequence.

  • To fetch the end of 5 characters from a string column, you can use the RIGHT function in SQL.

Q43. write query last hour hour transaction details through net banking r card

Ans.

Query for last hour transaction details through net banking or card

  • Use the transaction table to filter transactions by timestamp

  • Join with the net banking or card table to get additional details

  • Filter for transactions within the last hour using a WHERE clause

  • Order the results by timestamp to get the most recent transactions first

Q44. What is fact?

Ans.

A fact is a piece of information that is known to be true or proven through evidence.

  • Facts are objective and can be verified through research or observation.

  • Facts are not opinions or beliefs.

  • Examples of facts include the boiling point of water, the population of a city, and historical events that have been documented.

  • Facts can be used to support arguments or conclusions.

  • Facts can change over time as new information is discovered or theories are revised.

Q45. Find the nth row. Find the top 50% records, string function

Ans.

Use SQL query to find the nth row and top 50% records using string function

  • To find the nth row, use LIMIT and OFFSET in SQL query

  • To find top 50% records, use PERCENTILE_CONT function in SQL

  • String functions like SUBSTRING can be used to manipulate string data

Q46. what is Star Schema and Snow flake schema ?

Ans.

Star Schema is a type of database schema where a central fact table is connected to multiple dimension tables. Snowflake Schema is a normalized form of Star Schema where dimension tables are further normalized into sub-dimension tables.

  • Star Schema has a central fact table connected to multiple dimension tables

  • Snowflake Schema is a normalized form of Star Schema with sub-dimension tables

  • Star Schema is denormalized for better query performance

  • Snowflake Schema is normalized for ...read more

Q47. SQL query to find second highest salary.

Ans.

SQL query to find second highest salary.

  • Use ORDER BY clause in descending order to sort the salaries

  • Use LIMIT 1,1 to select the second highest salary

  • Example: SELECT salary FROM employees ORDER BY salary DESC LIMIT 1,1

Q48. How you identify duplicates and how to find latest duplicate

Ans.

Identifying duplicates and finding the latest duplicate in ETL testing

  • Identify duplicates by comparing key fields in the dataset

  • To find the latest duplicate, look for the record with the highest timestamp or version number

  • Use SQL queries or ETL tools to filter out duplicates and identify the latest one

Q49. What is the difference between star schema and snowflake schme

Ans.

Star schema has a single fact table connected to multiple dimension tables directly, while snowflake schema has normalized dimension tables connected to each other.

  • Star schema has denormalized dimension tables, making it easier to query but potentially less efficient for storage.

  • Snowflake schema has normalized dimension tables, reducing redundancy but potentially making queries more complex.

  • Star schema is more suitable for data warehousing applications with simpler queries, w...read more

Q50. Group by clause and how it work with a query

Ans.

Group by clause is used in SQL to group rows that have the same values into summary rows.

  • Group by clause is used with aggregate functions like SUM, COUNT, AVG, etc.

  • It is used to group rows based on one or more columns specified in the query.

  • Group by clause must be used after the WHERE clause and before the ORDER BY clause.

  • Example: SELECT department, COUNT(*) FROM employees GROUP BY department;

1
2
3
Next
Interview Tips & Stories
Ace your next interview with expert advice and inspiring stories

Interview experiences of popular companies

3.7
 • 10k Interviews
3.9
 • 7.8k Interviews
3.7
 • 7.3k Interviews
3.8
 • 5.4k Interviews
3.8
 • 4.6k Interviews
3.6
 • 3.6k Interviews
3.6
 • 2.3k Interviews
3.8
 • 513 Interviews
3.7
 • 507 Interviews
View all

Calculate your in-hand salary

Confused about how your in-hand salary is calculated? Enter your annual salary (CTC) and get your in-hand salary

ETL Tester Interview Questions
Share an Interview
Stay ahead in your career. Get AmbitionBox app
qr-code
Helping over 1 Crore job seekers every month in choosing their right fit company
65 L+

Reviews

4 L+

Interviews

4 Cr+

Salaries

1 Cr+

Users/Month

Contribute to help millions
Get AmbitionBox app

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2024 Info Edge (India) Ltd.

Follow us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter