ETL Tester

filter-iconFilter interviews by

100+ ETL Tester Interview Questions and Answers

Updated 4 Mar 2025

Q51. What is the difference between star schema and snowflake schme

Ans.

Star schema has a single fact table connected to multiple dimension tables directly, while snowflake schema has normalized dimension tables connected to each other.

  • Star schema has denormalized dimension tables, making it easier to query but potentially less efficient for storage.

  • Snowflake schema has normalized dimension tables, reducing redundancy but potentially making queries more complex.

  • Star schema is more suitable for data warehousing applications with simpler queries, w...read more

Q52. Group by clause and how it work with a query

Ans.

Group by clause is used in SQL to group rows that have the same values into summary rows.

  • Group by clause is used with aggregate functions like SUM, COUNT, AVG, etc.

  • It is used to group rows based on one or more columns specified in the query.

  • Group by clause must be used after the WHERE clause and before the ORDER BY clause.

  • Example: SELECT department, COUNT(*) FROM employees GROUP BY department;

Q53. Explain different types of data validation and checks performed in ETL testing

Ans.

Different types of data validation and checks in ETL testing include schema validation, data completeness checks, data transformation checks, and data quality checks.

  • Schema validation ensures that the data conforms to the expected structure and format.

  • Data completeness checks verify that all expected data is present and accounted for.

  • Data transformation checks ensure that data is accurately transformed according to business rules.

  • Data quality checks assess the accuracy, consi...read more

Q54. Explain Defects life cycle

Ans.

Defects life cycle is a process that describes the stages of a defect from discovery to resolution.

  • Defect is identified and reported

  • Defect is assigned to a developer

  • Developer fixes the defect

  • Defect is retested by the tester

  • If the defect is still present, it is reopened

  • If the defect is fixed, it is closed

Are these interview questions helpful?

Q55. What is difference between delete and drop

Ans.

Delete removes specific records from a table; drop removes the entire table structure and its data.

  • DELETE is a DML command used to remove specific rows from a table based on a condition.

  • Example: DELETE FROM Employees WHERE EmployeeID = 1; removes the employee with ID 1.

  • DROP is a DDL command that removes the entire table structure and all its data permanently.

  • Example: DROP TABLE Employees; deletes the entire Employees table.

  • DELETE can be rolled back if used within a transactio...read more

Q56. Types of test data creation

Ans.

Test data creation types include manual, automated, random, boundary, and negative testing.

  • Manual testing involves creating data by hand

  • Automated testing uses tools to generate data

  • Random testing involves creating data randomly

  • Boundary testing involves testing data at the limits of its range

  • Negative testing involves testing invalid or unexpected data

Share interview questions and help millions of jobseekers 🌟

man-with-laptop

Q57. What is the flow of recent project

Ans.

The recent project flow involved data extraction, transformation, and loading using ETL tools.

  • The project started with identifying the data sources and requirements.

  • Then, the data was extracted from various sources using ETL tools like Informatica, Talend, etc.

  • The extracted data was transformed as per the business rules and requirements.

  • Finally, the transformed data was loaded into the target system.

  • The project also involved testing the ETL process and ensuring data accuracy ...read more

Q58. what is difference between union and union all

Ans.

Union combines the result sets of two or more SELECT statements, while Union All combines all rows from two or more SELECT statements.

  • Union removes duplicate rows from the result set, while Union All does not.

  • Union requires that the number and order of columns in all SELECT statements be the same, while Union All does not have this requirement.

  • Union is slower than Union All because it performs a distinct operation, while Union All simply concatenates the result sets.

  • Example: ...read more

ETL Tester Jobs

Etl Tester - PAN INDIA 4-9 years
Infosys
3.6
₹ 5 L/yr - ₹ 13 L/yr
Hyderabad / Secunderabad
ETL Tester_ Pan India 3-8 years
Infosys
3.6
Hyderabad / Secunderabad
Etl Tester_Infosys 3-8 years
Infosys
3.6
Hyderabad / Secunderabad

Q59. Query to fetch data greater than average, min and max salary.

Ans.

Query to fetch data greater than average, min and max salary.

  • Calculate average, min, and max salary first

  • Use a SELECT statement with WHERE clause to filter data greater than average, min, and max salary

Q60. Query to fetch only numbers from alphanumeric data from a table.

Ans.

Use regular expressions to extract numbers from alphanumeric data in a table.

  • Use the REGEXP_REPLACE function in SQL to replace all non-numeric characters with an empty string.

  • For example, SELECT REGEXP_REPLACE(column_name, '[^0-9]', '') FROM table_name;

  • This will return only the numbers from the alphanumeric data in the specified column.

Q61. How would you test a full load vs incremental load

Ans.

To test a full load vs incremental load, compare the results of loading all data at once vs loading only new or updated data.

  • Create test cases to verify the accuracy of data loaded during a full load.

  • Create test cases to verify that only new or updated data is loaded during an incremental load.

  • Compare the results of the full load and incremental load to ensure consistency and accuracy.

  • Verify that the data integrity is maintained during both types of loads.

  • Use tools like SQL q...read more

Q62. What type of data validation you performed.

Ans.

I performed various types of data validation including format validation, completeness validation, and referential integrity validation.

  • Format validation: Ensuring that the data is in the correct format as per the defined specifications. For example, validating that a date field is in the format 'YYYY-MM-DD'.

  • Completeness validation: Checking if all the required fields are present and populated in the data. For example, ensuring that a customer record has a name, address, and ...read more

Q63. what is mapping design????

Ans.

Mapping design is the process of creating a mapping between source and target data structures.

  • It involves identifying the source and target data structures

  • Defining the transformation rules to convert the source data to target data

  • Creating a mapping document to document the mapping rules

  • Examples include ETL mapping design for data integration projects

Q64. what is difference between join and union?

Ans.

Join is used to combine rows from two or more tables based on a related column, while union is used to combine the result sets of two or more SELECT statements.

  • Join is used to retrieve data from multiple tables based on a related column

  • Union is used to combine the result sets of two or more SELECT statements

  • Join can be used with different types like INNER JOIN, LEFT JOIN, RIGHT JOIN, etc.

  • Union removes duplicate rows from the combined result set

Q65. How to find duplicate values in python?

Ans.

Use a dictionary to find duplicate values in an array of strings in Python.

  • Create an empty dictionary to store the count of each string in the array.

  • Iterate through the array and for each string, check if it exists in the dictionary. If it does, increment the count. If not, add it to the dictionary with a count of 1.

  • After iterating through the array, check the dictionary for any strings with a count greater than 1. These are the duplicate values.

Q66. What is fact table?

Ans.

Fact table is a table in a data warehouse that stores quantitative data about a business process.

  • Contains foreign keys to dimension tables

  • Stores numerical data such as sales, revenue, etc.

  • Used for analysis and reporting

  • Can have multiple fact tables in a data warehouse

Q67. how do you delete duplicates from each department

Ans.

Use SQL query with GROUP BY and HAVING clause to delete duplicates from each department

  • Use GROUP BY clause to group the records by department

  • Use HAVING clause to filter out the duplicates within each department

  • Write a DELETE statement to remove the duplicate records

Q68. what are the types of scd

Ans.

Types of SCD include Type 1, Type 2, Type 3, and Type 4.

  • Type 1 - Overwrite: Old record is replaced with new data.

  • Type 2 - Add new row: New record is added with a new surrogate key.

  • Type 3 - Update: New column is added to track changes.

  • Type 4 - Hybrid: Combination of Type 1 and Type 2.

Q69. What is Fact and Dimension Tables

Ans.

Fact tables contain quantitative data while dimension tables contain descriptive attributes.

  • Fact tables store measurable, quantitative data such as sales revenue or quantity sold

  • Dimension tables store descriptive attributes related to the data in the fact table, such as product name or customer details

  • Fact tables are typically normalized while dimension tables are denormalized for easier querying

  • Example: In a sales database, the fact table would contain sales data while dimen...read more

Q70. Explain challenges you faced in etl testing?

Ans.

Challenges in ETL testing include data quality issues, complex transformations, and handling large volumes of data.

  • Data quality issues such as missing or incorrect data can impact testing results

  • Complex transformations can be difficult to validate and may lead to errors in the ETL process

  • Handling large volumes of data can slow down testing processes and require efficient testing strategies

Q71. domain name from an email ?

Ans.

A domain name from an email refers to the part after the @ symbol in an email address.

  • The domain name is typically the name of the organization or service provider associated with the email address.

  • For example, in the email address example@gmail.com, 'gmail.com' is the domain name.

Q72. SCD Types , Star Schema Vs SnowFlake Schema

Ans.

SCD types refer to slowly changing dimensions in data warehousing. Star schema is denormalized while snowflake schema is normalized.

  • SCD types refer to how dimensions change over time in a data warehouse

  • Type 1: Overwrite existing data with new data

  • Type 2: Keep track of historical data with new records

  • Type 3: Keep track of some historical data with limited history

  • Star schema is denormalized with a central fact table connected to dimension tables

  • Snowflake schema is normalized wi...read more

Q73. What is ETL and its architecture

Q74. Explain defect life cycle.

Ans.

Defect life cycle is the process of identifying, reporting, prioritizing, fixing, and verifying defects in software.

  • Defect is identified by testers during testing

  • Defect is reported to development team

  • Development team prioritizes and fixes the defect

  • Fixed defect is verified by testers

  • If defect is not fixed, it goes back to development team

  • If defect is fixed, it is closed

Q75. What is ETL explain,stages Of etl

Ans.

ETL stands for Extract, Transform, Load. It is a process of extracting data from various sources, transforming it into a consistent format, and loading it into a target database.

  • Extract: Data is extracted from different sources such as databases, files, APIs, etc.

  • Transform: Data is cleaned, validated, and transformed into a consistent format.

  • Load: Transformed data is loaded into a target database or data warehouse.

  • Stages of ETL: Extraction, Transformation, Loading.

Q76. Difference between snowflake schema and star schema

Ans.

Snowflake schema is a normalized form of star schema with additional dimension tables.

  • Snowflake schema is a data modeling technique used in data warehousing.

  • In snowflake schema, dimensions are normalized into multiple related tables.

  • Snowflake schema reduces redundancy and improves data integrity.

  • Star schema is a denormalized form of snowflake schema with a single dimension table.

  • In star schema, dimensions are not normalized and are directly linked to the fact table.

  • Star schem...read more

Q77. rank dense_rank() ,row_numebr with example?

Ans.

dense_rank() assigns a unique rank to each distinct row, row_number() assigns a unique sequential integer to each row.

  • dense_rank() is used to assign a unique rank to each distinct row based on the specified order.

  • row_number() is used to assign a unique sequential integer to each row in the result set.

  • Example: dense_rank() - SELECT dense_rank() OVER (ORDER BY salary) AS rank FROM employees;

  • Example: row_number() - SELECT row_number() OVER (ORDER BY hire_date) AS row_num FROM em...read more

Q78. What is entry criteria & exit criteria

Ans.

Entry criteria and exit criteria are conditions that must be met before and after testing, respectively.

  • Entry criteria are the conditions that must be fulfilled before testing can begin

  • Exit criteria are the conditions that must be fulfilled before testing can be considered complete

  • Entry criteria can include having test data ready, test environment set up, and test cases prepared

  • Exit criteria can include all test cases executed, all defects fixed, and test reports generated

Q79. What is Scalar function with example.

Ans.

Scalar function is a function that returns a single value based on input parameters.

  • Scalar functions are used in SQL to perform calculations on input values and return a single result.

  • They can be used in SELECT statements, WHERE clauses, and other SQL queries.

  • An example of a scalar function is the ABS() function, which returns the absolute value of a number.

Q80. How to find duplicates in SQL?

Ans.

Use the GROUP BY clause with COUNT() function to find duplicates in SQL.

  • Use GROUP BY clause with COUNT() function to group the records by the columns you want to check for duplicates

  • Filter the results by using HAVING clause to only show records with count greater than 1

  • Example: SELECT column1, column2, COUNT(*) FROM table_name GROUP BY column1, column2 HAVING COUNT(*) > 1;

Q81. What is dwh and Data Mart

Ans.

DWH (Data Warehouse) is a centralized repository for integrated data, while Data Mart is a subset focused on specific business areas.

  • DWH stores large volumes of historical data from multiple sources for analysis.

  • Data Mart is designed for specific departments, like sales or finance.

  • Example of DWH: A retail company's data warehouse containing sales, inventory, and customer data.

  • Example of Data Mart: A sales data mart that focuses only on sales transactions and customer demograp...read more

Q82. What is Slowly Changing Dimensions?

Ans.

Slowly Changing Dimensions (SCD) are techniques used in data warehousing to track changes to dimension data over time.

  • SCD Type 1: Overwrites old data with new data, losing historical information.

  • SCD Type 2: Creates new records for each change, preserving historical data.

  • SCD Type 3: Creates additional columns to track changes, allowing limited history.

  • SCD Type 4: Uses separate tables to store historical and current data.

  • Example: In a customer dimension, SCD Type 1 would update...read more

Q83. Etl tester test cases in various stages

Ans.

ETL tester tests in various stages to ensure data accuracy and completeness.

  • Test extraction process to ensure correct data is pulled from source systems

  • Test transformation process to ensure data is correctly transformed and formatted

  • Test loading process to ensure data is loaded into target systems without errors

  • Test data reconciliation to ensure data is complete and accurate in target systems

  • Test error handling and logging to ensure errors are properly handled and logged

Q84. What are constraints?

Ans.

Constraints are rules or conditions that must be followed in order to ensure data integrity and accuracy.

  • Constraints define the limits or restrictions on data values in a database table

  • Examples of constraints include primary key, foreign key, unique key, and check constraints

  • Constraints help maintain data quality and consistency in a database

Q85. Query to get 4th highest row based on a column

Ans.

Use SQL query with ORDER BY and LIMIT to get 4th highest row based on a column

  • Use ORDER BY clause to sort the column in descending order

  • Use LIMIT 1 OFFSET 3 to get the 4th highest row

  • Example: SELECT * FROM table_name ORDER BY column_name DESC LIMIT 1 OFFSET 3

Q86. Find the optimized approach for a given code

Ans.

Identify bottlenecks and inefficiencies in the code to optimize performance

  • Analyze the code to identify any redundant or unnecessary operations

  • Optimize data processing by using efficient algorithms and data structures

  • Consider parallel processing or multi-threading for performance improvement

  • Use profiling tools to identify performance bottlenecks and optimize them

  • Implement caching mechanisms to reduce redundant computations

Q87. Expected CTC in present company

Ans.

I prefer not to disclose my current salary as it is confidential.

  • It is not appropriate to disclose current salary as it is confidential information.

  • I am open to discussing a fair and competitive salary for the position.

  • My focus is on finding a challenging role that aligns with my skills and experience.

  • I am willing to negotiate based on the job responsibilities and market standards.

Q88. Flexible to work any environment

Ans.

Yes

  • I am flexible to work in any environment, be it a team or individual work

  • I can adapt to different work cultures and work styles

  • I am open to learning new technologies and tools

  • I can work in different time zones and shifts if required

Q89. Try to give real time example

Ans.

As an ETL Tester, a real-time example would be testing the data migration from a legacy system to a new system.

  • Testing the accuracy of data migration from one system to another

  • Ensuring data integrity during the migration process

  • Validating the completeness of data after migration

  • Verifying the consistency of data across systems

  • Identifying and resolving any data quality issues

  • Testing the performance of the ETL process

Q90. How to read a parquet file

Ans.

To read a parquet file, use a library like Apache Parquet or PyArrow to load the file and access the data.

  • Use a library like Apache Parquet or PyArrow to read the parquet file

  • Load the parquet file using the library's functions

  • Access the data within the parquet file for analysis or processing

Q91. Type of scd.

Ans.

SCD stands for Slowly Changing Dimensions. There are three types of SCD: Type 1, Type 2, and Type 3.

  • Type 1: Overwrites old data with new data.

  • Type 2: Creates a new record for new data and keeps the old record for historical data.

  • Type 3: Creates a new column for new data and keeps the old column for historical data.

Q92. Explain cross join with an example

Ans.

Cross join combines each row from one table with each row from another table.

  • Cross join produces a Cartesian product of the two tables involved.

  • It does not require any common columns between the two tables.

  • It can result in a large number of rows if the tables are big.

  • Example: SELECT * FROM table1 CROSS JOIN table2;

Q93. Explain inner join with an example

Ans.

Inner join combines rows from two tables based on a related column between them.

  • Inner join returns only the rows that have matching values in both tables

  • It is used to retrieve data that exists in both tables

  • Example: SELECT Orders.OrderID, Customers.CustomerName FROM Orders INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID;

Q94. Explain SCD types

Ans.

SCD types are used to track changes in data over time in a data warehouse.

  • SCD stands for Slowly Changing Dimensions.

  • There are three types of SCD: Type 1, Type 2, and Type 3.

  • Type 1 overwrites old data with new data.

  • Type 2 creates a new record for each change and keeps a history of changes.

  • Type 3 keeps both old and new data in the same record.

  • SCD types are important for maintaining data integrity and accuracy in a data warehouse.

Q95. What is primary key?

Ans.

Primary key is a unique identifier for each record in a database table.

  • Primary key ensures each record in a table is unique

  • It can be a single column or a combination of columns

  • Primary key values cannot be NULL

Q96. Query to fetch nth highest salary

Ans.

Use SQL query with ORDER BY and LIMIT to fetch nth highest salary

  • Use ORDER BY salary DESC to sort salaries in descending order

  • Use LIMIT n-1,1 to fetch the nth highest salary

  • Replace n with the desired position of the salary

Q97. what is data aquisation

Ans.

Data acquisition is the process of collecting, gathering, and capturing data from various sources.

  • Data acquisition involves retrieving data from different sources such as databases, files, APIs, or sensors.

  • It is an essential step in the ETL (Extract, Transform, Load) process.

  • Data acquisition can be manual or automated, depending on the source and the system being used.

  • Examples of data acquisition include extracting customer information from a CRM system, importing data from a...read more

Q98. what is data reconsilation

Ans.

Data reconciliation is the process of comparing and verifying data between two or more sources to ensure accuracy and consistency.

  • Data reconciliation involves comparing data from different sources, such as databases or files.

  • It aims to identify any discrepancies or inconsistencies in the data.

  • Reconciliation can be done by comparing data values, counts, or summaries.

  • Examples include comparing sales data between a point-of-sale system and an inventory database.

  • Data reconciliati...read more

Q99. what is scrum ceremony

Ans.

Scrum ceremonies are regular meetings held in Agile development to facilitate collaboration, planning, and review.

  • Scrum ceremonies are time-boxed events that occur at specific intervals during a sprint.

  • The ceremonies include Sprint Planning, Daily Stand-up, Sprint Review, and Sprint Retrospective.

  • Sprint Planning involves determining the work to be done in the upcoming sprint.

  • Daily Stand-up is a brief meeting where team members discuss progress, challenges, and plans for the d...read more

Q100. SQL query to find second Max salary.

Ans.

Use SQL query with subquery to find second highest salary.

  • Use ORDER BY and LIMIT to get the second highest salary.

  • Use a subquery to exclude the highest salary from the result set.

  • Example: SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees);

Previous
1
2
3
Next
Interview Tips & Stories
Ace your next interview with expert advice and inspiring stories

Interview experiences of popular companies

3.7
 • 10.5k Interviews
3.8
 • 8.2k Interviews
3.6
 • 7.6k Interviews
3.7
 • 5.6k Interviews
3.7
 • 4.8k Interviews
3.5
 • 3.8k Interviews
3.8
 • 2.9k Interviews
3.7
 • 538 Interviews
3.7
 • 514 Interviews
View all

Calculate your in-hand salary

Confused about how your in-hand salary is calculated? Enter your annual salary (CTC) and get your in-hand salary

Recently Viewed
SALARIES
Nyati Group
SALARIES
Nyati Group
JOBS
Smartworks
No Jobs
SALARIES
Nyati Group
LIST OF COMPANIES
Nyati Group
Overview
SALARIES
Nyati Group
INTERVIEWS
SMEC
No Interviews
INTERVIEWS
Dumadu Games
No Interviews
INTERVIEWS
Fandango
No Interviews
INTERVIEWS
SMEC
No Interviews
ETL Tester Interview Questions
Share an Interview
Stay ahead in your career. Get AmbitionBox app
qr-code
Helping over 1 Crore job seekers every month in choosing their right fit company
65 L+

Reviews

4 L+

Interviews

4 Cr+

Salaries

1 Cr+

Users/Month

Contribute to help millions

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2024 Info Edge (India) Ltd.

Follow us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter