ETL Tester
100+ ETL Tester Interview Questions and Answers

Asked in SKF

Q. What is the difference between a star schema and a snowflake schema?
Star schema has a single fact table and denormalized dimension tables, while snowflake schema has normalized dimension tables.
Star schema is simpler and easier to understand, but can lead to data redundancy.
Snowflake schema is more complex but saves storage space and allows for more flexibility in querying.
Star schema is better suited for smaller datasets, while snowflake schema is better for larger datasets.
Example of star schema: sales fact table with denormalized customer ...read more

Asked in Hexaware Technologies

Q. Suppose we have a table with a column named 'country' containing the values UK, USA, India, Germany, and Spain. How would you reorder the table to have India in the first row, followed by the remaining countrie...
read moreReorder a table to place 'India' at the top, followed by other countries.
Use SQL to reorder the results: SELECT country FROM table ORDER BY CASE WHEN country = 'India' THEN 0 ELSE 1 END;
This SQL query prioritizes 'India' by assigning it the lowest sort value.
The rest of the countries will follow in their natural alphabetical order.

Asked in Cognizant

Q. Explain ETL architecture.
ETL architecture refers to the design and structure of the ETL process.
ETL architecture involves three main components: extraction, transformation, and loading.
Data is extracted from various sources, transformed to fit the target system, and loaded into the target database.
ETL architecture can be implemented using different tools and technologies, such as ETL software, data integration platforms, and cloud-based solutions.
The architecture should be designed to ensure data acc...read more

Asked in Cognizant

Q. What is a data repository?
A data repository is a centralized location where data is stored, managed, and maintained.
It is used to store and manage data in a structured manner
It can be a database, data warehouse, or data lake
It allows for easy access and retrieval of data
Examples include Hadoop Distributed File System (HDFS), Amazon S3, and Oracle Database

Asked in Wipro

Q. What is Primary key and foreign key.guve me example of same in your current project
Primary keys uniquely identify records in a table, while foreign keys establish relationships between tables.
A primary key is a unique identifier for a record in a database table.
Example: In a 'Customers' table, 'CustomerID' can be a primary key.
A foreign key is a field in one table that links to the primary key of another table.
Example: In an 'Orders' table, 'CustomerID' can be a foreign key referencing 'Customers'.
Primary keys ensure data integrity, while foreign keys maint...read more

Asked in Infosys

Q. Write an SQL query to fetch the 5th, 10th, 15th, ... records from a table.
Use OFFSET and FETCH in SQL to retrieve specific records from a table.
Use OFFSET to skip the first n-1 records
Use FETCH to limit the number of records to be retrieved
Example: SELECT * FROM table_name OFFSET 4 ROWS FETCH NEXT 5 ROWS ONLY
ETL Tester Jobs




Asked in Wipro

Q. What is the difference between DROP and TRUNCATE commands?
Drop command deletes the table structure along with data, while truncate command deletes only the data.
Drop command removes the table structure and all its data, while truncate command only removes the data but keeps the table structure intact.
Drop command is a DDL (Data Definition Language) command, while truncate command is a DML (Data Manipulation Language) command.
Drop command is slower than truncate command as it involves deleting the table structure.
Example: DROP TABLE ...read more

Asked in Qualitest

Q. diff b/w where clause and having clause? diff b/w severity and priority
Difference between where and having clause and severity and priority
Where clause is used to filter rows before grouping while having clause is used to filter groups after grouping
Severity is the degree of impact of a defect on the system while priority is the order in which defects should be fixed
Where clause is used with SELECT, UPDATE and DELETE statements while having clause is used with GROUP BY clause
Severity is usually assigned by the tester while priority is assigned b...read more
Share interview questions and help millions of jobseekers 🌟

Asked in Hexaware Technologies

Q. What is the difference between set and join operations?
Set and join are operations used in database management systems.
Set is used to perform mathematical set operations like union, intersection, and difference on two or more sets.
Join is used to combine rows from two or more tables based on a related column between them.
Set operations do not require a common column, while join operations do.
Set operations return a result set with unique values, while join operations return a result set with combined values.
Examples: UNION, INTER...read more

Asked in Zensar Technologies

Q. What is a data mart?
A data mart is a subset of a larger data warehouse that is designed to serve a specific business unit or department.
Contains a subset of data from a larger data warehouse
Designed to serve a specific business unit or department
Provides a more focused view of data for analysis and reporting
Can be created using a top-down or bottom-up approach
Examples include sales data mart, marketing data mart, finance data mart

Asked in Wipro

Q. What are the different types of joins, and can you explain a query for each?
Joins are SQL operations that combine rows from two or more tables based on related columns.
INNER JOIN: Returns records with matching values in both tables. Example: SELECT * FROM A INNER JOIN B ON A.id = B.id;
LEFT JOIN: Returns all records from the left table and matched records from the right table. Example: SELECT * FROM A LEFT JOIN B ON A.id = B.id;
RIGHT JOIN: Returns all records from the right table and matched records from the left table. Example: SELECT * FROM A RIGHT ...read more

Asked in EPAM Systems

Q. What is an iterative model?
Iterative model is a software development model where the development process is divided into smaller iterations.
Iterative model involves repeating the development process in small cycles.
Each cycle involves planning, designing, building, testing and evaluating the software.
Feedback from each cycle is used to improve the software in the next cycle.
Iterative model is useful when requirements are not clear or keep changing.
Examples of iterative models include Agile and Scrum.

Asked in TCS

Q. How you identify duplicates and how to find latest duplicate
Identifying duplicates and finding the latest duplicate in ETL testing
Identify duplicates by comparing key fields in the dataset
To find the latest duplicate, look for the record with the highest timestamp or version number
Use SQL queries or ETL tools to filter out duplicates and identify the latest one

Asked in LTIMindtree

Q. Find the nth row. Find the top 50% records, string function
Use SQL query to find the nth row and top 50% records using string function
To find the nth row, use LIMIT and OFFSET in SQL query
To find top 50% records, use PERCENTILE_CONT function in SQL
String functions like SUBSTRING can be used to manipulate string data
Asked in Data Unveil

Q. 1. Fetch last record of the column. 2. Dense rank vs rank 3. Fetch end of 5 character from the string of the column
The interview questions are related to SQL and data manipulation.
To fetch the last record of a column, you can use the ORDER BY clause with DESC keyword and LIMIT 1.
Dense rank and rank are both ranking functions in SQL, but dense rank assigns consecutive rank numbers to rows, while rank leaves gaps in the ranking sequence.
To fetch the end of 5 characters from a string column, you can use the RIGHT function in SQL.

Asked in Cognizant

Q. Explain the schema you used in your project.
The schema used in my project was a star schema.
Star schema is a type of data warehouse schema where a central fact table is connected to multiple dimension tables.
The fact table contains the measurements or metrics of the business process, while the dimension tables provide context and descriptive attributes.
This schema is commonly used in data warehousing and business intelligence applications.
Example: In a sales analysis project, the fact table could contain sales transact...read more

Asked in Wipro

Q. what is Star Schema and Snow flake schema ?
Star Schema is a type of database schema where a central fact table is connected to multiple dimension tables. Snowflake Schema is a normalized form of Star Schema where dimension tables are further normalized into sub-dimension tables.
Star Schema has a central fact table connected to multiple dimension tables
Snowflake Schema is a normalized form of Star Schema with sub-dimension tables
Star Schema is denormalized for better query performance
Snowflake Schema is normalized for ...read more

Asked in Qualitest

Q. Write a query to retrieve transaction details from the last hour made through net banking or card.
Query for last hour transaction details through net banking or card
Use the transaction table to filter transactions by timestamp
Join with the net banking or card table to get additional details
Filter for transactions within the last hour using a WHERE clause
Order the results by timestamp to get the most recent transactions first

Asked in Cognizant

Q. What is a fact table?
A fact is a piece of information that is known to be true or proven through evidence.
Facts are objective and can be verified through research or observation.
Facts are not opinions or beliefs.
Examples of facts include the boiling point of water, the population of a city, and historical events that have been documented.
Facts can be used to support arguments or conclusions.
Facts can change over time as new information is discovered or theories are revised.

Asked in American Broadcasting Company

Q. What is the difference between star schema and snowflake schema?
Star schema has a single fact table connected to multiple dimension tables directly, while snowflake schema has normalized dimension tables connected to each other.
Star schema has denormalized dimension tables, making it easier to query but potentially less efficient for storage.
Snowflake schema has normalized dimension tables, reducing redundancy but potentially making queries more complex.
Star schema is more suitable for data warehousing applications with simpler queries, w...read more

Asked in Infosys

Q. Explain the GROUP BY clause and how it works within a query.
Group by clause is used in SQL to group rows that have the same values into summary rows.
Group by clause is used with aggregate functions like SUM, COUNT, AVG, etc.
It is used to group rows based on one or more columns specified in the query.
Group by clause must be used after the WHERE clause and before the ORDER BY clause.
Example: SELECT department, COUNT(*) FROM employees GROUP BY department;

Asked in HCLTech

Q. Explain the different types of data validation and checks performed in ETL testing.
Different types of data validation and checks in ETL testing include schema validation, data completeness checks, data transformation checks, and data quality checks.
Schema validation ensures that the data conforms to the expected structure and format.
Data completeness checks verify that all expected data is present and accounted for.
Data transformation checks ensure that data is accurately transformed according to business rules.
Data quality checks assess the accuracy, consi...read more

Asked in Oracle

Q. Write an SQL query to find the second highest salary.
SQL query to find second highest salary.
Use ORDER BY clause in descending order to sort the salaries
Use LIMIT 1,1 to select the second highest salary
Example: SELECT salary FROM employees ORDER BY salary DESC LIMIT 1,1
Asked in Kinelabs

Q. What is the difference between DELETE and DROP statements?
Delete removes specific records from a table; drop removes the entire table structure and its data.
DELETE is a DML command used to remove specific rows from a table based on a condition.
Example: DELETE FROM Employees WHERE EmployeeID = 1; removes the employee with ID 1.
DROP is a DDL command that removes the entire table structure and all its data permanently.
Example: DROP TABLE Employees; deletes the entire Employees table.
DELETE can be rolled back if used within a transactio...read more

Asked in Qualitest

Q. Explain the defect life cycle.
Defects life cycle is a process that describes the stages of a defect from discovery to resolution.
Defect is identified and reported
Defect is assigned to a developer
Developer fixes the defect
Defect is retested by the tester
If the defect is still present, it is reopened
If the defect is fixed, it is closed

Asked in Cognizant

Q. Types of test data creation
Test data creation types include manual, automated, random, boundary, and negative testing.
Manual testing involves creating data by hand
Automated testing uses tools to generate data
Random testing involves creating data randomly
Boundary testing involves testing data at the limits of its range
Negative testing involves testing invalid or unexpected data

Asked in Cognizant

Q. How would you test a full load versus an incremental load?
To test a full load vs incremental load, compare the results of loading all data at once vs loading only new or updated data.
Create test cases to verify the accuracy of data loaded during a full load.
Create test cases to verify that only new or updated data is loaded during an incremental load.
Compare the results of the full load and incremental load to ensure consistency and accuracy.
Verify that the data integrity is maintained during both types of loads.
Use tools like SQL q...read more
Asked in Data Unveil

Q. What type of data validation have you performed?
I performed various types of data validation including format validation, completeness validation, and referential integrity validation.
Format validation: Ensuring that the data is in the correct format as per the defined specifications. For example, validating that a date field is in the format 'YYYY-MM-DD'.
Completeness validation: Checking if all the required fields are present and populated in the data. For example, ensuring that a customer record has a name, address, and ...read more

Asked in Persistent Systems

Q. What was the flow of your most recent project?
The recent project flow involved data extraction, transformation, and loading using ETL tools.
The project started with identifying the data sources and requirements.
Then, the data was extracted from various sources using ETL tools like Informatica, Talend, etc.
The extracted data was transformed as per the business rules and requirements.
Finally, the transformed data was loaded into the target system.
The project also involved testing the ETL process and ensuring data accuracy ...read more

Asked in Horizon Architects

Q. How do you delete duplicates from each department?
Use SQL query with GROUP BY and HAVING clause to delete duplicates from each department
Use GROUP BY clause to group the records by department
Use HAVING clause to filter out the duplicates within each department
Write a DELETE statement to remove the duplicate records
Interview Questions of Similar Designations
Interview Experiences of Popular Companies





Top Interview Questions for ETL Tester Related Skills

Calculate your in-hand salary
Confused about how your in-hand salary is calculated? Enter your annual salary (CTC) and get your in-hand salary


Reviews
Interviews
Salaries
Users

