Tech Mahindra
WMS Industries Interview Questions and Answers
Q1. which of these 2 select * from table and select * from table limit 100 is faster
select * from table limit 100 is faster
Using 'select * from table' retrieves all rows from the table, which can be slower if the table is large
Using 'select * from table limit 100' limits the number of rows retrieved, making it faster
Limiting the number of rows fetched can improve query performance
Q2. gcp storage class types
GCP offers different storage classes for varying performance and cost requirements.
Standard Storage: for frequently accessed data
Nearline Storage: for data accessed less frequently
Coldline Storage: for data accessed very infrequently
Archive Storage: for data stored for long-term retention
Q3. sql optimisation techniques
SQL optimization techniques focus on improving query performance by reducing execution time and resource usage.
Use indexes to speed up data retrieval
Avoid using SELECT * and instead specify only the columns needed
Optimize joins by using appropriate join types and conditions
Limit the use of subqueries and instead use JOINs where possible
Use EXPLAIN to analyze query execution plans and identify bottlenecks
Q4. explain scd and Merge in bigquery
SCD stands for Slowly Changing Dimension and Merge is a SQL operation used to update or insert data in BigQuery.
SCD is used to track changes to data over time in a data warehouse
Merge in BigQuery is used to perform insert, update, or delete operations in a single statement
Example: MERGE INTO target_table USING source_table ON condition WHEN MATCHED THEN UPDATE SET col1 = value1 WHEN NOT MATCHED THEN INSERT (col1, col2) VALUES (value1, value2)
Q5. Dataflow function to split sentence
Dataflow function to split sentence
Use the Split transform in Dataflow to split the sentence into words
Apply ParDo function to process each word individually
Use regular expressions to handle punctuation and special characters
Q6. Architecture of bigquery
BigQuery is a fully managed, serverless data warehouse that enables scalable analysis over petabytes of data.
BigQuery uses a columnar storage format for efficient querying.
It supports standard SQL for querying data.
BigQuery allows for real-time data streaming for analysis.
It integrates with various data sources like Google Cloud Storage, Google Sheets, etc.
BigQuery provides automatic scaling and high availability.
Top Gcp Data Engineer Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month