Tech Mahindra
10+ Interview Questions and Answers
Q1. how to remove duplicate rows from bigquery? find the month of a given date in bigquery.
Q2. what operator is used in composer to move data from gcs to bq
Q3. write a code for this - input = [1,2,3,4] output = [1,4,9,16]
Q4. architecture of bq. Query optimization techniques in bigquery.
Q5. transformations in pyspark rank,dense rank
Rank and Dense Rank are transformations in PySpark used to assign ranks to rows based on a specific column.
Rank assigns unique ranks to each row based on the order of values in a specific column.
Dense Rank assigns ranks to each row based on the order of values in a specific column, but with no gaps between ranks.
Both transformations can be used with the 'over' function to specify the column to order by.
Example: df.select('name', 'score', rank().over(Window.orderBy('score')).a...read more
Q6. difference between bigtable and bigquery.
Q7. RDD vs dataframe vs dataset in pyspark
Q8. What is cloud ? What is pyspark
Cloud is a network of remote servers hosted on the internet to store, manage, and process data.
Cloud computing allows users to access data and applications from any device with an internet connection.
It provides scalability, flexibility, and cost-effectiveness for businesses.
Examples of cloud services include Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform.
Q9. how bigquery works?
BigQuery is a fully managed, serverless data warehouse by Google Cloud for analyzing large datasets using SQL queries.
BigQuery is a cloud-based data warehouse that allows for fast SQL queries on large datasets.
It is fully managed and serverless, meaning users do not have to worry about infrastructure management.
BigQuery can handle petabytes of data and allows for real-time analytics with its streaming capabilities.
It supports standard SQL queries and integrates with other Goo...read more
Q10. dataflow vs dataproc.
Q11. delete and truncate difference
Delete removes rows from a table while truncate removes all rows from a table
Delete is a DML command while truncate is a DDL command
Delete can be rolled back while truncate cannot be rolled back
Delete fires triggers on each row deletion while truncate does not fire triggers
Interview Process at null
Top Data Engineer Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month