i
Tech Mahindra
Filter interviews by
Clear (1)
I was interviewed in Dec 2024.
I applied via Recruitment Consulltant and was interviewed in Nov 2024. There was 1 interview round.
Bigtable is a NoSQL database for real-time analytics, while BigQuery is a fully managed data warehouse for running SQL queries.
Bigtable is a NoSQL database designed for real-time analytics and high throughput, while BigQuery is a fully managed data warehouse for running SQL queries.
Bigtable is used for storing large amounts of semi-structured data, while BigQuery is used for analyzing structured data using SQL queries.
...
To remove duplicate rows from BigQuery, use the DISTINCT keyword. To find the month of a given date, use the EXTRACT function.
To remove duplicate rows, use SELECT DISTINCT * FROM table_name;
To find the month of a given date, use SELECT EXTRACT(MONTH FROM date_column) AS month_name FROM table_name;
Make sure to replace 'table_name' and 'date_column' with the appropriate values in your query.
The operator used in Composer to move data from GCS to BigQuery is the GCS to BigQuery operator.
The GCS to BigQuery operator is used in Apache Airflow, which is the underlying technology of Composer.
This operator allows you to transfer data from Google Cloud Storage (GCS) to BigQuery.
You can specify the source and destination parameters in the operator to define the data transfer process.
Code to square each element in the input array.
Iterate through the input array and square each element.
Store the squared values in a new array to get the desired output.
Dataflow is a fully managed stream and batch processing service, while Dataproc is a managed Apache Spark and Hadoop service.
Dataflow is a serverless data processing service that automatically scales to handle your data, while Dataproc is a managed Spark and Hadoop service that requires you to provision and manage clusters.
Dataflow is designed for both batch and stream processing, allowing you to process data in real-t...
BigQuery architecture includes storage, execution, and optimization components for efficient query processing.
BigQuery stores data in Capacitor storage system for fast access.
Query execution is distributed across multiple nodes for parallel processing.
Query optimization techniques include partitioning tables, clustering tables, and using query cache.
Using partitioned tables can help eliminate scanning unnecessary data.
...
RDD vs dataframe vs dataset in PySpark
RDD (Resilient Distributed Dataset) is the basic abstraction in PySpark, representing a distributed collection of objects
Dataframe is a distributed collection of data organized into named columns, similar to a table in a relational database
Dataset is a distributed collection of data with the ability to use custom classes for type safety and user-defined functions
Dataframes and Data...
What people are saying about Tech Mahindra
I applied via Naukri.com and was interviewed in Jun 2024. There was 1 interview round.
Tech Mahindra interview questions for designations
I applied via Naukri.com and was interviewed in Jun 2024. There were 2 interview rounds.
Questions related to Database and ETL Process
I am impressed by your innovative projects and collaborative work culture.
I admire your organization's commitment to cutting-edge technology and data-driven solutions.
I am excited about the opportunity to work with a team of talented professionals who share my passion for data engineering.
Your organization's reputation for fostering a collaborative and inclusive work environment is appealing to me.
Get interview-ready with Top Tech Mahindra Interview Questions
I was interviewed in Apr 2024.
Pyspark coding questions and sql
BigQuery is a fully managed, serverless data warehouse by Google Cloud for analyzing large datasets using SQL queries.
BigQuery is a cloud-based data warehouse that allows for fast SQL queries on large datasets.
It is fully managed and serverless, meaning users do not have to worry about infrastructure management.
BigQuery can handle petabytes of data and allows for real-time analytics with its streaming capabilities.
It s...
Basic aptitude and reasoning online test
Some of the top questions asked at the Tech Mahindra Data Engineer interview -
The duration of Tech Mahindra Data Engineer interview process can vary, but typically it takes about less than 2 weeks to complete.
based on 13 interviews
3 Interview rounds
based on 34 reviews
Rating in categories
Software Engineer
26.4k
salaries
| ₹2 L/yr - ₹9.2 L/yr |
Senior Software Engineer
21.4k
salaries
| ₹5.5 L/yr - ₹23 L/yr |
Technical Lead
11.7k
salaries
| ₹9.5 L/yr - ₹38 L/yr |
Associate Software Engineer
5.5k
salaries
| ₹1.8 L/yr - ₹8.2 L/yr |
Team Lead
5k
salaries
| ₹5.2 L/yr - ₹17 L/yr |
Infosys
Cognizant
Accenture
Wipro