i
Tech
Mahindra
Filter interviews by
I applied via Naukri.com and was interviewed in Nov 2024. There were 2 interview rounds.
Developed a data pipeline to ingest, process, and analyze customer feedback data for a retail company.
Used Google Cloud Platform services like BigQuery, Dataflow, and Pub/Sub for data processing.
Implemented data cleansing and transformation techniques to ensure data quality.
Created visualizations and dashboards using tools like Data Studio for stakeholders to easily interpret the data.
GCP offers different storage classes for varying performance and cost requirements.
Standard Storage: for frequently accessed data
Nearline Storage: for data accessed less frequently
Coldline Storage: for data accessed very infrequently
Archive Storage: for data stored for long-term retention
SQL optimization techniques focus on improving query performance by reducing execution time and resource usage.
Use indexes to speed up data retrieval
Avoid using SELECT * and instead specify only the columns needed
Optimize joins by using appropriate join types and conditions
Limit the use of subqueries and instead use JOINs where possible
Use EXPLAIN to analyze query execution plans and identify bottlenecks
I applied via LinkedIn and was interviewed before Nov 2021. There were 3 interview rounds.
I applied via Naukri.com and was interviewed before Nov 2021. There were 2 interview rounds.
Google Cloud BigQuery is a fully-managed, serverless data warehouse that uses a distributed architecture for processing and analyzing large datasets.
BigQuery uses a distributed storage system called Capacitor for storing and managing data.
It uses a distributed query engine called Dremel for executing SQL-like queries on large datasets.
BigQuery separates storage and compute, allowing users to scale compute resources ind...
List and tuple are both used to store collections of data, but they have some differences.
Lists are mutable while tuples are immutable
Lists use square brackets [] while tuples use parentheses ()
Lists are typically used for collections of homogeneous data while tuples are used for heterogeneous data
Lists have more built-in methods than tuples
Window functions in BigQuery are used to perform calculations across a set of table rows related to the current row.
Window functions allow you to perform calculations on a set of rows related to the current row
They are used with the OVER() clause in SQL queries
Common window functions include ROW_NUMBER(), RANK(), and NTILE()
They can be used to calculate moving averages, cumulative sums, and more
Types of NoSQL databases in GCP include Firestore, Bigtable, and Datastore.
Firestore is a flexible, scalable database for mobile, web, and server development.
Bigtable is a high-performance NoSQL database service for large analytical and operational workloads.
Datastore is a highly scalable NoSQL database for web and mobile applications.
Code to find max number of product by customer
Iterate through each customer's purchases
Keep track of the count of each product for each customer
Find the product with the maximum count for each customer
Creating a dataframe in GCP Data Engineer
Use the pandas library to create a dataframe
Provide data in the form of a dictionary or list of lists
Specify column names if needed
What people are saying about Tech Mahindra
I applied via Company Website and was interviewed before Oct 2023. There were 3 interview rounds.
Need to write codes based on the scenarios
Reverse a given number
Convert the number to a string
Reverse the string
Convert the reversed string back to a number
Recursion function to calculate factorial of a number
Define a function that takes an integer as input
Base case: if input is 0, return 1
Recursive case: return input multiplied by factorial of input-1
Example: factorial(5) = 5 * factorial(4) = 5 * 4 * factorial(3) = ... = 5 * 4 * 3 * 2 * 1 = 120
I chose TCS for its reputation, global presence, diverse opportunities, and focus on innovation.
TCS is a renowned company with a strong reputation in the IT industry
TCS has a global presence with offices in multiple countries, providing opportunities for international exposure
TCS offers diverse opportunities for career growth and development in various domains
TCS is known for its focus on innovation and cutting-edge te...
I am currently on a 30-day notice period.
My notice period is 30 days.
I am required to work for 30 more days before leaving my current position.
The GCP services used in our project include BigQuery, Dataflow, Pub/Sub, and Cloud Storage.
BigQuery for data warehousing and analytics
Dataflow for real-time data processing
Pub/Sub for messaging and event ingestion
Cloud Storage for storing data and files
Cloud Functions are event-driven functions that run in response to cloud events.
Serverless functions that automatically scale based on demand
Can be triggered by events from various cloud services
Supports multiple programming languages like Node.js, Python, etc.
To schedule a job to trigger every hour in Airflow, you can use the Cron schedule interval
Define a DAG (Directed Acyclic Graph) in Airflow
Set the schedule_interval parameter to '0 * * * *' to trigger the job every hour
Example: schedule_interval='0 * * * *'
Use Python's slicing feature to display a string in reverse order.
Use string slicing with a step of -1 to reverse the string.
Example: 'hello'[::-1] will output 'olleh'.
Pub/Sub is a messaging service that allows communication between independent applications.
Pub/Sub is used for real-time messaging and event-driven systems.
It is commonly used for data ingestion, streaming analytics, and event-driven architectures.
Examples of Pub/Sub services include Google Cloud Pub/Sub, Apache Kafka, and Amazon SNS/SQS.
I applied via Walk-in and was interviewed in Mar 2022. There was 1 interview round.
Use SQL to find keys present in table A but not in table B (old copy of A).
Use a LEFT JOIN to combine tables A and B based on the key column
Filter the results where the key column in table B is NULL
This will give you the keys present in table A but not in table B
SQL query to retrieve the 4th highest salary from a salary table using various methods.
Use the 'DISTINCT' keyword to avoid duplicate salaries.
Utilize 'ORDER BY' to sort salaries in descending order.
Use 'LIMIT' with 'OFFSET' to skip the first three highest salaries.
Example SQL: 'SELECT DISTINCT salary FROM employees ORDER BY salary DESC LIMIT 1 OFFSET 3;'
Use GCP Dataflow to transfer files between GCS buckets
Create a Dataflow pipeline using Apache Beam to read from source bucket and write to destination bucket
Use GCS connector to read and write files in Dataflow pipeline
Set up appropriate permissions for Dataflow service account to access both buckets
Explaining project flow and ownership to a new IT joiner involves outlining roles, responsibilities, and collaboration.
1. Project Initiation: Discuss how projects start with requirements gathering and stakeholder meetings.
2. Role Assignment: Explain how tasks are assigned based on team members' strengths and expertise.
3. Collaboration Tools: Introduce tools like JIRA or Trello for tracking progress and ownership.
4. Reg...
Implemented a data pipeline using Airflow for ETL processes, enhancing workflow management and scheduling.
Airflow's DAG (Directed Acyclic Graph) structure allows for clear visualization of task dependencies.
It supports dynamic pipeline generation, enabling flexibility in defining workflows based on external parameters.
Airflow has a rich user interface for monitoring and managing workflows, making it easier to troublesh...
Cloud Composer is another orchestration tool in GCP
Cloud Composer is a fully managed workflow orchestration service built on Apache Airflow
It allows you to author, schedule, and monitor workflows that span across GCP services
Cloud Composer provides a rich set of features like DAGs, plugins, and monitoring capabilities
It integrates seamlessly with other GCP services like BigQuery, Dataflow, and Dataproc
I applied via LinkedIn and was interviewed in Oct 2024. There were 2 interview rounds.
I have experience working on projects involving data processing, transformation, and analysis using GCP services like BigQuery, Dataflow, and Dataproc.
Utilized BigQuery for storing and querying large datasets
Implemented data pipelines using Dataflow for real-time data processing
Utilized Dataproc for running Apache Spark and Hadoop clusters for data processing
Worked on data ingestion and transformation using Cloud Stora...
based on 1 interview experience
Difficulty level
Duration
based on 3 reviews
Rating in categories
Software Engineer
26.6k
salaries
| ₹3.7 L/yr - ₹9.2 L/yr |
Senior Software Engineer
22.2k
salaries
| ₹9.1 L/yr - ₹18.5 L/yr |
Technical Lead
12.5k
salaries
| ₹16.9 L/yr - ₹30 L/yr |
Associate Software Engineer
6.1k
salaries
| ₹1.9 L/yr - ₹5.7 L/yr |
Team Lead
5.4k
salaries
| ₹5.7 L/yr - ₹17.7 L/yr |
Infosys
Cognizant
Accenture
Wipro