Filter interviews by
I applied via Naukri.com and was interviewed in Jun 2021. There were 4 interview rounds.
Top trending discussions
I applied via Naukri.com and was interviewed in Nov 2024. There was 1 interview round.
I applied via AmbitionBox and was interviewed in Nov 2024. There were 4 interview rounds.
I utilize tools such as Excel, Python, SQL, and Tableau for data analysis.
Excel for basic data manipulation and visualization
Python for advanced data analysis and machine learning
SQL for querying databases
Tableau for creating interactive visualizations
Data analysis of code in the context of data analysis.
Coding logical question paper.
posted on 28 Aug 2024
I have experience working on projects involving data pipeline development, ETL processes, and data warehousing.
Developed ETL processes to extract, transform, and load data from various sources into a data warehouse
Built data pipelines to automate the flow of data between systems and ensure data quality and consistency
Optimized database performance and implemented data modeling best practices
Worked on real-time data pro...
posted on 25 Sep 2024
I applied via Walk-in and was interviewed in Aug 2024. There were 5 interview rounds.
Maths grammar & communication
You're like this job opportunity
I applied via LinkedIn and was interviewed in Jul 2024. There were 2 interview rounds.
It was pair programming round where we need to attempt a couple of Spark Scenario along with the Interviewer. You will have a boiler plate code with some functionalities to be filled up. You will be assessed on writing clean and extensible code and test cases.
I applied via Naukri.com and was interviewed in Oct 2024. There was 1 interview round.
Incremental load in pyspark refers to loading only new or updated data into a dataset without reloading the entire dataset.
Use the 'delta' function in pyspark to perform incremental loads by specifying the 'mergeSchema' option.
Utilize the 'partitionBy' function to optimize incremental loads by partitioning the data based on specific columns.
Implement a logic to identify new or updated records based on timestamps or uni...
Distribution in Spark refers to how data is divided across different nodes in a cluster for parallel processing.
Data is partitioned across multiple nodes in a cluster to enable parallel processing
Distribution can be controlled using partitioning techniques like hash partitioning or range partitioning
Ensures efficient utilization of resources and faster processing times
AWS Glue can process petabytes of data per hour
AWS Glue can process petabytes of data per hour, depending on the configuration and resources allocated
It is designed to scale horizontally to handle large volumes of data efficiently
AWS Glue can be used for ETL (Extract, Transform, Load) processes on massive datasets
Distribution in Spark refers to how data is divided across different nodes in a cluster for parallel processing.
Distribution in Spark determines how data is partitioned across different nodes in a cluster
It helps in achieving parallel processing by distributing the workload
Examples of distribution methods in Spark include hash partitioning and range partitioning
AWS Glue can process petabytes of data per hour.
AWS Glue can process petabytes of data per hour, making it suitable for large-scale data processing tasks.
It can handle various types of data sources, including structured and semi-structured data.
AWS Glue offers serverless ETL (Extract, Transform, Load) capabilities, allowing for scalable and cost-effective data processing.
It integrates seamlessly with other AWS services...
Spark is a fast and general-purpose cluster computing system, while PySpark is the Python API for Spark.
Spark is a distributed computing system that provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
PySpark is the Python API for Spark that allows developers to write Spark applications using Python.
Spark and PySpark are commonly used for big data processing, machine...
based on 9 reviews
Rating in categories
Senior Engineer
884
salaries
| ₹6.2 L/yr - ₹22.9 L/yr |
Senior Software Engineer
562
salaries
| ₹6.8 L/yr - ₹25.9 L/yr |
Software Engineer
259
salaries
| ₹3.5 L/yr - ₹14 L/yr |
Technical Specialist
207
salaries
| ₹10 L/yr - ₹38.5 L/yr |
Software Development Engineer
188
salaries
| ₹4 L/yr - ₹12 L/yr |
Accenture
TCS
Infosys
Wipro