Filter interviews by
Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark and big data workloads.
Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing.
It stores data in Parquet format and uses a transaction log to keep track of all the changes made to the data.
Delta Lake architecture includes a storage layer, a transaction log, and a metadata l...
Activities in ADF refer to the tasks or operations that can be performed in Azure Data Factory.
Activities can include data movement, data transformation, data processing, and data orchestration.
Examples of activities in ADF are Copy Data activity, Execute Pipeline activity, Lookup activity, and Web activity.
Activities can be chained together in pipelines to create end-to-end data workflows.
Each activity in ADF has prop...
Mounting process in Databricks allows users to access external data sources within the Databricks environment.
Mounting allows users to access external data sources like Azure Blob Storage, AWS S3, etc.
Users can mount a storage account to a Databricks File System (DBFS) path using the Databricks UI or CLI.
Mounted data can be accessed like regular DBFS paths in Databricks notebooks and jobs.
Blob storage is for unstructured data, while Data Lake is for structured and unstructured data with metadata.
Blob storage is optimized for storing large amounts of unstructured data like images, videos, and backups.
Data Lake is designed to store structured and unstructured data with additional metadata for easier organization and analysis.
Blob storage is typically used for simple storage needs, while Data Lake is used ...
Top trending discussions
I applied via Approached by Company and was interviewed in Dec 2024. There were 2 interview rounds.
I applied via Naukri.com and was interviewed in Sep 2024. There were 4 interview rounds.
Basic aptitude questions
Data structure and algorithms
I applied via Naukri.com and was interviewed in Mar 2024. There were 3 interview rounds.
Error handling in PySpark involves using try-except blocks and logging to handle exceptions and errors.
Use try-except blocks to catch and handle exceptions in PySpark code
Utilize logging to record errors and exceptions for debugging purposes
Consider using the .option('mode', 'PERMISSIVE') method to handle corrupt records in data processing
I applied via LinkedIn and was interviewed in Mar 2024. There were 2 interview rounds.
Coding questions on sql python and spark
Implement a function to pair elements of an array based on a given sum.
Iterate through the array and check if the current element plus any other element equals the given sum.
Use a hash set to store elements already visited to avoid duplicate pairs.
Return an array of arrays containing the pairs that sum up to the given value.
I applied via Approached by Company and was interviewed in Aug 2023. There were 2 interview rounds.
7 sql questions of easy to medium , 2 python questions, 1 pyspark question
I applied via Referral and was interviewed before Nov 2022. There were 3 interview rounds.
SQL coding and Ssis technical skills
I applied via Approached by Company and was interviewed before Apr 2022. There were 4 interview rounds.
Senior Analyst
925
salaries
| ₹2 L/yr - ₹6.2 L/yr |
Analyst
795
salaries
| ₹1.5 L/yr - ₹5.7 L/yr |
Operations Analyst
317
salaries
| ₹2 L/yr - ₹5 L/yr |
Process Analyst
222
salaries
| ₹1.2 L/yr - ₹4.5 L/yr |
Specialist
191
salaries
| ₹3.1 L/yr - ₹7 L/yr |
Wells Fargo
American Express
UBS
State Street Corporation