i
Wavicle Data
Solutions
Filter interviews by
I applied via LinkedIn and was interviewed before May 2023. There were 2 interview rounds.
I applied via Naukri.com and was interviewed in Apr 2021. There were 4 interview rounds.
I applied via Naukri.com and was interviewed in Jul 2020. There were 3 interview rounds.
Top trending discussions
I appeared for an interview in Nov 2024, where I was asked the following questions.
Apache Spark is a distributed computing framework optimized for big data processing with various optimization techniques.
Spark's architecture consists of a driver program, cluster manager, and worker nodes.
The driver program coordinates the execution of tasks and maintains the SparkContext.
Cluster managers like YARN or Mesos allocate resources across the cluster.
Worker nodes execute tasks and store data in memory for f...
Databricks is a unified data analytics platform that includes components like Databricks Workspace, Databricks Runtime, and Databricks Delta.
Databricks Workspace: Collaborative environment for data science and engineering teams.
Databricks Runtime: Optimized Apache Spark cluster for data processing.
Databricks Delta: Unified data management system for data lakes.
To read a JSON file, use a programming language's built-in functions or libraries to parse the file and extract the data.
Use a programming language like Python, Java, or JavaScript to read the JSON file.
Import libraries like json in Python or json-simple in Java to parse the JSON data.
Use functions like json.load() in Python to load the JSON file and convert it into a dictionary or object.
Access the data in the JSON fi...
To find the second highest salary in SQL, use the MAX function with a subquery or the LIMIT clause.
Use the MAX function with a subquery to find the highest salary first, then use a WHERE clause to exclude it and find the second highest salary.
Alternatively, use the LIMIT clause to select the second highest salary directly.
Make sure to handle cases where there may be ties for the highest salary.
Spark cluster configuration involves setting up memory, cores, and other parameters for optimal performance.
Specify the number of executors and executor memory
Set the number of cores per executor
Adjust the driver memory based on the application requirements
Configure shuffle partitions for efficient data processing
Enable dynamic allocation for better resource utilization
I applied via Approached by Company and was interviewed in Oct 2023. There were 2 interview rounds.
Medium level of coding questions.
Use a command line tool like cat to concatenate multiple CSV files into a single file
Use the cat command in the terminal to concatenate multiple CSV files into a single file
Navigate to the directory where the CSV files are located
Run the command 'cat file1.csv file2.csv > combined.csv' to merge file1.csv and file2.csv into a new file named combined.csv
HDInsight is a cloud-based service in Azure that makes it easy to process big data using Apache Hadoop, Spark, and other tools.
HDInsight is a fully managed cloud service that makes it easy to process big data using open-source frameworks like Apache Hadoop, Spark, and more.
It allows you to create, scale, and monitor Hadoop clusters in Azure.
HDInsight integrates with Azure Data Factory to provide data orchestration and ...
Data copy in Azure can be performed using Azure Data Factory or Azure Storage Explorer.
Use Azure Data Factory to create data pipelines for copying data between various sources and destinations.
Use Azure Storage Explorer to manually copy data between Azure storage accounts.
Utilize Azure Blob Storage for storing the data to be copied.
I expect a collaborative environment that fosters growth, innovation, and the opportunity to work on impactful data projects.
Opportunities for professional development, such as workshops and training sessions.
A culture of collaboration where team members share knowledge and support each other.
Engagement in challenging projects that allow me to apply my skills and learn new technologies, like cloud data solutions.
Clear ...
based on 4 interview experiences
Difficulty level
Duration
based on 53 reviews
Rating in categories
Data Engineer
94
salaries
| ₹4.4 L/yr - ₹16.5 L/yr |
Senior Data Engineer
31
salaries
| ₹8 L/yr - ₹30 L/yr |
Data Integration Developer
19
salaries
| ₹1.8 L/yr - ₹6.1 L/yr |
Lead Data Engineer
15
salaries
| ₹20.5 L/yr - ₹38 L/yr |
Junior Data Analyst
13
salaries
| ₹3.5 L/yr - ₹7 L/yr |
Tekwissen
Damco Solutions
smartData Enterprises
In Time Tec Visionsoft