i
DataGrokr
Filter interviews by
Spark related assignment
Use the sorted() function in Python to sort numbers in a list.
Use the sorted() function with the list of numbers as input.
sorted() function returns a new sorted list without modifying the original list.
You can use the reverse parameter in sorted() to sort in descending order.
Example: numbers = [3, 1, 4, 1, 5, 9, 2] sorted_numbers = sorted(numbers) print(sorted_numbers)
I have experience working as a Data Engineer for 5 years in a tech company, where I was responsible for designing and implementing data pipelines, optimizing data storage and retrieval, and collaborating with cross-functional teams.
Designed and implemented data pipelines to extract, transform, and load data from various sources
Optimized data storage and retrieval processes to improve efficiency and performance
Collabora...
Continuous learning through online courses, books, and hands-on projects.
Regularly enroll in online courses related to data engineering to stay updated on new technologies and best practices.
Read books and articles on data engineering to deepen understanding of concepts and techniques.
Work on hands-on projects to apply theoretical knowledge and gain practical experience.
Participate in data engineering communities and f...
I applied via Approached by Company and was interviewed in Sep 2023. There were 3 interview rounds.
You get an assignment with a deadline of 4-5 days. It's basically the SQL queries and Flask API. So the assignment will be evaluated on code quality and time taken for execution.
Use recursive CTE to print numbers 1-50 in SQL
Use recursive Common Table Expression (CTE) to generate numbers from 1 to 50
Start with anchor member as 1 and recursively add 1 until reaching 50
Select the generated numbers from the CTE
I attended interview for data engineer intern in March 2024. First round was written test which consisted of 30 questions of 50 marks, Out of which 27 questions are MCQ type and 3 questions are coding.
The coding questions are
1. Write program to find the Fibonacci series
2. Program to check if a linked list is palindrome or not
3. Query to find the Nth highest salary
I applied via Campus Placement
I was assigned a fairly straightforward data analysis task with a one-week deadline. For this task, we were required to learn Pyspark SQL.
DataGrokr interview questions for popular designations
I applied via Internshala and was interviewed before May 2023. There were 2 interview rounds.
I applied via Campus Placement and was interviewed before Feb 2022. There were 4 interview rounds.
Related to python, SQL and pyspark.
Top trending discussions
I applied via Job Portal and was interviewed in Mar 2024. There were 3 interview rounds.
Spark cluster sizing depends on workload, data size, memory requirements, and processing speed.
Consider the size of the data being processed
Take into account the memory requirements of the Spark jobs
Factor in the processing speed needed for the workload
Scale the cluster based on the number of nodes and cores required
Monitor performance and adjust cluster size as needed
Implement a pipeline based on given conditions and data requirement
I applied via LinkedIn and was interviewed in Aug 2020. There were 5 interview rounds.
Commands that run on driver and executor in a word count Spark program.
The command to read the input file and create RDD will run on driver.
The command to split the lines and count the words will run on executor.
The command to aggregate the word counts and write the output will run on driver.
Driver sends tasks to executors and coordinates the overall job.
Executor processes the tasks assigned by the driver.
based on 7 interviews
Interview experience
based on 16 reviews
Rating in categories
Data Engineer
65
salaries
| ₹5 L/yr - ₹15 L/yr |
Software Engineer
40
salaries
| ₹5.7 L/yr - ₹15.2 L/yr |
Software Developer
15
salaries
| ₹5 L/yr - ₹20 L/yr |
Devops Engineer
15
salaries
| ₹3.5 L/yr - ₹12 L/yr |
Full Stack Developer
8
salaries
| ₹4.9 L/yr - ₹11 L/yr |
Fractal Analytics
Mu Sigma
Tiger Analytics
LatentView Analytics