Upload Button Icon Add office photos

Filter interviews by

Amazintech Innovations Data Engineer Interview Questions and Answers for Experienced

Updated 27 Apr 2023

Amazintech Innovations Data Engineer Interview Experiences for Experienced

1 interview found

Data Engineer Interview Questions & Answers

user image BALBHIM RAMCHANDRA PATIL

posted on 27 Apr 2023

Interview experience
4
Good
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Resume Shortlist 
Pro Tip by AmbitionBox:
Keep your resume crisp and to the point. A recruiter looks at your resume for an average of 6 seconds, make sure to leave the best impression.
View all tips
Round 2 - Technical 

(3 Questions)

  • Q1. What is HDFS for hadoop?
  • Ans. 

    HDFS is a distributed file system designed to store large data sets reliably and fault-tolerantly.

    • HDFS stands for Hadoop Distributed File System

    • It is the primary storage system used by Hadoop applications

    • It is designed to store large files and data sets across multiple machines

    • It provides high throughput access to application data

    • It is fault-tolerant and can handle node failures

    • It uses a master/slave architecture with ...

  • Answered by AI
  • Q2. What is replication factor of hadoop 2.x?
  • Ans. 

    The default replication factor of Hadoop 2.x is 3.

    • Replication factor determines the number of copies of data blocks that are stored across the Hadoop cluster.

    • The default replication factor in Hadoop 2.x is 3, which means that each data block is replicated three times.

    • The replication factor can be configured in the Hadoop configuration files.

    • The replication factor affects the fault tolerance and performance of the Hadoo...

  • Answered by AI
  • Q3. What is default block size of hadoop
  • Ans. 

    The default block size of Hadoop is 128 MB.

    • Hadoop uses HDFS (Hadoop Distributed File System) to store data in a distributed manner.

    • The default block size of HDFS is 128 MB.

    • This block size can be changed by modifying the dfs.blocksize property in the Hadoop configuration files.

  • Answered by AI

Skills evaluated in this interview

Interview questions from similar companies

Interview experience
2
Poor
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Coding Test 

Oops dsa sql network

Round 2 - Technical 

(2 Questions)

  • Q1. Dsa based python question
  • Q2. Dsa based python question on tress
Round 3 - Technical 

(2 Questions)

  • Q1. Sql question queries
  • Q2. Sql question query
Interview experience
4
Good
Difficulty level
Moderate
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. What is the SQL query to calculate the average sales over a period of 7 days?
  • Q2. Explain Data bricks architecture?
Interview experience
4
Good
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. Basic Python, SQL & Spark coding questions.
  • Q2. Nth highest salary, word count in pyspark
  • Ans. 

    To find the nth highest salary in pyspark, use the window function with row_number and filter on the desired rank.

    • Use window function with row_number to assign a rank to each salary

    • Filter the result to get the row with the desired rank

    • Example: df.withColumn('rank', F.row_number().over(Window.orderBy(F.col('salary').desc()))).filter(F.col('rank') == n).select('salary')

  • Answered by AI
Round 2 - Technical 

(1 Question)

  • Q1. Questions on SQL & PySpark.
Round 3 - Behavioral 

(1 Question)

  • Q1. Questions about previous projects & optimization techniques etc. Also behavioural
Round 4 - HR 

(1 Question)

  • Q1. Salary negotiation part.

Skills evaluated in this interview

Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. Explain the architecture of delta lake
  • Ans. 

    Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark and big data workloads.

    • Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing.

    • It stores data in Parquet format and uses Apache Spark for processing.

    • Delta Lake ensures data reliability and data quality by providing schema enforcement and data versioning.

    • It supports time tra...

  • Answered by AI
  • Q2. Tuning operations in databricks
  • Ans. 

    Tuning operations in Databricks involves optimizing performance and efficiency of data processing tasks.

    • Use cluster configuration settings to allocate resources efficiently

    • Optimize code by minimizing data shuffling and reducing unnecessary operations

    • Leverage Databricks Auto Optimize to automatically tune performance

    • Monitor job performance using Databricks Runtime Metrics and Spark UI

  • Answered by AI

Skills evaluated in this interview

Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - One-on-one 

(2 Questions)

  • Q1. Normal questions on spark, hadoop, sql
  • Q2. One python coding question was there
Round 2 - One-on-one 

(2 Questions)

  • Q1. Python coding felt difficult, its a moderate difficult question
  • Q2. Questions on project, sql moderate questions
Interview experience
4
Good
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(1 Question)

  • Q1. Spark architecture

Data Engineer Interview Questions & Answers

HCLTech user image Manoj Chitturi

posted on 28 May 2024

Interview experience
3
Average
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via Approached by Company and was interviewed in Apr 2024. There were 3 interview rounds.

Round 1 - Technical 

(2 Questions)

  • Q1. Explain Spark RDD,DF
  • Ans. 

    Spark RDD and DF are two data structures in Apache Spark for processing and analyzing data.

    • RDD (Resilient Distributed Dataset) is a distributed collection of elements that can be operated on in parallel. It is immutable and fault-tolerant.

    • DF (DataFrame) is a distributed collection of data organized into named columns. It provides a more structured and efficient way to work with data compared to RDDs.

    • RDD is low-level an...

  • Answered by AI
  • Q2. Spark Architecture
Round 2 - Case Study 

End to end data engineering architecture

Round 3 - HR 

(2 Questions)

  • Q1. Salary expectation
  • Q2. Cultural fitment

Skills evaluated in this interview

Interview experience
3
Average
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(1 Question)

  • Q1. Mostly on Spark

Data Engineer Interview Questions & Answers

Wipro user image Aarthi Varatharaji

posted on 13 Jun 2024

Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
Not Selected
Round 1 - Technical 

(2 Questions)

  • Q1. What is executor memory
  • Ans. 

    Executor memory is the amount of memory allocated to each executor in a Spark application.

    • Executor memory is specified using the 'spark.executor.memory' configuration property.

    • It determines how much memory each executor can use to process tasks.

    • It is important to properly configure executor memory to avoid out-of-memory errors or inefficient resource utilization.

  • Answered by AI
  • Q2. Difference between rdd and data frame
  • Ans. 

    RDD is a low-level abstraction in Spark representing distributed data, while DataFrames are higher-level structured APIs for working with data.

    • RDD is an immutable distributed collection of objects, while DataFrames are distributed collection of data organized into named columns.

    • RDDs are more suitable for unstructured data and low-level transformations, while DataFrames provide a more user-friendly API for structured da...

  • Answered by AI

Skills evaluated in this interview

Amazintech Innovations Interview FAQs

How many rounds are there in Amazintech Innovations Data Engineer interview for experienced candidates?
Amazintech Innovations interview process for experienced candidates usually has 2 rounds. The most common rounds in the Amazintech Innovations interview process for experienced candidates are Resume Shortlist and Technical.
What are the top questions asked in Amazintech Innovations Data Engineer interview for experienced candidates?

Some of the top questions asked at the Amazintech Innovations Data Engineer interview for experienced candidates -

  1. what is replication factor of hadoop 2...read more
  2. what is default block size of had...read more
  3. what is HDFS for hado...read more

Tell us how to improve this page.

Data Engineer Interview Questions from Similar Companies

View all
Software Engineer
4 salaries
unlock blur

₹3 L/yr - ₹7 L/yr

Senior Software Developer
3 salaries
unlock blur

₹5 L/yr - ₹10.8 L/yr

Senior Software Engineer
3 salaries
unlock blur

₹5.4 L/yr - ₹7.6 L/yr

Explore more salaries
Compare Amazintech Innovations with

TCS

3.7
Compare

Accenture

3.9
Compare

Wipro

3.7
Compare

Cognizant

3.8
Compare

Calculate your in-hand salary

Confused about how your in-hand salary is calculated? Enter your annual salary (CTC) and get your in-hand salary
Did you find this page helpful?
Yes No
write
Share an Interview