Upload Button Icon Add office photos

Filter interviews by

CBA Infotech Data Engineer Interview Questions and Answers for Experienced

Updated 26 Feb 2024

CBA Infotech Data Engineer Interview Experiences for Experienced

1 interview found

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 26 Feb 2024

Interview experience
2
Poor
Difficulty level
Moderate
Process Duration
4-6 weeks
Result
Not Selected

I was interviewed in Jan 2024.

Round 1 - Assignment 

Spark and scala coding

Round 2 - Technical 

(1 Question)

  • Q1. Difference between List and Tuple
  • Ans. 

    List is mutable, Tuple is immutable in Python.

    • List can be modified after creation, Tuple cannot be modified.

    • List is defined using square brackets [], Tuple is defined using parentheses ().

    • List is used for collections of items that may need to be changed, Tuple is used for fixed collections of items.

    • Example: list_example = [1, 2, 3], tuple_example = (4, 5, 6)

  • Answered by AI

Interview Preparation Tips

Interview preparation tips for other job seekers - Practice scala coding

Skills evaluated in this interview

Interview questions from similar companies

Interview experience
2
Poor
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Coding Test 

Oops dsa sql network

Round 2 - Technical 

(2 Questions)

  • Q1. Dsa based python question
  • Q2. Dsa based python question on tress
Round 3 - Technical 

(2 Questions)

  • Q1. Sql question queries
  • Q2. Sql question query
Interview experience
4
Good
Difficulty level
Moderate
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. What is the SQL query to calculate the average sales over a period of 7 days?
  • Q2. Explain Data bricks architecture?
Interview experience
4
Good
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. Basic Python, SQL & Spark coding questions.
  • Q2. Nth highest salary, word count in pyspark
  • Ans. 

    To find the nth highest salary in pyspark, use the window function with row_number and filter on the desired rank.

    • Use window function with row_number to assign a rank to each salary

    • Filter the result to get the row with the desired rank

    • Example: df.withColumn('rank', F.row_number().over(Window.orderBy(F.col('salary').desc()))).filter(F.col('rank') == n).select('salary')

  • Answered by AI
Round 2 - Technical 

(1 Question)

  • Q1. Questions on SQL & PySpark.
Round 3 - Behavioral 

(1 Question)

  • Q1. Questions about previous projects & optimization techniques etc. Also behavioural
Round 4 - HR 

(1 Question)

  • Q1. Salary negotiation part.

Skills evaluated in this interview

Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. Explain the architecture of delta lake
  • Ans. 

    Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark and big data workloads.

    • Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing.

    • It stores data in Parquet format and uses Apache Spark for processing.

    • Delta Lake ensures data reliability and data quality by providing schema enforcement and data versioning.

    • It supports time tra...

  • Answered by AI
  • Q2. Tuning operations in databricks
  • Ans. 

    Tuning operations in Databricks involves optimizing performance and efficiency of data processing tasks.

    • Use cluster configuration settings to allocate resources efficiently

    • Optimize code by minimizing data shuffling and reducing unnecessary operations

    • Leverage Databricks Auto Optimize to automatically tune performance

    • Monitor job performance using Databricks Runtime Metrics and Spark UI

  • Answered by AI

Skills evaluated in this interview

Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - One-on-one 

(2 Questions)

  • Q1. Normal questions on spark, hadoop, sql
  • Q2. One python coding question was there
Round 2 - One-on-one 

(2 Questions)

  • Q1. Python coding felt difficult, its a moderate difficult question
  • Q2. Questions on project, sql moderate questions
Interview experience
4
Good
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(1 Question)

  • Q1. Spark architecture

Data Engineer Interview Questions & Answers

HCLTech user image Manoj Chitturi

posted on 28 May 2024

Interview experience
3
Average
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via Approached by Company and was interviewed in Apr 2024. There were 3 interview rounds.

Round 1 - Technical 

(2 Questions)

  • Q1. Explain Spark RDD,DF
  • Ans. 

    Spark RDD and DF are two data structures in Apache Spark for processing and analyzing data.

    • RDD (Resilient Distributed Dataset) is a distributed collection of elements that can be operated on in parallel. It is immutable and fault-tolerant.

    • DF (DataFrame) is a distributed collection of data organized into named columns. It provides a more structured and efficient way to work with data compared to RDDs.

    • RDD is low-level an...

  • Answered by AI
  • Q2. Spark Architecture
Round 2 - Case Study 

End to end data engineering architecture

Round 3 - HR 

(2 Questions)

  • Q1. Salary expectation
  • Q2. Cultural fitment

Skills evaluated in this interview

Interview experience
3
Average
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(1 Question)

  • Q1. Mostly on Spark

Data Engineer Interview Questions & Answers

Wipro user image Aarthi Varatharaji

posted on 13 Jun 2024

Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
Not Selected
Round 1 - Technical 

(2 Questions)

  • Q1. What is executor memory
  • Ans. 

    Executor memory is the amount of memory allocated to each executor in a Spark application.

    • Executor memory is specified using the 'spark.executor.memory' configuration property.

    • It determines how much memory each executor can use to process tasks.

    • It is important to properly configure executor memory to avoid out-of-memory errors or inefficient resource utilization.

  • Answered by AI
  • Q2. Difference between rdd and data frame
  • Ans. 

    RDD is a low-level abstraction in Spark representing distributed data, while DataFrames are higher-level structured APIs for working with data.

    • RDD is an immutable distributed collection of objects, while DataFrames are distributed collection of data organized into named columns.

    • RDDs are more suitable for unstructured data and low-level transformations, while DataFrames provide a more user-friendly API for structured da...

  • Answered by AI

Skills evaluated in this interview

CBA Infotech Interview FAQs

How many rounds are there in CBA Infotech Data Engineer interview for experienced candidates?
CBA Infotech interview process for experienced candidates usually has 2 rounds. The most common rounds in the CBA Infotech interview process for experienced candidates are Assignment and Technical.

Tell us how to improve this page.

Data Engineer Interview Questions from Similar Companies

View all
Wordpress Developer
4 salaries
unlock blur

₹2 L/yr - ₹6 L/yr

Business Analyst
3 salaries
unlock blur

₹28.8 L/yr - ₹48 L/yr

Explore more salaries
Compare CBA Infotech with

TCS

3.7
Compare

Accenture

3.9
Compare

Wipro

3.7
Compare

Cognizant

3.8
Compare

Calculate your in-hand salary

Confused about how your in-hand salary is calculated? Enter your annual salary (CTC) and get your in-hand salary
Did you find this page helpful?
Yes No
write
Share an Interview