Upload Button Icon Add office photos
Engaged Employer

i

This company page is being actively managed by OSI Digital Team. If you also belong to the team, you can get access from here

OSI Digital Verified Tick

Compare button icon Compare button icon Compare
3.7

based on 131 Reviews

Filter interviews by

OSI Digital Data Engineer Interview Questions and Answers

Updated 28 May 2024

OSI Digital Data Engineer Interview Experiences

1 interview found

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 28 May 2024

Interview experience
3
Average
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Approached by Company and was interviewed in Nov 2023. There was 1 interview round.

Round 1 - Technical 

(3 Questions)

  • Q1. 1. Lead and lag based questions
  • Q2. 2. Coleasce and repartition in spark
  • Ans. 

    Coalesce reduces the number of partitions in a DataFrame, while repartition reshuffles the data across a specified number of partitions in Spark.

    • Coalesce is used to reduce the number of partitions in a DataFrame without shuffling the data

    • Repartition is used to increase or decrease the number of partitions in a DataFrame by shuffling the data across the specified number of partitions

    • Coalesce is more efficient than repar...

  • Answered by AI
  • Q3. 3. Some sql questions

Interview Preparation Tips

Topics to prepare for OSI Digital Data Engineer interview:
  • SQL
  • Spark

Skills evaluated in this interview

Interview questions from similar companies

Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Selected Selected

I was interviewed in Oct 2024.

Round 1 - Technical 

(2 Questions)

  • Q1. SQL problem window function
  • Q2. SQL code like join and scenerio
Round 2 - Technical 

(2 Questions)

  • Q1. Design round for adf pipeline
  • Ans. 

    Designing an ADF pipeline for data processing

    • Identify data sources and destinations

    • Define data transformations and processing steps

    • Consider scheduling and monitoring requirements

    • Utilize ADF activities like Copy Data, Data Flow, and Databricks

    • Implement error handling and logging mechanisms

  • Answered by AI
  • Q2. Azure synapses and adf adb
Round 3 - HR 

(2 Questions)

  • Q1. Expected ctc and current ctc negotiations
  • Ans. 

    Discussing expected and current salary for negotiation purposes.

    • Be honest about your current salary and provide a realistic expectation for your desired salary.

    • Highlight your skills and experience that justify your desired salary.

    • Be open to negotiation and willing to discuss other benefits besides salary.

    • Research industry standards and salary ranges for similar positions to support your negotiation.

    • Focus on the value y...

  • Answered by AI
  • Q2. Relocation and remote work until ofc open for Pune location

Interview Preparation Tips

Interview preparation tips for other job seekers - Be prepared for sql and problem solving

Skills evaluated in this interview

Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Naukri.com and was interviewed in Oct 2024. There were 2 interview rounds.

Round 1 - One-on-one 

(2 Questions)

  • Q1. Azure Scenario based questions
  • Q2. Pyspark Coding based questions
Round 2 - One-on-one 

(2 Questions)

  • Q1. ADF, Databricks related question
  • Q2. Spark Performance problem and scenarios
Interview experience
3
Average
Difficulty level
-
Process Duration
-
Result
-

I applied via Campus Placement

Round 1 - Aptitude Test 

Based on SQL , statistics , python , cognitive

Round 2 - Technical 

(2 Questions)

  • Q1. Based on AI/Ml and based on cv
  • Q2. Based on projects
Round 3 - HR 

(2 Questions)

  • Q1. How to handle toxic work culture?
  • Ans. 

    Address toxic work culture by open communication, setting boundaries, seeking support, and considering leaving if necessary.

    • Open communication with colleagues and management about issues

    • Set boundaries to protect your mental and emotional well-being

    • Seek support from HR, a mentor, or a therapist if needed

    • Consider leaving the toxic work environment if the situation does not improve

  • Answered by AI
  • Q2. 5 strength and weakness

Interview Preparation Tips

Interview preparation tips for other job seekers - Be confident in interviews and try to calm ur mind!
Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. SCD questions. Iceberg questions
  • Q2. Basic python programing, pyspark architechture.
Interview experience
4
Good
Difficulty level
-
Process Duration
-
Result
-
Round 1 - One-on-one 

(1 Question)

  • Q1. Spark basic question , hive related questions.

Interview Preparation Tips

Interview preparation tips for other job seekers - Good question asked, It covers sql , spark and python.
Interview experience
4
Good
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
No response

I was interviewed in Aug 2024.

Round 1 - Technical 

(5 Questions)

  • Q1. Questions on Pyspark
  • Q2. Questions on SQL
  • Q3. Transformations
  • Q4. Questions on Sql optimizations
  • Q5. Questions About my current Project
Interview experience
5
Excellent
Difficulty level
Easy
Process Duration
Less than 2 weeks
Result
No response

I applied via Naukri.com and was interviewed in Oct 2024. There was 1 interview round.

Round 1 - One-on-one 

(2 Questions)

  • Q1. Incremental load in pyspark
  • Ans. 

    Incremental load in pyspark refers to loading only new or updated data into a dataset without reloading the entire dataset.

    • Use the 'delta' function in pyspark to perform incremental loads by specifying the 'mergeSchema' option.

    • Utilize the 'partitionBy' function to optimize incremental loads by partitioning the data based on specific columns.

    • Implement a logic to identify new or updated records based on timestamps or uni...

  • Answered by AI
  • Q2. Drop duplicates

Skills evaluated in this interview

Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
-
Result
Selected Selected

I applied via Campus Placement and was interviewed in Aug 2024. There were 2 interview rounds.

Round 1 - Aptitude Test 

Java and sql questions

Round 2 - Coding Test 

Simple java program for find factorial and prime number

Interview experience
3
Average
Difficulty level
Moderate
Process Duration
-
Result
No response

I applied via LinkedIn and was interviewed in Jan 2024. There was 1 interview round.

Round 1 - Technical 

(4 Questions)

  • Q1. What is Pyspark?
  • Ans. 

    Pyspark is a Python API for Apache Spark, a powerful open-source distributed computing system.

    • Pyspark is used for processing large datasets in parallel across a cluster of computers.

    • It provides high-level APIs in Python for Spark programming.

    • Pyspark allows seamless integration with other Python libraries like Pandas and NumPy.

    • Example: Using Pyspark to perform data analysis and machine learning tasks on big data sets.

  • Answered by AI
  • Q2. What is Pyspark SQL?
  • Ans. 

    Pyspark SQL is a module in Apache Spark that provides a SQL interface for working with structured data.

    • Pyspark SQL allows users to run SQL queries on Spark dataframes.

    • It provides a more concise and user-friendly way to interact with data compared to traditional Spark RDDs.

    • Users can leverage the power of SQL for data manipulation and analysis within the Spark ecosystem.

  • Answered by AI
  • Q3. How to merge 2 dataframes of different schema?
  • Ans. 

    To merge 2 dataframes of different schema, use join operations or data transformation techniques.

    • Use join operations like inner join, outer join, left join, or right join based on the requirement.

    • Perform data transformation to align the schemas before merging.

    • Use tools like Apache Spark, Pandas, or SQL to merge dataframes with different schemas.

  • Answered by AI
  • Q4. What is Pyspark streaming?
  • Ans. 

    Pyspark streaming is a scalable and fault-tolerant stream processing engine built on top of Apache Spark.

    • Pyspark streaming allows for real-time processing of streaming data.

    • It provides high-level APIs in Python for creating streaming applications.

    • Pyspark streaming supports various data sources like Kafka, Flume, Kinesis, etc.

    • It enables windowed computations and stateful processing for handling streaming data.

    • Example: C...

  • Answered by AI

Interview Preparation Tips

Topics to prepare for Luxoft Data Engineer interview:
  • Pyspark

Skills evaluated in this interview

OSI Digital Interview FAQs

How many rounds are there in OSI Digital Data Engineer interview?
OSI Digital interview process usually has 1 rounds. The most common rounds in the OSI Digital interview process are Technical.
What are the top questions asked in OSI Digital Data Engineer interview?

Some of the top questions asked at the OSI Digital Data Engineer interview -

  1. 2. Coleasce and repartition in sp...read more
  2. 1. Lead and lag based questi...read more
  3. 3. Some sql questi...read more

Tell us how to improve this page.

OSI Digital Data Engineer Interview Process

based on 1 interview

Interview experience

3
  
Average
View more
OSI Digital Data Engineer Salary
based on 4 salaries
₹3.4 L/yr - ₹7.5 L/yr
51% less than the average Data Engineer Salary in India
View more details
Software Engineer
161 salaries
unlock blur

₹3.2 L/yr - ₹13 L/yr

Senior Software Engineer
154 salaries
unlock blur

₹6.2 L/yr - ₹23 L/yr

Associate Software Engineer
126 salaries
unlock blur

₹3 L/yr - ₹7.5 L/yr

Associate Technical Leader
61 salaries
unlock blur

₹10 L/yr - ₹23.2 L/yr

Technical Lead
54 salaries
unlock blur

₹12.4 L/yr - ₹25.5 L/yr

Explore more salaries
Compare OSI Digital with

TCS

3.7
Compare

Infosys

3.6
Compare

Wipro

3.7
Compare

HCLTech

3.5
Compare
Did you find this page helpful?
Yes No
write
Share an Interview