Filter interviews by
I applied via Naukri.com and was interviewed in Oct 2023. There were 2 interview rounds.
The word count program test case has to be successful
Top trending discussions
I applied via Naukri.com and was interviewed in Nov 2024. There was 1 interview round.
posted on 25 Sep 2024
I applied via Walk-in and was interviewed in Aug 2024. There were 5 interview rounds.
Maths grammar & communication
You're like this job opportunity
I applied via Campus Placement and was interviewed in May 2024. There were 3 interview rounds.
Basics of Statics and also verbal
Data structures and algorithms trees
I applied via LinkedIn and was interviewed in Jan 2024. There was 1 interview round.
Pyspark is a Python API for Apache Spark, a powerful open-source distributed computing system.
Pyspark is used for processing large datasets in parallel across a cluster of computers.
It provides high-level APIs in Python for Spark programming.
Pyspark allows seamless integration with other Python libraries like Pandas and NumPy.
Example: Using Pyspark to perform data analysis and machine learning tasks on big data sets.
Pyspark SQL is a module in Apache Spark that provides a SQL interface for working with structured data.
Pyspark SQL allows users to run SQL queries on Spark dataframes.
It provides a more concise and user-friendly way to interact with data compared to traditional Spark RDDs.
Users can leverage the power of SQL for data manipulation and analysis within the Spark ecosystem.
To merge 2 dataframes of different schema, use join operations or data transformation techniques.
Use join operations like inner join, outer join, left join, or right join based on the requirement.
Perform data transformation to align the schemas before merging.
Use tools like Apache Spark, Pandas, or SQL to merge dataframes with different schemas.
Pyspark streaming is a scalable and fault-tolerant stream processing engine built on top of Apache Spark.
Pyspark streaming allows for real-time processing of streaming data.
It provides high-level APIs in Python for creating streaming applications.
Pyspark streaming supports various data sources like Kafka, Flume, Kinesis, etc.
It enables windowed computations and stateful processing for handling streaming data.
Example: C...
Interview experience
Senior Consultant
689
salaries
| ₹11 L/yr - ₹38.9 L/yr |
Application Developer
654
salaries
| ₹6.8 L/yr - ₹22 L/yr |
Lead Consultant
244
salaries
| ₹18 L/yr - ₹65 L/yr |
Consultant
149
salaries
| ₹8 L/yr - ₹21.7 L/yr |
Business Analyst
90
salaries
| ₹8.4 L/yr - ₹20 L/yr |
TCS
Infosys
Wipro
HCLTech