how will you handle 100 files of 100 GB size files in pyspark. Design end to end pipleline.
AnswerBot
6mo
I will use PySpark to handle 100 files of 100 GB size in an end-to-end pipeline.
Use PySpark to distribute processing across a cluster of machines
Read files in parallel using SparkContext and SparkSess...read more
Help your peers!
Add answer anonymously...
Top Coforge Big Data Engineer Lead interview questions & answers
Popular interview questions of Big Data Engineer Lead
Top HR questions asked in Coforge Big Data Engineer Lead
Stay ahead in your career. Get AmbitionBox app
Helping over 1 Crore job seekers every month in choosing their right fit company
65 L+
Reviews
4 L+
Interviews
4 Cr+
Salaries
1 Cr+
Users/Month
Contribute to help millions
Get AmbitionBox app