i
Photon Interactive
Filter interviews by
I applied via Naukri.com and was interviewed in Jul 2024. There was 1 interview round.
Large data processing in Pyspark involves partitioning, caching, and optimizing transformations for efficient processing.
Partitioning data to distribute workload evenly across nodes
Caching intermediate results to avoid recomputation
Optimizing transformations to minimize shuffling and reduce data movement
Data governance is implemented through policies, processes, and tools to ensure data quality, security, and compliance.
Establish data governance policies and procedures to define roles, responsibilities, and processes for managing data
Implement data quality controls to ensure accuracy, completeness, and consistency of data
Utilize data security measures such as encryption, access controls, and monitoring to protect sens...
Two data lineage tools are Apache Atlas and Informatica Enterprise Data Catalog.
Apache Atlas is an open source tool for metadata management and governance in Hadoop ecosystems.
Informatica Enterprise Data Catalog provides a comprehensive data discovery and metadata management solution.
Top trending discussions
posted on 3 Mar 2025
I was interviewed in Feb 2025.
I applied via Naukri.com and was interviewed in Nov 2024. There was 1 interview round.
I applied via Naukri.com and was interviewed in Oct 2024. There were 2 interview rounds.
Spark performance problems can arise due to inefficient code, data skew, resource constraints, and improper configuration.
Inefficient code can lead to slow performance, such as using collect() on large datasets.
Data skew can cause uneven distribution of data across partitions, impacting processing time.
Resource constraints like insufficient memory or CPU can result in slow Spark jobs.
Improper configuration settings, su...
I applied via Campus Placement
Based on SQL , statistics , python , cognitive
Address toxic work culture by open communication, setting boundaries, seeking support, and considering leaving if necessary.
Open communication with colleagues and management about issues
Set boundaries to protect your mental and emotional well-being
Seek support from HR, a mentor, or a therapist if needed
Consider leaving the toxic work environment if the situation does not improve
I applied via LinkedIn and was interviewed in Jul 2024. There were 2 interview rounds.
It was pair programming round where we need to attempt a couple of Spark Scenario along with the Interviewer. You will have a boiler plate code with some functionalities to be filled up. You will be assessed on writing clean and extensible code and test cases.
I applied via Naukri.com and was interviewed in Oct 2024. There was 1 interview round.
Incremental load in pyspark refers to loading only new or updated data into a dataset without reloading the entire dataset.
Use the 'delta' function in pyspark to perform incremental loads by specifying the 'mergeSchema' option.
Utilize the 'partitionBy' function to optimize incremental loads by partitioning the data based on specific columns.
Implement a logic to identify new or updated records based on timestamps or uni...
I applied via Campus Placement and was interviewed in Aug 2024. There were 2 interview rounds.
Java and sql questions
Simple java program for find factorial and prime number
based on 1 interview
Interview experience
Senior Software Engineer
973
salaries
| ₹6 L/yr - ₹24 L/yr |
Software Engineer
489
salaries
| ₹3.5 L/yr - ₹13.9 L/yr |
Technical Lead
416
salaries
| ₹10.5 L/yr - ₹31 L/yr |
Softwaretest Engineer
136
salaries
| ₹2.7 L/yr - ₹11.4 L/yr |
Project Manager
100
salaries
| ₹8 L/yr - ₹24.5 L/yr |
TCS
Infosys
Wipro
HCLTech