i
Publicis Sapient
Filter interviews by
I applied via Naukri.com and was interviewed in Sep 2024. There was 1 interview round.
3 questions are there.
What people are saying about Publicis Sapient
I applied via Approached by Company and was interviewed in Jun 2023. There were 3 interview rounds.
Spark internal flow involves job submission, DAG creation, task scheduling, and execution.
Job submission: User submits a Spark job to the SparkContext.
DAG creation: SparkContext creates a Directed Acyclic Graph (DAG) of the job.
Task scheduling: DAGScheduler breaks the DAG into stages and tasks, which are scheduled by TaskScheduler.
Task execution: Executors execute the tasks and return results to the driver.
Result fetch...
Hive meta store stores metadata about Hive tables, partitions, columns, and storage location.
Hive meta store is a central repository that stores metadata information about Hive tables, partitions, columns, and storage location.
It stores this metadata in a relational database like MySQL, Derby, or PostgreSQL.
The metadata includes information such as table names, column names, data types, file formats, and storage locati...
SQL query for simple tables
Use SELECT statement to retrieve data
Specify the columns you want to select
Use FROM clause to specify the tables you are querying from
Add WHERE clause to filter the results if needed
posted on 23 Dec 2024
I applied via Naukri.com and was interviewed in Jun 2024. There were 3 interview rounds.
Sample data and its transformations
Sample data can be in the form of CSV, JSON, or database tables
Transformations include cleaning, filtering, aggregating, and joining data
Examples: converting date formats, removing duplicates, calculating averages
Seeking new challenges and opportunities for growth in a more dynamic environment.
Looking for new challenges and opportunities for growth
Seeking a more dynamic work environment
Interested in expanding skill set and knowledge
Want to work on more innovative projects
posted on 30 May 2024
I applied via LinkedIn and was interviewed in Apr 2024. There was 1 interview round.
Spark architecture is based on a master-slave architecture with a cluster manager to coordinate tasks.
Spark architecture consists of a driver program that communicates with a cluster manager to coordinate tasks.
The cluster manager allocates resources and schedules tasks on worker nodes.
Worker nodes execute the tasks and return results to the driver program.
Spark supports various cluster managers like YARN, Mesos, and s...
There will be 4 stages created in total for the spark job.
Wide transformations trigger a shuffle and create a new stage.
Narrow transformations do not trigger a shuffle and do not create a new stage.
In this case, 3 wide transformations will create 3 new stages and 2 narrow transformations will not create new stages.
Therefore, a total of 4 stages will be created.
SQL statements are executed in a specific order to ensure accurate results.
SQL statements are executed in the following order: FROM, WHERE, GROUP BY, HAVING, SELECT, ORDER BY.
The FROM clause specifies the tables involved in the query.
The WHERE clause filters the rows based on specified conditions.
The GROUP BY clause groups the rows based on specified columns.
The HAVING clause filters the groups based on specified condi...
Identifying root cause of slow executor compared to others
Check resource utilization of the slow executor (CPU, memory, disk)
Look for any specific tasks or stages that are taking longer on the slow executor
Check for network latency or communication issues affecting the slow executor
Monitor garbage collection and JVM metrics for potential bottlenecks
Consider data skew or unbalanced data distribution causing slow perform
I applied via Naukri.com and was interviewed in Nov 2024. There was 1 interview round.
Enhanced optimization in AWS Glue improves job performance by automatically adjusting resources based on workload
Enhanced optimization in AWS Glue automatically adjusts resources like DPUs based on workload
It helps improve job performance by optimizing resource allocation
Users can enable enhanced optimization in AWS Glue job settings
Optimizing querying in Amazon Redshift involves proper table design, distribution keys, sort keys, and query optimization techniques.
Use appropriate distribution keys to evenly distribute data across nodes for parallel processing.
Utilize sort keys to physically order data on disk, reducing the need for sorting during queries.
Avoid using SELECT * and instead specify only the columns needed to reduce data transfer.
Use AN...
posted on 4 Aug 2024
I am a Senior Data Engineer with 5+ years of experience in designing and implementing data pipelines for large-scale projects.
Experienced in ETL processes and data warehousing
Proficient in programming languages like Python, SQL, and Java
Skilled in working with big data technologies such as Hadoop, Spark, and Kafka
Strong understanding of data modeling and database management
Excellent problem-solving and communication sk
Developing a real-time data processing system for analyzing customer behavior on e-commerce platform.
Utilizing Apache Kafka for real-time data streaming
Implementing Spark for data processing and analysis
Creating machine learning models for customer segmentation
Integrating with Elasticsearch for data indexing and search functionality
I appeared for an interview in Dec 2024.
based on 2 interviews
Interview experience
Senior Associate
2.2k
salaries
| ₹11.1 L/yr - ₹40 L/yr |
Associate Technology L2
1.5k
salaries
| ₹6.5 L/yr - ₹20 L/yr |
Senior Associate Technology L1
1.2k
salaries
| ₹10.3 L/yr - ₹32 L/yr |
Senior Software Engineer
788
salaries
| ₹10 L/yr - ₹38 L/yr |
Senior Associate 2
635
salaries
| ₹14.1 L/yr - ₹42 L/yr |
Genpact
DXC Technology
Virtusa Consulting Services
CGI Group