i
Cognizant
Proud winner of ABECA 2024 - AmbitionBox Employee Choice Awards
Filter interviews by
I applied via Naukri.com and was interviewed in Apr 2023. There were 3 interview rounds.
Speculative execution in Hadoop is a feature that allows the framework to launch duplicate tasks for a job, with the goal of completing the job faster.
Speculative execution is used when a task is taking longer to complete than expected.
Hadoop identifies slow-running tasks and launches duplicate tasks on other nodes.
The first task to complete is used, while the others are killed to avoid duplication of results.
This help...
Lists are mutable ordered collections, tuples are immutable ordered collections, and sets are mutable unordered collections.
Lists are mutable and ordered, allowing for duplicate elements. Example: [1, 2, 3, 3]
Tuples are immutable and ordered, allowing for duplicate elements. Example: (1, 2, 3, 3)
Sets are mutable and unordered, not allowing for duplicate elements. Example: {1, 2, 3}
Rank assigns a unique rank to each distinct row, while dense rank assigns consecutive ranks to rows with the same values.
Rank function assigns unique ranks to each distinct row in the result set
Dense rank function assigns consecutive ranks to rows with the same values
Rank function leaves gaps in the ranking sequence if there are ties, while dense rank does not
To add data into a partitioned hive table, you can use the INSERT INTO statement with the PARTITION clause.
Use INSERT INTO statement to add data into the table.
Specify the partition column values using the PARTITION clause.
Example: INSERT INTO table_name PARTITION (partition_column=value) VALUES (data);
WAP to add index wise elements of a list . A=[1,2,3] , B=[4,5,7] C should be [5,7,10]
I applied via Naukri.com and was interviewed in Aug 2024. There were 2 interview rounds.
Different types of joins in SQL with examples
Inner Join: Returns rows when there is a match in both tables
Left Join: Returns all rows from the left table and the matched rows from the right table
Right Join: Returns all rows from the right table and the matched rows from the left table
Full Outer Join: Returns all rows when there is a match in either table
Large Spark datasets can be handled by partitioning, caching, optimizing transformations, and tuning resources.
Partitioning data to distribute workload evenly across nodes
Caching frequently accessed data to avoid recomputation
Optimizing transformations to reduce unnecessary processing
Tuning resources like memory allocation and parallelism for optimal performance
Spark configuration settings can be tuned to optimize query performance by adjusting parameters like memory allocation, parallelism, and caching.
Increase executor memory and cores to allow for more parallel processing
Adjust shuffle partitions to optimize data shuffling during joins and aggregations
Enable dynamic allocation to scale resources based on workload demands
Utilize caching to store intermediate results and avo...
To handle data skew and partition imbalance in Spark, strategies include using salting, bucketing, repartitioning, and optimizing join operations.
Use salting to evenly distribute skewed keys across partitions
Implement bucketing to pre-partition data based on a specific column
Repartition data based on a specific key to balance partitions
Optimize join operations by broadcasting small tables or using partitioning strategi
I applied via Company Website and was interviewed in Sep 2024. There was 1 interview round.
Spark optimization techniques involve partitioning, caching, and tuning resource allocation.
Partitioning data to distribute workload evenly
Caching frequently accessed data to avoid recomputation
Tuning resource allocation for optimal performance
I applied via Company Website and was interviewed in Apr 2024. There were 3 interview rounds.
I want to join Infosys because of its reputation for innovation and growth opportunities.
Infosys is known for its cutting-edge technology solutions and innovative projects.
I am impressed by Infosys' commitment to employee development and career growth.
I believe that joining Infosys will provide me with the opportunity to work on challenging projects and enhance my skills.
I applied via Company Website and was interviewed in Apr 2024. There were 3 interview rounds.
Questions on software and system designs
Helps employer identify particular personality traits like leadership, confidence, interpersonal and teamwork skills of potential employees
I applied via Job Portal and was interviewed in Sep 2023. There were 3 interview rounds.
Easy only so prepare well that's it
I use a combination of programming languages, tools, and frameworks to analyze and process large datasets.
Utilize programming languages like Python, Java, or Scala for data processing
Leverage tools like Hadoop, Spark, or Kafka for distributed computing
Implement frameworks like MapReduce or Apache Flink for data analysis
Use SQL or NoSQL databases for data storage and retrieval
Implemented a real-time data processing system using Apache Kafka and Spark for analyzing customer behavior.
Developed data pipelines to ingest, process, and analyze large volumes of data
Utilized Apache Kafka for real-time data streaming
Implemented machine learning algorithms for predictive analytics
Optimized data storage and retrieval for faster query performance
Coalesce is used to reduce the number of partitions in a DataFrame, while repartition is used to increase the number of partitions.
Coalesce is a narrow transformation that can only decrease the number of partitions without shuffling data.
Repartition is a wide transformation that can both increase or decrease the number of partitions and involves shuffling data across the cluster.
Coalesce is more efficient for reducing ...
Rank vs dense rank quetions ctes
Python data structure
I applied via Naukri.com and was interviewed in Jun 2022. There were 4 interview rounds.
Internal tables are managed by Hive, while external tables are managed by the user.
Internal tables are stored in a Hive-managed warehouse directory, while external tables can be stored anywhere.
Internal tables are deleted when the table is dropped, while external tables are not.
External tables can be used to access data stored in non-Hive formats, such as CSV or JSON.
Internal tables are typically used for temporary or ...
I applied via Approached by Company and was interviewed before Oct 2022. There were 2 interview rounds.
I applied via Recruitment Consultant and was interviewed in Jul 2021. There were 3 interview rounds.
based on 14 reviews
Rating in categories
Associate
72.5k
salaries
| ₹5.1 L/yr - ₹15.9 L/yr |
Programmer Analyst
54k
salaries
| ₹2.4 L/yr - ₹8 L/yr |
Senior Associate
48.2k
salaries
| ₹8.9 L/yr - ₹27 L/yr |
Senior Processing Executive
28.4k
salaries
| ₹1.8 L/yr - ₹9 L/yr |
Technical Lead
17.5k
salaries
| ₹5.8 L/yr - ₹24 L/yr |
TCS
Infosys
Wipro
Accenture