Filter interviews by
I applied via Recruitment Consulltant and was interviewed before Sep 2022. There were 3 interview rounds.
An test was to be completed as per the advised guidelines.
I have worked on various data analysis applications including financial forecasting, customer segmentation, and marketing campaign optimization.
Financial forecasting using time series analysis
Customer segmentation based on demographic and behavioral data
Marketing campaign optimization through A/B testing and predictive modeling
Top trending discussions
I applied via Company Website and was interviewed in Jun 2024. There was 1 interview round.
Data migration is the process of transferring data from one system to another.
Data migration involves transferring data from one storage system to another.
It can also involve moving data from one format to another.
Data migration is often necessary when upgrading systems or consolidating data.
Examples include migrating data from an old CRM system to a new one, or moving data from on-premises servers to the cloud.
Agile methodologies within data migration involve iterative and incremental processes to efficiently move data from one system to another.
Scrum: Breaks down the data migration process into smaller tasks called sprints, with regular meetings to track progress.
Kanban: Visualizes the data migration workflow on a board, allowing for continuous delivery of data.
Lean: Focuses on minimizing waste and maximizing value during t...
I applied via Walk-in and was interviewed in May 2024. There were 2 interview rounds.
It was good,Who will pass the test they will get into interview otherwise they are not selected for next round.
I cleared my Aptitude test.
I was interviewed in Aug 2021.
Big Data refers to large volumes of data that cannot be processed using traditional methods.
Big Data involves processing and analyzing large volumes of data
It includes structured, unstructured, and semi-structured data
Examples include social media data, sensor data, and financial transactions
I applied via Walk-in and was interviewed in Dec 2024. There were 5 interview rounds.
Given task Statics standard deviations Attrition Average of given table values and Given graph economi graph and poverty graph base on that need to gave answers 30 qustion and 60 min time duration
I was interviewed in Dec 2024.
Work done from all assigments
I was interviewed in Dec 2024.
I was interviewed in Dec 2024.
I applied via Naukri.com and was interviewed in Oct 2024. There were 2 interview rounds.
Optimizing SQL queries involves using indexes, avoiding unnecessary joins, and optimizing the query structure.
Use indexes on columns frequently used in WHERE clauses
Avoid using SELECT * and only retrieve necessary columns
Optimize joins by using INNER JOIN instead of OUTER JOIN when possible
Use EXPLAIN to analyze query performance and make necessary adjustments
Performance optimization in Spark involves tuning configurations, optimizing code, and utilizing caching.
Tune Spark configurations such as executor memory, number of executors, and shuffle partitions.
Optimize code by reducing unnecessary shuffles, using efficient transformations, and avoiding unnecessary data movements.
Utilize caching to store intermediate results in memory and avoid recomputation.
Example: In my projec...
SparkContext is the main entry point for Spark functionality, while SparkSession is the entry point for Spark SQL.
SparkContext is the entry point for low-level API functionality in Spark.
SparkSession is the entry point for Spark SQL functionality.
SparkContext is used to create RDDs (Resilient Distributed Datasets) in Spark.
SparkSession provides a unified entry point for reading data from various sources and performing
When a spark job is submitted, various steps are executed at the backend to process the job.
The job is submitted to the Spark driver program.
The driver program communicates with the cluster manager to request resources.
The cluster manager allocates resources (CPU, memory) to the job.
The driver program creates DAG (Directed Acyclic Graph) of the job stages and tasks.
Tasks are then scheduled and executed on worker nodes ...
Calculate second highest salary using SQL and pyspark
Use SQL query with ORDER BY and LIMIT to get the second highest salary
In pyspark, use orderBy() and take() functions to achieve the same result
The two types of modes for Spark architecture are standalone mode and cluster mode.
Standalone mode: Spark runs on a single machine with a single JVM and is suitable for development and testing.
Cluster mode: Spark runs on a cluster of machines managed by a cluster manager like YARN or Mesos for production workloads.
Client mode is better for very less latency due to direct communication with the cluster.
Client mode allows direct communication with the cluster, reducing latency.
Standalone mode requires an additional layer of communication, increasing latency.
Client mode is preferred for real-time applications where low latency is crucial.
I applied via Naukri.com and was interviewed in Nov 2024. There was 1 interview round.
I am a Senior Data Engineer with experience in building scalable data pipelines and optimizing data processing workflows.
Experience in designing and implementing ETL processes using tools like Apache Spark and Airflow
Proficient in working with large datasets and optimizing query performance
Strong background in data modeling and database design
Worked on projects involving real-time data processing and streaming analytic
Decorators in Python are functions that modify the behavior of other functions or methods.
Decorators are defined using the @decorator_name syntax before a function definition.
They can be used to add functionality to existing functions without modifying their code.
Decorators can be used for logging, timing, authentication, and more.
Example: @staticmethod decorator in Python is used to define a static method in a class.
SQL query to group by employee ID and combine first name and last name with a space
Use the GROUP BY clause to group by employee ID
Use the CONCAT function to combine first name and last name with a space
Select employee ID, CONCAT(first_name, ' ', last_name) AS full_name
Constructors in Python are special methods used for initializing objects. They are called automatically when a new instance of a class is created.
Constructors are defined using the __init__() method in a class.
They are used to initialize instance variables of a class.
Example: class Person: def __init__(self, name, age): self.name = name self.age = age person1 = Person('Alice', 30)
Indexing in SQL is a technique used to improve the performance of queries by creating a data structure that allows for faster retrieval of data.
Indexes are created on columns in a database table to speed up the retrieval of rows that match a certain condition in a WHERE clause.
Indexes can be created using CREATE INDEX statement in SQL.
Types of indexes include clustered indexes, non-clustered indexes, unique indexes, an...
Spark works well with Parquet files due to its columnar storage format, efficient compression, and ability to push down filters.
Parquet files are columnar storage format, which aligns well with Spark's processing model of working on columns rather than rows.
Parquet files support efficient compression, reducing storage space and improving read performance in Spark.
Spark can push down filters to Parquet files, allowing f...
based on 1 review
Rating in categories
TCS
Accenture
Wipro
Cognizant