i
Altimetrik
Filter interviews by
I applied via Naukri.com and was interviewed in Jul 2023. There were 9 interview rounds.
Hackerrank test conducted
Real-time issues analysis
Top trending discussions
I applied via Naukri.com and was interviewed in Nov 2024. There was 1 interview round.
I applied via Naukri.com and was interviewed in Oct 2024. There were 2 interview rounds.
Spark performance problems can arise due to inefficient code, data skew, resource constraints, and improper configuration.
Inefficient code can lead to slow performance, such as using collect() on large datasets.
Data skew can cause uneven distribution of data across partitions, impacting processing time.
Resource constraints like insufficient memory or CPU can result in slow Spark jobs.
Improper configuration settings, su...
I applied via Referral and was interviewed in Nov 2024. There were 2 interview rounds.
I applied via Approached by Company and was interviewed in Jun 2024. There was 1 interview round.
Power Pivot is a data analysis tool in Excel that allows users to create powerful data models, perform calculations, and generate insights.
Power Pivot is an Excel add-in used for data analysis and modeling.
It allows users to import and manipulate large datasets from different sources.
Users can create relationships between tables, perform calculations, and create advanced data visualizations.
Power Pivot is commonly used...
Power Query is a data connection technology that enables you to discover, connect, combine, and refine data across a wide variety of sources.
Power Query is used to import, transform, and combine data from different sources for analysis.
It helps in cleaning and shaping data before loading it into Excel or Power BI.
Power Query can be used to automate data preparation tasks, saving time and effort.
It allows users to easil...
Power Pivot is used for data modeling and analysis, while Power Query is used for data transformation and cleaning.
Power Pivot is used for creating relationships between tables and performing calculations.
Power Query is used for importing, transforming, and cleaning data from various sources.
Power Pivot is more focused on data analysis and modeling, while Power Query is more focused on data preparation.
Both Power Pivot...
To retrieve data over 3 months in a dynamic dashboard, use a date range filter and ensure the data source is updated regularly.
Create a date range filter in the dashboard to select a time period of over 3 months
Ensure the data source is updated regularly to include the required data
Use SQL queries or data extraction tools to pull the necessary data for the dashboard
Consider automating the data retrieval process to ensu
A Dicreat Chart is a type of chart that displays data points in a discrete manner, typically using bars or columns.
Dicreat Charts are used to represent categorical data, where each category is represented by a separate bar or column.
They are commonly used in market research, survey data analysis, and comparison of different categories.
Examples of Dicreat Charts include bar charts, column charts, and stacked bar charts.
SQL join is used to combine rows from two or more tables based on a related column between them.
Types of SQL joins include INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.
INNER JOIN returns rows when there is at least one match in both tables.
LEFT JOIN returns all rows from the left table and the matched rows from the right table.
RIGHT JOIN returns all rows from the right table and the matched rows from the left table...
Stored procedures are precompiled SQL queries stored in a database for reuse.
Stored procedures are precompiled SQL queries stored in a database for reuse
They can improve performance by reducing network traffic and increasing security
Stored procedures can be used to encapsulate business logic and complex queries
Examples include procedures for updating customer information or calculating sales totals
CTE stands for Common Table Expressions. It is a temporary result set that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement.
CTEs are defined using the WITH keyword in SQL.
They help improve readability and maintainability of complex queries.
CTEs can be recursive, allowing for hierarchical data querying.
Examples: Recursive CTEs for querying organizational hierarchies, CTEs for data transformation be
Matrix in PowerBi is a visual representation of data in rows and columns, allowing for easy comparison and analysis.
Matrix displays data in a grid format with rows and columns
It allows for easy comparison of data across different categories
Users can drill down into the data to see more detailed information
Matrix can be used to create interactive reports and dashboards
Lambda function is an anonymous function in Python that can have any number of arguments, but can only have one expression.
Used for creating small, throwaway functions without a name
Commonly used with functions like map(), filter(), and reduce()
Can be used to define functions inline without the need to formally define a function using def keyword
To manipulate datasets in Python, steps include loading data, cleaning data, transforming data, and analyzing data using libraries like Pandas.
Load the dataset using Pandas library
Clean the data by handling missing values, removing duplicates, and correcting data types
Transform the data by applying functions, merging datasets, and creating new columns
Analyze the data by performing statistical analysis, visualizations,
DAX data types are used in Power BI and Excel to define the type of data stored in a column or measure.
DAX data types include Integer, Decimal Number, String, Boolean, Date, Time, DateTime, and Currency.
Data types are important for calculations and formatting in DAX formulas.
For example, using the correct data type for a column can ensure accurate calculations and visualizations.
Inner join returns only the matching rows between two tables, while left join returns all rows from the left table and the matching rows from the right table.
Inner join only includes rows that have matching values in both tables
Left join includes all rows from the left table, even if there are no matching rows in the right table
Example: Inner join - SELECT * FROM table1 INNER JOIN table2 ON table1.id = table2.id
Example...
posted on 25 Sep 2024
I applied via Walk-in and was interviewed in Aug 2024. There were 5 interview rounds.
Maths grammar & communication
You're like this job opportunity
posted on 29 Dec 2024
I applied via Naukri.com and was interviewed in Oct 2024. There was 1 interview round.
Incremental load in pyspark refers to loading only new or updated data into a dataset without reloading the entire dataset.
Use the 'delta' function in pyspark to perform incremental loads by specifying the 'mergeSchema' option.
Utilize the 'partitionBy' function to optimize incremental loads by partitioning the data based on specific columns.
Implement a logic to identify new or updated records based on timestamps or uni...
I applied via LinkedIn and was interviewed in Jan 2024. There was 1 interview round.
Pyspark is a Python API for Apache Spark, a powerful open-source distributed computing system.
Pyspark is used for processing large datasets in parallel across a cluster of computers.
It provides high-level APIs in Python for Spark programming.
Pyspark allows seamless integration with other Python libraries like Pandas and NumPy.
Example: Using Pyspark to perform data analysis and machine learning tasks on big data sets.
Pyspark SQL is a module in Apache Spark that provides a SQL interface for working with structured data.
Pyspark SQL allows users to run SQL queries on Spark dataframes.
It provides a more concise and user-friendly way to interact with data compared to traditional Spark RDDs.
Users can leverage the power of SQL for data manipulation and analysis within the Spark ecosystem.
To merge 2 dataframes of different schema, use join operations or data transformation techniques.
Use join operations like inner join, outer join, left join, or right join based on the requirement.
Perform data transformation to align the schemas before merging.
Use tools like Apache Spark, Pandas, or SQL to merge dataframes with different schemas.
Pyspark streaming is a scalable and fault-tolerant stream processing engine built on top of Apache Spark.
Pyspark streaming allows for real-time processing of streaming data.
It provides high-level APIs in Python for creating streaming applications.
Pyspark streaming supports various data sources like Kafka, Flume, Kinesis, etc.
It enables windowed computations and stateful processing for handling streaming data.
Example: C...
Senior Software Engineer
1.2k
salaries
| ₹9 L/yr - ₹34 L/yr |
Staff Engineer
807
salaries
| ₹10.9 L/yr - ₹41 L/yr |
Senior Engineer
611
salaries
| ₹9 L/yr - ₹30 L/yr |
Software Engineer
306
salaries
| ₹4.8 L/yr - ₹19 L/yr |
Senior Staff Engineer
216
salaries
| ₹15 L/yr - ₹43 L/yr |
Accenture
Persistent Systems
Mphasis
LTIMindtree