TransOrg Analytics
20+ Aspiro Pharma Interview Questions and Answers
Q1. Assume We had a PAN india Retail store because of which i have customer table in backend one is customer profile table and other is customer transaction table both will linked with customer id so what will the ...
read moreUse left join for computationally efficient way to find customer names from customer profile and transaction tables.
Use left join to combine customer profile and transaction tables based on customer id
Left join will include all customers from profile table even if they don't have transactions
Subquery may be less efficient as it has to be executed for each row in the result set
Q2. What is the Difference between Transformation and Actions in pyspark? And Give Example
Transformation in pyspark is lazy evaluation while Actions trigger execution of transformations.
Transformations are operations that are not executed immediately but create a plan for execution.
Actions are operations that trigger the execution of transformations and return results.
Examples of transformations include map, filter, and reduceByKey.
Examples of actions include collect, count, and saveAsTextFile.
Q3. what is Common Expression Query (CTE)?How CTE is different from Stored Procedure?
CTE is a temporary result set that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement. It is different from a Stored Procedure as it is only available for the duration of the query.
CTE stands for Common Table Expression and is defined using the WITH keyword.
CTEs are mainly used for recursive queries, complex joins, and simplifying complex queries.
CTEs are not stored in the database like Stored Procedures, they exist only for the duration of the query execu...read more
Q4. what if you have to find out second highest transacting member in each city?
Use SQL query with window function to rank members by transaction amount in each city.
Use SQL query with PARTITION BY clause to group members by city
Use ORDER BY clause to rank members by transaction amount
Select the second highest member for each city
Q5. what is Normalization is sql and explain 1NF 2NF 3NF?
Normalization in SQL is the process of organizing data in a database to reduce redundancy and improve data integrity.
1NF (First Normal Form) - Each column in a table must contain atomic values, and there should be no repeating groups.
2NF (Second Normal Form) - Table should be in 1NF and all non-key attributes are fully functional dependent on the primary key.
3NF (Third Normal Form) - Table should be in 2NF and there should be no transitive dependencies between non-key attribu...read more
Q6. design a business case to use self join? Condition : not use hirachical usecase like teacher student employee manager father and grandfather
Using self join to analyze customer behavior in an e-commerce platform.
Identifying patterns in customer purchase history
Analyzing customer preferences based on past purchases
Segmenting customers based on their buying behavior
Q7. Have you work on Lambda Function Explain it?
Lambda function is a serverless computing service that runs code in response to events and automatically manages the computing resources required.
Lambda functions are event-driven and can be triggered by various AWS services such as S3, DynamoDB, API Gateway, etc.
They are written in languages like Python, Node.js, Java, etc.
Lambda functions are scalable and cost-effective as you only pay for the compute time you consume.
They can be used for data processing, real-time file pro...read more
Q8. what is difference between alter and update ?
Alter is used to modify the structure of a table, while update is used to modify the data in a table.
Alter is used to add, remove, or modify columns in a table.
Update is used to change the values of existing records in a table.
Alter can change the structure of a table, such as adding a new column or changing the data type of a column.
Update is used to modify the data in a table, such as changing the value of a specific column in a specific row.
Q9. what is List Comprehension?
List comprehension is a concise way to create lists in Python by applying an expression to each item in an iterable.
Syntax: [expression for item in iterable]
Can include conditionals: [expression for item in iterable if condition]
Example: squares = [x**2 for x in range(10)]
Q10. what is generator function?
A generator function is a function that can pause and resume its execution, allowing it to yield multiple values over time.
Generator functions are defined using the 'function*' syntax in JavaScript.
They use the 'yield' keyword to return values one at a time.
Generators can be iterated over using a 'for...of' loop.
They are useful for generating sequences of values lazily, improving memory efficiency.
Q11. what is Broadcast Variables?
Broadcast Variables are read-only shared variables that are cached on each machine in a cluster for efficient data distribution.
Broadcast Variables are used to efficiently distribute large read-only datasets to all nodes in a Spark cluster.
They are useful for tasks like joining a small lookup table with a large dataset.
Broadcast variables are cached in memory on each machine to avoid unnecessary data shuffling during computation.
Q12. Difference between map and Flatmap?
Map applies a function to each element in a collection and returns a new collection. Flatmap applies a function that returns a collection to each element and flattens the result.
Map transforms each element in a collection using a function and returns a new collection.
Flatmap applies a function that returns a collection to each element and flattens the result into a single collection.
Map does not flatten nested collections, while flatmap does.
Example: Map - [1, 2, 3].map(x => ...read more
Q13. What have you done in AI/Ml field?
I have worked on developing machine learning models for predicting customer churn in a telecom company.
Developed machine learning models using Python and scikit-learn
Preprocessed and cleaned large datasets to improve model performance
Implemented various algorithms such as Random Forest and Gradient Boosting
Evaluated model performance using metrics like accuracy, precision, and recall
Q14. What do you know about SQL?
SQL is a programming language used for managing and manipulating databases.
SQL stands for Structured Query Language.
It is used to communicate with databases to perform tasks like querying data, updating data, and creating tables.
Common SQL commands include SELECT, INSERT, UPDATE, DELETE.
SQL is used in various database management systems like MySQL, PostgreSQL, and Oracle.
Knowledge of SQL is essential for data analysis and database management roles.
Q15. Write the python code for creating pandas series.
Creating pandas series in Python.
Import the pandas library: import pandas as pd
Create a series using a list: data = ['a', 'b', 'c'] series = pd.Series(data)
Create a series with custom index: data = {'a': 1, 'b': 2, 'c': 3} series = pd.Series(data)
Q16. What is Normalisation and it's types
Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity.
Normalization involves breaking down a table into smaller tables and defining relationships between them.
There are different levels of normalization, including first normal form (1NF), second normal form (2NF), and third normal form (3NF).
1NF involves eliminating duplicate columns and creating a primary key.
2NF involves ensuring that all non-key attributes are depende...read more
Q17. What are clustering algorithms?
Clustering algorithms are unsupervised machine learning techniques used to group similar data points together.
Clustering algorithms are used to identify patterns in data by grouping similar data points together.
They are unsupervised machine learning techniques, meaning they do not require labeled data.
Common clustering algorithms include k-means, hierarchical clustering, and DBSCAN.
Clustering can be used for customer segmentation, anomaly detection, and image segmentation, am...read more
Q18. Write a python code to sort the elements
Python code to sort elements in an array of strings
Use the built-in sort() method to sort the array in ascending order
Use the reverse parameter to sort in descending order
Use key parameter to sort based on a specific attribute
Use sorted() function to return a new sorted array without modifying the original
Q19. Difference between SQL and MySQL
SQL is a language used to manage relational databases while MySQL is a relational database management system.
SQL is a language used to manage relational databases while MySQL is a relational database management system.
SQL is a standard language for managing relational databases while MySQL is an open-source RDBMS.
SQL can be used with different RDBMS like Oracle, Microsoft SQL Server, etc. while MySQL is only used with MySQL RDBMS.
SQL is used to create, modify, and query datab...read more
Q20. Explain DSA and state its uses
DSA stands for Data Structures and Algorithms. It is used to store and manipulate data efficiently.
DSA is essential for solving complex problems efficiently in software development.
It helps in organizing and managing data effectively.
Common DSA include arrays, linked lists, stacks, queues, trees, graphs, etc.
Examples of DSA usage include searching algorithms like binary search, sorting algorithms like quicksort, and data structures like hash tables.
Interview Process at Aspiro Pharma
Top Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month