Genpact
20+ Calfus Interview Questions and Answers
Q1. Discuss the scope what are the standard process and how to effectively manage SAM?
SAM scope includes software inventory, license compliance, usage tracking. Standard process involves discovery, normalization, optimization. Effective management requires policies, tools, regular audits.
Scope of SAM includes software inventory, license compliance, and usage tracking.
Standard process involves discovery of all software assets, normalization of data, and optimization of licenses.
Effective management of SAM requires implementing policies, utilizing tools for trac...read more
Q2. What rest API, if you have to integrate to different rest points how will you do that
I would use a middleware to handle the integration of different REST APIs.
Identify the endpoints of the different REST APIs
Create a middleware that can handle requests and responses from the different APIs
Map the endpoints of the different APIs to the middleware
Use the middleware to handle requests and responses from the different APIs
Q3. How to convert unstructured data to structure data
Unstructured data can be converted to structured data by using techniques such as data mining, natural language processing, and machine learning.
Identify the relevant data points and attributes
Use data mining techniques to extract patterns and relationships
Apply natural language processing to extract meaning from text
Use machine learning algorithms to classify and categorize data
Transform the data into a structured format such as a database or spreadsheet
Q4. What is a pick rule? How it is different from putaway rule?
A pick rule determines how items are picked from a warehouse location, while a putaway rule determines where items are stored in the warehouse.
Pick rule dictates the order in which items are picked from a location for an order fulfillment process.
Putaway rule determines where items are stored in the warehouse after being received.
Pick rule focuses on optimizing the picking process for efficiency and accuracy.
Putaway rule focuses on optimizing storage space and organization wi...read more
Q5. What is microservices where you have implemented it
Microservices is an architectural approach to building applications as a collection of small, independent services.
Microservices break down a large application into smaller, independent services that can be developed, deployed, and scaled independently.
Each microservice is responsible for a specific business capability and communicates with other microservices through APIs.
Microservices architecture promotes agility, scalability, and resilience.
Examples of companies that use ...read more
Q6. Write a query for finding third highest salary of the employee
Query to find third highest salary of employee
Use ORDER BY and LIMIT to select the third highest salary
Join the employee table with salary table to get the salary information
Exclude duplicate salaries using DISTINCT keyword
Q7. What are the types of label printing modes?
Types of label printing modes include direct thermal, thermal transfer, and inkjet.
Direct thermal printing uses heat-sensitive paper to produce images.
Thermal transfer printing uses a ribbon to transfer ink onto the label.
Inkjet printing uses liquid ink to create images on the label.
Q8. Write lambda expression for finding out even nos & arrange them in ascending order using streams
Lambda expression to find and sort even numbers using streams
Use filter() method to find even numbers
Use sorted() method to sort the even numbers in ascending order
Use collect() method to collect the sorted even numbers into a list
Lambda expression: list.stream().filter(n -> n % 2 == 0).sorted().collect(Collectors.toList())
Q9. What is spark increment data load
Spark increment data load is a process of loading only the new or updated data into an existing dataset.
It helps in reducing the processing time and resources required for loading the entire dataset.
It involves identifying the changes in the source data and only loading those changes into the target dataset.
Spark provides various APIs and tools to perform incremental data loading such as Delta Lake and Structured Streaming.
Example: Loading only the new sales data into a sales...read more
Q10. How you handle SAM process end to end?
I handle SAM process end to end by ensuring proper software asset management from procurement to retirement.
Developing and implementing SAM policies and procedures
Regularly auditing software licenses and usage
Managing software procurement and vendor relationships
Ensuring compliance with software license agreements
Retiring software licenses and assets properly
Training staff on SAM best practices
Q11. What is tungsten project in spark
Tungsten is a project in Spark that provides a high-level API for distributed data processing.
Tungsten optimizes Spark's execution engine by using memory management and binary processing.
It improves performance by reducing garbage collection overhead and CPU usage.
Tungsten also provides a new columnar storage format called Catalyst that improves query performance.
Example: Tungsten can be used to process large datasets in parallel across a cluster of machines.
Example: Tungsten...read more
Q12. Tell me about veeva migration activities
Veeva migration activities involve transferring data and configurations from one Veeva instance to another.
Veeva migration typically includes migrating data such as accounts, contacts, activities, and configurations like custom fields, page layouts, and workflows.
Data mapping is crucial in Veeva migration to ensure that data is accurately transferred between systems.
Validation and testing are important steps in Veeva migration to ensure data integrity and system functionality...read more
Q13. How to read file name in spark
To read file name in Spark, use the 'input_file_name' function.
Use 'input_file_name' function in Spark to read file name.
This function returns the name of the file being processed.
Example: df.select(input_file_name().alias('filename')).show()
This will show the file name in a new column named 'filename'.
Q14. What are the spark configurations
Spark configurations are settings that determine how Spark applications run on a cluster.
Spark configurations can be set using command-line arguments, properties files, or programmatically in code.
Configurations can control various aspects of Spark, such as memory usage, parallelism, and logging.
Examples of Spark configurations include spark.executor.memory, spark.driver.cores, and spark.eventLog.enabled.
Q15. What is broadcast join
Broadcast join is a type of join operation in distributed computing where a small table is broadcasted to all nodes for joining with a larger table.
Broadcast join is used in distributed computing to optimize join operations.
It involves broadcasting a small table to all nodes in a cluster.
The small table is then joined with a larger table on each node.
This reduces the amount of data that needs to be shuffled across the network.
Broadcast join is useful when the small table can ...read more
Q16. What is serialization in spark
Serialization is the process of converting data into a format that can be stored or transmitted.
In Spark, serialization is used to convert data into a format that can be sent over the network or stored in memory.
Spark supports two types of serialization: Java serialization and Kryo serialization.
Kryo serialization is faster and more efficient than Java serialization.
Serialization is important in Spark because it allows data to be transferred between nodes in a cluster.
Seriali...read more
Q17. Write a program to reverse a string
A program to reverse a string
Create an empty string to store the reversed string
Loop through the original string from the end to the beginning
Add each character to the new string
Return the new string
Q18. Difference between profile and permission set
Profile defines the baseline permissions for a user, while permission set grants additional permissions to a user.
Profiles are assigned to users at the time of user creation, while permission sets can be assigned at any time.
Profiles control access to objects, fields, and records, while permission sets grant additional permissions beyond the profile settings.
Profiles are one-size-fits-all, while permission sets can be tailored to specific user needs.
Users can have only one pr...read more
Q19. What is catalyst optimizer
Catalyst optimizer is a tool used in computer programming to improve the performance of code by optimizing the execution process.
It is used in programming languages like Python, Scala, and Spark.
It analyzes the code and makes changes to improve its efficiency.
It can improve the speed of code execution and reduce memory usage.
Examples include PySpark's Catalyst optimizer and Scala's Catalyst optimizer.
Q20. Spring boot solid design principles
Solid design principles are important in Spring Boot to ensure maintainability and scalability of the application.
Single Responsibility Principle (SRP) - each class should have only one responsibility
Open/Closed Principle (OCP) - classes should be open for extension but closed for modification
Liskov Substitution Principle (LSP) - subclasses should be able to replace their parent classes without affecting the behavior of the program
Interface Segregation Principle (ISP) - clien...read more
Q21. What are BGP Attributes
BGP attributes are used to make routing decisions in Border Gateway Protocol networks.
BGP attributes are used to influence the best path selection process in BGP.
Examples of BGP attributes include AS Path, Next Hop, Local Preference, and Weight.
AS Path attribute shows the path that the route has taken through different autonomous systems.
Next Hop attribute specifies the next router to reach for a particular destination.
Local Preference attribute is used to influence outbound ...read more
Q22. Data Storage and Best Practices
Data storage best practices include regular backups, encryption, access control, and data retention policies.
Regularly backup data to prevent loss in case of system failure or cyber attacks.
Encrypt sensitive data to protect it from unauthorized access.
Implement access control measures to ensure only authorized personnel can access certain data.
Establish data retention policies to determine how long data should be stored and when it should be deleted.
Consider using cloud stora...read more
Q23. Write a spark submit command
Spark submit command to run a Scala application on a cluster
Include the path to the application jar file
Specify the main class of the application
Provide any necessary arguments or options
Specify the cluster manager and the number of executors
Example: spark-submit --class com.example.Main --master yarn --num-executors 4 /path/to/application.jar arg1 arg2
Q24. VIP and Pool Creation in F5
VIP and Pool Creation in F5 involves configuring virtual servers and pools to manage traffic.
Create a virtual server with a unique IP address and port number
Associate the virtual server with a pool of servers that will handle the incoming traffic
Configure health monitors to check the status of the servers in the pool
Set up load balancing methods such as round robin or least connections
Test the configuration to ensure proper traffic distribution
Q25. Challenges in labeling
Labeling challenges include regulatory compliance, multilingual requirements, and design limitations.
Regulatory compliance can vary by country and industry, requiring extensive research and attention to detail.
Multilingual requirements can add complexity to label design and increase production costs.
Design limitations may arise due to label size, material, or printing capabilities.
Labeling errors can result in legal and financial consequences, making accuracy and quality cont...read more
More about working at Genpact
Top HR Questions asked in Calfus
Interview Process at Calfus
Reviews
Interviews
Salaries
Users/Month