Filter interviews by
I applied via LinkedIn and was interviewed in Sep 2024. There were 3 interview rounds.
CICD stands for Continuous Integration and Continuous Deployment. It is a software development practice where code changes are automatically built, tested, and deployed.
Automates the process of integrating code changes into a shared repository
Automatically builds and tests the code to ensure it is functional
Automatically deploys the code to production or staging environments
Helps in detecting and fixing integration err
To create an image out of a running container, you can use the 'docker commit' command.
Use 'docker commit' command to create an image from a running container
Syntax: docker commit [OPTIONS] CONTAINER [REPOSITORY[:TAG]]
Example: docker commit container_id repository_name:tag
Docker add command can fetch a file from a URL and add it to the image, while copy command copies files from the host machine to the image.
Docker add command can fetch files from URLs and add them to the image
Copy command copies files from the host machine to the image
Add command can also automatically extract compressed files during the build process
Copy command is more commonly used for copying local files
A freestyle project is a type of project in Jenkins that allows users to configure the build process in any way they want. A pipeline is a set of automated steps that define the process for building, testing, and deploying code.
Freestyle project in Jenkins allows users to configure build process manually
Pipeline in Jenkins is a set of automated steps for building, testing, and deploying code
Freestyle projects are more ...
Docker Swarm is used for orchestrating multiple Docker containers across multiple hosts, while Docker Compose is used for defining and running multi-container Docker applications.
Docker Swarm is a container orchestration tool that allows you to manage a cluster of Docker hosts.
Docker Compose is a tool for defining and running multi-container Docker applications.
Docker Swarm is used for scaling and managing a cluster of...
VPC stands for Virtual Private Cloud, a virtual network dedicated to a single organization's resources in the cloud.
Allows organizations to have control over their virtual network environment
Enables customization of network configuration
Provides security by allowing isolation of resources
Can connect to on-premises data centers or other VPCs using VPN or Direct Connect
CICD pipeline is a process that automates the building, testing, and deployment of software.
Continuous Integration (CI) - code changes are integrated into a shared repository multiple times a day.
Continuous Testing - automated tests are run to ensure code quality.
Continuous Deployment - code changes are automatically deployed to production.
Stages include: build, test, deploy, and monitor.
Tools like Jenkins, GitLab, and
VPC peering allows connecting two VPCs to communicate using private IP addresses.
VPC peering enables instances in different VPCs to communicate as if they are within the same network.
Traffic between peered VPCs stays within the private IP space and does not traverse the internet.
VPC peering does not involve a gateway, VPN, or direct connection.
Both VPCs must have non-overlapping IP ranges for successful peering.
Example...
VPC is a virtual private cloud that allows you to create isolated networks within the cloud environment. Subnets are subdivisions of a VPC, route tables define how traffic is directed within the VPC, and NAT gateway allows instances in a private subnet to access the internet.
VPC is a virtual private cloud that provides a logically isolated section of the AWS Cloud where you can launch resources.
Subnets are subdivisions...
I applied via Company Website and was interviewed in Jul 2024. There were 2 interview rounds.
Apitude test with coding test two sql questions,
ETL stands for Extract, Transform, Load. It is a process used to extract data from various sources, transform it into a consistent format, and load it into a target database.
ETL involves three main layers: Extraction, Transformation, and Loading.
Extraction: Data is extracted from various sources such as databases, files, APIs, etc.
Transformation: Data is cleaned, validated, and transformed into a consistent format.
Load...
Python star pattern 1 12 123 is a pattern printing question. List and Dictionary are data structures in Python.
Python star pattern 1 12 123 can be achieved using nested loops.
Lists are ordered collections of items, accessed by index. Dictionaries are key-value pairs, accessed by key.
Example: List - [1, 2, 3], Dictionary - {'a': 1, 'b': 2, 'c': 3}
Yes, I am familiar with Google Cloud Platform (GCP) and BigQuery.
I have experience working with GCP services such as Compute Engine, Cloud Storage, and BigQuery.
I have used BigQuery for analyzing large datasets and running complex queries.
I am familiar with setting up data pipelines and ETL processes using GCP services.
SDLC stands for Software Development Life Cycle and STLC stands for Software Testing Life Cycle.
SDLC is a process used by software development teams to design, develop, and test high-quality software.
STLC is a subset of SDLC focused specifically on the testing phase of the software development process.
SDLC includes phases like planning, analysis, design, implementation, and maintenance.
STLC includes phases like test pl...
I applied via Naukri.com and was interviewed in Jul 2024. There were 2 interview rounds.
Use SQL query with GROUP BY and HAVING clause to find duplicate entries in a table.
Use GROUP BY clause to group the rows based on the columns that you suspect may have duplicates.
Use COUNT() function to count the number of occurrences of each group.
Use HAVING clause to filter out groups that have more than one occurrence, indicating duplicates.
Example: SELECT column1, column2, COUNT(*) FROM table_name GROUP BY column1,
Use subquery to find 3 highest salaries in Employee table without using limit or offset function
Use subquery to select distinct salaries from Employee table
Order the distinct salaries in descending order
Select top 3 salaries from the ordered list
Informatica is a data integration tool used for ETL (Extract, Transform, Load) processes in data engineering.
Informatica is used for extracting data from various sources like databases, flat files, etc.
It can transform the data according to business rules and load it into a target data warehouse or database.
Informatica provides a visual interface for designing ETL workflows and monitoring data integration processes.
It ...
Datastage is an ETL tool used for extracting, transforming, and loading data from various sources to a target destination.
Datastage is part of the IBM Information Server suite.
It provides a graphical interface to design and run data integration jobs.
Datastage supports parallel processing for high performance.
It can connect to a variety of data sources such as databases, flat files, and web services.
Datastage jobs can b...
DataMetica interview questions for popular designations
I applied via Approached by Company and was interviewed in Sep 2024. There was 1 interview round.
Get interview-ready with Top DataMetica Interview Questions
I applied via Naukri.com and was interviewed in Aug 2024. There was 1 interview round.
Informatica Powercentre is an on-premise ETL tool, while iics is a cloud-based ETL tool.
Informatica Powercentre is an on-premise ETL tool, meaning it is installed and run on the user's own hardware and infrastructure.
iics (Informatica Intelligent Cloud Services) is a cloud-based ETL tool, allowing users to access and use the tool via the internet.
Informatica Powercentre requires manual upgrades and maintenance, while i...
Optimizing session involves tuning session settings, utilizing efficient data loading techniques, and minimizing resource usage.
Tune session settings such as buffer size, commit interval, and block size for optimal performance
Utilize efficient data loading techniques like bulk loading, incremental loading, and parallel processing
Minimize resource usage by optimizing SQL queries, reducing unnecessary transformations, an
I applied via Naukri.com and was interviewed in Jul 2024. There was 1 interview round.
RBAC in Kubernetes stands for Role-Based Access Control, which is used to control access to resources based on roles assigned to users.
RBAC allows administrators to define roles with specific permissions for accessing resources in a Kubernetes cluster.
Roles can be assigned to users or groups, allowing fine-grained control over who can perform certain actions.
RBAC includes four primary components: Role, RoleBinding, Clu...
Google Cloud Platform (GCP) offers various security features to protect data and resources.
Identity and Access Management (IAM) for controlling access to resources
Encryption at rest and in transit to protect data
Network security with Virtual Private Cloud (VPC) and firewall rules
Security Key Enforcement for two-factor authentication
Security Scanner for vulnerability assessment
Cloud Security Command Center for centraliz
I applied via Naukri.com and was interviewed in Mar 2024. There was 1 interview round.
Facts and dimensions are key concepts in data warehousing. Facts are numerical data that can be measured, while dimensions are descriptive attributes related to the facts.
Facts are quantitative data that can be aggregated, such as sales revenue or quantity sold.
Dimensions are descriptive attributes that provide context to the facts, such as product category, customer name, or date.
Facts are typically stored in fact tab...
Top trending discussions
Interview experience
based on 335 reviews
Rating in categories
Data Engineer
234
salaries
| ₹3 L/yr - ₹10.1 L/yr |
Engineer 1
173
salaries
| ₹4 L/yr - ₹10.1 L/yr |
L2 Engineer
143
salaries
| ₹4.5 L/yr - ₹17.1 L/yr |
Senior Engineer
103
salaries
| ₹6.1 L/yr - ₹21 L/yr |
Associate Engineer
95
salaries
| ₹2.4 L/yr - ₹6 L/yr |
Fractal Analytics
Mu Sigma
LatentView Analytics
Tredence