Upload Button Icon Add office photos
Engaged Employer

i

This company page is being actively managed by Altimetrik Team. If you also belong to the team, you can get access from here

Altimetrik Verified Tick

Compare button icon Compare button icon Compare

Filter interviews by

Altimetrik Data Science Intern Interview Questions and Answers

Updated 3 Aug 2022

Altimetrik Data Science Intern Interview Experiences

1 interview found

I applied via Campus Placement and was interviewed in Aug 2021. There were 6 interview rounds.

Round 1 - Resume Shortlist 
Pro Tip by AmbitionBox:
Keep your resume crisp and to the point. A recruiter looks at your resume for an average of 6 seconds, make sure to leave the best impression.
View all tips
Round 2 - Aptitude Test 

In both aptitude and coding in the second round, aptitude mostly consists of basic problems and there are some data science problems like bias, stats and probability.

Round 3 - Coding Test 

2 coding problems the ones I got are easier didn't take more than 15 minutes to solve both of them.

Round 4 - Technical 

(2 Questions)

  • Q1. Pretty hard technical interview from formulae behind algorithms to math to algorithms touched the sight of all basic data science questions that are supposed to be asked on data science interview
  • Q2. What is gradient descent, why does gradient descent follow tan angles and please explain and write down the formula of it.
  • Ans. 

    Gradient descent is an optimization algorithm used to minimize the cost function of a machine learning model.

    • Gradient descent is used to update the parameters of a model to minimize the cost function.

    • It follows the direction of steepest descent, which is the negative gradient of the cost function.

    • The learning rate determines the step size of the algorithm.

    • The formula for gradient descent is: theta = theta - alpha * (1/...

  • Answered by AI
Round 5 - One-on-one 

(2 Questions)

  • Q1. Managerial Technical round asked some basic level coding questions and data handling with lists, tuples, sets and dicts.
  • Q2. Please write a dictionary and try to sort it.
  • Ans. 

    A dictionary sorted in ascending order based on keys.

    • Create a dictionary with key-value pairs

    • Use the sorted() function to sort the dictionary based on keys

    • Convert the sorted dictionary into a list of tuples

    • Use the dict() constructor to create a new dictionary from the sorted list of tuples

  • Answered by AI
Round 6 - HR 

(6 Questions)

  • Q1. What is your family background?
  • Q2. Why should we hire you?
  • Q3. Where do you see yourself in 5 years?
  • Q4. Why are you looking for a change?
  • Q5. What are your strengths and weaknesses?
  • Q6. Tell me about yourself.

Interview Preparation Tips

Interview preparation tips for other job seekers - Go through the Machine learning lectures of Andrew Ng on youtube, you could easily pass the interview if you have a grip on Andrew Ng's lectures.

Skills evaluated in this interview

Interview questions from similar companies

I applied via Naukri.com and was interviewed in Jun 2020. There was 1 interview round.

Interview Questionnaire 

2 Questions

  • Q1. There were two rounds of interviews. Both technical. They actually touch base every topic by asking questions out of it. First round- Python fundamental questions on data structures and operations, scenar...
  • Q2. Second round is mainly CV based project discussion. They would try to dig out on advanced concepts which could be bit mathematical. You can be honest and skip it rather than quoting it wrong.

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare well and be honest. Prepare the fundamentals of Python and ML well.
Interview experience
4
Good
Difficulty level
Moderate
Process Duration
2-4 weeks
Result
Not Selected

I applied via Naukri.com and was interviewed in Sep 2024. There was 1 interview round.

Round 1 - Technical 

(14 Questions)

  • Q1. How to create pipeline in adf?
  • Ans. 

    To create a pipeline in ADF, you can use the Azure Data Factory UI or code-based approach.

    • Use Azure Data Factory UI to visually create and manage pipelines

    • Use code-based approach with JSON to define pipelines and activities

    • Add activities such as data movement, data transformation, and data processing to the pipeline

    • Set up triggers and schedules for the pipeline to run automatically

  • Answered by AI
  • Q2. Diffrent types of activities in pipelines
  • Ans. 

    Activities in pipelines include data extraction, transformation, loading, and monitoring.

    • Data extraction: Retrieving data from various sources such as databases, APIs, and files.

    • Data transformation: Cleaning, filtering, and structuring data for analysis.

    • Data loading: Loading processed data into a data warehouse or database.

    • Monitoring: Tracking the performance and health of the pipeline to ensure data quality and reliab

  • Answered by AI
  • Q3. What is use of getmetadata
  • Ans. 

    getmetadata is used to retrieve metadata information about a dataset or data source.

    • getmetadata can provide information about the structure, format, and properties of the data.

    • It can be used to understand the data schema, column names, data types, and any constraints or relationships.

    • This information is helpful for data engineers to properly process, transform, and analyze the data.

    • For example, getmetadata can be used ...

  • Answered by AI
  • Q4. Diffrent types of triggers
  • Ans. 

    Triggers in databases are special stored procedures that are automatically executed when certain events occur.

    • Types of triggers include: DML triggers (for INSERT, UPDATE, DELETE operations), DDL triggers (for CREATE, ALTER, DROP operations), and logon triggers.

    • Triggers can be classified as row-level triggers (executed once for each row affected by the triggering event) or statement-level triggers (executed once for eac...

  • Answered by AI
  • Q5. Diffrence between normal cluster and job cluster in databricks
  • Ans. 

    Normal cluster is used for interactive workloads while job cluster is used for batch processing in Databricks.

    • Normal cluster is used for ad-hoc queries and exploratory data analysis.

    • Job cluster is used for running scheduled jobs and batch processing tasks.

    • Normal cluster is terminated after a period of inactivity, while job cluster is terminated after the job completes.

    • Normal cluster is more cost-effective for short-liv...

  • Answered by AI
  • Q6. What is slowly changing dimensions
  • Ans. 

    Slowly changing dimensions refer to data warehouse dimensions that change slowly over time.

    • SCDs are used to track historical changes in data over time.

    • There are three types of SCDs - Type 1, Type 2, and Type 3.

    • Type 1 SCDs overwrite old data with new data, Type 2 creates new records for changes, and Type 3 maintains both old and new data in separate columns.

    • Example: A customer's address changing would be a Type 2 SCD.

    • Ex...

  • Answered by AI
  • Q7. Incremental load
  • Q8. With use in python
  • Ans. 

    Use Python's 'with' statement to ensure proper resource management and exception handling.

    • Use 'with' statement to automatically close files after use

    • Helps in managing resources like database connections

    • Ensures proper cleanup even in case of exceptions

  • Answered by AI
  • Q9. List vs tuple in python
  • Ans. 

    List is mutable, tuple is immutable in Python.

    • List can be modified after creation, tuple cannot be modified.

    • List uses square brackets [], tuple uses parentheses ().

    • Lists are used for collections of items that may need to be changed, tuples are used for fixed collections of items.

    • Example: list_example = [1, 2, 3], tuple_example = (4, 5, 6)

  • Answered by AI
  • Q10. Datalake 1 vs datalake2
  • Ans. 

    Datalake 1 and Datalake 2 are both storage systems for big data, but they may differ in terms of architecture, scalability, and use cases.

    • Datalake 1 may use a Hadoop-based architecture while Datalake 2 may use a cloud-based architecture like AWS S3 or Azure Data Lake Storage.

    • Datalake 1 may be more suitable for on-premise data storage and processing, while Datalake 2 may offer better scalability and flexibility for clou...

  • Answered by AI
  • Q11. How to read a file in databricks
  • Ans. 

    To read a file in Databricks, you can use the Databricks File System (DBFS) or Spark APIs.

    • Use dbutils.fs.ls('dbfs:/path/to/file') to list files in DBFS

    • Use spark.read.format('csv').load('dbfs:/path/to/file') to read a CSV file

    • Use spark.read.format('parquet').load('dbfs:/path/to/file') to read a Parquet file

  • Answered by AI
  • Q12. Star vs snowflake schema
  • Ans. 

    Star schema is denormalized with one central fact table surrounded by dimension tables, while snowflake schema is normalized with multiple related dimension tables.

    • Star schema is easier to understand and query due to denormalization.

    • Snowflake schema saves storage space by normalizing data.

    • Star schema is better for data warehousing and OLAP applications.

    • Snowflake schema is better for OLTP systems with complex relationsh

  • Answered by AI
  • Q13. Repartition vs coalesece
  • Ans. 

    repartition increases partitions while coalesce decreases partitions in Spark

    • repartition shuffles data and can be used for increasing partitions for parallelism

    • coalesce reduces partitions without shuffling data, useful for reducing overhead

    • repartition is more expensive than coalesce as it involves data movement

    • example: df.repartition(10) vs df.coalesce(5)

  • Answered by AI
  • Q14. Parquet file uses
  • Ans. 

    Parquet file format is a columnar storage format used for efficient data storage and processing.

    • Parquet files store data in a columnar format, which allows for efficient querying and processing of specific columns without reading the entire file.

    • It supports complex nested data structures like arrays and maps.

    • Parquet files are highly compressed, reducing storage space and improving query performance.

    • It is commonly used ...

  • Answered by AI

Skills evaluated in this interview

Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. What can you improve the query performance?
  • Ans. 

    Improving query performance by optimizing indexes, using proper data types, and minimizing data retrieval.

    • Optimize indexes on frequently queried columns

    • Use proper data types to reduce storage space and improve query speed

    • Minimize data retrieval by only selecting necessary columns

    • Avoid using SELECT * in queries

    • Use query execution plans to identify bottlenecks and optimize accordingly

  • Answered by AI
  • Q2. What id SCD type2 table?
  • Ans. 

    SCD type2 table is used to track historical changes in data by creating new records for each change.

    • Contains current and historical data

    • New records are created for each change

    • Includes effective start and end dates for each record

    • Requires additional columns like surrogate keys and version numbers

    • Used for slowly changing dimensions in data warehousing

  • Answered by AI
Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Naukri.com and was interviewed in Oct 2024. There were 2 interview rounds.

Round 1 - One-on-one 

(2 Questions)

  • Q1. Azure Scenario based questions
  • Q2. Pyspark Coding based questions
Round 2 - One-on-one 

(2 Questions)

  • Q1. ADF, Databricks related question
  • Q2. Spark Performance problem and scenarios
  • Ans. 

    Spark performance problems can arise due to inefficient code, data skew, resource constraints, and improper configuration.

    • Inefficient code can lead to slow performance, such as using collect() on large datasets.

    • Data skew can cause uneven distribution of data across partitions, impacting processing time.

    • Resource constraints like insufficient memory or CPU can result in slow Spark jobs.

    • Improper configuration settings, su...

  • Answered by AI

Skills evaluated in this interview

Interview experience
3
Average
Difficulty level
Moderate
Process Duration
-
Result
No response

I applied via LinkedIn and was interviewed in Jan 2024. There was 1 interview round.

Round 1 - Technical 

(4 Questions)

  • Q1. What is Pyspark?
  • Ans. 

    Pyspark is a Python API for Apache Spark, a powerful open-source distributed computing system.

    • Pyspark is used for processing large datasets in parallel across a cluster of computers.

    • It provides high-level APIs in Python for Spark programming.

    • Pyspark allows seamless integration with other Python libraries like Pandas and NumPy.

    • Example: Using Pyspark to perform data analysis and machine learning tasks on big data sets.

  • Answered by AI
  • Q2. What is Pyspark SQL?
  • Ans. 

    Pyspark SQL is a module in Apache Spark that provides a SQL interface for working with structured data.

    • Pyspark SQL allows users to run SQL queries on Spark dataframes.

    • It provides a more concise and user-friendly way to interact with data compared to traditional Spark RDDs.

    • Users can leverage the power of SQL for data manipulation and analysis within the Spark ecosystem.

  • Answered by AI
  • Q3. How to merge 2 dataframes of different schema?
  • Ans. 

    To merge 2 dataframes of different schema, use join operations or data transformation techniques.

    • Use join operations like inner join, outer join, left join, or right join based on the requirement.

    • Perform data transformation to align the schemas before merging.

    • Use tools like Apache Spark, Pandas, or SQL to merge dataframes with different schemas.

  • Answered by AI
  • Q4. What is Pyspark streaming?
  • Ans. 

    Pyspark streaming is a scalable and fault-tolerant stream processing engine built on top of Apache Spark.

    • Pyspark streaming allows for real-time processing of streaming data.

    • It provides high-level APIs in Python for creating streaming applications.

    • Pyspark streaming supports various data sources like Kafka, Flume, Kinesis, etc.

    • It enables windowed computations and stateful processing for handling streaming data.

    • Example: C...

  • Answered by AI

Interview Preparation Tips

Topics to prepare for Luxoft Data Engineer interview:
  • Pyspark

Skills evaluated in this interview

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
No response

I applied via Company Website and was interviewed in Jan 2024. There was 1 interview round.

Round 1 - Technical 

(2 Questions)

  • Q1. Spark Architecture
  • Q2. Explain the Spark architecture with example
  • Ans. 

    Spark architecture includes driver, cluster manager, and worker nodes for distributed processing.

    • Spark architecture consists of a driver program that manages the execution of tasks on worker nodes.

    • Cluster manager is responsible for allocating resources and scheduling tasks across worker nodes.

    • Worker nodes execute the tasks and store data in memory or disk for processing.

    • Example: In a Spark application, the driver progr...

  • Answered by AI

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

Luxoft user image Madhurima Dutta

posted on 25 Jul 2024

Interview experience
5
Excellent
Difficulty level
Easy
Process Duration
2-4 weeks
Result
Selected Selected

I applied via Recruitment Consulltant and was interviewed before Jul 2023. There were 2 interview rounds.

Round 1 - Technical 

(4 Questions)

  • Q1. Questions on SQL complex queries
  • Q2. Handling ADF pipelines
  • Ans. 

    Handling ADF pipelines involves designing, building, and monitoring data pipelines in Azure Data Factory.

    • Designing data pipelines using ADF UI or code

    • Building pipelines with activities like copy data, data flow, and custom activities

    • Monitoring pipeline runs and debugging issues

    • Optimizing pipeline performance and scheduling triggers

  • Answered by AI
  • Q3. Schedules and triggers
  • Q4. About project work and complexity faced and handling issues
Round 2 - HR 

(1 Question)

  • Q1. Salary discussion

Skills evaluated in this interview

Interview experience
3
Average
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Naukri.com and was interviewed in Feb 2024. There was 1 interview round.

Round 1 - Technical 

(2 Questions)

  • Q1. Complete interview was on project discussion
  • Q2. On ensemble models some questions on stats
Interview experience
4
Good
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. It wnt well basics of hadoop,spark and my project
  • Q2. Basics of sql and joins
Round 2 - Coding Test 

Basics of sql and joins

Altimetrik Interview FAQs

How many rounds are there in Altimetrik Data Science Intern interview?
Altimetrik interview process usually has 6 rounds. The most common rounds in the Altimetrik interview process are Resume Shortlist, Aptitude Test and Coding Test.
What are the top questions asked in Altimetrik Data Science Intern interview?

Some of the top questions asked at the Altimetrik Data Science Intern interview -

  1. What is gradient descent, why does gradient descent follow tan angles and pleas...read more
  2. Please write a dictionary and try to sort ...read more
  3. Pretty hard technical interview from formulae behind algorithms to math to algo...read more

Tell us how to improve this page.

Interview Questions from Similar Companies

CitiusTech Interview Questions
3.4
 • 273 Interviews
Xoriant Interview Questions
4.1
 • 183 Interviews
Globant Interview Questions
3.8
 • 175 Interviews
ThoughtWorks Interview Questions
3.9
 • 147 Interviews
Apexon Interview Questions
3.3
 • 141 Interviews
Brillio Interview Questions
3.4
 • 132 Interviews
Luxoft Interview Questions
3.7
 • 124 Interviews
View all

Altimetrik Data Science Intern Reviews and Ratings

based on 1 review

2.0/5

Rating in categories

1.0

Skill development

1.0

Work-life balance

4.0

Salary

1.0

Job security

1.0

Company culture

3.0

Promotions

3.0

Work satisfaction

Explore 1 Review and Rating
Senior Software Engineer
1.2k salaries
unlock blur

₹9.5 L/yr - ₹35 L/yr

Staff Engineer
903 salaries
unlock blur

₹11.1 L/yr - ₹41 L/yr

Senior Engineer
690 salaries
unlock blur

₹9 L/yr - ₹30 L/yr

Software Engineer
328 salaries
unlock blur

₹4.8 L/yr - ₹19 L/yr

Staff Software Engineer
235 salaries
unlock blur

₹10.4 L/yr - ₹37 L/yr

Explore more salaries
Compare Altimetrik with

Accenture

3.8
Compare

Xoriant

4.1
Compare

CitiusTech

3.3
Compare

HTC Global Services

3.6
Compare
Did you find this page helpful?
Yes No
write
Share an Interview