Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 2K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

WINNERS AWAITED!
- ABECA 2025
  
  WINNERS AWAITED!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

Add office photos

Engaged Employer

TechStar Group

Compare

4.0

based on 124 Reviews

50 TechStar Group Jobs

Big Data Engineer - Hadoop/Cloudera (6-12 yrs)

TECHSTAR SOFTWARE DEVELOPMENT INDIA PVT LTD

4.0

based on 124 Reviews

6-12 years

TechStar Group

posted 4mon ago

Job Role Insights

Flexible timing

Key skills for the job

Data Engineering Cloud Computing Big Data Spark Hadoop Administration Kafka

+ 4 more

Job Description

Responsibilities of the Candidate :

- Be responsible for the design and development of big data solutions.

- Partner with domain experts, product managers, analysts, and data scientists to develop Big Data pipelines in Hadoop

- Be responsible for moving all legacy workloads to a cloud platform

- Work with data scientists to build Client pipelines using heterogeneous sources and provide engineering services for data science applications

- Ensure automation through CI/CD across platforms both in cloud and on-premises

- Define needs around maintainability, testability, performance, security, quality, and usability for the data platform

- Drive implementation, consistent patterns, reusable components, and coding standards for data engineering processes

- Convert SAS-based pipelines into languages like PySpark, and Scala to execute on Hadoop and non-Hadoop ecosystems

- Tune Big data applications on Hadoop and non-Hadoop platforms for optimal performance

- Apply an in-depth understanding of how data analytics collectively integrate within the sub-function as well as coordinate and contribute to the objectives of the entire function.

- Produce a detailed analysis of issues where the best course of action is not evident from the information available, but actions must be recommended/taken.

- Assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients, and assets, by driving compliance with applicable laws, rules, and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct, and business practices, and escalating, managing and reporting control issues with transparency

Requirements :

- 6+ years of total IT experience

- 4+ years of experience with Hadoop (Cloudera)/big data technologies

- Knowledge of the Hadoop ecosystem and Big Data technologies Hands-on experience with the Hadoop eco-system (HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, Solr)

- Experience in designing and developing Data Pipelines for Data Ingestion or Transformation using Java Scala or Python.

- Experience with Spark programming (Pyspark, Scala, or Java)

- Hands-on experience with Python/Pyspark/Scala and basic libraries for machine learning is required.

- Proficient in programming in Java or Python with prior Apache Beam/Spark experience a plus.

- System level understanding - Data structures, algorithms, distributed storage & compute

- Can-do attitude on solving complex business problems, good interpersonal and teamwork skills.