21 Fractal31 Jobs
Data Engineer - Spark/Hadoop (3-10 yrs)
Fractal31
posted 5d ago
Flexible timing
Key skills for the job
Roles and Responsibilities :
Cloud Migration : Manage the migration of legacy data workloads to modern cloud platforms, ensuring data quality, security, and optimal performance.
Data Science Support : Build client data pipelines from diverse sources to support data science applications, collaborating closely with data scientists to address data requirements.
Platform Standards : Establish and uphold standards for data platform maintainability, testability, performance, security, and usability. Drive consistency with reusable components and coding practices.
Code Conversion : Translate SAS-based pipelines into PySpark or Scala to execute on Hadoop and other ecosystems.
Performance Optimization : Enhance the performance of Big Data applications on both Hadoop and non-Hadoop platforms.
IT & Business Integration : Analyze and assess evolving business requirements to recommend and implement systems enhancements. Maintain a comprehensive understanding of data analytics as it supports business objectives.
Data Issue Resolution : Conduct detailed analyses of data issues, make actionable recommendations, and implement solutions.
Technical Skills :
Primary Skills : Strong expertise in PySpark and SQL, with a foundational understanding of data warehousing principles. Proficiency in Linux (Unix shell scripting) is advantageous.
Technical Expertise : In-depth understanding of the Hadoop ecosystem and Big Data technologie, including HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, and Solr.
Programming Skills : Proficiency in Spark programming with PySpark, Python, or Scala for effective data processing and pipeline creation.
Development Experience :
- Demonstrated experience in end-to-end development of data pipelines, with a focus on hands-on involvement in construction; production support experience alone is not sufficient.
Data Management :
- Experience managing large data volumes, to effectively support data scale requirements.
- SQL & Database Proficiency: Advanced SQL skills for querying, analyzing, and working with complex, real-world datasets.
- Cloud Platform Familiarity: Experience with AWS or Azure is beneficial, though not essential.
Note : Cloud knowledge should complement core data engineering skills.
Functional Areas: Software/Testing/Networking
Read full job descriptionPrepare for Data Engineer roles with real interview advice