We are seeking a highly skilled and experienced Data Engineer to join our dynamic team. This India based role focuses on leveraging Apache Spark and Scala/Python for building sophisticated big data solutions. The ideal candidate will be adept at designing and implementing robust, scalable data processing systems that drive our business insights and decision-making.
Experience range: 8 - 12 Years
Key Responsibilities:
Design and Build Big Data Systems : Develop and maintain scalable and reliable data architectures using Apache Spark and Scala/Python. Design systems that can process large volumes of data efficiently.
Develop Data Pipelines : Construct and maintain ETL processes and data pipelines. Ensure seamless data flow from various sources to our storage and analysis platforms.
Performance Tuning : Monitor and optimize the performance of data processing applications. Implement best practices to enhance efficiency and reduce latency.
Data Quality Assurance : Ensure the accuracy and consistency of data through rigorous testing and validation processes. Develop strategies to handle data anomalies and integrity issues.
Collaboration with Cross-Functional Teams : Work closely with analysts, and IT teams to understand data needs and deliver effective solutions. Communicate technical concepts clearly to non-technical stakeholders.
Stay Ahead of Industry Trends : Continuously update technical knowledge and skills, especially in areas of big data, Spark, Scala, Python and AI. Evaluate and adopt new tools and technologies to improve data systems.
Qualifications:
Bachelor s or Master s degree in Computer Science, Engineering, or a related field.
Minimum of 3 years Proven experience in data engineering, specifically with Apache Spark and Scala/Python
Familiarity with cloud platforms ( Azure, AWS, GCP) and their data services is a plus.
Hands-on experience in Databricks with Pyspark and Unity Catalog
Experience with dataflow and Data factory pipelines .
Practical experience in handling error or invalid formats in incoming data.
Experience working with Delta table formats .
Excellent problem-solving skills and ability to optimize data systems for performance and scalability.
Generative AI and machine learning concepts, with experience in integrating ML models into big data platforms will be plus
Strong communication skills for effective collaboration within and across teams.