Backend :Python, Pyspark, SQL, AWS (ECS/EKS/EC2/Redshift/EMR/Glue/S3/IAM), airflow, are seeking a Senior Data Engineer to lead the creation of a robust data ecosystem by transforming raw data from various sources into an integrated data layer, utilizing Airflow/python/pyspark/sql based pipeline for for data preparation and processing, which will be loaded into redshift/snowflake for business intelligence reporting and dashboarding.
Requirements.
This role requires a hands-on approach to manage the entire data pipeline, from ingestion to visualization, ensuring high-quality and accurate data outputs to support business Requirements :.
2+ years of experience in data engineering, data analytics, or business intelligence roles.
Strong expertise in python, pyspark, SQL, AWS (ECS/EKS/EC2/Redshift/EMR/Glue/S3/IAM), airflow, kubernetes.
Understanding of ETL processes and data warehouse concepts.
Strong ability to identify data anomalies, design data validation rules, and perform data cleanup to.
ensure high-quality data.
Experience in pharma or life sciences data :Familiarity with pharmaceutical datasets, including product, patient, or healthcare provider data, is a plus.
Experience in supply chain and manufacturing data is a plus.
Project management and task planning experience, ensuring smooth execution of deliverables and timelines.
Strong communication and interpersonal skills to collaborate with both technical and non-technical teams.