Design and implement scalable, fault-tolerant data pipelines on AWS using services like AWS Redshift, RDS, Glue, and EMR. Develop and optimize efficient data processing workflows using Python, PySpark and SQL. Build and maintain ETL/ELT processes for structured and unstructured data sources. Collaborate with data scientists and analysts to understand data requirements and provide optimized solutions. Ensure data quality, integrity, and security across various stages of the data pipeline. Implement best practices for data governance, monitoring, and automation. Troubleshoot and resolve data pipeline issues, optimizing performance and reliability. Manage and maintain database schemas and structure.
Mandatory Qualifications Bachelor s degree in computer science, engineering, or a related field. 6+ years of hands-on experience in data engineering roles. Understanding of machine learning and AI concepts. Proficient in Python, Pyspark and SQL for data manipulation and processing. Solid understanding of AWS data services. Proven experience in data modeling and database design (e.g. SQL, NoSQL). Strong problem-solving and analytical skills. Familiarity with data governance and data quality practices. Experience in management.
Preferred Qualifications AWS certifications (e.g., AWS Certified Data Analytics). Experience with big data technologies (e.g., Apache Spark, Hadoop). Familiarity with CI/CD practices and DevOps tools (e.g., Git, Jenkins). Knowledge of data streaming and real-time data processing.
Application Instructions
Please click on the link below to apply for this position. A new window will open and direct you to apply at our corporate careers page. We look forward to hearing from you!