7 Prama Jobs
Prama.ai - Python/PySpark Developer - Data Engineering (5-8 yrs)
Prama
posted 8d ago
Fixed timing
Key skills for the job
About the Role :
We are seeking a highly skilled and motivated Python/PySpark Data Engineer to join our growing data engineering team.
In this role, you will play a crucial part in building and maintaining robust and efficient data pipelines that power our data-driven decision making.
You will work closely with data engineers, analysts, and other stakeholders to design, develop, and deploy high-performance data solutions on cloud platforms, primarily AWS.
Responsibilities :
Data Pipeline Development & Maintenance :
- Design, develop, and maintain data pipelines using PySpark on cloud platforms like AWS EMR, AWS Glue, and Databricks.
- Extract, transform, and load (ETL) large datasets from various sources (e.g, databases, APIs, cloud storage) into data warehouses and data lakes.
- Optimize data pipelines for performance, scalability, and cost-effectiveness using techniques like data partitioning, caching, and indexing.
- Implement data quality checks and validation procedures to ensure data accuracy and integrity.
- Troubleshoot and resolve data pipeline issues promptly and effectively.
Python & PySpark Proficiency :
- Write clean, efficient, and well-documented Python code for data processing, transformation, and analysis.
- Leverage advanced PySpark features like DataFrames, SQL, and Spark SQL for data manipulation and aggregation.
- Experience with Spark streaming and real-time data processing is a plus.
Cloud Technologies :
- Hands-on experience with AWS services such as S3, Redshift, Glue, EMR, and IAM.
- Familiarity with cloud-native data platforms and tools is a plus (e.g, AWS Glue Data Catalog, AWS Athena).
Data Warehousing & ETL/ELT :
- Strong understanding of data warehousing concepts, including dimensional modeling, data marts, and data lakes.
- Experience with ETL/ELT processes and tools (e.g, Airflow, Prefect).
Collaboration & Communication :
- Collaborate effectively with data engineers, data analysts, data scientists, and business stakeholders to understand data requirements and translate them into technical solutions.
- Clearly communicate technical concepts and project progress to both technical and non-technical audiences.
Continuous Learning :
- Stay up-to-date with the latest advancements in data engineering technologies, best practices, and industry trends.
Qualifications :
- Bachelor's degree in Computer Science, Computer Engineering, or a related field.
- 3+ years of professional experience in Python development.
- 2+ years of hands-on experience with PySpark and the Spark ecosystem.
- Strong understanding of data structures, algorithms, and object-oriented programming principles.
- Proficiency in SQL and experience with relational databases (e.g, PostgreSQL, MySQL, Oracle).
- Experience with data warehousing concepts, ETL/ELT processes, and data modeling techniques.
- Excellent analytical and problem-solving skills with the ability to identify and resolve complex data issues.
- Strong communication and interpersonal skills with the ability to work effectively in a collaborative team environment.
- Experience with Agile development methodologies is a plus.
Bonus Points :
- Experience with containerization technologies like Docker and Kubernetes.
- Knowledge of machine learning and data science concepts.
- Experience with data visualization tools (e.g, Tableau, Power BI)
Functional Areas: Other
Read full job descriptionPrepare for Pyspark Developer roles with real interview advice
4-6 Yrs