104 Allen Digital Jobs
1-3 years
Bangalore / Bengaluru
1 vacancy
Pyspark Developer (Contract)"> Pyspark Developer (Contract)
Allen Digital
posted 4hr ago
Flexible timing
Key skills for the job
Allen Digital was created as a strategic partnership between Allen Careers Institute and Bodhi Tree Systems to ensure tech enablement for millions of students. Allen Digital aims to build an EdTech platform to provide students with everything a classroom cannot. We have the backing of some of the best names in - media, business, education and technology. We are a start-up with a reputed team, strong investors and a legacy.
Job Summary
We are seeking a skilled PySpark Developer to join our data engineering team. The ideal candidate will have strong expertise in Apache Spark, Python programming, and big data technologies. You will be responsible for designing, developing, and maintaining scalable data pipelines and ETL processes to support our data analytics and business intelligence initiatives.
Key Responsibilities
Advanced PySpark Development:
Write Efficient PySpark Code: Develop high-quality, reusable, and scalable PySpark scripts to process large datasets.
Custom Transformations: Design and implement custom transformations and user-defined functions (UDFs) to meet specific data processing requirements.
Data Frame Manipulation: Utilize PySpark DataFrame APIs to perform complex data manipulations, aggregations, and joins.
Error Handling and Logging: Implement robust error handling and logging mechanisms within PySpark applications to ensure reliability and ease of troubleshooting.
Develop and Maintain Data Pipelines:
Design, build, and optimize robust and scalable data pipelines using PySpark.
Implement ETL processes to ingest, transform, and load data from various sources.
Performance Optimization:
Optimize PySpark jobs for performance and efficiency, including tuning Spark configurations and optimizing resource utilization.
Conduct code reviews and performance assessments to identify and implement improvements in PySpark codebases.
Data Processing and Analysis:
Perform data cleaning, validation, and transformation using PySpark to ensure data quality and consistency.
Collaborate with data analysts and scientists to understand data requirements and deliver PySpark-based solutions.
Integration and Deployment:
Integrate PySpark applications with other data tools and platforms, ensuring seamless data flow across systems.
Participate in the deployment and maintenance of PySpark applications in production environments, ensuring scalability and reliability.
Collaboration and Documentation:
Work closely with cross-functional teams including data engineers, software developers, and business stakeholders to deliver comprehensive data solutions.
Document PySpark code, processes, workflows, and technical specifications to ensure maintainability and knowledge sharing.
Qualifications
Education:
Bachelor s degree in Computer Science, Information Technology, Engineering, or a related field.
Experience:
1-3 years of experience in data engineering or a related role.
Proven experience with Apache Spark and PySpark.
Technical Skills:
Proficiency in Python programming, with a strong emphasis on PySpark for big data processing.
Strong understanding of big data technologies and frameworks (e.g., Hadoop, Hive, Kafka).
Experience with SQL and database systems (e.g., MySQL, PostgreSQL, NoSQL databases).
Familiarity with data warehousing concepts and tools.
Tools & Platforms:
Experience with cloud platforms (e.g., AWS, Azure, Google Cloud) is a plus.
Knowledge of version control systems (e.g., Git).
Preferred Skills
Advanced Analytics:
Experience with machine learning libraries and frameworks.
DevOps Practices:
Familiarity with CI/CD pipelines and containerization technologies (e.g., Docker, Kubernetes).
Soft Skills:
Strong problem-solving abilities and attention to detail.
Excellent communication and teamwork skills.
Ability to manage multiple tasks and meet deadlines in a fast-paced environment.
Employment Type: Full Time, Permanent
Read full job descriptionPrepare for Pyspark Developer roles with real interview advice
1-3 Yrs
Bangalore / Bengaluru
1-5 Yrs
Varanasi, Bangalore / Bengaluru