52 TQUANTA Technologies Jobs
6-15 years
TQuanta Technologies - PySpark Developer - Python/SQL (6-15 yrs)
TQUANTA Technologies
posted 6d ago
Flexible timing
Key skills for the job
Job Description : PySpark Developer
Position Overview :
We are seeking a highly skilled PySpark Developer with experience to join our dynamic team. The ideal candidate will have a strong background in big data technologies, with expertise in PySpark and related tools to design, develop, and maintain scalable data processing solutions. This role involves collaborating with cross-functional teams to drive data-driven decision-making and improve the efficiency of large-scale data pipelines.
Key Responsibilities :
- Develop and maintain scalable ETL pipelines using PySpark to process large volumes of structured and unstructured data.
- Optimize data pipelines for performance, reliability, and scalability.
- Implement data quality and data validation frameworks to ensure the integrity of processed data.
- Integrate data from diverse sources, including APIs, databases, and flat files, into big data platforms.
- Transform raw data into usable formats for downstream analysis and machine learning applications.
- Collaborate with data engineers, data scientists, and business stakeholders to gather requirements and deliver data solutions tailored to business needs.
- Design and implement solutions aligned with architectural best practices and organizational goals.
- Analyze and optimize PySpark jobs for performance improvements.
- Tune cluster configurations and resource utilization in cloud or on-premises environments.
- Stay updated with emerging big data technologies and provide recommendations for new tools and frameworks.
- Mentor junior team members and conduct code reviews to ensure quality standards.
- Prepare detailed technical documentation for developed solutions.
- Create dashboards and reports for monitoring data pipelines and job execution metrics.
Required Skills and Qualifications :
- Proficiency in PySpark and Spark core concepts (RDDs, DataFrames, Datasets).
- Strong programming skills in Python.
- Hands-on experience with big data platforms like Hadoop, Hive, or HDFS.
- Familiarity with databases (SQL and NoSQL).
- Experience with cloud platforms such as AWS, Azure, or Google Cloud for data processing.
- Knowledge of CI/CD pipelines, version control systems (e.g., Git), and containerization tools like Docker.
- Ability to debug and resolve issues in distributed data processing systems.
- Strong analytical and troubleshooting skills for complex systems.
- Strong interpersonal skills for effective communication with technical and non-technical teams.
- Proven ability to work in an Agile development environment.
- Experience with machine learning frameworks and libraries like TensorFlow or Scikit-Learn.
- Knowledge of streaming data frameworks like Kafka or Spark Streaming.
- Exposure to tools like Airflow for workflow orchestration.
- Certification in big data or cloud technologies is a plus.
Educational Requirements :
- Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
- Collaborative and innovative work environment.
- Attractive compensation and growth opportunities.
Functional Areas: Other
Read full job descriptionPrepare for Pyspark Developer roles with real interview advice
6-15 Yrs
5-15 Yrs
3-10 Yrs