Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 2K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

WINNERS AWAITED!
- ABECA 2025
  
  WINNERS AWAITED!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

Add office photos

Employer? Claim Account for FREE

Pofi Technologies

Compare

3.5

based on 10 Reviews

2 Pofi Technologies Jobs

Data Scientist

PINKMOON TECHNOLOGIES PVT LTD

3.5

based on 10 Reviews

2-6 years

Vijayawada

Data Scientist

Pofi Technologies

posted 3hr ago

Job Role Insights

Flexible timing

Key skills for the job

Data Science Virtual Assistant Python AWS SQL ETL Testing

+ 7 more

Job Description

Key Responsibilities:
Data Processing: Using PySpark to process structured and unstructured data. This involves tasks like filtering, grouping, aggregating, and transforming large datasets.
Data Engineering: Working with ETL (Extract, Transform, Load) pipelines, using PySpark to load data from various sources (like HDFS, S3, databases) and transforming it into a suitable format for analysis.
Optimization: Ensuring that the Spark jobs are optimized for performance, such as reducing job execution time and minimizing resource usage.
Cluster Management: Monitoring and managing Spark clusters (often on platforms like Amazon EMR, Databricks, or Hadoop) to ensure the efficient processing of large-scale data.
Collaborating with Data Scientists: Working closely with data scientists to deploy machine learning models on big data, providing the infrastructure to support complex analytics tasks.

Key Skills:
Apache Spark: Understanding of the core concepts of Apache Spark (RDDs, DataFrames, Datasets, and Spark SQL).
Python: Proficiency in Python, as PySpark uses Python to write Spark applications.
SQL: Knowledge of SQL to interact with databases and perform data manipulation tasks.
Distributed Computing: Experience with distributed systems and parallel computing, as Spark allows for parallel data processing.
Cloud Platforms: Experience with cloud platforms like AWS, Azure, or GCP for data storage and computing (using services like S3, EMR, Databricks, etc.).
ETL Pipelines: Experience designing and managing ETL pipelines to process and analyze data.

Tools & Technologies:
PySpark (for processing data)
Apache Spark (core framework)
SQL (for querying databases)
AWS/GCP/Azure (for cloud-based data storage and processing)

Key Traits of a PySpark Developer:
Analytical thinking and problem-solving skills.
Ability to work with large, complex datasets.
Strong programming and debugging skills, especially in Python.
Familiarity with big data technologies and distributed computing.
Knowledge of data engineering and machine learning concepts.

Employment Type: Full Time, Permanent

Read full job description