28 SAPBOT Technologies Jobs
Data Engineer - Python/Spark (8-10 yrs)
SAPBOT Technologies
posted 20hr ago
Key skills for the job
Job Description :
- Extensive experience in building Data Processing pipelines using Apache Spark/Databricks with Python and PySpark
- Good Knowledge of inner working on Apache Spark Structured streaming is a plus.
- Deep understanding of Python and its ecosystem, principles and tooling that helps to write production grade applications e.g. PEP8, MyPy, PyLint, Pytest.
- Good knowledge of data design patterns and methodologies to build a data lake, based Azure cloud stack e.g. ADLSv2.
- Experience in creating data structures optimized for storage and various query patterns for DeltaLake, Parquet, Avro
- Deep understanding of the SDLC using Gitlab, Github and knowledge of CI/CD is a plus.
- Good to have working experience in cloud (Azure is preferrable). Knowledge of (Kafka or Event Hub) is plus
- Data lake houses using medallion architecture.
- Knowledge of Data Mesh principles is a plus.
- Ability to debug using tools Spark UI, Ganglia UI, expertise in Optimizing Spark Jobs
- The ability to work across structured, semi-structured, and unstructured data, extracting information and identifying linkages across disparate datasets.
- Experience of building applications using Polars, Pandas, Numpy is plus.
- Experience of building microservices on Kubernetes is plus
- Experience in traditional data warehousing concepts (Kimball Methodology, Star Schema, SCD2).
- Experience in orchestration tools like Azure Databricks Workflow, Apache Airflow is plus.
- Strong problem solving and analytical skills.
- Working experience in Agile methodologies (SCRUM)
- A proven team player with strong leadership skills, who can work in a collaborative way across business units, teams and regions
Functional Areas: Software/Testing/Networking
Read full job description