i
Hoonartek
3 Hoonartek Jobs
7-10 years
Azure Databricks Lead/Specialist - ETL/PySpark (7-10 yrs)
Hoonartek
posted 1mon ago
Flexible timing
Key skills for the job
Job Title : Azure Databricks Lead
Job Overview :
As an Azure Databricks Lead/Specialist, you will play a critical role in designing, implementing, and optimizing data solutions using Azure Databricks. Your expertise will contribute to building robust data pipelines, ensuring data quality, and enhancing overall performance. You'll collaborate with cross-functional teams to deliver high-quality solutions aligned with business requirements.
Responsibilities :
1. Design and Develop Data Pipelines:
- Create scalable data processing pipelines using Azure Databricks and PySpark.
- Implement ETL (Extract, Transform, Load) processes to ingest, transform, and load data from various sources.
- Collaborate with data engineers and architects to ensure efficient data movement and transformation.
2. Data Quality Implementation :
- Establish data quality checks and validation rules within Azure Databricks.
- Monitor data quality metrics and address anomalies promptly.
- Work closely with data governance teams to maintain data accuracy and consistency.
3. Unity Catalog Integration :
- Leverage Azure Databricks Unity Catalog to manage metadata, tables, and views.
- Integrate Databricks assets seamlessly with other Azure services.
- Ensure proper documentation and organization of data assets.
4. Delta Lake Expertise :
- Understand and utilize Delta Lake, which provides ACID transactions and time travel capabilities on top of data lakes.
- Implement Delta Lake tables for reliable data storage and versioning.
- Optimize performance by leveraging Delta Lake features.
5. Performance Tuning and Query Optimization:
- Profile and analyze query performance.
- Optimize SQL queries, Spark jobs, and transformations for efficiency.
- Tune resource allocation to achieve optimal execution times.
6. Resource Optimization :
- Manage compute resources effectively within Azure Databricks clusters.
- Scale clusters dynamically based on workload requirements.
- Monitor resource utilization and cost efficiency.
7. Source System Integration :
- Integrate Azure Databricks with various source systems (e.g., databases, data lakes, APIs).
- Ensure seamless data ingestion and synchronization.
- Handle schema evolution and changes in source data.
8 Stored Procedure Conversion in Databricks:
- Convert existing stored procedures (e.g., from SQL Server) into Databricks-compatible code.
- Optimize and enhance stored procedures for better performance within Databricks
- SSRS conversion experience
Qualifications and Skills:
- Education: Bachelor's degree in computer science, Information Technology, or a related field.
- Experience:
- Minimum 5 years of experience architecting and building data platforms on Azure.
- Proficiency in Azure Databricks, PySpark, and SQL.
- Familiarity with Delta Lake concepts.
Certifications (preferred) :
- Microsoft Certified: Azure Data Engineer Associate or similar.
- Databricks Certified Associate Developer for Apache Spark.
Functional Areas: Software/Testing/Networking
Read full job descriptionPrepare for Specialist roles with real interview advice