We are seeking an experienced Databricks Developer / Data Engineer to design, develop, and optimize data pipelines, ETL workflows, and big data solutions using Databricks. The ideal candidate should have expertise in Apache Spark, PySpark, SQL, and cloud-based data platforms (Azure, AWS, GCP). This role involves working with large-scale datasets, data lakes, and data warehouses to drive business intelligence and analytics.
Key Responsibilities: Design, build, and optimize ETL and ELT pipelines using Databricks and Apache Spark. Work with big data processing frameworks (PySpark, Scala, SQL) for data transformation and analytics. Implement Delta Lake architecture for data reliability, ACID transactions, and schema evolution. Integrate Databricks with cloud services like Azure Data Lake, AWS S3, GCP BigQuery, and Snowflake. Develop and maintain data models, data lakes, and data warehouse solutions. Optimize Spark performance tuning, job scheduling, and cluster configurations. Work with Azure Synapse, AWS Glue, or GCP Dataflow to enable seamless data integration. Implement CI/CD automation for data pipelines using Azure DevOps, GitHub Actions, or Jenkins. Perform data quality checks, validation, and governance using Databricks Unity Catalog. Collaborate with data scientists, analysts, and business teams to support analytics and AI/ML models. Required Skills & Qualifications: 6+ years of experience in data engineering and big data technologies. Strong expertise in Databricks, Apache Spark, and PySpark/Scala. Hands-on experience with SQL, NoSQL, and structured/unstructured data processing. Experience with cloud platforms (Azure, AWS, GCP) and their data services. Proficiency in Python, SQL, and Spark optimizations. Experience with Delta Lake, Lakehouse architecture, and metadata management. Strong understanding of ETL/ELT processes, data lakes, and warehousing concepts. Experience with streaming data processing (Kafka, Event Hubs, Kinesis, etc.). Knowledge of security best practices, role-based access control (RBAC), and compliance. Experience in Agile methodologies and working in cross-functional teams. Preferred Qualifications: Databricks Certifications (Databricks Certified Data Engineer Associate/Professional). Experience with Machine Learning and AI/ML pipelines on Databricks. Hands-on experience with Terraform, CloudFormation, or Infrastructure as Code (IaC).