i
TalentBox Labs
10 TalentBox Labs Jobs
Associate Data Engineer - Azure Databricks (1-2 yrs)
TalentBox Labs
posted 4d ago
Key skills for the job
Job Description :
- Assist in designing, developing, and optimizing end-to-end data migration pipelines from various relational databases (Oracle, SQL Server, MySQL, PostgreSQL, DB2, Teradata, etc.) to cloud-based data lakes and data warehouses (AWS, Azure, GCP, Snowflake, Databricks).
- Collaborate with data engineers, architects, and business stakeholders to understand data structures, dependencies, and migration requirements to ensure seamless and efficient data transfer.
- Develop, test, and deploy ETL/ELT workflows using SQL, Python, and Spark for data extraction, transformation, and loading while ensuring data integrity and accuracy.
- Support large-scale data transformations, aggregations, and enrichment processes to prepare datasets for analytics and reporting.
- Optimize ETL/ELT pipelines for performance, scalability, and cost efficiency by leveraging best practices such as partitioning, indexing, and caching.
- Implement robust data validation and reconciliation checks to detect and fix inconsistencies, anomalies, and missing records before loading into the target system.
- Develop scripts and automation solutions to streamline repetitive tasks, such as schema conversion, data mapping, and incremental loads.
- Monitor and troubleshoot data pipeline failures, identify root causes, and implement corrective actions to ensure seamless data movement with minimal downtime.
- Ensure adherence to data governance policies, compliance standards (GDPR, HIPAA, CCPA), and security best practices, including data encryption, masking, and access controls.
- Work with cloud-based tools and services like AWS Glue, Azure Data Factory, Google Cloud Dataflow, and Databricks Workflows to manage and orchestrate data pipelines efficiently.
- Utilize workflow automation tools such as Apache Airflow or Prefect to schedule and monitor data pipeline execution.
- Conduct detailed data profiling and metadata analysis to assess data quality, identify anomalies, and recommend data standardization approaches.
- Collaborate with DevOps teams to integrate CI/CD pipelines for data workflows, ensuring continuous deployment and testing of data migration scripts.
Functional Areas: Other
Read full job description