This role is responsible for data collection procedures, including accurate and relevant data for reporting, analytics and machine learning models. Extracting and analysing data from the primary and secondary database. The role ensures data integrity and compliance by performing data cleansing and data validations. The role is responsible for tracking the flow of data from its origin to its destination and record metadata about the data assets and transformations. The role performs root-cause analysis and recommends or executes corrective actions when data related system problems occur.
Responsibilities
Review and evaluate data pipelines and engineering activities for compliance with architecture, security and quality guidelines and standards; provides tangible feedback to improve data quality and mitigate failure risk.
Develop and implement data governance strategies, policies, and procedures to ensure data quality.
Establish data frameworks and guidelines to ensure accurate classification of data, ownership and lineage
Leads for all stages of design and development for complex, secure and performant data solutions and models, including design, analysis, coding, testing, and integration of structured/unstructured data.
Ensures and maintains a high level of data integrity by using tools to monitor and mass update data changes.
Build KPIs and Metrics to measure and monitor data quality
Represents the data engineering team for all phases of larger and more-complex development projects.
Drives innovation and integration of new technologies into projects and activities in the big data space.
Provides guidance and mentoring to less experienced staff members.
Education and Experience Required
Four-year or Graduate Degree in Computer Science, Information Technology, Software Engineering, Statistics/ Mathematics, or any other related discipline or commensurate work experience or demonstrated competence.
Typically has 7-10 years of work experience, preferably in data analytics, data engineering, data modeling, or a related field.
Has experience managing data pipelines, build monitoring solutions and administrative/governance tasks
Experience in Metadata Management or Building Data Catalogs is an advantage
Knowledge & Skills
Databricks
PostgreSQL
Amazon Web Services
Data Analysis
Data Engineering
Data Modeling
Data Pipelines
Extract Transform Load (ETL)
Python (Programming Language)
PySpark (Programming Language)
SQL (Programming Language)
Cross-Org Skills
Effective Communication
Results Orientation
Learning Agility
Digital Fluency
Customer Centricity
Impact & Scope
Impacts function and leads and/or provides expertise to functional project teams and may participate in cross-functional initiatives.
Complexity
Works on complex problems where analysis of situations or data requires an in-depth evaluation of multiple factors