ETL Development Data Warehousing: Design, build, deploy, and iteratively improve ETL processes and data warehousing solutions, ensuring data integrity, performance, and security.
Collaboration Across Teams: Work closely with data scientists, analysts, and product teams to enable data collection, analysis, and the development of data-driven features.
Data Pipeline Optimization: Develop and maintain data pipelines using tools like Airflow and Airbyte to ensure efficient data flow from multiple data sources, including MySQL, Sendgrid, Iterable, and application logs.
Machine Learning Integration: Work alongside data scientists to implement and improve machine learning models, deploying them as services to leverage our growing data volumes effectively.
Technical Leadership: Provide input during planning, story mapping, and other product development activities to help shape features and make informed technical decisions.
DevOps Reliability: Collaborate with Site Reliability Engineers and the Technology team to align the data platform work with the overall technical direction, ensuring efficient integration between the application platform and the data platform.
Our Tech Stack
Core ETL process written in Ruby, transitioning towards an S3 and Iceberg data lake.
Data ingestion managed by Airflow and Airbyte.
Data storage using Snowflake, structured following Kimball dimensional modeling principles.
Data transformation handled with dbt.
Infrastructure managed on Kubernetes (EKS) with Terraform.
Additional support for a recommendation engine developed in Python.
Requirements
Proven experience in building and deploying data warehouses and ETL pipelines, with an emphasis on maintainability, performance, and security.
Strong programming skills in Python or other general-purpose programming languages, with the ability to write clean, maintainable, and well-tested code.
Experience working with data platforms and technologies, ideally with direct experience in some of our stack (e.g., Airflow, Snowflake, Kubernetes, Terraform).
Professional experience working in a cloud-based environment with a DevOps culture, including hands-on experience with data ingestion, processing, and storage.
Ability to understand and solve data-related challenges for analysts, data scientists, and other engineers.
Strong written and verbal communication skills, with the ability to engage with stakeholders at various levels of the organization.
Good to Have
Experience working with Ruby or willingness to adapt to new languages.
Knowledge of Kimball dimensional modeling and its application in data warehousing.
Exposure to data lake architecture and the use of Iceberg or similar technologies.
Familiarity with cloud platforms and infrastructure-as-code practices (e.g., Terraform).