Were seeking a hands-on Lead Data Engineer to build and scale our data infrastructure from the ground up. Youll play a critical role in enabling data-driven decisions across the organization, focusing on high-frequency ingestion of data and the development of custom pipelines and tools to support our unique needs. This is an opportunity to make a significant impact in a fast-growing startup environment.
Responsibilities:
Design, build, and maintain robust and scalable data pipelines using both open-source and cloud-based technologies.
Develop and optimize data ingestion processes for high-frequency, small and large file data.
Build and maintain a data warehouse/lake solution.
Develop custom tools for data validation, processing, analysis, and automation.
Collaborate with product and engineering teams to understand data needs and deliver solutions.
Implement data quality monitoring and alerting.
Mentor and guide junior data engineers as the team grows.
Stay up-to-date with the latest data engineering trends and technologies.
Qualifications:
Bachelors degree in Computer Science or a related field.
3+ years of experience in data engineering.
Strong understanding of data engineering principles and best practices.
Proficiency in Python and SQL.
Experience with data pipeline tools (eg, Apache Airflow, Prefect, dbt).
Experience with cloud platforms (eg, AWS, GCP, Azure).
Experience with data warehousing/lake solutions (eg, Snowflake, BigQuery, Databricks).
Experience with message queues (eg, Kafka, RabbitMQ).
Excellent problem-solving and communication skills.
Experience with high-frequency data ingestion is a plus.
Tools:
Programming Languages: Python, SQL
Data Pipelines: Apache Airflow, Prefect, dbt, Hevo
Cloud Platforms: AWS, GCP, Azure
Data Warehouses/Lakes: Snowflake, BigQuery, Databricks