We are seeking a skilled and motivated Senior Data Engineer to join our Data Acquisition side of our AI team at Speechify. This team is responsible for all aspects of data collection to support our model training operations. In this role, you will design, build, and maintain robust data pipelines and systems to support the organizations data needs. The ideal candidate has a strong understanding of data architecture, ETL processes, and a passion for leveraging data to drive decision-making and innovation.
What You ll Do
Design, develop, and maintain scalable data pipelines and workflows to ingest, transform, and store large datasets.
Collaborate with data scientists, analysts, and software engineers to understand data needs and deliver effective solutions.
Optimize and enhance existing data processes for performance, scalability, and cost-efficiency.
Implement data quality checks, validation, and monitoring to ensure data accuracy and reliability.
Develop and manage data warehouses, databases, and other storage solutions.
Ensure compliance with data governance and security policies.
Stay up-to-date with emerging technologies and best practices in data engineering and apply them as appropriate.
An Ideal Candidate Should Have
Bachelor s or Master s degree in Computer Science, Engineering, or a related field.
Proven experience as a Data Engineer or in a similar role and experience with ETL.
Proficiency in programming languages such as Python and experience in SQL
Big data tools: Data- and Delta-lakes
Cloud: Bare-Metal, Hybrid infrastructure
Good to Have
Experience working with media files (transformations)
Torch dataset experience
What We Offer
A fast-growing environment where you can help shape the culture
An entrepreneurial crew that supports risk, intuition, and hustle
A hands-off approach so you can focus and do your best work
The opportunity to make an impact in a transformative industry
A competitive salary, a collegiate atmosphere, and a commitment to building a great asynchronous culture