In collaboration with various functional teams, work across multiple projects and efforts to orchestrate and deliver cohesive data engineering solutions
Take ownership of the entire data services life cycle, from data ingestion to data processing and ETL to data delivery for reporting
Collaborate with other technical teams to deliver data solutions that meet both business and technical needs
Define technical requirements and implementation details for the underlying data lake, data warehouse, and data marts
Identify, troubleshoot, and resolve issues with production data integrity and performance
As the lead, collaborate with all aspects of data management to ensure that patterns, decisions, and tooling are implemented in accordance with enterprise standards
Conduct data source gap analysis and develop data source/target catalogs and mappings
Develop a thorough understanding of cross-system integration, interactions, and relationships in order to plan for an enterprise's future data requirements
Design, coordinate and carry out pilots / prototypes / proofs-of-concept to validate specific scenarios and provide an implementation roadmap
Recommend/ensure technical functionality for Data Engineering (e.g., scalability, security, performance, data recovery, reliability, and so on)
Organize workshops to define requirements and design data solutions
Decisions about enterprise and solution architecture should be applied to data architecture frameworks and data models
Keep track of all data architecture artifacts and procedures in a repository
Collaborate with IT teams, software providers, and business owners to forecast and design data architecture
Address business requirements for data collection, aggregation, and interaction across multiple data streams
Job Requirements:
Bachelor s/Master s degree in Engineering, Computer Science (or equivalent experience)
At least 3+ years of relevant experience as a software engineer
Programming experience in Python, Scala, or Java is required
RDBMS and Data Warehousing (Strong SQL) experience with Redshift, Snowflake, or similar
In-depth understanding and experience with data and information architecture patterns, as well as implementation approaches for operational data stores, data warehouses, data marts, and data lakes
Expertise in logical/physical data architecture, design, and development
Experience implementing a data lake / big data analytics platform, either cloud-based or on-premise; AWS preferred
Working with large amounts of data with experience designing, implementing, and supporting highly distributed data applications
Extensive knowledge of development tools for CI/CD, unit and integration testing, automation, and orchestration, such as GitHub, Jenkins, Concourse, Airflow, and Terraform
Experience writing Kafka producers and consumers, or knowledge of AWS Kinesis
Hands-on experience building a distributed data processing platform using Big Data technologies such as Hadoop, Spark, and others
A talent for independence (hands-on) as well as teamwork
Excellent analytical and problem-solving skills, particularly in the face of ill-defined issues or conflicting information
Experience with streaming data ingestion, machine learning, and Apache Spark would be advantageous
Excellent communication and presentation skills (verbal, written, and presentation) across all levels of the organization
Adept at eliciting, gathering, and managing requirements in an Agile delivery environment
Capability to convert ambiguous concepts into concrete ideas