Lead multiple projects and efforts to orchestrate and deliver cohesive data engineering solutions in collaboration with various functional teams
Take ownership of the entire data services life cycle, from data ingestion to data processing and ETL to data delivery for reporting
Collaborate with other technical teams to deliver data solutions that meet both business and technical needs
Define technical requirements and implementation details for the underlying data lake, data warehouse, and data marts
Identify, troubleshoot, and resolve issues with production data integrity and performance
As the lead, collaborate with all aspects of data management to ensure that patterns, decisions, and tooling are implemented in accordance with enterprise standards
Conduct data source gap analysis and develop data source/target catalogs and mappings
Develop a thorough understanding of cross-system integration, interactions, and relationships in order to create an enterprise view of future data requirements
Design, coordinate, and carry out pilots / prototypes / proofs-of-concept to validate specific scenarios and provide an implementation roadmap
Recommend/ensure technical functionality for Data Engineering (e.g., scalability, security, performance, data recovery, reliability, and so on)
Organize workshops to define requirements and design data solutions
Decisions about enterprise and solution architecture should be applied to data architecture frameworks and data models
Keep track of all data architecture artifacts and procedures in a repository
Collaborate with IT teams, software providers, and business owners to forecast and design data architecture
Job Requirements:
Bachelor s/Master s degree in Engineering, Computer Science (or equivalent experience)
At least 3-5+ years of relevant experience as a software engineer
Knowledge of data architectures, data pipelines, real-time processing, streaming, networking, and security is required
Programming experience in Python, Scala, or Java is required
RDBMS and Data Warehousing (Strong SQL) experience with Redshift, Snowflake, or similar
Expertise in logical/physical data architecture, design, and development experience implementing a data lake / big data analytics platform, either cloud-based or on-premise- AWS preferred
Extensive knowledge of development tools for CI/CD, unit and integration testing, automation, and orchestration, such as GitHub, Jenkins, Concourse, Airflow, and Terraform
Experience writing Kafka producers and consumers, or knowledge of AWS Kinesis
Hands-on experience building a distributed data processing platform using Big Data technologies such as Hadoop, Spark, and others
Have a talent for independence (hands-on) as well as teamwork
Excellent analytical and problem-solving skills, particularly in the face of ill-defined issues or conflicting information
Experience with streaming data ingestion, machine learning, and Apache Spark would be advantageous
Capable of eliciting, gathering, and managing requirements in an Agile delivery environment
Excellent communication and presentation skills (verbal, written, and presentation) at all organizational levels
Capability to convert ambiguous concepts into concrete ideas