Requirement Analysis: Collaborate with stakeholders to understand business requirements and data sources, and define the architecture and design of data pipelines to meet these requirements.
Architecture Design: Design scalable, reliable, and efficient data pipelines to ingest, process, transform, and store large volumes of data from diverse sources, ensuring data quality and consistency.
Technology Selection: Evaluate using POCs and recommend appropriate technologies, frameworks, and tools for building and managing data pipelines, considering factors like performance, scalability, and cost-effectiveness.
Data Processing: Develop and implement data processing logic, including data cleansing, transformation, and aggregation, using technologies like AWS Glue, Batch, Lambda
Data Storage: Design and implement data storage solutions, including data lakes, data warehouses, and NoSQL databases, to store and manage data for analysis and reporting.
Data Integration: Integrate data pipelines with external systems, APIs, and services to automate data workflows and ensure seamless data exchange between systems.
Monitoring and Logging: Implement monitoring and logging solutions for data pipelines using tools like AWS CloudWatch to ensure data pipeline health and performance.
Security and Compliance: Ensure data pipeline architecture complies with security best practices and regulatory requirements, implementing encryption, access controls, and data masking as needed.
Documentation: Create and maintain documentation for data pipeline architecture, design, and implementation, including diagrams, data flow descriptions, and operational procedures.
Collaboration: Collaborate with cross-functional teams, including data engineers, data scientists, and business analysts, to understand their requirements and integrate data pipelines into their workflows.
Continuous Improvement: Stay updated with the latest trends, tools, and technologies in data engineering and data pipeline architecture, and continuously improve data pipeline processes and methodologies.
Desired Skills Requirement
Data Engineering Understanding of data engineering concepts and principles including data ingestion processing transformation and storage using tools and technologies such as AWS Glue Batch Lambda
Data Modeling Knowledge of data modeling techniques and best practices for designing efficient and scalable data pipelines including dimensional modeling star schemas and snowflake schemas
Programming Languages Proficiency in programming languages commonly used in data engineering such as Python Java or Scala as well as experience with SQL for data querying and manipulation
Cloud Computing Proficiency in using AWS cloud services for data pipeline architecture and implementation
Data Integration Understanding of data integration techniques and tools for integrating data from various sources including batch and realtime data integration and experience with ETL Extract Transform Load processes
Database Systems Knowledge of database systems including relational databases eg MySQL PostgreSQL and NoSQL databases eg MongoDB Cassandra and experience in designing and managing database schemas
Data Governance Awareness of data governance principles and practices including data quality data lineage and data privacy and ability to ensure compliance with relevant regulations and standards
Problemsolving Skills Excellent problemsolving skills with the ability to analyze complex data pipeline requirements and design solutions that meet business needs
Communication and Collaboration Strong communication and collaboration skills with the ability to effectively communicate technical concepts to nontechnical stakeholders and work effectively in a team environment
Learning Agility A commitment to continuous learning and staying updated with the latest trends tools and technologies in data engineering and data pipeline architecture
Academic Qualification and Industry Experience
Bachelor s Degree A bachelor s degree in Computer Science Software Engineering or a related field is often preferred although equivalent experience and certifications can also be valuable Experience with Financial domain knowledge is a plus
At least 7 years of experience in solutions architecture software development or any other related field