Data Pipeline Development: Design, develop, and maintain efficient and scalable data pipelines using ETL tools and programming languages like Node.js, Python, and SQL.
Data Modeling: Create and optimize data models to ensure efficient data storage and retrieval.
Data Integration: Integrate data from various sources, including databases, APIs, and cloud-based data sources.
Data Quality Assurance: Implement data quality checks and monitoring to ensure data accuracy and consistency.
Cloud Infrastructure: Manage and maintain data infrastructure on AWS, including data warehouses, data lakes, and data pipelines.
Performance Optimization: Determine and rectify performance bottlenecks in data pipelines and queries.
Collaboration: Work with cross-functional teams to comprehend data requirements and convert them into technical solutions.
Required Skills:
Proficient in Python, SQL, and Node.js.
Experience with ETL frameworks and tools.
Detailed understanding of relational databases, with a particular emphasis on SQL Server.
Experience with cloud platforms, with a particular emphasis on AWS.
Comprehending the concepts of data warehousing and data lakes.
Proficient in analytical and problem-solving abilities.
Capacity to function both independently and collaboratively.
Preferred Skills:
Experience with data visualization tools (e.g., Tableau, Power BI).
Knowledge of NoSQL databases (e.g., MongoDB, DynamoDB).
Experience with data streaming technologies (e.g., Kafka, Kinesis).
Familiarity with machine learning and AI concepts.