4 Talescope Bangalore Jobs
Senior Data Engineer - Python/SQL (4-7 yrs)
Talescope Bangalore
posted 10d ago
Key skills for the job
Technical knowledge :
- AWS, Python, SQL, S3, EC2, Glue, Athena, Lambda, DynamoDB, RedShift, Step Functions, Cloud Formation, CI/CD Pipelines, Github, EMR, RDS,AWS Lake Formation, GitLab, Jenkins, and AWS CodePipeline.
Role Summary :
As a Senior Data Engineer,with over 5 years of expertise in Python, PySpark, SQL to design, develop and optimize complex data pipelines, support data modeling, and contribute to the architecture that supports big data processing and analytics to cutting-edge cloud solutions that drive business growth. You will lead the design and implementation of scalable, high-performance data solutions on AWS and mentor junior team members.This role demands a deep understanding of AWS services, big data tools, and complex architectures to support large-scale data processing and advanced analytics.
Key Responsibilities :
- Design and develop robust, scalable data pipelines using AWS services, Python, PySpark, and SQL that integrate seamlessly with the broader data and product ecosystem.
- Lead the migration of legacy data warehouses and data marts to AWS cloud-based data lake and data warehouse solutions.
- Optimize data processing and storage for performance and cost.
- Implement data security and compliance best practices, in collaboration with the IT security team.
- Build flexible and scalable systems to handle the growing demands of real-time analytics and big data processing.
- Work closely with data scientists and analysts to support their data needs and assist in building complex queries and data analysis pipelines.
- Collaborate with cross-functional teams to understand their data needs and translate them into technical requirements.
- Continuously evaluate new technologies and AWS services to enhance data capabilities and performance.
- Create and maintain comprehensive documentation of data pipelines, architectures, and workflows.
- Participate in code reviews and ensure that all solutions are aligned to pre-defined architectural specifications.
- Present findings to executive leadership and recommend data-driven strategies for business growth.
- Communicate effectively with different levels of management to gather use cases/requirements and provide designs that cater to those stakeholders.
- Handle clients in multiple industries at the same time, balancing their unique needs.
- Provide mentoring and guidance to junior data engineers and team members.
Requirements :
- 5+ years of experience in a data engineering role with a strong focus on AWS, Python, PySpark, Hive, and SQL.
- Proven experience in designing and delivering large-scale data warehousing and data processing solutions.
- Lead the design and implementation of complex, scalable data pipelines using AWS services such as S3, EC2, EMR, RDS, Redshift, Glue, Lambda, Athena, and AWS Lake Formation.
- Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field.
- Deep knowledge of big data technologies and ETL tools, such as Apache Spark, PySpark, Hadoop, Kafka, and Spark Streaming.
- Implement data architecture patterns, including event-driven pipelines, Lambda architectures, and data lakes.
- Experience with cloud platforms such as AWS, Azure, and GCP.
- Incorporate modern tools like Databricks, Airflow, and Terraform for orchestration and infrastructure as code.
- Implement continuous integration and delivery pipelines using GitLab, Jenkins, and AWS CodePipeline.
- Ensure data security, governance, and compliance by leveraging tools such as IAM, KMS, and AWS CloudTrail.
- Mentor junior engineers, fostering a culture of continuous learning and improvement.
- Excellent problem-solving and analytical skills, with a strategic mindset.
- Strong communication and leadership skills, with the ability to influence stakeholders at all levels.
- Ability to work independently as well as part of a team in a fast-paced environment.
- Advanced data visualization skills and the ability to present complex data in a clear and concise manner.
- Excellent communication skills, both written and verbal, to collaborate effectively across teams and levels.
Preferred Skills :
- Experience with Databricks, Snowflake, and machine learning pipelines.
- Exposure to real-time data streaming technologies and architectures.
- Familiarity with containerization and serverless computing (Docker, Kubernetes, AWS Lambda).
Functional Areas: Software/Testing/Networking
Read full job description