20 Fission Labs Jobs
Lead Data Engineer - Python/Spark (5-8 yrs)
Fission Labs
posted 5d ago
Flexible timing
Key skills for the job
Role : Senior/Lead-Data Engineer & AI
About Us :
Headquartered in Sunnyvale, with offices in Dallas & Hyderabad, Fission Labs is a leading software development company, specializing in crafting flexible, agile, and scalable solutions that propel businesses forward. With a comprehensive range of services, including product development, cloud engineering, big data analytics, QA, DevOps consulting, and AI/ML solutions, we empower clients to achieve sustainable digital transformation that aligns seamlessly with their business goals.
Role Overview :
A Senior/Lead Data Engineer is responsible for overseeing the design, development, and management of data infrastructure and pipelines within an organisation. This role involves a mix of technical leadership, project management, and collaboration with other teams to ensure the efficient collection, storage, processing, and analysis of large datasets. The Lead Data Engineer typically manages a team of data engineers, architects, and analysts, ensuring that data workflows are scalable, reliable, and meet the business's requirements.
Responsibilities :
- Lead the design, development, and maintenance of data pipelines and ETL processes architect and implement scalable data solutions using Databricks and AWS.
- Optimize data storage and retrieval systems using Rockset, Clickhouse, and CrateDB.
- Develop and maintain data APIs using FastAPI.
- Orchestrate and automate data workflows using Airflow.
- Collaborate with data scientists and analysts to support their data needs.
- Ensure data quality, security, and compliance across all data systems.
- Mentor junior data engineers and promote best practices in data engineering.
- Evaluate and implement new data technologies to improve the data infrastructure.
- Participate in cross-functional projects and provide technical leadership.
- Manage and optimize data storage solutions using AWS S3, implementing best practices for data lakes and data warehouses.
- Implement and manage Databricks Unity Catalog for centralized data governance and access control across the organization.
Qualifications :
- Bachelor's or Master's degree in Computer Science, Engineering, or related field
- 5+ years of experience in data engineering, with at least 3 years in a lead role
- Strong proficiency in Python, PySpark, and SQL
- Extensive experience with Databricks and AWS cloud services
- Hands-on experience with Airflow for workflow orchestration
- Familiarity with FastAPI for building high-performance APIs
- Experience with columnar databases like Rockset, Clickhouse, and CrateDB
- Solid understanding of data modeling, data warehousing, and ETL processes
- Experience with version control systems (e.g., Git) and CI/CD pipelines
- Excellent problem-solving skills and ability to work in a fast-paced environment
- Strong communication skills and ability to work effectively in cross-functional teams
- Knowledge of data governance, security, and compliance best practices
- Proficiency in designing and implementing data lake architectures using AWS S3
- Experience with Databricks Unity Catalog or similar data governance and metadata management tools
Preferred Qualifications :
- Experience with real-time data processing and streaming technologies
- Familiarity with machine learning workflows and MLOps
- Certifications in Databricks, AWS
- Experience implementing data mesh or data fabric architectures
- Knowledge of data lineage and metadata management best practices
Tech Stack :
- Databricks, Python, PySpark, SQL, Airflow, FastAPI, AWS (S3, IAM, ECR, Lambda), Rockset, Clickhouse, CrateDB
Functional Areas: Software/Testing/Networking
Read full job descriptionPrepare for Lead Data Engineer roles with real interview advice