The Lead Data Engineer is a skilled professional responsible for designing, developing, and maintaining large-scale, multi-tenant data infrastructure and applications. This individual applies industry best practices to build robust, scalable, and efficient data pipelines and systems using an Agile-based methodology.
Job Description & Responsibilities
Design and manage data platforms, including data storage, data processing, data security, and data quality.
Implement cloud-based data platform solutions which have high availability, low latency, high performance, high efficiency and optimal cost
Design, develop, and maintain scalable and reliable data pipelines and architectures that can handle large volumes and variety of data from multiple sources
Research and evaluate new technologies and tools for data engineering and recommend solutions that can improve performance, efficiency, and scalability.
Skilled in ELT/ETL processes, data modelling, and metadata management, with a good grasp of IaC principles preferably with Databricks/Snowflake and/or Spark with Lakehouse Architecture
Deep familiarity with modern large datasets ,data architectures, including data lakes, data warehouses, and real-time data processing systems and data processing (real-time and batch) at large scale.
Enable and support self-serve data engineering solutions for both the wider data team and engineers
Proactively ensure the security and integrity of the platform and data feeds and take part in security and privacy design reviews.
Experience with Data Governance and Data Catalog concepts such as Data lineage, Data Quality, Data Classification, Role based and Attribute Based Access Control etc.
Ability to write and review technical documentation, ensuring clarity and accuracy
What we are looking for
Degree or equivalent, with at least 12-15 Years of experience
A trusting, resourceful, kind team player personality
At-least 6 years of experience Databricks / Snowflake
At-least 6 years of Experience in other data engineering tools / frameworks such as Spark, Hadoop, Airflow on AWS/GCP/Azure cloud
Proficient with Spark, Python and SQL, understand software engineering principles with a good grasp of distributed job/query optimization (execution plans, partitioning, and auto-scaling)
Strong knowledge of database technologies, such as Oracle, Microsoft SQL Server, PostgreSQL, and MongoDB
Skills in operating in a global environment, demonstrating a proactive and self-directed work culture
Experienced in developing junior data talent, mentoring junior engineers, and effectively steering projects to align with business goals while seeking growth opportunities.
Excellent communication skills with the ability to convey complex technical concepts to diverse stakeholders and foster collaborative problem-solving environment
Ensures business needs are being met by evaluating the ongoing effectiveness of current plans, programs, and initiatives