7Dxperts specializes in utilizing all types of dimensions of data to tackle challenging questions in the field of Data/Spatial/ML. Our customers can be engaged for insight projects to prove immediate value, build data-driven solutions to target specific problems, or build capability, operation culture to keep innovating.
We firmly believe that targeted solutions designed for specific use cases hold more power than generic solutions. Therefore, at the core of the business, it s about bringing people together who care about customers, have a passion to solve problems, and have expertise in building targeted accelerators/solutions for industry-specific problems.
Work Location: Bangalore
Experience: 6 + years
Role: Permanent
Overview of Role
We are looking for an experienced Databricks data platform engineer to join our specialist team to work on our Data Engineering, Data Science, Geospatial projects, and products. You will use advanced data engineering, Databricks, cloud services, infrastructure as code, Linux stack, data quality and machine learning to build data platform architecture and solution following our data architecture standards and principles.
To succeed in this data platform engineering position, you should have strong cloud knowledge, Databricks platform, analytical skills, the ability to combine data from different sources and develop data pipeline using latest libraries and data platform standards.
If you are detail-oriented, with excellent organizational skills and experience in this field, we d like to hear from you.
Responsibilities
3+ years of experience in Spark, Databricks, Hadoop, Data and ML Engineering. 3+ Years on experience in designing architectures using AWS cloud services Databricks. Architecture, design and build Big Data Platform (Data Lake / Data Warehouse / Lake house) using Databricks services and integrating with wider AWS cloud services.
Knowledge experience in infrastructure as code and CI/CD pipeline to build and deploy data platform tech stack and solution. Hands-on spark experience in supporting and developing Data Engineering (ETL/ELT) and Machine learning (ML) solutions using Python, Spark, Scala or R languages. Distributed system fundamentals and optimising Spark distributed computing. Experience in setting up batch and streams data pipeline using Databricks DLT, jobs and streams. Understand the concepts and principles of data modelling, Database, tables and can produce,maintain, and update relevant data models across multiple subject areas. Design, build and test medium to complex or large-scale data pipelines (ETL/ELT) based on feeds from multiple systems using a range of different storage technologies and/or access methods, implement data quality validation and to create repeatable and reusable pipelines. Experience in designing metadata repositories, understanding range of metadata tools and technologies to implement metadata repositories and working with metadata. Understand the concepts of build automation, implementing automation pipelines to build,ntest and deploy changes to higher environments. Define and execute test cases, scripts and understand the role of testing and how it works.
Requirements and skills
Big Data technologies Databricks, Spark, Hadoop, EMR or Hortonworks. Solid hands-on experience in programming languages Python, Spark, SQL, Spark SQL, Spark Streaming, Hive and Presto Experience in different Databricks components and API like notebooks, jobs, DLT, interactive and jobs cluster, SQL warehouse, policies, secrets, dbfs, Hive Metastore, Glue Metastore,Unity Catalog and ML Flow. Knowledge and experience in AWS Lambda, VPC, S3, EC2, API Gateway, IAM users, roles policies, Cognito, Application Load Balancer, Glue, Redshift, Spectrum, Athena and Kinesis. Experience in using source control tools like git, bit bucket or AWS code commit and automation tools like Jenkins, AWS Code build and Code deploy. Hands-on experience in terraform and Databricks API to automate infrastructure stack. Experience in implementing CI/CD pipeline and ML Ops pipeline using Git, Git actions or Jenkins. Experience in delivering project artifacts like design documents, test cases, traceability matrix and low-level design documents. Build references architectures, how-tos, and demo applications for customers.