In todays fast-paced world, businesses need real-time insights more than ever
Yet, the complexity poses significant challenges as it combines Big Data and Real Time
The Golden Gate Data Fabric team, with its extensive experience, is building a next-generation streaming analytics platform that will set a new standard in the industry
Our mission is to revolutionize how data professionals manage and interact with data sets
This next-gen Platform will allow the building of data pipelines and data transformation with zero coding
It would support big data technologies like Apache Spark Streaming, Flink, Beam, Delta Lake, Iceberg, Hudi, Parquet, and Kafka
It will enable the evolving modern data architectures involving Lake House, Data Mesh, Data Fabric, etc
It will be a very dynamic and fast-paced environment
It will have challenges from both the big data world and real-time streaming
The candidates should be able to learn new technologies independently and quickly
They should have experience developing services on cloud (AWS, Azure, OCI etc)
Required Qualifications:
- BS or MS in Computer Science or a relevant technical field involving coding or equivalent practical experience
- 5+ years of overall software development experience
- Good with Design, Coding, Data Structures and Algorithms
- Good experience in writing code using Java for Distributed Cloud Applications
(Rest Services, Kubernetes, Cloud Native etc) - Good knowledge of JavaScript/ReactJS or Big Data Pipeline (Spark, Kafka etc)
- Good knowledge of databases (SQL and NoSQL)
- Experience in building services in the Cloud (AWS, Azure, OCI)
- Self-motivated, Fast Learner, Excellent Problem Solving Skills and a Good Team Player
- Curiosity and willingness to be a full stack developer: Big Data Technologies, Distributed Cloud Applications, Frontend Technologies
- Basic OS concepts like Process, Threads, Virtual Memory etc
Preferred Qualifications:
- Experience processing, filtering, and presenting large quantities (millions to billions of rows) of data
- Experience with distributed (multi-tiered) systems, algorithms, and relational databases
- Strong knowledge of Apache Spark, Spark Streaming, Flink, Beam, Delta Lake, Iceberg, Parquet, and Kafka
- Understanding of designing and implementing data pipelines
- Awareness of Data Mesh, Data Fabric concepts
- Knowledge of Infrastructure as Code (IAC) languages, preferably Terraform and Ansible
- Experience working on large-scale, highly distributed services infrastructure