27 Cygnuspro Professionals Jobs
12-15 years
Bangalore / Bengaluru
Data Engineering Architect - Google Cloud Platform (12-15 yrs)
Cygnuspro Professionals
posted 1mon ago
Key skills for the job
Role : Data Engineering Architect (GCP & Apache Spark).
Job Summary :
We are seeking a highly skilled Data Engineering Architect with deep expertise in Google Cloud Platform (GCP), Apache Spark, to architect, design and implement scalable, high-performance data lake solutions.
The ideal candidate will have extensive experience in building data ingestion pipelines, managing big data processing using Apache Spark,.
Key Requirements :
- Over 12 years of professional experience in data engineering, specializing in implementing large-scale enterprise Data Engineering projects with the latest technologies.
- Over 5 years of hands-on experience in GCP technologies and over 3 years of architect experience.
- Design and implement end-to-end data architectures leveraging GCP services (e., Big Query, Cloud Storage, Dataflow, Pub/Sub, Cloud Composer) for large-scale data ingestion and processing.
- Build and optimize large-scale data pipelines using Apache Spark on GCP (via Dataproc or other Spark services).
- Ensure high performance and scalability in Spark-based data processing workloads.
- Lead the integration of SAP S/4HANA data with GCP for real-time and batch data processing.
- Manage data extraction, transformation, and loading (ETL) processes from SAP S/4HANA into cloud storage and data lakes.
- Develop and manage scalable data ingestion pipelines for structured and unstructured data using tools like Cloud Dataflow, Cloud Pub/Sub, and Apache Spark.
- Provide architectural guidance for designing secure, scalable, and efficient data solutions on the Google Cloud Platform, integrating with on-premise/cloud systems like SAP S/4HANA.
- Implement both real-time streaming and batch processing pipelines using Apache Spark, Dataflow, and other GCP services to meet business requirements.
- Implement data governance, access controls, and security best practices to ensure the integrity, confidentiality, and compliance of data across systems.
- Collaborate with business stakeholders, data scientists, and engineering teams to define data requirements, ensuring the architecture aligns with business goals.
- Optimize Apache Spark jobs for performance, scalability, and cost-efficiency, ensuring that the architecture can handle growing data volumes.
- Provide technical leadership to the data engineering team, mentoring junior engineers in data architecture, Apache Spark development, and GCP best practices.
Technical Expertise :
- Expert-level programming proficiency in Python, Java, and Scala.
- Extensive hands-on experience with big data technologies, including Spark, Hadoop, Hive, Yarn, MapReduce, Pig, Kafka, and PySpark.
- Proficient in Google Cloud Platform services, such as BigQuery, Dataflow, Cloud Storage, Dataproc, and Cloud Composer Google Pub/Sub, and Google Cloud Functions.
- Expertise in Apache Spark for both batch and real-time processing, as well as proficiency in Apache Beam, Hadoop, or other big data frameworks.
- Experienced in using Cloud SQL, BigQuery, and Looker Studio (Google Data Studio) for cloud-based data solutions.
- Skilled in orchestration and deployment tools like Cloud Composer, Airflow, and Jenkins for continuous integration and deployment (CI/CD).
- Expertise in designing and developing integration solutions involving Hadoop/HDFS, real-time systems, data warehouses, and analytics solutions.
- Experience with DevOps practices, including version control (Git), CI/CD pipelines, and infrastructure-as-code (e., Terraform, Cloud Deployment Manager).
- Strong background in working with relational databases, NoSQL databases, and in-memory databases.
- Experience managing large datasets within Data Lake and Data Fabric architectures.
- Strong knowledge of security best practices, IAM, encryption mechanisms, and compliance frameworks (GDPR, HIPAA) within GCP environments.
- Experience in implementing data governance, data lineage, and data quality frameworks.
- In-depth knowledge of web technologies, application programming languages, OLTP/OLAP technologies, data strategy disciplines, relational databases, data warehouse development, and big data solutions.
- Led end-to-end processes for the design, development, deployment, and maintenance of data engineering projects.
- Excellent debugging and problem-solving skills.
- Retail and e-commerce domain knowledge is a plus.
- Positive attitude with strong analytical skills and the ability to guide teams effectively.
Preferred Qualifications :
- GCP Certifications: Such as Professional Data Engineer or Professional Cloud Architect.
- Apache Spark and Python certifications.
- Experience with Data visualization tools like Tableau, Power BI etc.
Functional Areas: Other
Read full job description12-15 Yrs
Bangalore / Bengaluru
4-6 Yrs
10-12 Yrs
Hyderabad / Secunderabad