We are seeking a highly skilled Data Engineer with expertise in Java, PySpark, and big data technologies. The ideal candidate will have in-depth knowledge of Apache Spark, Python, and Java programming (Java 8 and above, including Lambda, Streams, Exception Handling, Collections, etc.). Responsibilities include developing data processing pipelines using PySpark, creating Spark jobs for data transformation and aggregation, and optimizing query performance using file formats like ORC, Parquet, and AVRO. Candidates must also have hands-on experience with Spring Core, Spring MVC, Spring Boot, REST APIs, and cloud services like AWS. This role involves designing scalable pipelines for batch and real-time analytics, performing data enrichment, and integrating with SQL databases.