Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 1K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

RATE NOW!
- ABECA 2025
  
  RATE NOW!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

Add office photos

Employer? Claim Account for FREE

SOFTPATH TECHNOLOGIES

Compare

4.1

based on 10 Reviews

66 SOFTPATH TECHNOLOGIES Jobs

Softpath Technologies - Data Engineer - Spark/Hadoop (4-6 yrs)

Softpath Technologies LLC

4.1

based on 10 Reviews

4-6 years

SOFTPATH TECHNOLOGIES

posted 8d ago

Job Role Insights

Fixed timing

Key skills for the job

Data Analytics Data Engineering SQL Big Data Spark Hadoop Administration

+ 4 more

Job Description

Position Title : Data Engineer

Experience : 4+ Years

Education : Bachelor's Degree in Engineering (BE) or Master of Computer Applications (MCA)

Project Type : Contract

Duration : 12+ months

Role Overview :

We are seeking a highly skilled and motivated Data Engineer with 4+ years of experience to join our dynamic team. The ideal candidate will be responsible for monitoring, maintaining, and optimizing our data ingestion pipelines to ensure the efficient flow and processing of data across various systems. This role requires a strong foundation in Big Data concepts, SQL, and data pipeline operations, as well as the ability to collaborate with cross-functional teams in an agile environment. A basic understanding of Data Science principles, BigQuery, and DevOps practices is also essential. Additionally, experience in web development for building reports and dashboards will be a plus.

As a Data Engineer, you will work closely with the data science, engineering, and DevOps teams to design and implement scalable and efficient data solutions. You will be tasked with ensuring the operational success of data pipelines and play a pivotal role in enabling data-driven decision-making across the organization.

Key Responsibilities :

Data Pipeline Monitoring & Support :

- Monitor and ensure the smooth operation of data ingestion pipelines.

- Troubleshoot and resolve issues related to data flow, data quality, and pipeline failures.

- Conduct regular performance checks and implement improvements to optimize pipeline efficiency.

Collaborate with Cross-functional Teams :

- Work in an agile development environment, collaborating with data scientists, data analysts, and DevOps engineers to ensure data pipelines meet business requirements.

- Participate in sprint planning, stand-ups, and retrospectives, ensuring that data requirements are met in a timely and efficient manner.

Data Visualization & Reporting :

- Assist in the development and optimization of reports and dashboards for effective data visualization.

- Work with business analysts and stakeholders to understand data needs and translate them into actionable insights.

- Develop user-friendly dashboards and reports using BI tools or custom web applications.

Big Data & Data Science Understanding :

- Maintain a solid understanding of Big Data technologies and concepts, including distributed computing frameworks like Hadoop, Spark, and cloud-based solutions.

- Apply data science concepts, such as data cleaning, transformation, and model integration, to optimize data pipelines.

- Ensure data is structured in a way that supports future analytical and machine learning workflows.

BigQuery & Cloud Data Solutions :

- Use Google BigQuery for data storage, querying, and optimization of large-scale datasets.

- Design and implement efficient data models for reporting and analytics in BigQuery.

- Collaborate with cloud infrastructure teams to ensure proper integration of BigQuery with other cloud-based services.

DevOps Practices & Automation :

- Apply DevOps best practices for continuous integration, deployment, and monitoring of data pipelines.

- Automate repetitive tasks, such as pipeline testing, data validation, and deployment processes, to reduce manual intervention and improve efficiency.

- Ensure the scalability and reliability of the data infrastructure by utilizing appropriate monitoring tools and alerting systems.

Documentation & Knowledge Sharing :

- Maintain clear, comprehensive documentation on data pipeline processes, architecture, and troubleshooting steps.

- Share knowledge and best practices with other team members to enhance overall data engineering capabilities within the organization.

Required Skills :

SQL :

- Strong proficiency in SQL for data manipulation, querying, and optimizing large datasets.

- Ability to write efficient queries to extract, transform, and load data from multiple sources.

Big Data :

- Hands-on experience with Big Data technologies such as Hadoop, Spark, and Kafka.

- Understanding of distributed computing and data processing principles.

BigQuery :

- Experience working with Google BigQuery, including querying large datasets, creating views, and optimizing performance.

- Ability to design efficient data models and schema in BigQuery for optimized querying and reporting.

Web Development (React.js, Node.js): (Nice to Have) :

- Experience in web development, particularly in building dashboards and reports using front-end technologies like React.js.

- Familiarity with server-side development using Node.js for back-end services.

- Understanding of RESTful API design to integrate data pipelines with web-based interfaces for real-time reporting and visualization.

Data Pipeline & ETL :

- Experience in designing, developing, and maintaining ETL (Extract, Transform, Load) processes for data ingestion and transformation.

- Familiarity with tools such as Apache Airflow, Talend, or custom Python-based solutions for orchestrating data pipelines.

Cloud Platforms (Google Cloud Platform, AWS, Azure) :

- Experience working with cloud-based platforms for data storage, processing, and analytics.

- Knowledge of cloud-based data solutions such as Google Cloud Storage, BigQuery, AWS Redshift, or Azure Synapse Analytics.

Agile Methodology :

- Strong experience working in agile environments, using agile frameworks such as Scrum or Kanban.

- Comfortable with fast-paced development cycles and the ability to adapt to changing business priorities.

Desired Skills and Experience :

DevOps Practices :

- Familiarity with version control systems (e.g., Git) and CI/CD pipelines.

- Knowledge of containerization technologies (Docker, Kubernetes) and cloud-native architectures.

Data Quality & Governance :

- Experience with data quality frameworks and tools for ensuring the accuracy and integrity of data.

- Knowledge of data governance practices and how to implement them in data engineering workflows.

Communication & Collaboration :

- Strong written and verbal communication skills, with the ability to explain complex technical concepts to non-technical stakeholders.

- Ability to collaborate across teams and help guide the organization's data strategy.

Preferred Qualifications :

- Experience with machine learning model integration into production environments.

- Familiarity with data orchestration tools like Apache Airflow or Luigi.

- Experience in real-time data streaming with technologies like Kafka or AWS Kinesis.

- Knowledge of data lakes and data warehouse architectures.

- Exposure to containerized environments and orchestration using Kubernetes.