At Providence, we use our voice to advocate for vulnerable populations and needed reforms in health care. We pursue innovative ways to transform health care by keeping people healthy, and making our services more convenient, accessible and affordable for all. In an increasingly uncertain world, we are committed to high-quality, compassionate health care for everyone regardless of coverage or ability to pay. We help people and communities benefit from the best health care model for the future today.
Together, our 119,000-plus caregivers/employees serve in 51 hospitals, more than 1000 clinics and a comprehensive range of health and social services across Alaska, California, Montana, New Mexico, Oregon, Texas and Washington in United States.
Providence Global Center recently launched in Hyderabad, India as Global Capability Center for Providence looking to leverage the India talent to help meet our global vision and scale our Information Services and products to the world of Cloud.
What will you be responsible for?
Develop, implement, and maintain data quality standards, metrics, and validation rules.
Perform data profiling, cleansing, and validation to ensure accuracy, completeness, and reliability.
Identify and resolve data discrepancies, inconsistencies, and errors across systems.
Collaborate with data engineering teams to design and implement robust data pipelines for quality checks.
Support ETL/ELT processes by integrating data quality checks at various stages of the pipeline.
Assist in automating data quality workflows to minimize manual intervention.
Work closely with data stewards to define and enforce governance policies and standards.
Maintain a data dictionary and metadata repository to ensure transparency and consistency.
Monitor compliance with data governance policies and regulatory requirements.
Partner with business stakeholders to understand data quality requirements and use cases.
Provide regular reports and insights on data quality issues, trends, and resolutions.
Train and guide teams on data quality best practices and tools.
Prepare and validate data for AI/ML models by ensuring high data quality.
Assist in feature engineering and identifying potential data-related challenges for AI/ML workflows.
Collaborate with data scientists to ensure data quality aligns with model requirements.
Who we are looking for ?
2 - 6 years of experience with strong understanding of data quality dimensions (accuracy, completeness, consistency, timeliness) and data quality tools (e.g., Talend, Informatica Data Quality, Great Expectations).
Proficiency in SQL for querying and analyzing data.
Basic understanding of data pipelines and ETL/ELT processes.
Familiarity with cloud-based data platforms (e.g., Snowflake, Redshift, BigQuery).
Familiarity with data governance frameworks and best practices.
Experience working with metadata management tools and data catalogs.
Ability to analyze large datasets and identify patterns, discrepancies, and root causes.
Strong troubleshooting skills for resolving data-related issues.
Excellent communication skills to convey technical findings to non-technical stakeholders.
Strong interpersonal skills to work effectively with cross-functional teams.
Experience with Python for data manipulation and quality checks.
Familiarity with data visualization tools (e.g., Tableau, Power BI) for reporting quality metrics.
Preferred: Basic understanding of AI/ML concepts and workflows, including feature engineering and data preparation.
Preferred: Knowledge of distributed computing frameworks (e.g., Apache Spark) and cloud-native data engineering practices.
Preferred: Relevant certifications such as DAMA Certified Data Management Professional (CDMP), or cloud data certifications.
Experience with scripting for automating data quality checks and workflows.
Familiarity with CI/CD pipelines for automated data governance