45 Recro Jobs
Recro - Data Scientist - Artificial Intelligence/Machine Learning (5-6 yrs)
Recro
posted 5d ago
Flexible timing
Key skills for the job
As an Applied Scientist, you will work at the intersection of software engineering and machine learning to design and build reusable, modular components for data extraction, particularly from unstructured data sources like PDFs, Word documents, and other complex text formats.
You will leverage your experience in AI/ML to solve challenging, real-world problems, build scalable systems, and integrate cutting-edge commercial and open-source solutions into production workflows.
This role is ideal for someone with a pragmatic mindset, who thrives in experimentation and is passionate about building high-quality, reusable software components.
Key Responsibilities :
- Develop and deploy AI/ML models specifically designed for data extraction from unstructured data sources such as PDFs, Word files, and other similar document formats.
- Build machine learning models, particularly natural language processing (NLP) models, to extract meaningful data points from unstructured documents.
- Enhance existing models and develop new methods for improving extraction accuracy, precision, and performance in real-world scenarios.
- Design, implement, and maintain modular, reusable software components that integrate seamlessly into larger data workflows.
- Build systems that are easily maintainable, extensible, and scalable, ensuring that they can evolve over time without significant refactoring.
- Evaluate, select, and integrate commercial AI/ML solutions (such as pre-built document processing or NLP tools) into production workflows to improve the speed and efficiency of data extraction.
- Assess the feasibility of integrating third-party solutions while balancing trade-offs between cost, performance, and customization.
- Work on experimental and unsolved problems related to data extraction, particularly around unstructured text and document formats.
- Apply a pragmatic mindset to balance the feasibility of experiments with the real-world application of solutions.
- Use critical thinking to challenge assumptions, particularly regarding the scope of problems and the limitations of current technologies and solutions.
- Work closely with data engineers, software developers, and business stakeholders to align on objectives and ensure that the models and systems being developed meet business needs.
- Provide thought leadership and insights into the latest advancements in AI/ML to help guide the overall technology strategy of the team.
- Continuously share knowledge, mentor junior team members, and collaborate to ensure best practices are followed.
- Continuously monitor and tune the performance of AI/ML models and data extraction systems, ensuring they deliver high performance and meet operational requirements.
- Optimize workflows to reduce latency and improve throughput while maintaining data integrity and accuracy.
Required Skills & Experience :
- Minimum of 5 years of experience in AI/ML, with a focus on data extraction and natural language processing (NLP).
- Hands-on experience with developing machine learning models for text extraction from unstructured data (e.g, PDFs, Word files, emails, etc.
- Strong proficiency in machine learning frameworks and libraries like TensorFlow, PyTorch, scikit-learn, or spaCy.
- Proficiency in software engineering practices such as modular design, object-oriented programming, and API development.
- Solid understanding of software development principles and the ability to build scalable, reusable components.
- Experience with programming languages like Python, Java, or C++.
- Expertise in NLP techniques like tokenization, named entity recognition (NER), text classification, and semantic analysis.
- Knowledge of working with structured and unstructured data and the ability to extract meaningful insights from raw, unclean data sources.
- Familiarity with text mining, information retrieval, and knowledge extraction from documents.
- Ability to evaluate and integrate third-party AI/ML solutions into production pipelines.
- Knowledge of commercial platforms such as AWS Textract, Google Cloud Document AI, Microsoft Cognitive Services, or other document extraction tools.
- A pragmatic, solution-oriented approach to experimentation, with the ability to balance theoretical advancements with practical, real-world constraints.
- Strong critical thinking skills to challenge assumptions, refine problem scope, and assess the feasibility of various solutions.
- Strong communication skills to present technical findings to stakeholders and cross-functional teams effectively.
- Experience working in a collaborative, agile environment with both technical and non-technical teams
Functional Areas: Other
Read full job descriptionPrepare for Data Scientist roles with real interview advice
5-6 Yrs