Upload Button Icon Add office photos
Engaged Employer

i

This company page is being actively managed by Mind WaveAI Solutions Pvt Ltd Team. If you also belong to the team, you can get access from here

Mind WaveAI Solutions Pvt Ltd Verified Tick

Compare button icon Compare button icon Compare
filter salaries All Filters

1 Mind WaveAI Solutions Pvt Ltd Python Developer Job

Python Developer PDF Table Extraction (Open-Source OCR & AI)

5-8 years

Hyderabad / Secunderabad

2 vacancies

Python Developer PDF Table Extraction (Open-Source OCR & AI)

Mind WaveAI Solutions Pvt Ltd

posted 22d ago

Job Role Insights

Fixed timing

Job Description

We are seeking a skilled **Python Developer** with expertise in extracting **unstructured tables from PDF documents** using **open-source models**. The ideal candidate should have hands-on experience with **OCR, deep learning, and NLP techniques** to accurately process and structure tabular data.  

### **Key Responsibilities:**  

- Develop and implement **Python-based solutions** to extract tables from **unstructured PDFs**.  

- Utilize **open-source libraries** like **pdfplumber, Tesseract OCR, Camelot, Tabula, PyMuPDF**, and deep learning-based models.  

- Handle **complex table structures, multi-page tables, and merged cells** effectively.  

- Preprocess PDFs, including **noise reduction, skew correction, and text enhancement**.  

- Use AI/ML models (e.g., **Detectron2, LayoutLM, Donut OCR, or Graph Neural Networks**) for intelligent table extraction.  

- Optimize the accuracy and reliability of extracted data through **post-processing techniques**.  

- Ensure **scalability, performance, and error handling** for large document processing.  

- Work with **structured storage solutions** like **Pandas, SQL, or JSON** for extracted data.  

- Collaborate with teams to **integrate the solution into an existing pipeline or API**.  

### **Required Skills:**  

✅ **Strong Python skills** (NumPy, Pandas, OpenCV, TensorFlow/PyTorch).  

✅ **Experience with OCR tools** (Tesseract, EasyOCR, PaddleOCR).  

✅ **PDF processing libraries** (pdfplumber, PyMuPDF, Camelot, Tabula).  

✅ **Deep Learning models** for document understanding (Detectron2, LayoutLM, Donut OCR).  

✅ **Preprocessing techniques** (denoising, deskewing, contour detection).  

✅ **Experience with NLP and Computer Vision for text segmentation**.  

✅ Knowledge of **data extraction, transformation, and validation techniques**.  

✅ Familiarity with **Docker, API integration, and cloud storage solutions**.  

### **Preferred Skills (Bonus):**  

🔹 Experience in **Graph Neural Networks (GNN) for table structure detection**.  

🔹 Working knowledge of **Hugging Face transformers for document AI**.  

🔹 Familiarity with **LLMs for intelligent document parsing (LlamaIndex, LangChain)**.  

### **Project Goal:**  

Develop an **end-to-end open-source solution** that accurately extracts and structures tables from **scanned and text-based PDFs** without using paid services like **AWS Textract, Google Vision, or Azure Form Recognizer**.




Employment Type: Part Time

Read full job description

Prepare for Python Developer roles with real interview advice

What Python Developer at Mind WaveAI Solutions Pvt Ltd are saying

Python Developer salary at Mind WaveAI Solutions Pvt Ltd

reported by 1 employee
₹1.6 L/yr - ₹2.1 L/yr
68% less than the average Python Developer Salary in India
View more details

What Mind WaveAI Solutions Pvt Ltd employees are saying about work life

based on 20 employees
55%
100%
50%
75%
Strict timing
Monday to Friday
Within city
Day Shift
View more insights

Mind WaveAI Solutions Pvt Ltd Benefits

Free Transport
Child care
Gymnasium
Cafeteria
Work From Home
Free Food +6 more
View more benefits

Compare Mind WaveAI Solutions Pvt Ltd with

TCS

3.7
Compare

Infosys

3.6
Compare

Wipro

3.7
Compare

HCLTech

3.5
Compare

Tech Mahindra

3.5
Compare

Cognizant

3.7
Compare

Accenture

3.8
Compare

Capgemini

3.7
Compare

Mphasis

3.4
Compare

Northcorp Software

4.3
Compare

Accel Frontline

4.0
Compare

Elentec Power India (EPI) Pvt. Ltd.

3.7
Compare

HyScaler

4.5
Compare

Appsierra

4.4
Compare

Pitney Bowes

3.8
Compare

Apmosys Technologies

3.4
Compare

Yashi Consulting Services

3.6
Compare

Apex CoVantage

3.1
Compare

VHS Consulting

3.7
Compare

DynPro

3.8
Compare

Similar Jobs for you

Ocr Developer at IGTAPPS

Lucknow

5-10 Yrs

₹ 7-12 LPA

Application Developer at IBM India Pvt. Limited

Bangalore / Bengaluru

6-8 Yrs

₹ 8-10 LPA

Application Developer at IBM India Pvt. Limited

Pune

6-8 Yrs

₹ 8-10 LPA

Application Developer at IBM India Pvt. Limited

Bangalore / Bengaluru

3-7 Yrs

₹ 5-9 LPA

Application Developer at IBM India Pvt. Limited

Pune

3-7 Yrs

₹ 5-9 LPA

Application Developer at IBM India Pvt. Limited

Bangalore / Bengaluru

3-7 Yrs

₹ 5-9 LPA

Application Developer at IBM India Pvt. Limited

Pune

3-7 Yrs

₹ 5-9 LPA

Application Developer at IBM India Pvt. Limited

Gurgaon / Gurugram

3-7 Yrs

₹ 5-9 LPA

Application Developer at IBM India Pvt. Limited

Bangalore / Bengaluru

3-7 Yrs

₹ 5-9 LPA

Application Developer at IBM India Pvt. Limited

Bangalore / Bengaluru

3-5 Yrs

₹ 5-10 LPA

Python Developer PDF Table Extraction (Open-Source OCR & AI)

5-8 Yrs

Hyderabad / Secunderabad

22d ago·via naukri.com
write
Share an Interview