I will extract data from any documents using ocr
Over deze dienst
I build production-ready OCR and Intelligent Document Processing (IDP) systems that extract structured information from scanned documents, images, PDFs, invoices, receipts, forms, and handwritten documents.
With over 5 years of Machine Learning engineering experience, I create OCR pipelines using modern AI models instead of relying only on traditional OCR.
What I can build
Invoice OCR
Receipt OCR
Passport / ID extraction
Business card OCR
Bank statement extraction
PDF to JSON
PDF to Excel
Image to Text
Handwritten text extraction
Form data extraction
Table extraction
Custom document parser
Technologies
- Python
- PaddleOCR
- Tesseract OCR
- EasyOCR
- Donut Transformer
- TrOCR
- OpenCV
- FastAPI
- Hugging Face
- LayoutLM
- AWS Textract (optional)
- Google Document AI (optional)
Output Formats
- JSON
- CSV
- Excel
- XML
- SQL Database
- REST API
Why work with me?
Production-ready code
Clean architecture
Fast communication
API documentation
Deployment support
Docker support
Please contact me before ordering if your project contains custom document layouts.
Programmeertaal:
Python
•
Amazon SageMaker
Tools:
opencv
•
tensorflow
•
PyTorch
Andere Data science en ML diensten die ik aanbied
Veelgestelde vragen
Can you read handwritten documents?
Yes. I use AI models like Donut or TrOCR for handwritten text when appropriate.
Can you create an API?
Yes. I can build REST APIs using FastAPI.
Can you extract tables?
Yes. I can extract tables from invoices, receipts, and reports.
Can you process thousands of PDFs?
Yes. I can build batch-processing pipelines for large datasets.
