The RAG Index / Ingestion & Parsing / #3

PaddlePaddle/PaddleOCR

by PaddlePaddle · Ingestion & Parsing · updated 1d ago

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

85
momentum
82,074
stars
10,750
forks
#3
rank
ai4sciencechineseocrdocument-parsingdocument-translationkieocrpaddleocr-vlpdf-extractor-ragpdf-parserpdf2markdownpp-ocrpp-structure
View on GitHub →