New OCR model from Rednote performs well across benchmarks07-08-2025 https://github.com/rednote-hilab/dots.ocr https://github.com/allenai/olmocr https://github.com/opendatalab/OmniDocBench fine-training - https://github.com/wjbmattingly/dots.ocr/blob/master/train_simple.py