Andrej Baranovskij Blog

Blog about Oracle, Full Stack, Machine Learning and Cloud

Thursday, March 12, 2026

Fast Large Table Extraction: Sparrow + dots.ocr to JSON

›
Sparrow provides table processing mode. It is optimized to handle large tables, it comes with separate template script (new templates can be...
Wednesday, March 4, 2026

Local OCR Comparison: dots.ocr More Accurate, DeepSeek-OCR 2 Faster (Sparrow + MLX)

›
I run local tests with Sparrow to compare DeepSeek OCR2 and dots.ocr (by RedNote), both run on MLX-VLM in FP16 precision. Dots.ocr consisten...
Monday, February 16, 2026

GLM-OCR vs DeepSeek OCR 2: Which One Wins at Markdown Extraction?

›
I compare two OCR models using real test cases: GLM OCR and DeepSeek OCR2. Both are evaluated on their ability to extract document content a...
Monday, February 9, 2026

Get Vision LLMs to Follow Your Rules: Prompt-Guided JSON Formatting

›
JSON query helps to fetch structured output with Vision LLM and extract document data. I describe how to improve such output with additional...
Tuesday, January 27, 2026

Vision LLM Output Control for Better OCR with Prompt Hints

›
I explain my approach to enforce better OCR output from vision LLMs with prompt hints. This allows to set rules for output data validation a...
Thursday, January 22, 2026

DeepSeek OCR Markdown Processing in Sparrow for Large Tables

›
I describe new functionality in Sparrow, where DeepSeek OCR is used to extract text data in markdown format and in the next step instruction...
Saturday, December 27, 2025

DeepSeek OCR Review

›
I'm testing structured data extraction with DeepSeek OCR. It works well and gives good data accuracy and performance to disrupt traditio...
›
Home
View web version

About Me

My photo
Andrej Baranovskij
Vilnius, Lithuania
I'm Oracle ACE Director, Oracle Groundbreaker Ambassador, CEO and Technical Expert at Red Samurai Consulting with focus on Oracle Fusion Middleware and Oracle Cloud technologies.
View my complete profile
Powered by Blogger.