Andrej Baranovskij Blog

Blog about Oracle, Full Stack, Machine Learning and Cloud

Monday, April 27, 2026

MoE vs Dense Models for Structured Data Extraction — Who Wins?

›
MoE or Dense — which model architecture wins for structured data extraction from documents? It depends on document complexity. In this vide...
Tuesday, April 21, 2026

Gemma 4 for Structured Data Extraction: Can It Beat Qwen 3.5?

›
In this video, I put Gemma 4 to the test on a real-world task — extracting structured data from bank statements — and benchmark it head-to-h...
Thursday, April 2, 2026

Running Multiple Models on One GPU with vLLM and GPU Memory Utilization

›
In this video I show how to run multiple vLLM model instances on the same GPU (Nvidia) in parallel by adjusting the --gpu-memory-utilization...
Tuesday, March 24, 2026

How to Cache vLLM Model in FastAPI for Faster Inference

›
I show you how to keep your vLLM model loaded in FastAPI cache for much faster inference — without reloading it on every request.   
Monday, March 16, 2026

Qwen 3.5 Test for JSON Structured Data Extraction

›
Quick test of the new Qwen 3.5 models on JSON structured data extraction from images. Testing and comparing results for 9B FP16, 27B Q8, and...
Thursday, March 12, 2026

Fast Large Table Extraction: Sparrow + dots.ocr to JSON

›
Sparrow provides table processing mode. It is optimized to handle large tables, it comes with separate template script (new templates can be...
Wednesday, March 4, 2026

Local OCR Comparison: dots.ocr More Accurate, DeepSeek-OCR 2 Faster (Sparrow + MLX)

›
I run local tests with Sparrow to compare DeepSeek OCR2 and dots.ocr (by RedNote), both run on MLX-VLM in FP16 precision. Dots.ocr consisten...
›
Home
View web version

About Me

My photo
Andrej Baranovskij
Vilnius, Lithuania
I'm Oracle ACE Director, Oracle Groundbreaker Ambassador, CEO and Technical Expert at Red Samurai Consulting with focus on Oracle Fusion Middleware and Oracle Cloud technologies.
View my complete profile
Powered by Blogger.