Andrej Baranovskij Blog
Blog about Oracle, Full Stack, Machine Learning and Cloud
Sunday, October 20, 2024
Qwen2-VL Performance Boost
I share performance-boosting tips based on my experience using Qwen2-VL in production.
Sunday, October 13, 2024
Sparrow Parse Vision LLM FastAPI Endpoint
Sparrow provides an API for accessing the Sparrow Parse agent, allowing you to run document extraction workflows directly from your existing systems. It helps simplify how data is pulled from documents and integrated into your workflows.
Tuesday, October 8, 2024
Sparrow Parse Invoice Query with Vision LLM
New Sparrow Agent - Sparrow Parse, works with Qwen2 Vision LLM.
What it does:
1. Accepts query with JSON schema, this helps to solve few things at once - provides JSON structure for LLM to generate response, and hints LLM what types to use for each response element
2. Runs inference on your GPU of choice, either cloud or local GPU
3. Validates JSON response, based on query schema
Labels:
DocumentProcessing,
OCR,
VisionLLM
Monday, September 30, 2024
Running Qwen2 Vision LLM on Hugging Face ZeroGPU API
Explaining my experience running Sparrow Parse with Qwen2 Vision LLM inference on Hugging Face ZeroGPU instance.
Labels:
Hugging Face,
vision,
ZeroGPU
Sunday, September 15, 2024
Document Querying with Qwen2-VL-7B and JSON Output
In this video, I demonstrate how to perform document queries using Qwen2-VL-7B. By simplifying field names, we streamline the prompts, making them more efficient and reusable across different documents. This approach is similar to running SQL queries on a database, but tailored for language models like Qwen2-VL-7B, with results returned in JSON format.
Sunday, September 8, 2024
Table Parsing with Qwen2-VL-7B
I show how to retrieve structured JSON output from table image using Qwen2-VL-7B. This VLLM performs OCR and data mapping tasks all out of the box, also it can return structured JSON output without use of intermediate frameworks.
Sunday, August 18, 2024
Sparrow Parse: Table Data Extraction with Table Transformer and OCR
I explain how we extract data with Sparrow Parse, using Table Transformer to identify table area and build table structure to be processed by OCR. Sparrow Parse implements additional logic to clear-up and improve (removing noise, merging columns, adjusting rows) table structure generated by Table Transformer.
Labels:
Machine Learning,
OCR,
Python
Subscribe to:
Posts (Atom)