Andrej Baranovskij Blog
Blog about Oracle, Full Stack, Machine Learning and Cloud
Monday, November 25, 2024
Batch Inference with Qwen2 Vision LLM (Sparrow)
I'm explaining several hints how to optimize Qwen2 Visual LLM performance for batch processing.
Sunday, November 17, 2024
Visual LLM Structured Output Validation with Sparrow
I explain how Sparrow validates the structured output of visual LLMs to ensure it complies with the JSON schema provided in the query. This process helps prevent errors and hallucinations generated by the LLM.
Sunday, November 10, 2024
Extracting Financial Market Stock Data from Images with Vision LLM
In this video, I demonstrate how to extract financial market stock data from images using the powerful Vision LLM Qwen2, all within a Gradio interface. This setup allows quick and easy extraction of key stock stats from screenshots and other image-based data sources—perfect for analysts, traders, and finance enthusiasts looking to streamline data processing. Watch to see how this AI tool can simplify your workflow and make stock data analysis faster and more efficient!
Monday, November 4, 2024
Structured Output Example with Sparrow UI Shell
Structured output is all you need. I deployed a Sparrow demo UI with Gradio to demonstrate the output Sparrow can produce by running a JSON schema query. You can see examples for the Bonds table, Lab results, and Bank statement.
Labels:
Machine Learning,
OCR,
VisionLLM
Sunday, October 20, 2024
Qwen2-VL Performance Boost
I share performance-boosting tips based on my experience using Qwen2-VL in production.
Sunday, October 13, 2024
Sparrow Parse Vision LLM FastAPI Endpoint
Sparrow provides an API for accessing the Sparrow Parse agent, allowing you to run document extraction workflows directly from your existing systems. It helps simplify how data is pulled from documents and integrated into your workflows.
Tuesday, October 8, 2024
Sparrow Parse Invoice Query with Vision LLM
New Sparrow Agent - Sparrow Parse, works with Qwen2 Vision LLM.
What it does:
1. Accepts query with JSON schema, this helps to solve few things at once - provides JSON structure for LLM to generate response, and hints LLM what types to use for each response element
2. Runs inference on your GPU of choice, either cloud or local GPU
3. Validates JSON response, based on query schema
Labels:
DocumentProcessing,
OCR,
VisionLLM
Subscribe to:
Posts (Atom)