Monday, December 23, 2024
Stateless MLX Inference with FastAPI in Sparrow
I show how to run inference with MLX in stateless mode, when loaded model is released after inference completes. This is useful when inference requests are less frequent and it helps to reclaim resources reserved by MLX.
Tuesday, December 17, 2024
Streamlined Table Data Extraction with Sparrow | Table Transformer, Qwen2 VL, MLX, & Mac Mini M4 Pro
Learn how to streamline table data extraction with Sparrow, Table Transformer, Qwen2 VL, and MLX on the Mac Mini M4 Pro. Simplify your workflow and get accurate results!
Monday, December 9, 2024
Structured Output from Multipage PDF with Sparrow (Qwen2 Vision LLM and MLX)
I explain how multipage PDFs are handled in Sparrow to extract structured data in a single call.
Tuesday, December 3, 2024
Sparrow Apple MLX Backend on Mac Mini M4 (Qwen2 72B 4bit)
I show how I’m running the Qwen2 72B 4bit model locally on a Mac Mini M4 for Sparrow’s backend. MLX (and MLX-VLM) is the main platform I’m using for local data extraction in Sparrow.
Monday, November 25, 2024
Batch Inference with Qwen2 Vision LLM (Sparrow)
I'm explaining several hints how to optimize Qwen2 Visual LLM performance for batch processing.
Sunday, November 17, 2024
Visual LLM Structured Output Validation with Sparrow
I explain how Sparrow validates the structured output of visual LLMs to ensure it complies with the JSON schema provided in the query. This process helps prevent errors and hallucinations generated by the LLM.
Sunday, November 10, 2024
Extracting Financial Market Stock Data from Images with Vision LLM
In this video, I demonstrate how to extract financial market stock data from images using the powerful Vision LLM Qwen2, all within a Gradio interface. This setup allows quick and easy extraction of key stock stats from screenshots and other image-based data sources—perfect for analysts, traders, and finance enthusiasts looking to streamline data processing. Watch to see how this AI tool can simplify your workflow and make stock data analysis faster and more efficient!
Monday, November 4, 2024
Structured Output Example with Sparrow UI Shell
Structured output is all you need. I deployed a Sparrow demo UI with Gradio to demonstrate the output Sparrow can produce by running a JSON schema query. You can see examples for the Bonds table, Lab results, and Bank statement.
Labels:
Machine Learning,
OCR,
VisionLLM
Sunday, October 20, 2024
Qwen2-VL Performance Boost
I share performance-boosting tips based on my experience using Qwen2-VL in production.
Sunday, October 13, 2024
Sparrow Parse Vision LLM FastAPI Endpoint
Sparrow provides an API for accessing the Sparrow Parse agent, allowing you to run document extraction workflows directly from your existing systems. It helps simplify how data is pulled from documents and integrated into your workflows.
Tuesday, October 8, 2024
Sparrow Parse Invoice Query with Vision LLM
New Sparrow Agent - Sparrow Parse, works with Qwen2 Vision LLM.
What it does:
1. Accepts query with JSON schema, this helps to solve few things at once - provides JSON structure for LLM to generate response, and hints LLM what types to use for each response element
2. Runs inference on your GPU of choice, either cloud or local GPU
3. Validates JSON response, based on query schema
Labels:
DocumentProcessing,
OCR,
VisionLLM
Monday, September 30, 2024
Running Qwen2 Vision LLM on Hugging Face ZeroGPU API
Explaining my experience running Sparrow Parse with Qwen2 Vision LLM inference on Hugging Face ZeroGPU instance.
Labels:
Hugging Face,
vision,
ZeroGPU
Sunday, September 15, 2024
Document Querying with Qwen2-VL-7B and JSON Output
In this video, I demonstrate how to perform document queries using Qwen2-VL-7B. By simplifying field names, we streamline the prompts, making them more efficient and reusable across different documents. This approach is similar to running SQL queries on a database, but tailored for language models like Qwen2-VL-7B, with results returned in JSON format.
Sunday, September 8, 2024
Table Parsing with Qwen2-VL-7B
I show how to retrieve structured JSON output from table image using Qwen2-VL-7B. This VLLM performs OCR and data mapping tasks all out of the box, also it can return structured JSON output without use of intermediate frameworks.
Sunday, August 18, 2024
Sparrow Parse: Table Data Extraction with Table Transformer and OCR
I explain how we extract data with Sparrow Parse, using Table Transformer to identify table area and build table structure to be processed by OCR. Sparrow Parse implements additional logic to clear-up and improve (removing noise, merging columns, adjusting rows) table structure generated by Table Transformer.
Labels:
Machine Learning,
OCR,
Python
Sunday, August 11, 2024
Table Header Extraction with Table Transformer
Table Transformer model is able to provide table functional analysis. As result we can identify table header area and build cells to enclose each column header. In the next step with crop each cell and read data with OCR. Finally we get structured data for table header column names.
Sunday, July 21, 2024
Invoice Table Detection with Table Transformer
I show how an open-source transformer model from Microsoft for table detection and structure recognition works. The code is integrated into Sparrow Parse and runs on a local CPU. This approach helps to crop the table area first and then get coordinates for the table cells. Each cell can be cropped and text can be extracted with OCR. This allows retaining the original table structure and reporting the result in JSON or CSV formats. The data extraction part is not in this video; this will be the topic for the next video.
Sunday, July 14, 2024
Sparrow OCR Service with PaddleOCR
In this video, I demonstrate the latest updates to the Sparrow OCR Service using PaddleOCR. I walk you through the OCR service workflow in Sparrow, showcasing its integration with FastAPI and highlighting the enhanced functionalities brought by the recent PaddleOCR update. Join me to see how you can leverage these powerful tools for efficient OCR processing!
Wednesday, July 3, 2024
FastAPI Endpoint for Sparrow LLM Agent
FastAPI Endpoint for Sparrow LLM Agent. I show how FastAPI endpoint is used in Sparrow to run LLM agent functionality from API client.
Sunday, June 23, 2024
Sparrow Parse API for PDF Invoice Data Extraction
I explain how Sparrow Parse API is integrated into Sparrow for data extraction from PDF documents, such as invoices, receipts, etc.
Monday, June 17, 2024
Avoid LLM Hallucinations: Use Sparrow Parse for Tabular PDF Data, Instructor LLM for Forms
LLMs tend to hallucinate and produce incorrect results for table data extraction. For this reason in Sparrow we are using Instructor structured output for LLM to query form data and Sparrow Parse to process tabular data within the same document in combined approach.
Monday, June 10, 2024
Effective Table Data Extraction from PDF without LLM
Sparrow Parse helps to read tabular data from PDFs, relying on various libraries, such as Unstructured or PyMuPDF4LLM. This allows us to avoid data hallucination errors often produced by LLMs when processing complex data structures.
Monday, June 3, 2024
Instructor and Ollama for Invoice Data Extraction in Sparrow [LLM, JSON]
Structured output from invoice document, running local LLM. This works well with Instructor and Ollama.
Labels:
Instructor,
LLM,
Python
Monday, May 27, 2024
Hybrid RAG with Sparrow Parse
To process complex layout docs and improve data retrieval from invoices or bank statements, we are implementing Sparrow Parse. It works in combination with LLM for form data processing. Table data is converted either into HTML or Markdown formats and extracted directly by Sparrow Parse. I explain Hybrid RAG idea in this video.
Monday, May 20, 2024
Sparrow Parse - Data Processing for LLM
Data processing in LLM RAG is very important, it helps to improve data extraction results, especially for complex layout documents, with large tables. This is why I build open source Sparrow Parse library, it helps to balance between LLM and standard Python data extraction methods.
Monday, May 13, 2024
Invoice Data Preprocessing for LLM
Data preprocessing is important step for LLM pipeline. I show various approaches to preprocess invoice data, before feeding it to LLM. This is quite challenging step, especially to preprocess tables.
Monday, May 6, 2024
You Don't Need RAG to Extract Invoice Data
Documents like invoices or receipts can be processed by LLM directly, without RAG. I explain how you can do this locally with Ollama and Instructor. Thanks to Instructor, structured output from LLM can be validated with your own Pydantic class.
Monday, April 29, 2024
LLM JSON Output with Instructor RAG and WizardLM-2
With Instructor library you can implement simple RAG without Vector DB or dependencies to other LLM libraries. The key RAG components - good data pre-processing and cleaning, powerful local LLM (such as WizardLM-2, Nous Hermes 2 PRO or Llama3) and Ollama or MLX backend.
Monday, April 22, 2024
Local RAG Explained with Unstructured and LangChain
In this tutorial, I do a code walkthrough and demonstrate how to implement the RAG pipeline using Unstructured, LangChain, and Pydantic for processing invoice data and extracting structured JSON data.
Monday, April 15, 2024
Local LLM RAG with Unstructured and LangChain [Structured JSON]
Using unstructured library to pre-process PDF document content, to be in a cleaner format. This helps LLM to produce more accurate response. JSON response is generated thanks to Nous Hermes 2 PRO LLM. Without any additional post-processing. Using Pydantic dynamic class to validate response to make sure it matches request.
Sunday, March 31, 2024
LlamaIndex Upgrade to 0.10.x Experience
I explain key points you should keep in mind when upgrading to LlamaIndex 0.10.x.
Labels:
LlamaIndex,
LLM,
RAG
Monday, March 25, 2024
LLM Structured Output for Function Calling with Ollama
I explain how function calling works with LLM. This is often confused concept, LLM doesn't call a function - LLM retuns JSON response with values to be used for function call from your environment. In this example I'm using Sparrow agent, to call a function.
Sunday, March 17, 2024
FastAPI File Upload and Temporary Directory for Stateless API
I explain how to handle file upload with FastAPI and how to process the file by using Python temporary directory. Files placed into temporary directory are automatically removed once request completes, this is very convenient for stateless API.
Sunday, March 10, 2024
Optimizing Receipt Processing with LlamaIndex and PaddleOCR
LlamaIndex Text Completion function allows to execute LLM request combining custom data and the question, without using Vector DB. This is very useful when processing output from OCR, it simplifies the RAG pipeline. In this video I explain, how OCR can be combined with LLM to process image documents in Sparrow.
Labels:
LlamaIndex,
LLM,
RAG
Sunday, March 3, 2024
LlamaIndex Multimodal with Ollama [Local LLM]
I describe how to run LlamaIndex Multimodal with local LlaVA LLM through Ollama. Advantage of this approach - you can process image documents with LLM directly, without running through OCR, this should lead to better results. This functionality is integrated as separate LLM agent into Sparrow.
Labels:
LlamaIndex,
LLM,
RAG
Monday, February 26, 2024
LLM Agents with Sparrow
I explain new functionality in Sparrow - LLM agents support. This means you can implement independently running agents, and invoke them from CLI or API. This makes it easier to run various LLM related processing within Sparrow.
Tuesday, February 20, 2024
Extracting Invoice Structured Output with Haystack and Ollama Local LLM
I implemented Sparrow agent with Haystack structured output functionality to extract invoice data. This runs locally through Ollama, using LLM to retrieve key/value pairs data.
Sunday, February 4, 2024
Local LLM RAG Pipelines with Sparrow Plugins [Python Interface]
There are many tools and frameworks around LLM, evolving and improving daily. I added plugin support in Sparrow to run different pipelines through the same Sparrow interface. Each pipeline can be implemented with different tech (LlamaIndex, Haystack, etc.) and run independently. The main advantage is that you can test various RAG functionalities from a single app with a unified API and choose the one that works best in the specific use case.
Monday, January 29, 2024
LLM Structured Output with Local Haystack RAG and Ollama
Haystack 2.0 provides functionality to process LLM output and ensure proper JSON structure, based on predefined Pydantic class. I show how you can run this on your local machine, with Ollama. This is possible thanks to OllamaGenerator class available from Haystack.
Tuesday, January 23, 2024
JSON Output with Notus Local LLM [LlamaIndex, Ollama, Weaviate]
In this video, I show how to get JSON output from Notus LLM running locally with Ollama. JSON output is generated with LlamaIndex using the dynamic Pydantic class approach.
Labels:
LlamaIndex,
LLM,
RAG
Monday, January 15, 2024
FastAPI and LlamaIndex RAG: Creating Efficient APIs
FastAPI works great with LlamaIndex RAG. In this video, I show how to build a POST endpoint to execute inference requests for LlamaIndex. RAG implementation is done as part of Sparrow data extraction solution. I show how FastAPI can handle multiple concurrent requests to initiate RAG pipeline. I'm using Ollama to execute LLM calls as part of the pipeline. Ollama processes requests sequentially. It means Ollama will process API requests in the queue order. Hopefully, in the future, Ollama will support concurrent requests.
Labels:
FastAPI,
LlamaIndex,
LLM,
RAG
Monday, January 8, 2024
Transforming Invoice Data into JSON: Local LLM with LlamaIndex & Pydantic
This is Sparrow, our open-source solution for document processing with local LLMs. I'm running local Starling LLM with Ollama. I explain how to get structured JSON output with LlamaIndex and dynamic Pydantic class. This helps to implement the use case of data extraction from invoice documents. The solution runs on the local machine, thanks to Ollama. I'm using a MacBook Air M1 with 8GB RAM.
Labels:
JSON,
LlamaIndex,
LLM,
Pydantic,
RAG
Subscribe to:
Posts (Atom)