Monday, February 26, 2024

LLM Agents with Sparrow

I explain new functionality in Sparrow - LLM agents support. This means you can implement independently running agents, and invoke them from CLI or API. This makes it easier to run various LLM related processing within Sparrow. 


Tuesday, February 20, 2024

Extracting Invoice Structured Output with Haystack and Ollama Local LLM

I implemented Sparrow agent with Haystack structured output functionality to extract invoice data. This runs locally through Ollama, using LLM to retrieve key/value pairs data. 


Sunday, February 4, 2024

Local LLM RAG Pipelines with Sparrow Plugins [Python Interface]

There are many tools and frameworks around LLM, evolving and improving daily. I added plugin support in Sparrow to run different pipelines through the same Sparrow interface. Each pipeline can be implemented with different tech (LlamaIndex, Haystack, etc.) and run independently. The main advantage is that you can test various RAG functionalities from a single app with a unified API and choose the one that works best in the specific use case. 


Monday, January 29, 2024

LLM Structured Output with Local Haystack RAG and Ollama

Haystack 2.0 provides functionality to process LLM output and ensure proper JSON structure, based on predefined Pydantic class. I show how you can run this on your local machine, with Ollama. This is possible thanks to OllamaGenerator class available from Haystack. 


Tuesday, January 23, 2024

JSON Output with Notus Local LLM [LlamaIndex, Ollama, Weaviate]

In this video, I show how to get JSON output from Notus LLM running locally with Ollama. JSON output is generated with LlamaIndex using the dynamic Pydantic class approach. 


Monday, January 15, 2024

FastAPI and LlamaIndex RAG: Creating Efficient APIs

FastAPI works great with LlamaIndex RAG. In this video, I show how to build a POST endpoint to execute inference requests for LlamaIndex. RAG implementation is done as part of Sparrow data extraction solution. I show how FastAPI can handle multiple concurrent requests to initiate RAG pipeline. I'm using Ollama to execute LLM calls as part of the pipeline. Ollama processes requests sequentially. It means Ollama will process API requests in the queue order. Hopefully, in the future, Ollama will support concurrent requests. 


Monday, January 8, 2024

Transforming Invoice Data into JSON: Local LLM with LlamaIndex & Pydantic

This is Sparrow, our open-source solution for document processing with local LLMs. I'm running local Starling LLM with Ollama. I explain how to get structured JSON output with LlamaIndex and dynamic Pydantic class. This helps to implement the use case of data extraction from invoice documents. The solution runs on the local machine, thanks to Ollama. I'm using a MacBook Air M1 with 8GB RAM.