Andrej Baranovskij Blog: 2023

Sunday, December 17, 2023

From Text to Vectors: Leveraging Weaviate for local RAG Implementation with LlamaIndex

Weaviate provides vector storage and plays an important part in RAG implementation. I'm using local embeddings from the Sentence Transformers library to create vectors for text-based PDF invoices and store them in Weaviate. I explain how integration is done with LlamaIndex to manage data ingest and LLM inference pipeline.

Monday, December 11, 2023

Enhancing RAG: LlamaIndex and Ollama for On-Premise Data Extraction

LlamaIndex is an excellent choice for RAG implementation. It provides a perfect API to work with different data sources and extract data. LlamaIndex provides API for Ollama integration. This means we can easily use LlamaIndex with on-premise LLMs through Ollama. I explain a sample app where LlamaIndex works with Ollama to extract data from PDF invoices.

Tuesday, December 5, 2023

Secure and Private: On-Premise Invoice Processing with LangChain and Ollama RAG

The Ollama desktop tool helps run LLMs locally on your machine. This tutorial explains how I implemented a pipeline with LangChain and Ollama for on-premise invoice processing. Running LLM on-premise provides many advantages in terms of security and privacy. Ollama works similarly to Docker; you can think of it as Docker for LLMs. You can pull and run multiple LLMs. This allows to switch between LLMs without changing RAG pipeline.

Monday, November 27, 2023

Easy-to-Follow RAG Pipeline Tutorial: Invoice Processing with ChromaDB & LangChain

I explain the implementation of the pipeline to process invoice data from PDF documents. The data is loaded into Chroma DB's vector store. Through LangChain API, the data from the vector store is ready to be consumed by LLM as part of the RAG infrastructure.

Sunday, November 19, 2023

Vector Database Impact on RAG Efficiency: A Simple Overview

I explain the importance of Vector DB for RAG implementation. I show with a simple example, how data retrieval from Vector DB could affect LLM performance. Before data is sent to LLM, you should verify if quality data is fetched from Vector DB.

Monday, November 13, 2023

JSON Output from Mistral 7B LLM [LangChain, Ctransformers]

I explain how to compose a prompt for Mistral 7B LLM model running with LangChain and Ctransformers to retrieve output as JSON string, without any additional text.

Monday, November 6, 2023

Structured JSON Output from LLM RAG on Local CPU [Weaviate, Llama.cpp, Haystack]

I explain how to get structured JSON output from LLM RAG running using Haystack API on top of Llama.cpp. Vector embeddings are stored in Weaviate database, the same as in my previous video. When extracting data, a structured JSON response is preferred because we are not interested in additional descriptions.

Sunday, October 22, 2023

Invoice Data Processing with Llama2 13B LLM RAG on Local CPU [Weaviate, Llama.cpp, Haystack]

I explained how to set up local LLM RAG to process invoice data with Llama2 13B. Based on my experiments, Llama2 13B works better with tabular data compared to Mistral 7B model. This example presents a production LLM RAG setup with Weaviate database for vector embeddings, Haystack for LLM API, and Llama.cpp to run Llama2 13b on a local CPU.

Monday, October 16, 2023

Invoice Data Processing with Mistral LLM on Local CPU

I explain the solution to extract invoice document fields with open-source LLM Mistral. It runs on CPU and doesn't require Cloud machine. I'm using Mistral 7B LLM model, Langchain, Ctransformers and Faiss vector store to run it on a local CPU machine. This approach gives a great advantage for enterprise systems, when running ML models on Cloud is not allowed for privacy reasons.

Monday, October 9, 2023

Skipper MLOps Debugging and Development on Your Local Machine

I explain how to stop some of the Skipper MLOps services running in Docker and debug/develop these services code locally. This improves development workflow. There is no need to deploy code change to Docker container, it can be tested locally. Service that runs locally, connects to the Skipper infra through RabbitMQ queue.

Monday, October 2, 2023

Pros and Cons of Developing Your Own ChatGPT Plugin

I've been running ChatGPT plugin in prod for a month and sharing my thoughts about the pros and cons of developing it. Would I build a new ChatGPT plugin?

Monday, September 25, 2023

LLama 2 LLM for PDF Invoice Data Extraction

I show how you can extract data from text PDF invoice using LLama2 LLM model running on a free Colab GPU instance. I specifically explain how you can improve data retrieval using carefully crafted prompts.

Monday, September 11, 2023

Data Filtering and Aggregation with Receipt Assistant Plugin for ChatGPT

I explain Receipt Assistant plugin for ChatGPT from a user perspective. I show how to fetch previously processed and saved receipt data, including filtering and aggregation. Also, I show how you can fix spelling mistakes for Lithuanian language receipt items. At the end, numeric data is visualized with WizeCharts plugin for ChaGPT.

Monday, September 4, 2023

Computer Vision with ChatGPT - Receipt Assistant Plugin

Our plugin - Receipt Assistant was approved to be included in ChatGPT store. I explain how it works and how to use it in combination with other plugins, for example, to display charts. Receipt Assistant provides vision and storage option for ChatGPT. It is primarily tuned to work with receipts, but it can handle any structured info of medium complexity.

Saturday, August 19, 2023

How to Host FastAPI from Your Computer with ngrok

With ngrok you can host your FastAPI app from your computer. This can be a handy and cheaper option for some projects. In this video, I explain my experience running FastAPI apps from my very own Cloud with ngrok :)

Monday, August 14, 2023

ChatGPT Plugin OAuth with Logto

You can setup OAuth for ChatGPT plugin, to be able to get user info. This is needed when the plugin works with user data, and you want to keep that data across sessions. With OAuth you can authenticate users. I explain how to setup it with Logto. Logto is an Auth0 alternative for building modern customer identity infrastructure with minimal effort, for both your customers and their organizations.

Monday, August 7, 2023

Deploy Local ML Apps with Ngrok

Ngrok helps to run your local apps online with access on the Web. It provides HTTPS with auto renewal, content compression. With Ngrok you can serve your ML apps running on local infra to external users, similar as it would be running on Cloud. Main advantage of such approach - it allows to reduce infra cost.

Saturday, July 22, 2023

ChatGPT Plugin Backend with FastAPI

This tutorial explains how to integrate FastAPI backend with ChatGPT plugin implemented in Python. Backend stores data from ChatGPT in MongoDB to be persistent and available across sessions.

Tuesday, July 11, 2023

ChatGPT Plugin with Persistent Storage

Receipt Assistant is our ChatGPT plugin with persistent storage support. I show how it works to upload a scanned receipt and store OCR result converted to key/value pairs by ChatGPT. Load data back into ChatGPT, review it, and produce insights. In my future videos, I explain how it works from a technical point of view.

Monday, July 3, 2023

FastAPI, Pydantic and MongoDB for Beginners

I show how to initialize a connection to MongoDB from FastAPI endpoint with a startup event. Before pushing it to MongoDB collection, new record validation is done with Pydantic. I like the flexibility of MongoDB Motor async library. It helps to implement seamless communication from FastAPI to MongoDB.

Sunday, June 25, 2023

File Upload App for ChatGPT

ChatGPT doesn't provide a file upload option out of the box. I explained the app I built with Streamlit to handle file upload and allow ChatGPT to fetch file content through the plugin and unique key.

Monday, June 19, 2023

Building Your Own ChatGPT Plugin

I explain how to get started with ChatGPT plugin development. It is essential to understand how to define OpenAPI specification to match endpoints. In this example, you will see a working use case with endpoints providing info on uploading a file and then fetching file data into ChatGPT.

Sunday, June 11, 2023

PaddleOCR as a Service with FastAPI

PaddleOCR is a great tool to extract text data from docs, and it can group related words into a sentence. Such functionality can simplify extracted data analysis. In this video, I explain how to run it as a service with FastAPI in Python.

Monday, June 5, 2023

ChatGPT/GPT-4 for Receipt OCR Data Analysis

I show how ChatGPT works with array data generated from OCR engine by extracting text from the receipt document. ChatGPT can parse and understand the data. What is great about it - GPT-4 model automatically maps key/value pairs without any additional metadata. For example, it matches the receipt item and item price. I'm using PaddleOCR to construct OCR input for ChatGPT.

Monday, May 29, 2023

Document AI: How To Convert Colab ML Notebook Into FastAPI App

I explain how I converted Donut ML model fine-tuning code implemented as Colab notebook into API running as FastAPI app. I share several hints how to simplify code refactoring efforts.

Monday, May 22, 2023

Speeding Up FastAPI App with Background Tasks

FastAPI runs background tasks in a parallel thread, which prevents blocking app endpoints when a long task executes. I explain it in this video and show the benefit of running time-consuming operations in background tasks.

Monday, May 15, 2023

Optimizing FastAPI for Concurrent Users when Running Hugging Face ML Models

To serve multiple concurrent users accessing FastAPI endpoint running Hugging Face API, you must start the FastAPI app with several workers. It will ensure current user requests will not be blocked if another request is already running. I show and describe it in this video.

Monday, May 8, 2023

Optimizing ML Model Loading Time Using LRU Cache in FastAPI

Are you facing challenges with the time it takes to load large ML models in your backend API? This video presents a practical solution: utilizing LRU cache with properly annotated functions. Implementing this approach will make your model cached in memory, eliminating the need for disk reads on subsequent calls. Enhance the efficiency and performance of your ML workflow by incorporating LRU cache techniques. Join us to learn more about this valuable strategy!

Monday, April 24, 2023

Efficient Document Data Extraction with Sparrow UI: Streamlit, FastAPI, and Hugging Face's Donut ML

In this easy-to-follow video, I show you how I built Sparrow UI, a tool for pulling data from documents using Streamlit. With Sparrow UI, you can upload a document and quickly run a data extraction task. I'll walk you through how the system works, using a FastAPI app on the backend to run a fine-tuned Donut ML model from Hugging Face. I'll also explain the code that sends POST requests from the Streamlit app, including how it sends files and text to the FastAPI endpoint. This way, you'll get a JSON response with the extracted info from your document.

Monday, April 17, 2023

Deploying FastAPI Applications to Hugging Face Spaces

In this video, I demonstrate how to deploy a FastAPI backend API to Hugging Face Spaces using Docker. I cover creating a Dockerfile, setting up secrets for FastAPI, and deploying the application on the platform.

Monday, April 10, 2023

Build a Structured API with FastAPI

Learn how to create a structured API using FastAPI. In this tutorial, we explore the benefits of FastAPI, its powerful code structuring capabilities, and how it connects services within Sparrow to build scalable and efficient applications. Unleash the true potential of FastAPI and enhance your app development process!

Monday, March 27, 2023

Donut ML Model Fine-Tuning with Hugging Face API

I explain how Donut ML model can be fine-tuned on your own dataset by following different approaches. Either with PyTorch Lighting or Hugging Face Trainer API. I explain the pros and cons of both and what works best for me.

Tuesday, March 21, 2023

How I'm Using ChatGPT/GPT-4 as a Solo Python Developer

I'm working as a solo Python developer and using ChatGPT to speed up the development process. In this video, I explain how ChatGPT is helping me with various tasks, from code explanation to suggesting solutions.

Sunday, March 12, 2023

Hugging Face Dataset for Donut Model Fine-Tuning (Document AI)

Hugging Face Dataset is a very convenient way to store and share data for ML model fine-tuning. In this post, I share my experience creating a dataset for fine-tuning the Donut model. I made a set of scripts to generate the dataset, push it to the Hub and test it locally.

Monday, March 6, 2023

Improve OCR Results with Sparrow (running on Streamlit/Python and Ngrok)

OCR can often generate results in a different order. But to produce a dataset for data extraction ML model fine-tuning (for example - Donut), fields in all documents must be ordered correctly. Our solution (open-source), Sparrow, for data annotation/labeling includes functionality for OCRed field reordering. In this video, I explain and show how it works.

Monday, February 27, 2023

Document Data Extraction - Data Mapping for Donut Model Fine-Tuning Dataset (Document AI)

I explain the current status of my work related to dataset preparation for ML Donut model fine-tuning. I plan to use this model to run data extraction tasks from invoice documents. I share hints about data mapping and how to structure data to achieve better fine-tuning results.

Monday, February 20, 2023

Streamlit Button Group UI (Flowbite) Component

Streamlit doesn't provide an option to display multiple buttons side-by-side horizontally. I explain how to achieve this functionality using a custom Streamlit component and Flowbite button group UI.

Monday, February 13, 2023

Preparing Dataset for Donut Fine-Tuning (part 3, Document AI)

In this episode, I explain redesigned Sparrow UI for data annotation. Sparrow UI is improved with Streamlit Grid component (aggrid). I show how to group related fields generated by OCR into a single entity and map it with the label. I will briefly review the code and discuss how you can set up a grid component in Streamlit - a convenient and helpful UI element.

Monday, February 6, 2023

Preparing Dataset for Donut Fine-Tuning (part 2, Document AI)

I explain how to group OCR results into a single entity using Sparrow annotation tool. This is useful for such fields as an address, item description - when field text is based on multiple words.

Tuesday, January 31, 2023

Preparing Dataset for Donut Fine-Tuning (part 1, Document AI)

I explain the dataset I will be using to fine-tune Donut model. I show how PDFs are converted to image files for further processing and OCR data extraction. In the next step, JSON data is converted to the format understandable by Sparrow annotation processing/review tool.

Monday, January 23, 2023

How To Fine-tune Donut Model

Donut is an awesome Document AI model to extract data from docs. I share my experiences in fine-tuning the model, with CORD dataset, based on example from Transformers Tutorials.

Monday, January 16, 2023

Donut 🍩 - ChatGPT for Document AI

Donut - OCR-free Document Understanding Transformer. This ML model can process documents (images, scans) and return JSON structured info about the content. It works for different use cases: form understanding, visual question answering about the document, document image classification.

Thursday, January 5, 2023

Best Platform for Python Apps Deployment - Hugging Face Spaces with Docker

I walk through Hugging Face Spaces Docker SDK deployment option. I was using it to deploy our Streamlit/Python app Sparrow. So far very happy with Spaces Docker SDK - simple setup, very stable and good runtime performance, HTTPS out of the box, content compression out of the box too.