RAG-Ollama

A Retrieval-Augmented Generation (RAG) application that uses local Ollama models to answer questions based on your documents. This project demonstrates how to build a complete RAG pipeline with purely local, self-hosted components.

Architecture

Source: https://www.deepchecks.com/glossary/rag-architecture/

The application consists of three main components:

RAG Application - Python backend using llama-index for document processing and retrieval (runs as a local Python script)
ChromaDB - Vector database for storing embeddings (runs in Docker)
Ollama - Local LLM server providing embedding and text generation capabilities (runs as a local service)

Prerequisites

Docker (only for running ChromaDB)
Python 3.13+
Ollama installed locally

Installation & Setup

Clone the repository:

git clone https://github.com/yourusername/RAG-Ollama.git
cd RAG-Ollama

Install Python dependencies:
```
pip install -r requirements.txt
```
Install and start Ollama:
- Follow instructions at Ollama.com
- Pull required models:
```
ollama pull nomic-embed-text
ollama pull llama3.2
```
Start ChromaDB using Docker:
```
docker-compose up -d
```
Update the .env file to point to your local ChromaDB:
```
CHROMA_DB_HOST=localhost
```
Place your PDF documents in the assets folder
Run the application:
```
python src/app.py
```

Usage

Once running, the application will:

Process any PDFs in the assets folder
Create or update the vector database
Start an interactive command line interface

You can then ask questions about your documents, and the application will use RAG to generate relevant answers.

Example:

How can I help you?
What is the main benefit of retrieval augmented generation?

Searching for answer...
Based on the documents, the main benefit of Retrieval-Augmented Generation (RAG) is that it helps reduce hallucinations in large language models by grounding responses in retrieved documents. This makes the system more accurate and trustworthy.

Configuration

Ollama Models

The application uses these Ollama models by default:

nomic-embed-text for embeddings
llama3.2 for text generation

To change models, modify the get_embedding_model() and get_llm() methods in src/engine/ChatEngine.py.

Vector Database

ChromaDB settings can be adjusted in the docker-compose.yml file.

Project Structure

RAG-Ollama/
├── assets/                  # Place your PDFs here
├── src/
│   ├── app.py               # Main application entry point
│   ├── engine/
│   │   └── ChatEngine.py    # Chat engine implementation
│   ├── utils/
│   │   └── file_reader.py   # Document processing utilities
│   └── vectorstore/
│       └── ingestion.py     # Vector database ingestion
├── chroma/                  # ChromaDB persistent storage (will be mounted by the docker-compose.yml)
├── docker-compose.yml       # Docker configuration for ChromaDB
└── requirements.txt         # Python dependencies

Features

100% local and private RAG pipeline
PDF document processing
Interactive chat interface
Persistent vector database storage

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
.env		.env
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG-Ollama

Architecture

Prerequisites

Installation & Setup

Usage

Configuration

Ollama Models

Vector Database

Project Structure

Features

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG-Ollama

Architecture

Prerequisites

Installation & Setup

Usage

Configuration

Ollama Models

Vector Database

Project Structure

Features

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages