Streamlit app using Knowledge Graph RAG with Neo4j and Ollama for multi-hop reasoning and verifiable source citations.
A Streamlit application demonstrating how **Knowledge Graph-based Retrieval-Augmented Generation (RAG)** provides multi-hop reasoning with fully verifiable source attribution.
## π― What Makes This Different?
Traditional vector-based RAG finds similar text chunks, but struggles with:
- Questions requiring information from multiple documents
- Complex reasoning chains
- Providing verifiable sources for each claim
**Knowledge Graph RAG** solves these by:
1. **Building a structured graph** of entities and relationships from documents
2. **Traversing connections** to find related information (multi-hop reasoning)
3. **Tracking provenance** so every claim links back to its source
## β¨ Features
| Feature | Description |
|---------|-------------|
| π **Multi-hop Reasoning** | Traverse entity relationships to answer complex questions |
| π **Verifiable Citations** | Every claim includes source document and text |
| π§ **Reasoning Trace** | See exactly how the answer was derived |
| π **Fully Local** | Uses Ollama for LLM, Neo4j for graph storage |
## π Quick Start
### Prerequisites
1. **Ollama** - Local LLM inference
```bash
# Install from https://ollama.ai
ollama pull llama3.2
```
2. **Neo4j** - Knowledge graph database
```bash
# Using Docker
docker run -d \
--name neo4j \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/password \
neo4j:latest
```
### Installation
```bash
# Clone and navigate
cd knowledge_graph_rag_citations
# Install dependencies
pip install -r requirements.txt
# Run the app
streamlit run knowledge_graph_rag.py
```
## π How It Works
### Step 1: Document β Knowledge Graph
```
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Document β βββΊ β LLM Extraction β βββΊ β Knowledge Graph β
β (Text/PDF) β β (Entities+Rels) β β (Neo4j) β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
```
The LLM extracts:
- **Entities**: People, organizations, concepts, technologies
- **Relationships**: How entities connect (e.g., "works_for", "created", "uses")
- **Provenance**: Source document and chunk for each extraction
### Step 2: Query β Multi-hop Traversal
```
βββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββ
β Query β βββΊ β Find Start β βββΊ β Traverse β βββΊ β Context β
β β β Entities β β Relations β β + Sourcesβ
βββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββ
```
### Step 3: Answer β Verified Citations
```
βββββββββββββββ βββββββββββββββ ββββββββββββββββββββ
β Context β βββΊ β Generate β βββΊ β Answer with β
β + Sources β β Answer β β [1][2] Citationsβ
βββββββββββββββ βββββββββββββββ ββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββ
β Citation Details β
β β’ Source Doc β
β β’ Source Text β
β β’ Reasoning Path β
ββββββββββββββββββββ
```
## π₯οΈ Usage Example
### 1. Add a Document
Paste or select a sample document. The system extracts entities and relationships:
```
Document: "GraphRAG was developed by Microsoft Research.
Darren Edge led the project..."
Extracted:
βββ Entity: GraphRAG (TECHNOLOGY)
βββ Entity: Microsoft Research (ORGANIZATION)
βββ Entity: Darren Edge (PERSON)
βββ Relationship: Darren Edge --[WORKS_FOR]--> Microsoft Research
```
### 2. Ask a Question
```
Question: "Who developed GraphRAG and what organization are they from?"
```
### 3. Get Verified Answer
```
Answer: GraphRAG was developed by researchers at Microsoft Research [1],
with Darren Edge leading the project [2].
Citations:
[1] Source: AI Research Paper
Text: "GraphRAG is a technique developed by Microsoft Research..."
[2] Source: AI Research Paper
Text: "...introduced by researchers including Darren Edge..."
```
## π§ Configuration
| Setting | Default | Description |
|---------|---------|-------------|
| Neo4j URI | `bolt://localhost:7687` | Neo4j connection string |
| Neo4j User | `neo4j` | Database username |
| Neo4j Password | - | Database password |
| LLM Model | `llama3.2` | Ollama model for extraction/generation |
## ποΈ Architecture
```
knowledge_graph_rag_citations/
βββ knowledge_graph_rag.py # Main Streamlit application
βββ requirements.txt # Python dependencies
βββ README.md # This file
```
### Key Components
- **`KnowledgeGraphManager`**: Neo4j interface for graph operations
- **`extract_entities_with_llm()`**: LLM-based entity/relationship extraction
- **`generate_answer_with_citations()`**: Multi-hop RAG with provenance tracking
## π Learn More
This example is inspired by [VeritasGraph](https://github.com/bibinprathap/VeritasGraph), an enterprise-grade framework for:
- On-premise knowledge graph RAG
- Visual reasoning traces (Veritas-Scope)
- LoRA-tuned LLM integration
## π License
MIT License