← Back to prompt archive
πŸ” Research & Analysis

πŸ” Knowledge Graph RAG with Verifiable Citations

Streamlit app using Knowledge Graph RAG with Neo4j and Ollama for multi-hop reasoning and verifiable source citations.

Added Apr 14, 2026
A Streamlit application demonstrating how **Knowledge Graph-based Retrieval-Augmented Generation (RAG)** provides multi-hop reasoning with fully verifiable source attribution.

## 🎯 What Makes This Different?

Traditional vector-based RAG finds similar text chunks, but struggles with:
- Questions requiring information from multiple documents
- Complex reasoning chains
- Providing verifiable sources for each claim

**Knowledge Graph RAG** solves these by:
1. **Building a structured graph** of entities and relationships from documents
2. **Traversing connections** to find related information (multi-hop reasoning)
3. **Tracking provenance** so every claim links back to its source

## ✨ Features

| Feature | Description |
|---------|-------------|
| πŸ”— **Multi-hop Reasoning** | Traverse entity relationships to answer complex questions |
| πŸ“š **Verifiable Citations** | Every claim includes source document and text |
| 🧠 **Reasoning Trace** | See exactly how the answer was derived |
| 🏠 **Fully Local** | Uses Ollama for LLM, Neo4j for graph storage |

## πŸš€ Quick Start

### Prerequisites

1. **Ollama** - Local LLM inference
   ```bash
   # Install from https://ollama.ai
   ollama pull llama3.2
   ```

2. **Neo4j** - Knowledge graph database
   ```bash
   # Using Docker
   docker run -d \
     --name neo4j \
     -p 7474:7474 -p 7687:7687 \
     -e NEO4J_AUTH=neo4j/password \
     neo4j:latest
   ```

### Installation

```bash
# Clone and navigate
cd knowledge_graph_rag_citations

# Install dependencies
pip install -r requirements.txt

# Run the app
streamlit run knowledge_graph_rag.py
```

## πŸ“– How It Works

### Step 1: Document β†’ Knowledge Graph

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Document      β”‚ ──► β”‚  LLM Extraction  β”‚ ──► β”‚ Knowledge Graph β”‚
β”‚   (Text/PDF)    β”‚     β”‚  (Entities+Rels) β”‚     β”‚    (Neo4j)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

The LLM extracts:
- **Entities**: People, organizations, concepts, technologies
- **Relationships**: How entities connect (e.g., "works_for", "created", "uses")
- **Provenance**: Source document and chunk for each extraction

### Step 2: Query β†’ Multi-hop Traversal

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Query  β”‚ ──► β”‚  Find Start β”‚ ──► β”‚  Traverse   β”‚ ──► β”‚  Context  β”‚
β”‚         β”‚     β”‚   Entities  β”‚     β”‚  Relations  β”‚     β”‚  + Sourcesβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

### Step 3: Answer β†’ Verified Citations

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Context   β”‚ ──► β”‚  Generate   β”‚ ──► β”‚  Answer with     β”‚
β”‚ + Sources   β”‚     β”‚   Answer    β”‚     β”‚  [1][2] Citationsβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                β”‚
                                                β–Ό
                                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                        β”‚ Citation Details β”‚
                                        β”‚ β€’ Source Doc     β”‚
                                        β”‚ β€’ Source Text    β”‚
                                        β”‚ β€’ Reasoning Path β”‚
                                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

## πŸ–₯️ Usage Example

### 1. Add a Document

Paste or select a sample document. The system extracts entities and relationships:

```
Document: "GraphRAG was developed by Microsoft Research. 
           Darren Edge led the project..."

Extracted:
  β”œβ”€β”€ Entity: GraphRAG (TECHNOLOGY)
  β”œβ”€β”€ Entity: Microsoft Research (ORGANIZATION)  
  β”œβ”€β”€ Entity: Darren Edge (PERSON)
  └── Relationship: Darren Edge --[WORKS_FOR]--> Microsoft Research
```

### 2. Ask a Question

```
Question: "Who developed GraphRAG and what organization are they from?"
```

### 3. Get Verified Answer

```
Answer: GraphRAG was developed by researchers at Microsoft Research [1], 
        with Darren Edge leading the project [2].

Citations:
  [1] Source: AI Research Paper
      Text: "GraphRAG is a technique developed by Microsoft Research..."
      
  [2] Source: AI Research Paper  
      Text: "...introduced by researchers including Darren Edge..."
```

## πŸ”§ Configuration

| Setting | Default | Description |
|---------|---------|-------------|
| Neo4j URI | `bolt://localhost:7687` | Neo4j connection string |
| Neo4j User | `neo4j` | Database username |
| Neo4j Password | - | Database password |
| LLM Model | `llama3.2` | Ollama model for extraction/generation |

## πŸ—οΈ Architecture

```
knowledge_graph_rag_citations/
β”œβ”€β”€ knowledge_graph_rag.py   # Main Streamlit application
β”œβ”€β”€ requirements.txt         # Python dependencies
└── README.md               # This file
```

### Key Components

- **`KnowledgeGraphManager`**: Neo4j interface for graph operations
- **`extract_entities_with_llm()`**: LLM-based entity/relationship extraction
- **`generate_answer_with_citations()`**: Multi-hop RAG with provenance tracking

## πŸŽ“ Learn More

This example is inspired by [VeritasGraph](https://github.com/bibinprathap/VeritasGraph), an enterprise-grade framework for:
- On-premise knowledge graph RAG
- Visual reasoning traces (Veritas-Scope)
- LoRA-tuned LLM integration

## πŸ“ License

MIT License
#knowledge graph #RAG #Neo4j #Ollama #citations