LightRAG Deep Dive

LightRAG is a graph-enhanced Retrieval-Augmented Generation (RAG) system that goes beyond traditional chunk-based RAG by incorporating knowledge graphs into text indexing and retrieval.

May 18, 2026

white and black typewriter with white printer paper

What is LightRAG?

LightRAG is a graph-enhanced Retrieval-Augmented Generation (RAG) system that goes beyond traditional chunk-based RAG by incorporating knowledge graphs into text indexing and retrieval.

Key Innovation: Instead of just storing flat text chunks, LightRAG extracts entities and relationships to build a knowledge graph that enables structural retrieval alongside semantic search.

How LightRAG Differs from Traditional RAG

Traditional RAG

LightRAG

Does LightRAG Do Contextual Embedding?

No - LightRAG does NOT do contextual embedding the way Anthropic defines it.

Anthropic's Contextual Embedding Approach

Before embedding each chunk, prepend document-level context:

This requires an LLM call per chunk to generate ~50-100 tokens of context that gets baked into the embedding.

LightRAG's Approach

LightRAG's "context" comes from graph structure, not document-level prepending:

Aspect	Contextual Embedding	LightRAG
Context source	LLM summarizes document for each chunk	Entities/relationships extracted from chunk
How context is used	Prepended to chunk before embedding	Stored separately as graph nodes/edges
Retrieval enrichment	Embedded directly in vector	1-hop neighbors fetched at query time

LightRAG trades embedded context for structural context. The graph relationships serve a similar purpose (providing context about what a chunk relates to), but through graph traversal rather than prepended text in embeddings.

Storage Architecture

LightRAG uses 4 storage types for different purposes:

Storage Backend Options

Storage Type	Implementations
KV Storage	JsonFile (default), PostgreSQL, Redis, MongoDB
Vector Storage	NanoVectorDB (default), PostgreSQL (pgvector), Milvus, Qdrant, Faiss, MongoDB
Graph Storage	NetworkX (default), Neo4J, PostgreSQL (AGE), Memgraph
Doc Status	JsonFile (default), PostgreSQL, MongoDB

What Gets Stored Where

1. Original Chunks --> KV_STORAGE + VECTOR_STORAGE

Chunks are stored in TWO places:

# KV_STORAGE (text content):
{
    "chunk_id_123": {
        "content": "The company's revenue grew by 3%...",
        "source_id": "doc-1",
        "file_path": "report.pdf"
    }
}

# VECTOR_STORAGE (embedding):
{
    "id": "chunk_id_123",
    "vector": [0.023, -0.156, 0.089, ...],  # 1536 dims
    "payload": {"source_id": "doc-1"}
}

# KV_STORAGE (text content):
{
    "chunk_id_123": {
        "content": "The company's revenue grew by 3%...",
        "source_id": "doc-1",
        "file_path": "report.pdf"
    }
}

# VECTOR_STORAGE (embedding):
{
    "id": "chunk_id_123",
    "vector": [0.023, -0.156, 0.089, ...],  # 1536 dims
    "payload": {"source_id": "doc-1"}
}

# KV_STORAGE (text content):
{
    "chunk_id_123": {
        "content": "The company's revenue grew by 3%...",
        "source_id": "doc-1",
        "file_path": "report.pdf"
    }
}

# VECTOR_STORAGE (embedding):
{
    "id": "chunk_id_123",
    "vector": [0.023, -0.156, 0.089, ...],  # 1536 dims
    "payload": {"source_id": "doc-1"}
}

2. Entities --> KV_STORAGE + VECTOR_STORAGE + GRAPH_STORAGE

Entities are stored in THREE places:

# KV_STORAGE (entity metadata/description):
{
    "ACME CORP": {
        "entity_type": "company",
        "description": "ACME Corp is a technology company...",
        "source_id": "doc-1"
    }
}

# VECTOR_STORAGE (entity embedding for retrieval):
{
    "id": "entity_ACME_CORP",
    "vector": [0.045, -0.234, 0.112, ...],
    "payload": {"entity_name": "ACME CORP", "entity_type": "company"}
}

# GRAPH_STORAGE (node in graph):
Node("ACME CORP", type="company", description="...")

# KV_STORAGE (entity metadata/description):
{
    "ACME CORP": {
        "entity_type": "company",
        "description": "ACME Corp is a technology company...",
        "source_id": "doc-1"
    }
}

# VECTOR_STORAGE (entity embedding for retrieval):
{
    "id": "entity_ACME_CORP",
    "vector": [0.045, -0.234, 0.112, ...],
    "payload": {"entity_name": "ACME CORP", "entity_type": "company"}
}

# GRAPH_STORAGE (node in graph):
Node("ACME CORP", type="company", description="...")

# KV_STORAGE (entity metadata/description):
{
    "ACME CORP": {
        "entity_type": "company",
        "description": "ACME Corp is a technology company...",
        "source_id": "doc-1"
    }
}

# VECTOR_STORAGE (entity embedding for retrieval):
{
    "id": "entity_ACME_CORP",
    "vector": [0.045, -0.234, 0.112, ...],
    "payload": {"entity_name": "ACME CORP", "entity_type": "company"}
}

# GRAPH_STORAGE (node in graph):
Node("ACME CORP", type="company", description="...")

3. Relationships --> KV_STORAGE + VECTOR_STORAGE + GRAPH_STORAGE

Same pattern - THREE places:

# KV_STORAGE (relation metadata):
{
    "ACME_CORP->GMAIL": {
        "description": "ACME Corp develops Gmail",
        "keywords": "develops operates service",
        "weight": 2.0,
        "source_id": "doc-1"
    }
}

# VECTOR_STORAGE (relation embedding):
{
    "id": "rel_ACME_CORP_GMAIL",
    "vector": [0.078, -0.189, 0.234, ...],
    "payload": {"src": "ACME CORP", "tgt": "GMAIL", "keywords": "develops"}
}

# GRAPH_STORAGE (edge in graph):
Edge("ACME CORP" -> "GMAIL", relation="develops", weight=2.0)

# KV_STORAGE (relation metadata):
{
    "ACME_CORP->GMAIL": {
        "description": "ACME Corp develops Gmail",
        "keywords": "develops operates service",
        "weight": 2.0,
        "source_id": "doc-1"
    }
}

# VECTOR_STORAGE (relation embedding):
{
    "id": "rel_ACME_CORP_GMAIL",
    "vector": [0.078, -0.189, 0.234, ...],
    "payload": {"src": "ACME CORP", "tgt": "GMAIL", "keywords": "develops"}
}

# GRAPH_STORAGE (edge in graph):
Edge("ACME CORP" -> "GMAIL", relation="develops", weight=2.0)

# KV_STORAGE (relation metadata):
{
    "ACME_CORP->GMAIL": {
        "description": "ACME Corp develops Gmail",
        "keywords": "develops operates service",
        "weight": 2.0,
        "source_id": "doc-1"
    }
}

# VECTOR_STORAGE (relation embedding):
{
    "id": "rel_ACME_CORP_GMAIL",
    "vector": [0.078, -0.189, 0.234, ...],
    "payload": {"src": "ACME CORP", "tgt": "GMAIL", "keywords": "develops"}
}

# GRAPH_STORAGE (edge in graph):
Edge("ACME CORP" -> "GMAIL", relation="develops", weight=2.0)

Why the Redundancy?

Storage	Purpose
KV Storage	Fast key-based lookup, stores full text/descriptions
Vector Storage	Semantic similarity search via embeddings
Graph Storage	Structural traversal (neighbors, paths, subgraphs)

Data Flow Diagram

Query Modes Explained

LightRAG supports multiple retrieval strategies:

Mode	What it does	Best for
`naive`	Just chunk vector search	Simple lookups
`local`	Entity-focused + their relations	"Who is X?" questions
`global`	Relation/theme-focused	"How does X relate to Y?"
`hybrid`	Local + Global (no chunks)	Graph-only queries
`mix`	Local + Global + Naive + Reranker	Everything (recommended)

Dual-Level Retrieval Paradigm

LightRAG uses two retrieval levels to handle different query types:

Level	Purpose	Example Query
Low-Level (Local)	Specific entities + their attributes/relationships	"Who wrote Pride and Prejudice?"
High-Level (Global)	Broader topics, themes, aggregated insights	"How does AI influence modern education?"

Mix Mode - All Engines Firing

Mix mode combines all retrieval strategies:

Mix Mode Query Flow

Mode Comparison

	Mix	Hybrid	Naive
Latency	Highest	Medium	Fastest
Completeness	Best	Good	Basic
Token cost	Highest	Medium	Lowest

Mix mode is the "I want the best answer, cost be damned" option.

Reading LightRAG Logs

Example log from a mix mode query:

LLM extracted keywords from your query and cached the result.

24 parallel embedding workers spun up for vector searches.

Low-level retrieval: extracted local keywords, found 40 matching entities, retrieved 116 connected relations.

High-level retrieval: extracted global keywords, found 40 matching relations, retrieved 47 connected entities.

Traditional chunk vector search found 20 chunks.

Combined and deduplicated results from all retrieval modes.

Reranker filtered down to top 20 most relevant chunks.

Token budget trimmed to 13 chunks for LLM context window.

Tuning Token Budgets

QueryParam Token Controls

from lightrag import QueryParam

param = QueryParam(
    mode="mix",
    max_total_tokens=30000,      # Total budget (default)
    max_entity_tokens=6000,      # Budget for entity descriptions
    max_relation_tokens=8000,    # Budget for relation descriptions
    chunk_top_k=20,              # Number of chunks to retrieve
    top_k=60,                    # Number of entities/relations
)

from lightrag import QueryParam

param = QueryParam(
    mode="mix",
    max_total_tokens=30000,      # Total budget (default)
    max_entity_tokens=6000,      # Budget for entity descriptions
    max_relation_tokens=8000,    # Budget for relation descriptions
    chunk_top_k=20,              # Number of chunks to retrieve
    top_k=60,                    # Number of entities/relations
)

from lightrag import QueryParam

param = QueryParam(
    mode="mix",
    max_total_tokens=30000,      # Total budget (default)
    max_entity_tokens=6000,      # Budget for entity descriptions
    max_relation_tokens=8000,    # Budget for relation descriptions
    chunk_top_k=20,              # Number of chunks to retrieve
    top_k=60,                    # Number of entities/relations
)

To Keep All Reranked Chunks

If reranker outputs 20 chunks but token budget truncates to 13:

param = QueryParam(
    mode="mix",
    max_total_tokens=50000,      # Increase total budget
    # Or reduce entity/relation budgets:
    max_entity_tokens=4000,
    max_relation_tokens=6000,
)

param = QueryParam(
    mode="mix",
    max_total_tokens=50000,      # Increase total budget
    # Or reduce entity/relation budgets:
    max_entity_tokens=4000,
    max_relation_tokens=6000,
)

param = QueryParam(
    mode="mix",
    max_total_tokens=50000,      # Increase total budget
    # Or reduce entity/relation budgets:
    max_entity_tokens=4000,
    max_relation_tokens=6000,
)

Environment Variables (Server-Wide)

MAX_TOTAL_TOKENS=50000
MAX_ENTITY_TOKENS=6000
MAX_RELATION_TOKENS=8000
CHUNK_TOP_K=20
TOP_K=60

MAX_TOTAL_TOKENS=50000
MAX_ENTITY_TOKENS=6000
MAX_RELATION_TOKENS=8000
CHUNK_TOP_K=20
TOP_K=60

MAX_TOTAL_TOKENS=50000
MAX_ENTITY_TOKENS=6000
MAX_RELATION_TOKENS=8000
CHUNK_TOP_K=20
TOP_K=60

Go Big (Watch Your Context Window)

param = QueryParam(
    mode="mix",
    max_total_tokens=100000,
    max_entity_tokens=20000,
    max_relation_tokens=30000,
    chunk_top_k=50,
    top_k=100,
)

param = QueryParam(
    mode="mix",
    max_total_tokens=100000,
    max_entity_tokens=20000,
    max_relation_tokens=30000,
    chunk_top_k=50,
    top_k=100,
)

param = QueryParam(
    mode="mix",
    max_total_tokens=100000,
    max_entity_tokens=20000,
    max_relation_tokens=30000,
    chunk_top_k=50,
    top_k=100,
)

Warning: If using a 32k context model and sending 50k tokens, it will fail.

LightRAG vs GraphRAG

Feature	GraphRAG	LightRAG
Retrieval	Traverses communities (expensive)	Vector-based keyword matching (fast)
Tokens per query	~610,000 (Legal dataset)	<100
API calls	Hundreds	1
Incremental updates	Must rebuild all community reports	Seamlessly merges new nodes/edges

LLM Requirements

Minimum 32B parameters recommended
32KB context minimum (64KB recommended)
Supports: OpenAI, Ollama, Azure, Gemini, HuggingFace, LlamaIndex

Example Storage Configuration

Split storage across specialized backends:

LIGHTRAG_KV_STORAGE=PGKVStorage              # PostgreSQL
LIGHTRAG_VECTOR_STORAGE=QdrantVectorDBStorage # Qdrant  
LIGHTRAG_GRAPH_STORAGE=Neo4JStorage          # Neo4J
LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage # PostgreSQL

LIGHTRAG_KV_STORAGE=PGKVStorage              # PostgreSQL
LIGHTRAG_VECTOR_STORAGE=QdrantVectorDBStorage # Qdrant  
LIGHTRAG_GRAPH_STORAGE=Neo4JStorage          # Neo4J
LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage # PostgreSQL

LIGHTRAG_KV_STORAGE=PGKVStorage              # PostgreSQL
LIGHTRAG_VECTOR_STORAGE=QdrantVectorDBStorage # Qdrant  
LIGHTRAG_GRAPH_STORAGE=Neo4JStorage          # Neo4J
LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage # PostgreSQL

PostgreSQL stores:

Chunk text content (kv table)
Entity/relation descriptions (kv table)
Document processing status (doc_status table)

Qdrant stores:

Chunk embeddings
Entity embeddings
Relation embeddings

Neo4J stores:

Entity nodes with properties
Relationship edges with properties
Graph structure for traversal

Key Takeaways

LightRAG chunks AND extracts - It doesn't throw away original text
Triple storage for entities/relations - KV + Vector + Graph, each serving a purpose
Dual-level retrieval - Local (entities) + Global (relations) for different query types
Mix mode is comprehensive - Combines all strategies with reranking
Graph provides structural context - 1-hop neighbors enrich retrieval without contextual embedding
Token budgets control output - Tune max_total_tokens to keep more content

References

Systems

May 18, 2026

OAuth 2.0 vs OIDC: A Complete Guide for Developers

Systems

May 18, 2026

OAuth 2.0 vs OIDC: A Complete Guide for Developers

Systems

May 18, 2026