Skip to content
BAEM1N.DEV — AI, RAG, LLMOps 개발 블로그
Go back

Neo4j vs Apache AGE Benchmark — Same Cypher, Same Data, Different Results

Disclosure: The author maintains langchain-age. All benchmark code is open-source and reproducible.

TL;DR: On a 1K-node/2K-edge graph with identical Cypher queries, AGE wins 6 out of 8 tests (point lookup 2.2x, CREATE 3.7x, schema 2.1x faster). Neo4j wins deep traversals by 11–15x (3+ hops). For RAG workloads (1–2 hops + CRUD), AGE is faster and free. For deep graph analytics, Neo4j is clearly superior.

Table of contents

Open Table of contents

Series

This is Part 2 of the langchain-age series.

  1. GraphRAG with Just PostgreSQL — Overview + Setup
  2. Neo4j vs Apache AGE Benchmark (this post)
  3. Mastering Vector Search — Hybrid, MMR, Filtering
  4. Building a GraphRAG Pipeline — Vector + Graph Integration
  5. Full AI Agent Stack on One PostgreSQL — LangGraph Integration

What You’ll Learn

Why This Benchmark Matters

Most comparisons between Neo4j and Apache AGE are qualitative — “AGE is convenient because it runs on PostgreSQL”, “Neo4j is faster because it’s a native graph database”. No numbers.

This benchmark runs identical queries under identical conditions to provide quantitative data.

Test Environment

ItemNeo4jAGE
VersionNeo4j 5 (Docker)PostgreSQL 18 + AGE 1.7.0 (Docker)
Driverlangchain-neo4j (neo4j Python driver)langchain-age (psycopg3)
ResourcesSame machine, Docker containerSame machine, Docker container

Dataset

Methodology

# 3 warmup runs, then N iterations, reporting p50 (median)
def bench(fn, iterations=50):
    for _ in range(3): fn()  # warmup
    times = [measure(fn) for _ in range(iterations)]
    return median(times)

Results

Cypher vs Cypher (Fair Comparison)

TestNeo4j p50AGE p50WinnerFactor
Point lookup (MATCH by property)2.0ms0.9msAGE2.2x
1-hop traversal1.7ms1.0msAGE1.7x
3-hop traversal1.7ms25.8msNeo4j14.9x
6-hop traversal2.4ms27.7msNeo4j11.6x
Full count (aggregation)1.5ms1.0msAGE1.5x
Single CREATE3.3ms0.9msAGE3.7x
Batch CREATE (100 nodes)2.6ms1.1msAGE2.4x
Schema introspection16.6ms7.9msAGE2.1x

Where AGE Wins (6 of 8)

Point lookup (2.2x): PostgreSQL’s B-tree indexes are efficient for single-property lookups.

1-hop traversal (1.7x): Shallow relationship traversal works fine with PostgreSQL JOINs. This is the most common pattern in RAG.

Aggregation (1.5x): PostgreSQL’s query planner excels at full-scan operations like count(n).

Single CREATE (3.7x): PostgreSQL’s transaction overhead is lighter. Advantageous when storing LLM responses to the graph in real-time.

Batch CREATE (2.4x): Both use UNWIND with 100 nodes. AGE is still faster.

Schema introspection (2.1x): langchain-age queries ag_catalog system tables directly via SQL. Neo4j goes through APOC metadata.

Where Neo4j Wins (2 of 8)

3-hop traversal (14.9x): Neo4j’s core strength — index-free adjacency. Relationships are physical pointers, no JOINs needed.

6-hop traversal (11.6x): The architectural advantage compounds with depth. AGE requires a PostgreSQL JOIN per hop.

Analysis: Why the Difference

Where AGE is Fast — PostgreSQL’s Strengths

AGE stores graph data in PostgreSQL tables (graph_name."LabelName"). All PostgreSQL optimisations apply directly:

Where Neo4j is Fast — Native Graph Architecture

Neo4j stores relationships as physical pointers. Moving from node A to node B requires no index lookup or JOIN — just follow the pointer.

![Neo4j pointer traversal vs AGE SQL JOIN comparison](../../../assets/images/langchain-age/neo4j-vs-age-traversal-en.png)

The difference is negligible at 1 hop but grows exponentially at 3+ hops.

AGE’s Escape Hatch: traverse() + WITH RECURSIVE

The reason AGE’s Cypher is slow on deep traversals is clear: AGE’s Cypher-to-SQL translator expands MATCH (a)-[:LINK*6]->(b) into a 6-way self-join. PostgreSQL’s query planner doesn’t optimise this well.

But AGE has an escape hatch that Neo4j doesn’t — the data lives in PostgreSQL tables, so you can bypass Cypher and write SQL directly. langchain-age’s traverse() method uses PostgreSQL WITH RECURSIVE CTEs, which the query planner handles far more efficiently.

The generated SQL:

WITH RECURSIVE traverse AS (
    -- Find start nodes
    SELECT e.end_id AS node_id, 1 AS depth
    FROM graph."LINK" e
    JOIN graph."N" n ON e.start_id = n.id
    WHERE n.properties::text::jsonb->>'idx' = '0'

    UNION

    -- Recurse: next hop
    SELECT e.end_id, t.depth + 1
    FROM traverse t
    JOIN graph."LINK" e ON e.start_id = t.node_id
    WHERE t.depth < 6
)
SELECT DISTINCT depth, node_id FROM traverse;

The key difference:

Measured on the same 1K-node graph:

DepthAGE CypherAGE traverse()ImprovementNeo4j Cyphertraverse vs Neo4j
3-hop26.4ms1.3ms21x1.7msAGE 1.3x faster
6-hop28.2ms1.4ms19x2.4msAGE 1.7x faster

With traverse(), AGE beats Neo4j even on deep traversals.

This is possible because of AGE’s architectural property — AGE data is stored in ordinary PostgreSQL tables, accessible via raw SQL. Neo4j uses its own storage engine, so there’s no way to bypass its Cypher engine. The same optimisation cannot be applied to Neo4j.

Usage:

# Cypher *6: 28.2ms
graph.query("MATCH (a:Node {idx: 0})-[:LINK*6]->(b) RETURN count(b)")

# traverse(): 1.4ms — same result, 19x faster
results = graph.traverse(
    start_label="Node",
    start_filter={"idx": 0},
    edge_label="LINK",
    max_depth=6,
    direction="outgoing",      # also "incoming", "both"
    return_properties=True,    # False = node IDs only (faster)
)
# [{"depth": 1, "node_id": 123, "properties": {"name": "..."}}, ...]

Recommended usage:

PatternMethodWhy
1–3 hopsgraph.query() (Cypher)Readable and fast enough
4+ hopsgraph.traverse()10–22x performance gain
Complex start conditionsgraph.create_property_index() firstIndex accelerates start-node lookup

Benchmark Limitations

FAQ

How compatible is AGE’s Cypher with Neo4j?

AGE implements the openCypher spec: MATCH, CREATE, MERGE, DELETE, UNWIND, and more. APOC procedures are not available. All queries in this benchmark ran identically on both systems without modification.

Can AGE handle billions of nodes?

Storage is not the issue (PostgreSQL tables support up to 32TB). Shallow traversals (1–3 hops) scale well with indexes. Deep traversals (6+ hops) favour Neo4j.

This benchmark focuses on graph queries. Both systems support vector search (pgvector / Neo4j Vector Index), which deserves its own benchmark.

Can traverse() be applied to Neo4j?

No. traverse() works because AGE stores data in PostgreSQL tables, allowing raw SQL access. Neo4j uses its own storage engine — you cannot run SQL against it.

Conclusion

WorkloadRecommendationReason
RAG (1–2 hops + CRUD)AGEPoint lookup 2.2x, CREATE 3.7x faster
Social network analysis (3–6 hops)Neo4jDeep traversal 11–15x faster
Cost optimisationAGE$0 vs $15K+/year
Existing PostgreSQL infrastructureAGEExtension install only
Enterprise supportNeo4jSLA, 24/7 support

Most LLM/RAG applications need only 1–2 hops. In this range, AGE is faster than Neo4j, free, and simpler to operate.

Reproduce

git clone https://github.com/BAEM1N/langchain-age.git
cd langchain-age

# AGE container
cd docker && docker compose up -d && cd ..

# Neo4j container
docker run -d --name neo4j-bench -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/testpassword neo4j:5

# Run benchmark
pip install -e ".[dev]" langchain-neo4j
python benchmarks/bench.py

The benchmark script is available at benchmarks/bench.py.

External Resources

Key Takeaways


langchain-age is MIT licensed. Benchmark code and data are publicly available on GitHub for anyone to reproduce.


Share this post on:

Previous Post
GraphRAG with Just PostgreSQL — No Neo4j Required