Skip to content
BAEM1N.DEV — AI, RAG, LLMOps 개발 블로그
Go back

GraphRAG with Just PostgreSQL — No Neo4j Required

Disclosure: The author maintains langchain-age.

TL;DR: Combine Apache AGE (graph) + pgvector (vector) on PostgreSQL to get the same GraphRAG capabilities as Neo4j + Pinecone — with 1 database, 1 connection string, 1 backup pipeline. The langchain-age package plugs directly into the LangChain ecosystem.

Table of contents

Open Table of contents

Series

This is Part 1 of the langchain-age series.

  1. GraphRAG with Just PostgreSQL (this post)
  2. Neo4j vs Apache AGE Benchmark — Performance Data
  3. Mastering Vector Search — Hybrid, MMR, Filtering
  4. Building a GraphRAG Pipeline — Vector + Graph Integration
  5. Full AI Agent Stack on One PostgreSQL — LangGraph Integration

What You’ll Be Able to Do

The Problem: Do You Really Need Two Databases for GraphRAG?

GraphRAG retrieves context from both a knowledge graph and vector embeddings before generating an answer. The most common approach uses Neo4j, but that means:

If you already run PostgreSQL, the same instance can handle both graph and vector workloads.

The Answer: Apache AGE + pgvector

Apache AGE adds Cypher graph queries to PostgreSQL. pgvector adds vector similarity search. Both run inside the same PostgreSQL instance.

PostgreSQL unified architecture — AGE + pgvector + LangGraph

One database. One connection string.

Neo4j vs PostgreSQL+AGE Comparison

Neo4j + PineconePostgreSQL + AGE + pgvector
Databases2 (graph + vector)1
LicensingGPL + proprietaryApache 2.0 + PostgreSQL License
HA cost$15K+/year + vector DB pricing$0 (PG native HA)
LangChain integrationlangchain-neo4jlangchain-age
Deployment2 clusters to manage1 PostgreSQL
Backup2 pipelines1 pg_dump
Long-term memorySeparate DB or serviceSame DB (LangGraph PostgresStore)

Setup in 5 Minutes

Step 1: Start the Database

git clone https://github.com/BAEM1N/langchain-age.git
cd langchain-age/docker
docker compose up -d

One container. AGE + pgvector + pg_trgm pre-installed.

Step 2: Install

pip install "langchain-age[all]" langchain-openai

Step 3: Build a Knowledge Graph

from langchain_age import AGEGraph

graph = AGEGraph(
    "host=localhost port=5433 dbname=langchain_age user=langchain password=langchain",
    graph_name="company_kg",
)

# Same Cypher as Neo4j — no new syntax to learn
graph.query("CREATE (:Person {name: 'Alice', role: 'CTO'})")
graph.query("CREATE (:Person {name: 'Bob', role: 'Engineer'})")
graph.query("CREATE (:Product {name: 'AGE', desc: 'Graph extension for PostgreSQL'})")
graph.query(
    "MATCH (a:Person {name: 'Alice'}), (p:Product {name: 'AGE'}) "
    "CREATE (a)-[:LEADS]->(p)"
)
graph.query(
    "MATCH (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'}) "
    "CREATE (a)-[:MANAGES]->(b)"
)

Step 4: Vectorize Graph Nodes

from langchain_age import AGEVector
from langchain_openai import OpenAIEmbeddings

store = AGEVector.from_existing_graph(
    embedding=OpenAIEmbeddings(model="text-embedding-3-small"),
    connection_string="host=localhost port=5433 ...",
    graph_name="company_kg",
    node_label="Person",
    text_node_properties=["name", "role"],
    collection_name="person_vectors",
)

Step 5: GraphRAG — Vector Search + Graph Context

# Step 1: Find relevant nodes via vector search
docs = store.similarity_search("engineering leadership", k=2)

# Step 2: Expand context via graph relationships
for doc in docs:
    label = doc.metadata["node_label"]
    neighbors = graph.query(
        f"MATCH (n:{label})-[r]->(m) RETURN type(r) AS rel, m.name AS name"
    )
    print(f"{doc.page_content}: {neighbors}")

Step 6: LLM-Powered Cypher QA

from langchain_age import AGEGraphCypherQAChain
from langchain_openai import ChatOpenAI

chain = AGEGraphCypherQAChain.from_llm(
    ChatOpenAI(model="gpt-4o-mini"),
    graph=graph,
    allow_dangerous_requests=True,
)

answer = chain.run("Who does Alice manage?")
# "Alice manages Bob, who is an Engineer."

Why AGE Instead of Neo4j

1. You Already Have PostgreSQL

Most applications already run PostgreSQL. Adding AGE is CREATE EXTENSION age; — not deploying a new database cluster.

2. Licensing Freedom

Neo4j Community is GPL. If you embed it in a distributed product, GPL propagates. Neo4j Enterprise requires a commercial license ($15K+/year).

Apache AGE is Apache 2.0. No restrictions.

3. Total Cost Comparison

ScenarioNeo4jAGE
Dev/testFree (Community, single node)Free
Production HA$15K+/year (Enterprise) or AuraDB ($65/GB/month)$0 (PostgreSQL Patroni/repmgr)
Vector searchSeparate vector DB neededIncluded (pgvector)
Long-term memorySeparate service neededIncluded (LangGraph PostgresStore)

4. One Backup, One Monitor, One Team

Your PostgreSQL DBA already knows how to set up streaming replication, run pg_dump, monitor with pg_stat_statements, and manage connection pooling with PgBouncer. No new operational expertise required.

5. Performance Is Fine for RAG — Measured Data

“Isn’t AGE slower than Neo4j?” Here are actual measurements on the same 1K-node / 2K-edge graph, identical Cypher:

TestNeo4j p50AGE p50Winner
Point lookup2.0ms0.9msAGE 2.2x
1-hop traversal1.7ms1.0msAGE 1.7x
Single CREATE3.3ms0.9msAGE 3.7x
3-hop traversal1.7ms25.8msNeo4j 14.9x
6-hop traversal2.4ms27.7msNeo4j 11.6x

AGE is faster than Neo4j for the most common RAG patterns (1–2 hop lookups, CRUD). Neo4j only wins at 3+ hop deep traversals.

Even deep traversals can be reversed with langchain-age’s traverse() (PostgreSQL WITH RECURSIVE):

DepthAGE CypherAGE traverse()Neo4j
3-hop26.4ms1.3ms1.7ms
6-hop28.2ms1.4ms2.4ms

With traverse(), AGE beats Neo4j by 1.7x even at 6 hops. See the full benchmark in Neo4j vs AGE Benchmark.

When Neo4j IS Better

Being honest:

If your workload is “social network analysis on 10 billion edges with real-time community detection”, use Neo4j and pay the license fee. It’s worth it.

If your workload is “RAG application that needs graph context alongside vector search”, AGE on PostgreSQL is the simpler, cheaper, and more maintainable choice.

Frequently Asked Questions

How compatible is Apache AGE’s Cypher with Neo4j?

AGE implements the openCypher spec and supports core syntax including CREATE, MATCH, MERGE, and DELETE. Neo4j-specific APOC procedures are not available, but standard CRUD and pattern matching queries work as-is.

pgvector vs Pinecone — which vector DB is better for RAG?

For workloads under 10 million vectors, pgvector is sufficient. It has zero infrastructure cost and works within PostgreSQL transactions. For hundreds of millions of vectors with ultra-low latency requirements, consider a dedicated vector DB like Pinecone.

Is langchain-age compatible with LangGraph?

Yes. You can use LangGraph’s PostgresStore and Checkpoint on the same PostgreSQL instance, managing graph, vector, and agent state in a single database.

Can I migrate an existing Neo4j graph to AGE?

Export nodes and relationships via Cypher EXPORT, then load them with identical CREATE statements in AGE. For large datasets, use AGE’s CSV loader for efficiency.

How compatible is langchain-age with the langchain-neo4j API?

langchain-age mirrors the langchain-neo4j API. AGEGraph corresponds to Neo4jGraph, AGEVector to Neo4jVector, and AGEGraphCypherQAChain to GraphCypherQAChain. In most cases, changing import paths and the connection string is all that’s needed. Code using Neo4j-specific APOC procedures will need modification.

Should I migrate from Neo4j to AGE?

Not necessarily. If Neo4j is working well and licensing costs aren’t a concern, there’s no reason to switch. AGE is advantageous when: (1) you already run PostgreSQL and don’t want another database, (2) GPL licensing conflicts with your commercial product, or (3) you want to reduce HA costs.

Is Apache AGE production-ready?

Apache AGE is an Apache Software Foundation Top-Level Project and runs as a PostgreSQL extension, inheriting PostgreSQL’s MVCC, WAL, and crash recovery guarantees. For production HA, you can use existing PG HA solutions like Patroni or repmgr without any AGE-specific configuration.

Getting Started

pip install "langchain-age[all]"

Key Takeaways


langchain-age is MIT licensed. Apache AGE is Apache 2.0. pgvector is PostgreSQL License. No licensing fees, no vendor lock-in.


AI-assisted content
Share this post on:

Previous Post
Full AI Agent Stack on One PostgreSQL — LangGraph + langchain-age
Next Post
Neo4j vs Apache AGE Benchmark — Same Cypher, Same Data, Different Results