Databases/June 17, 2026/7 min read

From ArangoDB to NebulaGraph: Why Cost Forced the Move — and Performance Made It Permanent

When ArangoDB's October 2025 licensing change put our production cluster out of compliance overnight, we had 90 days to migrate a live graph workload to NebulaGraph. Here is exactly what we changed, what broke, and what we would do differently.

A Year on ArangoDB: What We Built and Why We Chose It

Our recommendation engine runs on a property graph: users, products, categories, purchase events, and behavioural signals all connected by typed edges. Twelve months ago, ArangoDB 3.11 was an easy sell. A single database handled JSON document storage and graph traversals under one query language (AQL), the Apache 2.0 licence was clean, and the .NET driver (ArangoDBNetStandard) was mature enough to be production-ready. We were storing roughly 60 GB of vertex and edge data, well within Community Edition limits.

Our .NET 8 worker service polled a queue, constructed AQL queries, and wrote traversal results back to a Redis cache. Life was simple.

// ArangoDB traversal — finding 2-hop product recommendations via AQL
public async Task<IEnumerable<string>> GetRecommendationsAsync(
    string userId, int maxDepth, CancellationToken ct)
{
    var query = $"""
        FOR v, e, p IN 1..{maxDepth} OUTBOUND @startVertex
            GRAPH 'recommendation_graph'
            OPTIONS {{ uniqueVertices: 'path', bfs: true }}
            FILTER v.active == true
            LIMIT 50
            RETURN DISTINCT v._key
        """;
 
    var bindVars = new Dictionary<string, object>
    {
        ["startVertex"] = $"users/{userId}"
    };
 
    var cursor = await _db.ExecuteAqlQueryAsync<string>(
        new AqlQuery { Query = query, BindVars = bindVars }, ct);
 
    return cursor.Result;
}

This worked well — until the dataset crossed 80 GB and traversal latency climbed from 40 ms to over 300 ms on 3-hop queries. We were hitting the well-documented performance ceiling of Community Edition: without SmartGraph and Intelligent Sharding (both Enterprise-only), ArangoDB cannot efficiently distribute traversal work across shards. Query times can slow 4–10× as the graph grows.

The Breaking Point: How ArangoDB's Cost Stopped Adding Up

On October 18, 2025, ArangoDB re-licensed future Community Edition releases from Apache 2.0 to BSL 1.1, simultaneously introducing a new ArangoDB Community License that caps data at 100 GB per cluster and prohibits commercial use. We were at 87 GB and growing 8 GB per month. The clock was running.

Enterprise pricing to unlock SmartGraph and remove the cap would have cost roughly 3–4× our current infrastructure spend. ArangoDB Managed Platform (AMP) pricing is node- and SLA-based, opaque until you get a quote, and oriented toward organisations with dedicated DBAs. For a mid-sized engineering team, neither path was attractive.

The licensing change also quietly rendered ArangoDB Community 3.11 end-of-life — no patches, no security updates. We were running 3.11.5, the last stable release before the re-licence. Staying put was not an option.

Why NebulaGraph: What We Were Looking For in an Alternative

Our requirements were concrete:

Apache 2.0 or equivalent — no commercial-use restrictions in the community edition.
Native distributed graph architecture — not a graph layer bolted onto a document store.
Millisecond-range traversal latency at hundreds of millions of edges.
A usable .NET client or REST API we could wrap ourselves.
A realistic migration path from ArangoDB's document + edge model.

NebulaGraph 3.7.0 (Community, released March 2024) ticked every box. It is licensed Apache 2.0 + Common Clause 1.0; the Common Clause restricts selling NebulaGraph as a managed service, but does not restrict commercial internal use — the opposite of ArangoDB's new position. The Enterprise/Cloud track (v5.2, January 2026) added features we wanted later, but Community was enough to start.

Snapchat evaluated nine graph databases — including Neo4j — and chose NebulaGraph for their friend recommendation engine based on SLOs, QPS, and data-scale requirements. That validated our shortlist.

Data Model Migration: Documents + Edge Collections to Vertices + Edges

ArangoDB's model is flexible JSON: collections are schema-optional, and _id, _from, _to are system-generated. NebulaGraph is schema-required: every vertex belongs to one or more Tags, every edge belongs to an Edge Type, and every vertex needs an explicit Vertex ID (VID).

The VID design decision is the most consequential architectural choice in the migration. There is no auto-generated equivalent of _id. A poor VID strategy — for example, using sequential integers — creates storage hotspots in NebulaGraph's distributed hash-partitioned storage layer. We used a composite key: {entity_type}:{uuid}, hashed consistently with FNV-1a before insertion, giving us uniform partition distribution.

ArangoDB Concept	NebulaGraph Equivalent
Document collection	Vertex Tag (typed schema)
Edge collection	Edge Type (typed schema)
`_id` (auto)	VID (must be designed)
Schema-optional JSON	Fixed tag/edge-type properties
AQL (multi-model)	nGQL / openCypher (graph-only)

We wrote a one-off .NET migration tool that streamed ArangoDB export JSONL files, mapped each document to a CREATE TAG schema, and bulk-imported using NebulaGraph's SST ingest API. Roughly 62 million vertices and 400 million edges migrated in a weekend.

Query Migration: Translating AQL to nGQL

AQL's FOR … IN … FILTER … RETURN is document-iteration syntax that has no direct nGQL counterpart. Graph traversals translate more cleanly.

The AQL snippet from the introduction becomes the following in nGQL:

// NebulaGraph traversal using nGQL via the nebula-net client (unofficial)
public async Task<IEnumerable<string>> GetRecommendationsAsync(
    string userId, int maxDepth, CancellationToken ct)
{
    // NebulaGraph uses GO FROM ... OVER ... for traversal
    // VIDs are typed; ours are strings prefixed with entity type
    var vid = $"user:{userId}";
 
    var ngql = $"""
        GO 1 TO {maxDepth} STEPS FROM "{vid}" OVER purchased, viewed
        WHERE $$.product.active == true
        YIELD DISTINCT $$.product.id AS product_id
        LIMIT 50
        """;
 
    // Using NebulaGraph's HTTP2 session interface wrapped in a typed client
    var result = await _nebulaSession.ExecuteAsync(ngql, ct);
 
    return result.GetValuesByColName("product_id")
                 .Select(v => v.GetSVal().ToString());
}

nGQL's GO FROM … OVER … STEPS is semantically equivalent to AQL's FOR v, e, p IN 1..N OUTBOUND, but you specify edge types explicitly rather than naming a graph. For teams more comfortable with openCypher, NebulaGraph v3.x supports MATCH syntax concurrently with nGQL — you can mix both in the same cluster.

Performance: Where NebulaGraph Was Faster

Once data was loaded and indexes built, the numbers were decisive. NebulaGraph's LDBC-SNB SF100 benchmarks (282 million vertices, 1.775 billion edges) show FIND ALL PATH queries improving 50–500% across varying depths, with roughly 600% improvement on 1–5 hop traversals in v3.5.0.

In our own workload (62 million vertices, 400 million edges), 3-hop recommendation traversals dropped from 280–320 ms on ArangoDB CE to 35–60 ms on NebulaGraph Community 3.7 — without any query tuning. MATCH two-hop count QPS improved meaningfully once we added native indexes on the tag properties we filtered on most frequently.

One important caveat: for queries exceeding 10 hops, Community Edition shows degradation and the Enterprise Edition is strongly recommended. Our use case tops out at 5 hops, so this did not affect us.

Features That Won Us Over

Storage/compute separation is the architectural reason NebulaGraph scales traversals better than ArangoDB CE. Compute nodes can scale independently of storage nodes; a heavy traversal job does not contend with write throughput on the same process.

The SAMPLE clause (added in v5.x) deserves special mention. In large graphs, highly connected vertices — super-nodes — can stall traversal queries as the engine tries to enumerate every edge. SAMPLE lets you cap exploration breadth per hop declaratively, preventing runaway queries without changing your graph structure.

Unified multi-modal querying in Enterprise/Cloud v5.0+ allows graph traversals, vector similarity search, and full-text queries in a single statement. For a recommendation engine that also wants semantic similarity between products, this is a compelling roadmap.

The Hard Parts: Gotchas and Lessons from the Migration

VID design is irreversible. Once you ingest 400 million edges with a given VID scheme, changing it means a full re-import. We got this right by designing it on paper first, but only because a colleague had read the NebulaGraph architecture docs carefully.

No schema flexibility. ArangoDB lets you add fields to documents without altering a schema. In NebulaGraph, adding a property to a Tag requires an ALTER TAG statement and a rolling schema propagation. Plan your schemas like you plan a relational database, not a document store.

Community Edition tooling gaps. The official .NET driver situation is thinner than ArangoDB's ecosystem. We ended up wrapping the HTTP2 Bolt-style interface ourselves — not difficult, but budget the time. The Java and Python clients are first-class; .NET is a second-class citizen in the NebulaGraph ecosystem today.

v3.5.0 regression. The benchmarks show a slight performance decrease on some Go1–3 StepEdge and Go1–3 StepEdge_count patterns in v3.5.0 compared to v3.4.0. Pin your version and benchmark against your own workload before upgrading.

Licensing nuance. Apache 2.0 + Common Clause 1.0 is not pure Apache 2.0. If your business model involves hosting NebulaGraph as a managed service for third parties, you need a commercial agreement. Internal production use is unrestricted.

The Results, and Whether We'd Make the Same Call Again

Six months post-migration: traversal p95 latency is under 65 ms, we are storing 104 GB with no licence alarm, and our infrastructure bill is lower because NebulaGraph's storage/compute separation let us right-size each tier independently.

The migration took one engineer eight weeks end-to-end: one week for VID design and schema mapping, one weekend for the bulk import, four weeks for query rewriting and integration testing, and two weeks for production cutover with a shadow-read validation period.

Would we make the same call again? Yes, with one change: we would prototype on NebulaGraph from the start if the domain is graph-first. ArangoDB's multi-model convenience is attractive early on, but it comes with a performance ceiling and — now — a licensing cliff that hits at exactly the scale where a graph workload gets interesting. NebulaGraph's purpose-built architecture pays dividends the moment your traversal depth and dataset size start to grow together.

July 22, 2026 · 5 min

My recap of WeAreDevelopers 2026

Two roles at the WeAreDevelopers World Congress 2026 in Berlin: moderating a full afternoon on Stage 3, and running my own workshop on bridging LLMs and real systems with MCP and function calling. Plus the people who made the trip.

Read

July 16, 2026 · 6 min

Graph-Native Data Structures in C#, Part 4 — Graphs & Adjacency with a Social Follow Network

Part 4 of the series models a directed social follow network in NebulaGraph, then surfaces neighbour queries, mutual connections, friend-of-friend suggestions, and super-node protection as typed C# async methods.

Read

July 13, 2026 · 7 min

Graph-Native Data Structures in C#, Part 3 — Linked Lists & Sequences as Edges

Model append-only event streams and version chains as linked-list graphs in NebulaGraph, then walk, mutate, and guard them safely from C# using idempotent VIDs and native nGQL GO traversals.

Read

A Year on ArangoDB: What We Built and Why We Chose It

The Breaking Point: How ArangoDB's Cost Stopped Adding Up

Why NebulaGraph: What We Were Looking For in an Alternative

Data Model Migration: Documents + Edge Collections to Vertices + Edges

Query Migration: Translating AQL to nGQL

Performance: Where NebulaGraph Was Faster

Features That Won Us Over

The Hard Parts: Gotchas and Lessons from the Migration

The Results, and Whether We'd Make the Same Call Again

Keep reading

My recap of WeAreDevelopers 2026

Graph-Native Data Structures in C#, Part 4 — Graphs & Adjacency with a Social Follow Network

Graph-Native Data Structures in C#, Part 3 — Linked Lists & Sequences as Edges