
From ArangoDB to NebulaGraph: Why Cost Forced the Move — and Performance Made It Permanent
When ArangoDB's October 2025 licensing change put our production cluster out of compliance overnight, we had 90 days to migrate a live graph workload to NebulaGraph. Here is exactly what we changed, what broke, and what we would do differently.
A Year on ArangoDB: What We Built and Why We Chose It
Our recommendation engine runs on a property graph: users, products, categories, purchase events, and behavioural signals all connected by typed edges. Twelve months ago, ArangoDB 3.11 was an easy sell. A single database handled JSON document storage and graph traversals under one query language (AQL), the Apache 2.0 licence was clean, and the .NET driver (ArangoDBNetStandard) was mature enough to be production-ready. We were storing roughly 60 GB of vertex and edge data, well within Community Edition limits.
Our .NET 8 worker service polled a queue, constructed AQL queries, and wrote traversal results back to a Redis cache. Life was simple.
// ArangoDB traversal — finding 2-hop product recommendations via AQL
public async Task<IEnumerable<string>> GetRecommendationsAsync(
string userId, int maxDepth, CancellationToken ct)
{
var query = $"""
FOR v, e, p IN 1..{maxDepth} OUTBOUND @startVertex
GRAPH 'recommendation_graph'
OPTIONS {{ uniqueVertices: 'path', bfs: true }}
FILTER v.active == true
LIMIT 50
RETURN DISTINCT v._key
""";
var bindVars = new Dictionary<string, object>
{
["startVertex"] = $"users/{userId}"
};
var cursor = await _db.ExecuteAqlQueryAsync<string>(
new AqlQuery { Query = query, BindVars = bindVars }, ct);
return cursor.Result;
}This worked well — until the dataset crossed 80 GB and traversal latency climbed from 40 ms to over 300 ms on 3-hop queries. We were hitting the well-documented performance ceiling of Community Edition: without SmartGraph and Intelligent Sharding (both Enterprise-only), ArangoDB cannot efficiently distribute traversal work across shards. Query times can slow 4–10× as the graph grows.
The Breaking Point: How ArangoDB's Cost Stopped Adding Up
On October 18, 2025, ArangoDB re-licensed future Community Edition releases from Apache 2.0 to BSL 1.1, simultaneously introducing a new ArangoDB Community License that caps data at 100 GB per cluster and prohibits commercial use. We were at 87 GB and growing 8 GB per month. The clock was running.
Enterprise pricing to unlock SmartGraph and remove the cap would have cost roughly 3–4× our current infrastructure spend. ArangoDB Managed Platform (AMP) pricing is node- and SLA-based, opaque until you get a quote, and oriented toward organisations with dedicated DBAs. For a mid-sized engineering team, neither path was attractive.
The licensing change also quietly rendered ArangoDB Community 3.11 end-of-life — no patches, no security updates. We were running 3.11.5, the last stable release before the re-licence. Staying put was not an option.
Why NebulaGraph: What We Were Looking For in an Alternative
Our requirements were concrete:
- Apache 2.0 or equivalent — no commercial-use restrictions in the community edition.
- Native distributed graph architecture — not a graph layer bolted onto a document store.
- Millisecond-range traversal latency at hundreds of millions of edges.
- A usable .NET client or REST API we could wrap ourselves.
- A realistic migration path from ArangoDB's document + edge model.
NebulaGraph 3.7.0 (Community, released March 2024) ticked every box. It is licensed Apache 2.0 + Common Clause 1.0; the Common Clause restricts selling NebulaGraph as a managed service, but does not restrict commercial internal use — the opposite of ArangoDB's new position. The Enterprise/Cloud track (v5.2, January 2026) added features we wanted later, but Community was enough to start.
Snapchat evaluated nine graph databases — including Neo4j — and chose NebulaGraph for their friend recommendation engine based on SLOs, QPS, and data-scale requirements. That validated our shortlist.
Data Model Migration: Documents + Edge Collections to Vertices + Edges
ArangoDB's model is flexible JSON: collections are schema-optional, and _id, _from, _to are system-generated. NebulaGraph is schema-required: every vertex belongs to one or more Tags, every edge belongs to an Edge Type, and every vertex needs an explicit Vertex ID (VID).
The VID design decision is the most consequential architectural choice in the migration. There is no auto-generated equivalent of _id. A poor VID strategy — for example, using sequential integers — creates storage hotspots in NebulaGraph's distributed hash-partitioned storage layer. We used a composite key: {entity_type}:{uuid}, hashed consistently with FNV-1a before insertion, giving us uniform partition distribution.
| ArangoDB Concept | NebulaGraph Equivalent |
|---|---|
| Document collection | Vertex Tag (typed schema) |
| Edge collection | Edge Type (typed schema) |
_id (auto) |
VID (must be designed) |
| Schema-optional JSON | Fixed tag/edge-type properties |
| AQL (multi-model) | nGQL / openCypher (graph-only) |
We wrote a one-off .NET migration tool that streamed ArangoDB export JSONL files, mapped each document to a CREATE TAG schema, and bulk-imported using NebulaGraph's SST ingest API. Roughly 62 million vertices and 400 million edges migrated in a weekend.
Query Migration: Translating AQL to nGQL
AQL's FOR … IN … FILTER … RETURN is document-iteration syntax that has no direct nGQL counterpart. Graph traversals translate more cleanly.
The AQL snippet from the introduction becomes the following in nGQL:
// NebulaGraph traversal using nGQL via the nebula-net client (unofficial)
public async Task<IEnumerable<string>> GetRecommendationsAsync(
string userId, int maxDepth, CancellationToken ct)
{
// NebulaGraph uses GO FROM ... OVER ... for traversal
// VIDs are typed; ours are strings prefixed with entity type
var vid = $"user:{userId}";
var ngql = $"""
GO 1 TO {maxDepth} STEPS FROM "{vid}" OVER purchased, viewed
WHERE $$.product.active == true
YIELD DISTINCT $$.product.id AS product_id
LIMIT 50
""";
// Using NebulaGraph's HTTP2 session interface wrapped in a typed client
var result = await _nebulaSession.ExecuteAsync(ngql, ct);
return result.GetValuesByColName("product_id")
.Select(v => v.GetSVal().ToString());
}nGQL's GO FROM … OVER … STEPS is semantically equivalent to AQL's FOR v, e, p IN 1..N OUTBOUND, but you specify edge types explicitly rather than naming a graph. For teams more comfortable with openCypher, NebulaGraph v3.x supports MATCH syntax concurrently with nGQL — you can mix both in the same cluster.
Performance: Where NebulaGraph Was Faster
Once data was loaded and indexes built, the numbers were decisive. NebulaGraph's LDBC-SNB SF100 benchmarks (282 million vertices, 1.775 billion edges) show FIND ALL PATH queries improving 50–500% across varying depths, with roughly 600% improvement on 1–5 hop traversals in v3.5.0.
In our own workload (62 million vertices, 400 million edges), 3-hop recommendation traversals dropped from 280–320 ms on ArangoDB CE to 35–60 ms on NebulaGraph Community 3.7 — without any query tuning. MATCH two-hop count QPS improved meaningfully once we added native indexes on the tag properties we filtered on most frequently.
One important caveat: for queries exceeding 10 hops, Community Edition shows degradation and the Enterprise Edition is strongly recommended. Our use case tops out at 5 hops, so this did not affect us.
Features That Won Us Over
Storage/compute separation is the architectural reason NebulaGraph scales traversals better than ArangoDB CE. Compute nodes can scale independently of storage nodes; a heavy traversal job does not contend with write throughput on the same process.
The SAMPLE clause (added in v5.x) deserves special mention. In large graphs, highly connected vertices — super-nodes — can stall traversal queries as the engine tries to enumerate every edge. SAMPLE lets you cap exploration breadth per hop declaratively, preventing runaway queries without changing your graph structure.
Unified multi-modal querying in Enterprise/Cloud v5.0+ allows graph traversals, vector similarity search, and full-text queries in a single statement. For a recommendation engine that also wants semantic similarity between products, this is a compelling roadmap.
The Hard Parts: Gotchas and Lessons from the Migration
VID design is irreversible. Once you ingest 400 million edges with a given VID scheme, changing it means a full re-import. We got this right by designing it on paper first, but only because a colleague had read the NebulaGraph architecture docs carefully.
No schema flexibility. ArangoDB lets you add fields to documents without altering a schema. In NebulaGraph, adding a property to a Tag requires an ALTER TAG statement and a rolling schema propagation. Plan your schemas like you plan a relational database, not a document store.
Community Edition tooling gaps. The official .NET driver situation is thinner than ArangoDB's ecosystem. We ended up wrapping the HTTP2 Bolt-style interface ourselves — not difficult, but budget the time. The Java and Python clients are first-class; .NET is a second-class citizen in the NebulaGraph ecosystem today.
v3.5.0 regression. The benchmarks show a slight performance decrease on some Go1–3 StepEdge and Go1–3 StepEdge_count patterns in v3.5.0 compared to v3.4.0. Pin your version and benchmark against your own workload before upgrading.
Licensing nuance. Apache 2.0 + Common Clause 1.0 is not pure Apache 2.0. If your business model involves hosting NebulaGraph as a managed service for third parties, you need a commercial agreement. Internal production use is unrestricted.
The Results, and Whether We'd Make the Same Call Again
Six months post-migration: traversal p95 latency is under 65 ms, we are storing 104 GB with no licence alarm, and our infrastructure bill is lower because NebulaGraph's storage/compute separation let us right-size each tier independently.
The migration took one engineer eight weeks end-to-end: one week for VID design and schema mapping, one weekend for the bulk import, four weeks for query rewriting and integration testing, and two weeks for production cutover with a shadow-read validation period.
Would we make the same call again? Yes, with one change: we would prototype on NebulaGraph from the start if the domain is graph-first. ArangoDB's multi-model convenience is attractive early on, but it comes with a performance ceiling and — now — a licensing cliff that hits at exactly the scale where a graph workload gets interesting. NebulaGraph's purpose-built architecture pays dividends the moment your traversal depth and dataset size start to grow together.
Keep reading

June 15, 2026 · 7 min
How to Interview Junior Developers in the Age of AI
We're still interviewing juniors for 2015 — quizzing syntax AI now writes and banning the tools they'll use on day one. Here's what to test instead: problem-solving, business sense, tool fluency, and teamwork, with a scorecard you can steal.
Read
June 11, 2026 · 6 min
Vertical Slice Architecture in ASP.NET Core: Features as First-Class Citizens
Vertical Slice Architecture flips the organizational instinct of layered systems — instead of grouping code by technical concern, you group it by use case. Here's how to structure, share, test, and migrate to slices in a real ASP.NET Core codebase.
Read
February 6, 2026 · 9 min
From Graph Traversal to Semantic Discovery: Adding Vector Search and LLM Reasoning to Your ArangoDB .NET 10 API
In the previous article, I built a .NET 10 API […]
Read