
Graph-Native Data Structures in C#, Part 2 — Trees & Hierarchies with NebulaGraph
Relational adjacency lists and nested sets buckle under real hierarchy workloads. This article shows how to model, query, and mutate category trees in NebulaGraph using nGQL and the nebula-net C# client — with cycle guards built into your application layer.
Part 1 of this series covered the property graph fundamentals: VID design, tag schemas, basic CRUD, and standing up a NebulaPool singleton. If you landed here first, read that one before continuing — the patterns here build directly on that plumbing.
This article uses a running e-commerce category tree (Electronics → Phones → Smartphones → Android) as its working example, with an org-chart aside where the same patterns apply identically.
Why Relational Hierarchy Models Break Down
Every .NET team eventually builds a category tree in SQL. The first pass is an adjacency list: a ParentId column on the same table, a recursive CTE at query time, and a prayer that the tree stays shallow. It works fine at a few hundred nodes and three levels deep.
Pressure arrives in two forms. First, depth: SQL Server and PostgreSQL recursive CTEs are unbounded by default, and performance degrades non-linearly as both depth and fan-out grow. Second, writes: the nested-sets (Celko) model trades fast reads for catastrophic write amplification — every insert or move requires an UPDATE … SET lft/rgt scan that rewrites a proportion of the entire table. Moving a subtree of 200 categories under a new parent triggers thousands of row updates.
Org charts compound both problems. A 10,000-person company with a matrix structure hits all three pain points simultaneously: fan-out at the top, depth at the leaves, and frequent re-orgs that are pure write torture on nested sets.
Graph databases model this natively. Depth-bounded traversal is a first-class operation, and moving a subtree is two edge mutations regardless of subtree size.
Modeling the Hierarchy
Schema and VID Design
The schema is minimal:
CREATE TAG category(name STRING, slug STRING);
CREATE EDGE parent_of();VIDs follow the 'cat:' + slug convention from Part 1. For the space definition, use vid_type = FIXED_STRING(64) to accommodate slugs of realistic length — the default fixed-length VID is only 8 bytes, which is too short for most real slugs.
Edge direction is parent → child. This feels counter-intuitive to people used to thinking "a child knows its parent," but the REVERSELY traversal keyword makes ancestor walks trivial, and subtree descent follows the natural direction. Committing to parent_of pointing downward is the right call.
Enforcing Single-Parent
NebulaGraph has no schema-level constraint that prevents a second parent_of edge pointing at the same child. You must enforce this in C# before inserting. Before any insert, execute:
GO 1 STEPS FROM 'cat:phones' OVER parent_of REVERSELY YIELD dst(edge) AS parentVidIf that returns a row, the node already has a parent. Throw before inserting the edge. Since NebulaGraph 3.3.0, every vertex must also carry at least one tag — bare VID inserts cause a schema error — so always pair your INSERT VERTEX with the category tag.
Reading Subtrees
For subtree descent, use MATCH with a depth-bounded variable-length pattern:
MATCH p=(r:category)-[:parent_of*1..5]->(d:category)
WHERE id(r)=='cat:electronics'
RETURN id(d) AS vid, d.category.name AS name, d.category.slug AS slug, length(p) AS depthAlways set the upper bound. *1.. with no ceiling can time out or OOM on unexpectedly deep or accidentally cyclic graphs. For category trees, *1..10 is a safe ceiling. For org charts, you might push to *1..20.
Important: MATCH and GO have different cycle semantics. If a back-edge is accidentally introduced, MATCH stops silently at the edge that would be revisited rather than looping — safe but quiet. GO will keep traversing vertices (though not the same edge twice). Use MATCH for subtree reads; you get the safer default.
For children-only (one hop), GO is more concise:
GO FROM 'cat:electronics' OVER parent_of YIELD dst(edge) AS childVidFor ancestors (breadcrumb trail), GO REVERSELY:
GO 1 TO 10 STEPS FROM 'cat:phones' OVER parent_of REVERSELY YIELD dst(edge) AS ancestorVidRebuilding the Tree in C#
The query returns a flat list of (vid, name, slug, depth) rows. Rebuild the in-memory tree with a dictionary keyed by VID:
public record CategoryNode(string Vid, string Name, string Slug, int Depth)
{
public List<CategoryNode> Children { get; } = new();
}
public async Task<CategoryNode?> GetSubtreeAsync(string rootVid, int maxDepth = 5)
{
var nql = $"""
MATCH p=(r:category)-[:parent_of*1..{maxDepth}]->(d:category)
WHERE id(r)=='{rootVid}'
RETURN id(r) AS rootVid, id(d) AS vid,
d.category.name AS name, d.category.slug AS slug,
length(p) AS depth
""";
await using var session = await _nebulaPool.GetSessionAsync();
var rows = await session.ExecuteAsync(nql)
.ToListAsync<FlatCategoryRow>();
session.Release();
if (rows.Count == 0) return null;
// Build root from first row's rootVid (we didn't fetch root's own properties here;
// for brevity assume a separate fetch or include root in a UNION)
var index = new Dictionary<string, CategoryNode>();
var root = new CategoryNode(rootVid, rows[0].RootName, rows[0].RootSlug, 0);
index[rootVid] = root;
// Sort ascending by depth so parents are always indexed before children
foreach (var row in rows.OrderBy(r => r.Depth))
{
var node = new CategoryNode(row.Vid, row.Name, row.Slug, row.Depth);
index[row.Vid] = node;
}
// Wire children — requires parent VID per row; adjust query to RETURN id(nodes[-2]) as parentVid
foreach (var row in rows.OrderBy(r => r.Depth))
{
if (index.TryGetValue(row.ParentVid, out var parent))
parent.Children.Add(index[row.Vid]);
}
return root;
}The pattern is deliberate: one round-trip, flat result set, in-memory assembly. Avoid fetching children level-by-level in a loop — that reintroduces the N+1 problem that graph databases are supposed to eliminate.
Mutations from C#
public async Task InsertNodeAsync(string parentVid, string name, string slug)
{
var childVid = $"cat:{slug}";
// Guard: child must not already have a parent
var existingParentNql = $"GO 1 STEPS FROM '{childVid}' OVER parent_of REVERSELY YIELD dst(edge)";
await using var session = await _nebulaPool.GetSessionAsync();
var existing = await session.ExecuteAsync(existingParentNql).ToListAsync<VidRow>();
if (existing.Count > 0)
throw new InvalidOperationException($"Node {childVid} already has a parent.");
// Guard: inserting childVid as child of parentVid must not create a cycle.
// A cycle would exist if childVid is already an ancestor of parentVid.
var ancestorsNql = $"GO 1 TO 10 STEPS FROM '{parentVid}' OVER parent_of REVERSELY YIELD dst(edge) AS vid";
var ancestors = await session.ExecuteAsync(ancestorsNql).ToListAsync<VidRow>();
if (ancestors.Any(a => a.Vid == childVid))
throw new InvalidOperationException("Insert would create a cycle.");
var batch = $"""
INSERT VERTEX category(name, slug) VALUES '{childVid}':('{name}', '{slug}');
INSERT EDGE parent_of() VALUES '{parentVid}'->'{childVid}':();
""";
await session.ExecuteAsync(batch);
session.Release();
}
public async Task MoveSubtreeAsync(string nodeVid, string newParentVid)
{
// Delete old parent_of edge, insert new one — two DML ops in one session call.
// Not atomic; design callers to be idempotent on partial failure.
var oldParentNql = $"GO 1 STEPS FROM '{nodeVid}' OVER parent_of REVERSELY YIELD dst(edge) AS vid";
await using var session = await _nebulaPool.GetSessionAsync();
var oldParents = await session.ExecuteAsync(oldParentNql).ToListAsync<VidRow>();
if (oldParents.Count == 0) throw new InvalidOperationException("Node has no parent to move from.");
var oldParentVid = oldParents[0].Vid;
var batch = $"""
DELETE EDGE parent_of '{oldParentVid}'->'{nodeVid}';
INSERT EDGE parent_of() VALUES '{newParentVid}'->'{nodeVid}':();
""";
await session.ExecuteAsync(batch);
session.Release();
}
public async Task<List<string>> GetAncestorsAsync(string nodeVid)
{
var nql = $"GO 1 TO 10 STEPS FROM '{nodeVid}' OVER parent_of REVERSELY YIELD dst(edge) AS vid";
await using var session = await _nebulaPool.GetSessionAsync();
var rows = await session.ExecuteAsync(nql).ToListAsync<VidRow>();
session.Release();
return rows.Select(r => r.Vid).ToList(); // ordered root-outward by hop count
}MoveSubtreeAsync is the operation that tortures nested sets. Here it is two edge mutations. The subtree itself needs no changes — all descendant edges remain valid. The caveat is atomicity: NebulaGraph does not offer multi-statement transactions across edge deletes and inserts by default. Batch both statements in the same session call using semicolons, and design your application layer to be idempotent on retry.
Performance and Safety Considerations
Deep Trees and Fan-Out
Depth-bounded traversal is essential — *1..N with a realistic ceiling prevents accidental unbounded walks. Fan-out at super-nodes (a category with hundreds of direct children) is where performance degrades. NebulaGraph v5.2 (released November 2025) introduced a SAMPLE clause that intelligently throttles super-node traversals for stable response times. If your category tree has a high-fan-out root, SAMPLE is worth evaluating.
The GET SUBGRAPH statement is an alternative to MATCH p=... when you want the full induced subgraph (vertices and edges) rather than a path-annotated result. It can be faster for wide subtrees where path depth metadata isn't needed.
Validating Acyclicity
A tree that gains a back-edge is no longer a tree. NebulaGraph enforces nothing at the schema level. The application guards in InsertNodeAsync above cover the common case: before inserting a parent_of edge from P to C, check that C does not appear in P's ancestor chain. If it does, the insert would close a cycle.
For MoveSubtreeAsync, the same check applies: confirm that the new parent is not itself inside the subtree being moved. You can reuse GetSubtreeAsync to enumerate the subtree's VIDs and assert the new parent is not among them.
v5.2's GQL stored procedures open the door to encapsulating cycle detection server-side, reducing round-trips — worth watching as that feature matures.
Portability Note
All MATCH queries in this article use openCypher-compatible syntax and will port to other property graph engines. The GO … REVERSELY syntax is nGQL-specific. NebulaGraph v5.0 was the first distributed graph database to implement ISO/IEC 39075 GQL, and v5.2 expands that compliance — if vendor portability matters to your team, prefer MATCH-based queries and the WITH clause over nGQL pipes (|) for compound queries.
The NebulaGraph migration article on this site covers schema upgrade paths if you're moving from a 3.x space to a 5.x deployment.
What's Next
Part 3 will move from trees to general graphs: many-to-many relationships, weighted edges, and shortest-path queries — the cases where the relational model doesn't just strain, it breaks entirely.
Sources
- NebulaGraph Query Language (nGQL)
- Graph Query Language: What You Should Know
- NebulaGraph Database Manual 3.3.0
- FAQ - NebulaGraph Database Manual
- NebulaGraph v3.0.0 Release Note
- NebulaGraph Query Statements Commentary: to Enhance nGQL’s Usability | by NebulaGraph Database | Medium
- nGQL cheatsheet - NebulaGraph Database Manual
- Overview of NebulaGraph general query statements
Keep reading

June 26, 2026 · 6 min
Two Bugs Hiding in Our OpenTelemetry Pipeline: CORS Preflights and the NUL Byte That Killed a Whole Batch
Two silent production bugs in an OTLP-based observability pipeline — one blocking all browser telemetry clients, one dropping entire log batches — exposed how quickly boundary gaps become blind spots. Here's the root cause, the fix, and the repro for each.
Read
June 11, 2026 · 6 min
Vertical Slice Architecture in ASP.NET Core: Features as First-Class Citizens
Vertical Slice Architecture flips the organizational instinct of layered systems — instead of grouping code by technical concern, you group it by use case. Here's how to structure, share, test, and migrate to slices in a real ASP.NET Core codebase.
Read
June 21, 2026 · 3 min
Introducing OpenPulse: Self-Hosted Observability with AI Root-Cause Analysis
Observability forces a bad trade-off: an expensive SaaS bill with your data offsite, or a DIY stack you babysit with no AI insight. OpenPulse is the third option — self-hosted logs, traces and metrics with AI root-cause analysis and auto-remediation built in.
Read