
Introduction to Graph API
Graph API is an ergonomic library for working with in memory graphs in Rust that provides a flexible and type-safe way to interact with graph data structures.
Heavily inspired by Apache TinkerPop, it offers an iterator-like interface specifically designed for graph traversal and manipulation.
Overview
This library offers a unified interface for working with different types of graphs while maintaining strong type safety and ergonomic usage patterns. It includes features for graph traversal, modification, and custom extensions.
Advantages
- Iterator-like interface for graphs: Intuitive API that feels familiar to Rust developers
- Works with multiple backends:
- simple graph - A simple adjacency list based graph with index support
- petgraph - An excellent established graph library
- You can also create your own!
- Composable operations: Chain graph operations together in a fluent, declarative style
- Separation of graph model and implementation: Define your domain model once and use it with any supported backend
Key Features
- Type-safe graph operations: Work with graph data in a strongly-typed manner
- Flexible vertex and edge traversal: Explore graph relationships with powerful walker API
- Custom graph implementations support: Adapt to various graph storage backends
- Derive macros for extending graph functionality: Automatically generate boilerplate code
- Comprehensive testing utilities: Including fuzzing support
Example
// Find a person using the username index (exact match)
let bryn = graph
.walk()
.vertices(Vertex::person_by_username("bryn123"))
.first()
.expect("Bryn should exist in the graph");
println!("Found person: {:?}", bryn);
// Find projects created by a person
let projects_created_by_bryn = graph
.walk()
.vertices_by_id(vec![bryn])
.edges(Edge::created().outgoing())
.head()
.collect::<Vec<_>>();
println!("Bryn created {} projects", projects_created_by_bryn.len());
// Find all people followed by Bryn
let followed_by_bryn = graph
.walk()
.vertices_by_id(vec![bryn])
.edges(Edge::follows().outgoing())
.head()
.collect::<Vec<_>>();
println!("Bryn follows {} people", followed_by_bryn.len());
// Find all people with "graph" in their biography
let graph_enthusiasts = graph
.walk()
.vertices(Vertex::person_by_biography("graph"))
.collect::<Vec<_>>();
println!("Found {} graph enthusiasts", graph_enthusiasts.len());
// Find people in a specific age range
let people_in_30s = graph
.walk()
.vertices(Vertex::person_by_age_range(30..40))
.collect::<Vec<_>>();
println!("Found {} people in their 30s", people_in_30s.len());
Book Organization
This book is organized into several sections:
- User Guide: How to use the Graph API to work with graph data
- Implementation Guide: How to implement the Graph API traits for your own graph types
- Reference: Detailed information about API components and functionality
- Appendix: Additional resources and reference materials
Whether you're a graph API user or implementing your own graph backend, this book provides comprehensive documentation to help you make the most of the Graph API library.
Getting Started
Overview
graph-api
is an API that helps you work with graphs in Rust. Graphs are powerful data structures used to represent
relationships between entities. With the Graph API, you can create, manipulate, and traverse graphs with ease.
Any implementation of the Graph trait will benefit from an ergonomic walker API that allows you to perform complex
traversals. For the purposes of this documentation, we will use SimpleGraph
, but you may choose any other graph
implementation.
Basic Concepts
A graph consists of:
- Vertices (also called nodes): These represent entities in your data model
- Edges (also called links or connections): These represent relationships between entities
- Properties: Both vertices and edges can have properties that store data
The Graph API provides:
- A common interface for working with graph data
- Powerful traversal capabilities with the Walker API
- Index support for efficient lookups
- Derive macros for easy model definition
Installation
Add the Graph API to your project dependencies:
[dependencies]
graph-api-lib = "0.2.0"
graph-api-derive = "0.1.4" # For derive macros
For a simple graph implementation:
[dependencies]
graph-api-simplegraph = "0.2.1"
Examples
Your First Graph
Let's define a simple graph model with people and projects:
// Define our vertex types using derive macro
#[derive(Debug, Clone, VertexExt)]
enum Vertex {
// A person vertex with name and age properties
Person {
name: String,
#[index(hash)]
username: String,
age: u8,
},
// A project vertex with just a name
Project {
name: String,
},
}
// Define our edge types using derive macro
#[derive(Debug, Clone, EdgeExt)]
enum Edge {
// Simple relationship types
Knows,
Created,
WorksOn,
}
Now we can create a graph and add some data:
// Create a new empty graph
let mut graph = SimpleGraph::<Vertex, Edge>::new();
// Add vertices
let alice = graph.add_vertex(Vertex::Person {
name: "Alice".to_string(),
username: "alice123".to_string(),
age: 30,
});
let bob = graph.add_vertex(Vertex::Person {
name: "Bob".to_string(),
username: "bob456".to_string(),
age: 28,
});
let project = graph.add_vertex(Vertex::Project {
name: "Graph API".to_string(),
});
// Connect vertices with edges
graph.add_edge(alice, bob, Edge::Knows);
graph.add_edge(alice, project, Edge::Created);
graph.add_edge(bob, project, Edge::WorksOn);
And finally, we can query the graph:
// Find all people who created a project
let creators = graph
.walk()
.vertices(VertexSearch::scan())
.filter_person() // Type-safe filter using generated helper
.edges(Edge::created()) // Using generated search function
.head()
.filter_project() // Type-safe filter using generated helper
.map(|v, _| {
// Use projection for type-safe property access
v.project::<Project<_>>().unwrap().name().to_string()
})
.collect::<Vec<_>>();
// Find all people who know someone
let people_with_friends = graph
.walk()
.vertices(VertexSearch::scan())
.filter_person() // Type-safe filter using generated helper
.edges(Edge::knows()) // Using generated search function
.tail()
.filter_person() // Type-safe filter using generated helper
.map(|v, _| {
// Use projection for type-safe property access
v.project::<Person<_>>().unwrap().name().to_string()
})
.collect::<Vec<_>>();
Advanced Usage
Graph API supports more advanced features that we'll explore in later sections:
- Complex traversals with the Walker API
- Different index types for optimized queries
- Transaction support (with appropriate graph implementations)
- Custom property types
Next Steps
Now that you have created your first graph, you can:
- Learn about defining a model for your graph data
- Explore basic operations for working with graphs
- Discover graph traversal techniques using walkers
Defining a Model
Overview
Graph API provides a flexible way to define your graph data model using Rust's enum types and derive macros. This approach gives you the benefits of Rust's type system while maintaining the flexibility of property graphs.
Basic Concepts
A graph consists of two primary elements:
- Vertices (nodes): The entities in your graph
- Edges: The relationships connecting vertices
For each of these, you'll define a Rust enum that represents all possible types.
Model Definition
The equivalent definition using Graph API is:
// Define vertex types for a social media application
#[derive(Debug, Clone, VertexExt)]
pub enum Vertex {
// Person vertex with various properties
Person {
name: String, // Not indexed
#[index(hash)] // Hash index for exact lookups
username: String,
#[index(full_text)] // Full-text index for text search
biography: String,
#[index(range)] // Range index for range queries
age: u8,
},
// Project vertex with minimal properties
Project {
name: String,
},
// Comment vertex
Comment {
text: String,
date: String,
},
}
// Define edge types that connect vertices
#[derive(Debug, Clone, EdgeExt)]
pub enum Edge {
// Simple edges without properties
Created,
Follows,
// Edges with properties
Liked { timestamp: String },
Commented { timestamp: String },
}
This model defines vertex types for people, projects, and comments, along with edge types for the relationships between them.
Creating Instances
Once you've defined your model, you can create instances of vertices and edges:
let mut graph = SimpleGraph::new();
// Create vertices
let bryn = graph.add_vertex(Vertex::Person {
name: "Bryn".to_string(),
username: "bryn123".to_string(),
biography: "Graph enthusiast".to_string(),
age: 28,
});
let julia = graph.add_vertex(Vertex::Person {
name: "Julia".to_string(),
username: "julia456".to_string(),
biography: "Software developer".to_string(),
age: 34,
});
let eve = graph.add_vertex(Vertex::Person {
name: "Eve".to_string(),
username: "eve789".to_string(),
biography: "Network specialist".to_string(),
age: 31,
});
let graph_api = graph.add_vertex(Vertex::Project {
name: "GraphApi".to_string(),
});
let alpaca = graph.add_vertex(Vertex::Project {
name: "Alpaca".to_string(),
});
// Create edges
graph.add_edge(bryn, graph_api, Edge::Created);
graph.add_edge(julia, alpaca, Edge::Created);
graph.add_edge(julia, bryn, Edge::Follows);
graph.add_edge(eve, julia, Edge::Follows);
graph.add_edge(bryn, eve, Edge::Follows);
graph.add_edge(
bryn,
alpaca,
Edge::Liked {
timestamp: "2023-01-01".to_string(),
},
);
graph.add_edge(
bryn,
alpaca,
Edge::Commented {
timestamp: "2023-01-02".to_string(),
},
);
Using Derive Macros
The VertexExt
and EdgeExt
derive macros generate implementations for your model types that enable them to work with
Graph API's traversal and query features.
VertexExt
This macro provides:
- Integration with the indexing system
- Projection types
- Type-safe accessors for properties
- Type-safe filters for traversals
EdgeExt
This macro provides:
- Integration with label-based indexing
- Projection types
- Type-safe accessors for properties
- Type-safe filters for traversals
Indexing
You can define indexes for efficient lookups using attributes:
#[index(hash)]
: Creates a hash index for exact match lookups#[index(range)]
: Creates a range index for range queries#[index(full_text)]
: Creates a full-text index for text search
For more details on indexes and examples, see the Property Graphs section.
Best Practices
When defining your graph model:
- Use descriptive names - Choose clear names for vertex and edge types
- Index strategically - Only index fields used in frequent queries
- Use appropriate index types - Match index types to query patterns
Property Graphs
A property graph is a powerful data model that allows you to represent complex, interconnected data in a highly intuitive way. Graph API leverages Rust's type system to provide a strongly-typed property graph experience.
What are Property Graphs?
Property graphs consist of two primary elements:
-
Vertices (nodes): Represent entities in your data model. Each vertex has a type (label) and can hold multiple key-value properties.
-
Edges (relationships): Connect vertices to express relationships. Edges are directed, have a type (label), and can also hold properties.
Both vertices and edges can have properties (attributes) that store additional information. In Graph API, these properties are represented using Rust's enum types, giving you full type safety.
Why Use Property Graphs?
Property graphs excel at representing highly connected data where relationships are as important as the entities themselves. They're particularly useful for:
- Social networks (people, friendships, interests)
- Knowledge graphs (concepts, relationships)
- Recommendation systems (users, products, preferences)
- Network topologies (devices, connections)
- Dependency graphs (components, dependencies)
The key advantages of property graphs include:
- Intuitive modeling: Reflects how we naturally think about connected data
- Relationship-centric: Makes connections first-class citizens
- Flexible schema: Easily adapt to changing data models
- Performance: Efficient for traversal and relationship-based queries
Graph API's Approach to Property Graphs
Graph API uses Rust's strong type system to create a safe, ergonomic property graph experience:
- Enum-based modeling: Define vertices and edges using Rust enums
- Derive macros: Generate boilerplate code for traversal and querying
- Type-safe queries: Leverage Rust's type checking for query correctness
- Efficient indexing: First-class support for various index types
Property Graph Example
Here's a simple example of how Graph API models a property graph:
// Define vertex types for a social media application
#[derive(Debug, Clone, VertexExt)]
pub enum Vertex {
// Person vertex with various properties
Person {
name: String, // Not indexed
#[index(hash)] // Hash index for exact lookups
username: String,
#[index(full_text)] // Full-text index for text search
biography: String,
#[index(range)] // Range index for range queries
age: u8,
},
// Project vertex with minimal properties
Project {
name: String,
},
// Comment vertex
Comment {
text: String,
date: String,
},
}
// Define edge types that connect vertices
#[derive(Debug, Clone, EdgeExt)]
pub enum Edge {
// Simple edges without properties
Created,
Follows,
// Edges with properties
Liked { timestamp: String },
Commented { timestamp: String },
}
Understanding Indexes
Indexes are a crucial part of an efficient graph database. They allow you to quickly locate vertices and edges without scanning the entire graph.
In Graph API, indexes are defined as part of your model using derive macros. The following sections explore different types of indexes and how to use them effectively:
- Exploring Without Indexes: Understand the challenges of graph traversal without indexes
- Hash Indexes: Fast lookups by exact property value
- Range Indexes: Range queries for numeric and range properties
- Full-text Indexes: Text search capabilities for string properties
Exploring Without Indexes
Before we dive into indexes, let's understand why they're necessary by exploring graph traversal without them.
The Challenge of Finding Starting Points
In a property graph, one of the biggest challenges is finding the right starting points for your traversal. Without indexes, your only option is to scan the entire graph, examining each vertex to find matches.
In this diagram:
- The graph contains several vertices (A-E), some of which have the property
name: "Alice"
. - The query at the top left indicates the desired starting condition: find vertices where
name = "Alice"
. - The blue arrows illustrate the process without an index: To satisfy the query, the traversal mechanism must
conceptually examine every single vertex (A, B, C, D, E) to check if its
name
property matches "Alice". This represents a full graph scan. - The orange highlighting on vertices
A
andC
shows the result of this scan – these are the only vertices found that satisfy the query condition.
This full scan becomes computationally expensive and slow, especially in large graphs, highlighting the need for indexes to quickly locate starting vertices.
Example: Graph Without Indexes
Let's consider a simple social network model without any indexes:
// Define vertex types for a social media application
#[derive(Debug, Clone, VertexExt)]
pub enum Vertex {
// Person vertex with various properties
Person {
name: String, // Not indexed
#[index(hash)] // Hash index for exact lookups
username: String,
#[index(full_text)] // Full-text index for text search
biography: String,
#[index(range)] // Range index for range queries
age: u8,
},
// Project vertex with minimal properties
Project {
name: String,
},
// Comment vertex
Comment {
text: String,
date: String,
},
}
// Define edge types that connect vertices
#[derive(Debug, Clone, EdgeExt)]
pub enum Edge {
// Simple edges without properties
Created,
Follows,
// Edges with properties
Liked { timestamp: String },
Commented { timestamp: String },
}
Scanning the Entire Graph
Without indexes, the only way to find vertices matching specific criteria is to scan the entire graph.
Finding by Name (Without Index)
When a field isn't indexed (like name
in the Person
vertex), we have to scan all vertices to find matches:
// INEFFICIENT: Find all people named "Bryn" by scanning
// Since name is not indexed, we must scan all vertices
let bryn_vertices = graph
.walk()
.vertices(VertexSearch::scan()) // Must scan ALL vertices
.filter_by_person(|person, _| person.name() == "Bryn")
.collect::<Vec<_>>();
println!("Found {} vertices for Bryn", bryn_vertices.len());
Finding Projects (Without Index)
Similarly, to find a project by name, we need to scan the entire graph:
// INEFFICIENT: Find all projects with a specific name by scanning
// Since Project::name is not indexed, we must scan all vertices
let graphapi_projects = graph
.walk()
.vertices(VertexSearch::scan()) // Must scan ALL vertices
.filter_by_project(|project, _| project.name() == "GraphApi")
.collect::<Vec<_>>();
println!("Found {} GraphApi projects", graphapi_projects.len());
Performance Comparison
Let's compare the performance of a full scan versus using an index:
- Inefficient approach - scanning all vertices:
// COMPARISON: Using an index vs. not using an index
// 1. Inefficient: Find people with a specific username using a scan
let julia_by_scan = graph
.walk()
.vertices(VertexSearch::scan())
.filter_by_person(|person, _| person.username() == "julia456")
.collect::<Vec<_>>();
println!("Found {} with username julia456", julia_by_scan.len());
- Efficient approach - using the index:
// 2. Efficient: Find the same person using the username index
let julia_by_index = graph
.walk()
.vertices(Vertex::person_by_username("julia456"))
.collect::<Vec<_>>();
println!("Found {} with username julia456", julia_by_index.len());
Performance Implications
For small graphs, scanning might be acceptable, but as graphs grow, scanning becomes increasingly inefficient:
- Linear Time Complexity: Scanning requires examining every vertex, making it O(n) where n is the number of vertices
- Resource Intensive: Needs to load and process every vertex, even those not matching criteria
- Poor Scalability: Performance degrades linearly as the graph grows
When to Use Full Scans
Despite their inefficiency, full scans do have legitimate uses:
- During initial development: When you're still figuring out your data model
- For admin or maintenance tasks: When completeness is more important than speed
- For small graphs: When the overhead of maintaining indexes exceeds their benefit
- For exploratory analysis: When you need to examine all data without preconceptions
Moving Beyond Scans
In the following sections, we'll explore how different types of indexes can dramatically improve traversal performance by allowing you to:
- Find vertices by exact property values using standard indexes
- Search for vertices within ranges using range indexes
- Find vertices containing specific text using full-text indexes
Indexes provide the entry points for efficient graph traversal, turning potentially expensive operations into near-constant time lookups.
Hash Indexes
Hash indexes are the most common type of index in a property graph. They enable fast lookups for exact matches on property values, dramatically improving query performance.
What are Hash Indexes?
A hash index maps property values to vertices, allowing you to quickly find all vertices with a specific property value without scanning the entire graph. It uses a hash table data structure for O(1) lookups.
In this diagram:
- The graph on the right contains vertices (A, B, C, D) with properties.
- The index on the left is specifically for the
name
property. - The dashed arrows show the mapping:
- Looking up
"Alice"
in the index quickly leads to verticesA
andC
. - Looking up
"Bob"
leads to vertexB
.
- Looking up
This allows a query like "find all vertices where name is 'Alice'" to directly access nodes A and C via the index, instead of checking every vertex in the graph.
Defining Hash Indexes
In Graph API, you define a hash index by adding the #[index(hash)]
attribute to a field in your vertex enum:
#[derive(Debug, VertexExt)]
pub enum IndexedVertex {
// Person vertex with various properties
Person {
name: String, // Not indexed
#[index(hash)] // Hash index for exact lookups
username: String,
},
}
How Hash Indexes Work
When you apply the #[index(hash)]
attribute to a field:
- The derive macro generates a hash index entry for that field
- The graph implementation maintains a hash map from property values to vertices
- When you query using the index, the graph can directly look up matching vertices in constant time
Querying with Hash Indexes
The real power of hash indexes becomes apparent when querying the graph:
// Find a person by their username (using hash index)
let julia = graph
.walk()
.vertices(Vertex::person_by_username("julia456"))
.first();
println!("Found Julia: {}", julia.is_some());
Performance Benefits
Hash indexes offer significant performance advantages:
- Constant time lookups: O(1) rather than O(n) for full scans
- Reduced memory pressure: Only relevant vertices are loaded
- Improved scalability: Performance remains consistent as the graph grows
When to Use Hash Indexes
Hash indexes are ideal for:
- Unique identifiers: User IDs, product codes, etc.
- Common lookup fields: Names, titles, categories
- Fields used in equality predicates: Where you need exact matches
Best Practices
When using hash indexes:
- Be selective: Only index fields you frequently query
- Choose appropriate fields: Index fields with high selectivity
- Consider cardinality: Fields with many unique values benefit most from indexing
- Balance maintenance cost: Each index adds storage and update overhead
Index Limitations
Hash indexes have some limitations:
- Exact matches only: Cannot handle range or partial text queries
- Memory overhead: Each index increases memory usage
- Update cost: Indexes must be maintained when data changes
For range queries or partial text matching, consider range indexes or full-text indexes respectively.
Range Indexes
Range indexes enable efficient querying for values within a specific range, such as finding people of a certain age range or products within a price bracket.
What are Range Indexes?
A range index organizes data in a way that optimizes searches for values within ranges rather than exact matches. This allows for efficient "greater than," "less than," and "between" queries.
Consider an index on an age
property:
In this diagram:
- The graph on the right contains vertices (A, B, C, D) with an
age
property. - The range index on the left stores
(age, vertex)
pairs, crucially sorted by age. Notice howage: 35
appears twice, pointing to bothB
andD
. - When a range query like
age >= 35
is executed:- The index efficiently locates the starting point for the value
35
. - It then scans forward through the sorted index entries (35, 35, 42) until the condition is no longer met.
- The vertices associated with these index entries (
B
,D
,C
) are identified as the result.
- The index efficiently locates the starting point for the value
- The orange highlighting shows the portion of the index scanned (
age >= 35
) and the resulting vertices (B
,D
,C
) in the graph. - Blue arrows point from the selected index entries to the corresponding graph vertices.
This is much faster than checking the age
property of every single vertex in the graph for large datasets.
Defining Range Indexes
In Graph API, you define a range index by adding the #[index(range)]
attribute to a field in your vertex or edge enum:
#[derive(Debug, VertexExt)]
pub enum IndexedVertex {
// Person vertex with various properties
Person {
name: String, // Not indexed
#[index(range)] // Range index for range lookups
age: u8,
},
}
How Range Indexes Work
When you apply the #[index(range)]
attribute to a field:
- The derive macro generates a range index entry for that field
- The graph implementation maintains an ordered data structure mapping property values to vertices
- When you make a range query, the graph can efficiently find all values within the specified range
Range indexes typically use ordered data structures like B-trees or skip lists to enable efficient range lookups.
Querying with Range Indexes
The primary benefit of range indexes is the ability to perform efficient range queries:
// Find people within a specific age range (29-35)
let age_range_results = graph
.walk()
.vertices(Vertex::person_by_age_range(29..36))
.collect::<Vec<_>>();
println!("Found {} people aged 29-35", age_range_results.len());
Performance Benefits
Range indexes offer significant performance advantages for range-based queries:
- Logarithmic lookup time: O(log n) rather than O(n) for full scans
- Reduced memory pressure: Only relevant vertices are loaded
- Efficient for slices: Get only the elements within a specific range
When to Use Range Indexes
Range indexes are ideal for:
- Numeric properties: Age, price, quantity, ratings
- Date/time values: Timestamps, creation dates, expiration dates
- Sequential identifiers: Version numbers, sequential IDs
- Properties used in inequality filters: Where you need >, <, >=, <= comparisons
Best Practices
When using range indexes:
- Apply to ordered data: Only index fields where ordering is meaningful
- Consider data distribution: Range indexes work best with evenly distributed values
- Time vs. space tradeoff: Range indexes may use more space than hash indexes
- Use for common queries: Index fields frequently used in range filters
- Choose proper data types: Use comparable types that have a meaningful ordering
Index Limitations
Range indexes have some limitations:
- Storage overhead: Typically larger than hash indexes
- Update cost: Maintaining order requires more work when updating
- Not for full-text search: For text search use full-text indexes instead
Supported Types
Range indexes can be applied to:
- Numeric types: integers (
u8
,i32
, etc.), floating-point numbers (f32
,f64
) - String types:
String
,&str
(lexicographic ordering) - Date/time types: When represented as ISO-8601 strings or numeric timestamps
- Other ordered types: Any type that implements
Ord
and appropriate serialization
Full-text Indexes
Full-text indexes enable powerful text search capabilities, allowing you to find vertices containing specific words or phrases within a text field.
What are Full-text Indexes?
A full-text index is a specialized index that processes text fields to enable efficient searching based on word content. Unlike standard indexes that require exact matches, full-text indexes allow you to find vertices that contain specific words, regardless of their position within the text.
Consider an index on a description
property:
In this diagram:
- The graph on the right has vertices (A, B, C) with text
description
properties. - The full-text index on the left is an inverted index:
- It lists processed tokens (like 'fast', 'graph', 'traversal').
- For each token, it points to the vertices (
A
,B
,C
) whosedescription
contains that token after processing. Note how 'graph' points to bothA
andB
.
- When a query like
description CONTAINS 'traversal'
is performed:- The index is used to look up the token 'traversal'.
- The index directly provides the list of matching vertices:
[ B, C ]
.
- The orange highlighting shows the index entry for 'traversal' being used and the resulting vertices (
B
,C
) identified in the graph. - Blue arrows point from the selected index entry to the corresponding graph vertices.
This approach is fundamental to searching documentation, product descriptions, user comments, or any unstructured text associated with graph elements.
Defining Full-text Indexes
In Graph API, you define a full-text index by using the #[index(full_text)]
attribute on string fields:
#[derive(Debug, VertexExt)]
pub enum IndexedVertex {
// Person vertex with various properties
Person {
name: String, // Not indexed
#[index(hash)] // Hash index for exact lookups
username: String,
},
}
How Full-text Indexes Work
Behind the scenes, full-text indexes:
- Process text by splitting into words (tokenization)
- Normalize words (lowercasing, removing punctuation)
- Create an inverted index mapping words to vertices
- Enable efficient lookup by word or phrase
Querying with Full-text Indexes
Full-text indexes dramatically simplify text search operations:
// Find people with "developer" in their biography
let developers = graph
.walk()
.vertices(Vertex::person_by_biography("developer"))
.collect::<Vec<_>>();
println!("Found {} people who are developers", developers.len());
Performance Benefits
Full-text indexes provide significant advantages for text search:
- Efficient keyword matching: Find text containing specific words without scanning
- Reduced memory requirements: Only load relevant vertices
- Better user experience: Enable natural language search patterns
- Improved relevance: Return results based on word presence rather than exact matches
When to Use Full-text Indexes
Full-text indexes are ideal for:
- Content search: Articles, posts, descriptions
- User profiles: Biographies, skills, interests
- Product descriptions: Features, benefits, specifications
- Documentation: API details, manuals, guides
- Search functionality: Implementing search features in your application
Best Practices
When using full-text indexes:
- Choose appropriate fields: Apply to content-rich text fields
- Consider search patterns: Think about how users will search
- Balance with standard indexes: Use standard indexes for fields requiring exact matches
- Be mindful of size: Full-text indexes can be larger than standard indexes
Limitations
Full-text indexes have some limitations:
- String fields only: Only applicable to string properties
- Implementation dependent: Search capabilities vary by graph implementation
- Tokenization limitations: Basic word splitting may not handle all languages equally
- Update complexity: Maintaining the index adds overhead during updates
For range-based queries, see range indexes.
Derive Macros
Graph API's derive macros automatically generate code that integrates your custom types with the Graph API framework, making your graph operations type-safe and ergonomic. This section explains the types that are generated and how to use them effectively.
Overview
Graph API provides two primary derive macros:
VertexExt
- For vertex enum typesEdgeExt
- For edge enum types
// Create a new graph with our model
let mut graph = SimpleGraph::<Vertex, Edge>::new();
// Add vertices of different types
let alice = graph.add_vertex(Vertex::Person {
name: "Alice".to_string(),
username: "alice_dev".to_string(),
biography: "Software engineer and graph enthusiast".to_string(),
age: 29,
});
let project = graph.add_vertex(Vertex::Project {
name: "GraphDB".to_string(),
});
// Add edges connecting vertices
graph.add_edge(alice, project, Edge::Created);
// Use type-safe projections for property access
let person_info = graph
.walk()
.vertices(VertexSearch::scan())
.filter_person() // Generated helper method
.map(|v, _| {
// Use type-safe projection
let person = v.project::<Person<_>>().unwrap();
// Return formatted string with person information
format!(
"Person: {} (@{})\n Bio: {}\n Age: {}",
person.name(),
person.username(),
person.biography(),
person.age()
)
})
.collect::<Vec<_>>();
// Print the collected information
for info in person_info {
println!("{}", info);
}
Generated Types for VertexExt
When you apply #[derive(VertexExt)]
to an enum, several useful types are generated:
1. Label Enum (VertexLabel)
A type-safe enum representing the variants of your vertex enum:
// Given this vertex definition
#[derive(VertexExt)]
enum Vertex {
Person { /* fields */ },
Project { /* fields */ },
Comment { /* fields */ },
}
// This is generated
pub enum VertexLabel {
Person,
Project,
Comment,
}
Purpose: Provides type-safe representations of vertex types, used for label-based indexing and filtering.
2. Index selectors
Selectors to make querying easier:
impl Vertex {
// Find all Person vertices
pub fn person() -> VertexSearch { /* ... */ }
// Find Person vertices by indexed properties
pub fn person_by_username(username: &str) -> VertexSearch { /* ... */ }
pub fn person_by_biography(text: &str) -> VertexSearch { /* ... */ }
pub fn person_by_age_range(range: Range<u8>) -> VertexSearch { /* ... */ }
}
Purpose: Enables efficient querying of vertices by label and properties.
3. Projection Types
For each variant with fields, two projection structs are generated:
// For Person variant
pub struct Person<'a, V> {
// Immutable references to fields
name: &'a String,
username: &'a String,
biography: &'a String,
age: &'a u8,
unique_id: &'a Uuid,
}
pub struct PersonMut<'a, V> {
// Mutable references to fields
name: &'a mut String,
username: &'a mut String,
biography: &'a mut String,
age: &'a mut u8,
unique_id: &'a mut Uuid,
}
Purpose: Provides type-safe access to variant fields without manually matching on enum variants.
4. Filter Extension Traits
Extension methods for the walker builder to filter by vertex type:
// On WalkerBuilder
// For all Person vertices
fn filter_person(self) -> Self;
// For Person vertices with custom criteria
fn filter_by_person<F>(self, filter: F) -> Self
where
F: Fn(Person<Vertex>, &Context) -> bool;
Purpose: Enables type-safe filtering of vertices during traversals.
Generated Types for EdgeExt
When you apply #[derive(EdgeExt)]
to an enum, similar types are generated:
1. Label Enum (EdgeLabel)
// Given this edge definition
#[derive(EdgeExt)]
pub enum Edge {
Created,
Follows,
Liked { timestamp: String },
Commented { timestamp: String },
}
// This is generated
pub enum EdgeLabel {
Created,
Follows,
Liked,
Commented,
}
Purpose: Provides type-safe representations of edge types, used for label-based indexing and filtering.
2. Index selectors
Selectors to make querying easier
impl Edge {
// Find all Created edges
pub fn created() -> EdgeSearch { /* ... */ }
// Find all Follows edges
pub fn follows() -> EdgeSearch { /* ... */ }
// Find all Liked edges
pub fn liked() -> EdgeSearch { /* ... */ }
// Find all Commented edges
pub fn commented() -> EdgeSearch { /* ... */ }
}
Purpose: Enables efficient querying of edges by label.
3. Projection Types
For each variant with fields, projection structs are generated:
// For Liked variant
pub struct Liked<'a, E> {
timestamp: &'a String,
}
pub struct LikedMut<'a, E> {
timestamp: &'a mut String,
}
Purpose: Provides type-safe access to edge properties.
4. Filter Extension Traits
Extension methods for the walker builder to filter by edge type:
// On WalkerBuilder
// For all Liked edges
fn filter_liked(self) -> Self;
// For Liked edges with custom criteria
fn filter_by_liked<F>(self, filter: F) -> Self
where
F: Fn(Liked<Edge>, &Context) -> bool;
Purpose: Enables type-safe filtering of edges during traversals.
Using the Generated Types
Vertex Querying
// Find all Person vertices
let people = graph
.walk()
.vertices(Vertex::person())
.collect::<Vec<_>>();
// Find Person vertices with a specific username
let user = graph
.walk()
.vertices(Vertex::person_by_username("bryn123"))
.first()
.unwrap();
// Find Person vertices in an age range
let adults = graph
.walk()
.vertices(Vertex::person_by_age_range(18..65))
.collect::<Vec<_>>();
Edge Traversal
// Find outgoing Created edges from a person
let created_projects = graph
.walk()
.vertices_by_id([person_id])
.edges(Edge::created().outgoing())
.tail() // Follow edges to target vertices
.collect::<Vec<_>>();
Type-Safe Filtering
// Find Person vertices that match specific criteria
let experienced_devs = graph
.walk()
.vertices(Vertex::person())
.filter_by_person(|person, _| {
person.age() > 25 && person.biography().contains("developer")
})
.collect::<Vec<_>>();
Projection for Type Safety
// Working with a vertex reference
if let Some(vertex_ref) = graph.vertex(vertex_id) {
// Project to Person variant
if let Some(person) = vertex_ref.project::<Person<_>>() {
println!("Name: {}, Age: {}", person.name(), person.age());
}
}
Index Attributes
You can apply these attributes to fields to control indexing behavior:
#[index]
- Standard index for exact match queries#[index(range)]
- Range index for range queries#[index(full_text)]
- Full-text index for text search (String fields only)
#[derive(VertexExt)]
enum User {
Profile {
#[index] // Standard index
username: String,
#[index(range)] // Range queries possible
age: u32,
#[index(full_text)] // Text search possible
bio: String,
// Not indexed
email: String,
}
}
Best Practices
-
Selective Indexing: Only add indexes to fields you'll frequently query, as each index increases memory usage.
-
Choose Appropriate Index Types:
- Use
#[index]
for exact match lookups - Use
#[index(range)]
for numeric fields that need range queries - Use
#[index(full_text)]
for text fields that need substring or keyword search
- Use
-
Use Type-Safe Methods:
- Prefer
filter_by_person()
over genericfilter()
with manual pattern matching - Use the generated search methods (
person_by_username()
, etc.) for efficient queries
- Prefer
-
Leverage Projections:
- Use the projection types to safely access variant fields without repetitive pattern matching
- This makes your code more maintainable and less error-prone
-
Consider Query Performance:
- Using an indexed search in the initial
vertices()
step is typically more efficient than scanning and then filtering - The more you can narrow your search using indexes, the better your graph traversal performance
- Using an indexed search in the initial
For more detailed information on the derive macros, see the API Reference.
Basic Operations
Overview
Once you've defined your graph model, you can perform basic operations like adding vertices and edges, querying the graph, and modifying elements. This guide covers the fundamental operations available in Graph API.
Creating a Graph
To start working with graphs, first create a new graph instance:
// Import the standard model and create a new graph
let mut graph = SimpleGraph::<Vertex, Edge>::new();
Adding Vertices
Once you have a graph instance, you can add vertices to it:
// Add a person vertex
let bryn = graph.add_vertex(Vertex::Person {
name: "Bryn".to_string(),
username: "bryn123".to_string(),
biography: "Graph enthusiast".to_string(),
age: 28,
});
// Add a project vertex
let project = graph.add_vertex(Vertex::Project {
name: "GraphAPI".to_string(),
});
// Add a comment vertex
let comment = graph.add_vertex(Vertex::Comment {
text: "Great project!".to_string(),
date: "2023-05-15".to_string(),
});
Adding Edges
Edges connect vertices in your graph:
// Add simple edges
graph.add_edge(bryn, project, Edge::Created);
graph.add_edge(julia, bryn, Edge::Follows);
// Add edges with properties
graph.add_edge(
bryn,
project,
Edge::Liked {
timestamp: "2023-01-15".to_string(),
},
);
Querying Vertices
You can query vertices using the walker API. Here are some examples:
Full Scan
When you need to find all vertices in the graph:
// Get all vertices in the graph
let all_vertices = graph
.walk()
.vertices() // Start with all vertices
.collect::<Vec<_>>();
// Count vertices by type using type-safe filters
let person_count = graph
.walk()
.vertices()
.filter_person() // Use generated helper method
.count();
let project_count = graph
.walk()
.vertices()
.filter_project() // Use generated helper method
.count();
Label-Based Lookup
For more efficient queries, use label-based indexes:
// Find all Person vertices using label index
let all_people = graph
.walk()
.vertices()
.filter_person() // Use generated helper for label filtering
.collect::<Vec<_>>();
// Find all Project vertices using label index
let all_projects = graph
.walk()
.vertices()
.filter_project() // Use generated helper for label filtering
.collect::<Vec<_>>();
Property Filtering
Filter vertices based on their properties:
// Find people over 30 years old
let older_people = graph
.walk()
.vertices()
.filter_by_person(|person, _| {
// Type-safe access to Person properties
person.age() > 30
})
.map(|v, _| {
// Use type-safe projection and accessor methods
let person = v.project::<Person<_>>().unwrap();
format!("{} ({})", person.name(), person.age())
})
.collect::<Vec<_>>();
// Find people with "programmer" in their biography
let programmers = graph
.walk()
.vertices()
.filter_by_person(|person, _| {
// Type-safe access to Person properties
person.biography().contains("programmer")
})
.map(|v, _| {
// Use type-safe projection and accessor methods
v.project::<Person<_>>().unwrap().name().to_string()
})
.collect::<Vec<_>>();
Working with Edges
The walker API also provides powerful tools for working with edges:
Finding Connected Edges
Get all edges connected to a vertex:
// Get all edges connected to a specific vertex
let connected_edges = graph
.walk()
.vertices_by_id(vec![person_id])
.edges(EdgeSearch::scan()) // Get all connected edges
.collect::<Vec<_>>();
Directional Edge Queries
Specify whether you want incoming or outgoing edges:
// Get only outgoing edges
let outgoing_edges = graph
.walk()
.vertices_by_id(vec![person_id])
.edges(EdgeSearch::scan().outgoing()) // Only outgoing edges
.collect::<Vec<_>>();
// Get only incoming edges
let incoming_edges = graph
.walk()
.vertices_by_id(vec![person_id])
.edges(EdgeSearch::scan().incoming()) // Only incoming edges
.collect::<Vec<_>>();
Label-Based Edge Filtering
Filter edges by their label:
// Find all "Created" edges in the graph
let creation_edges = graph
.walk()
.edges(EdgeSearch::scan())
.filter_created() // Type-safe filter using generated helper
.collect::<Vec<_>>();
// Find all "Follows" edges in the graph
let follow_edges = graph
.walk()
.edges(EdgeSearch::scan())
.filter_follows() // Type-safe filter using generated helper
.collect::<Vec<_>>();
// Find all edges with timestamp properties
let timestamped_edges = graph
.walk()
.edges(EdgeSearch::scan())
.filter(|e, _| match e.weight() {
Edge::Liked { timestamp } => true,
Edge::Commented { timestamp } => true,
_ => false,
})
.collect::<Vec<_>>();
Modifying the Graph
You can modify the graph during traversal using the mutation API:
Updating Vertex Properties
// Update all Person vertices to increment their age
let updated_count = graph
.walk_mut()
.vertices()
.filter_person() // Type-safe filter using generated helper
.mutate(|v| {
// Type-safe mutation using projection
if let Some(mut person) = v.project_mut::<Person<_>>() {
// Update the age property
let current_age = person.age();
person.set_age(current_age + 1);
true // Indicate that we modified the vertex
} else {
false // No modification
}
});
// Add a property to a specific person
graph
.walk_mut()
.vertices()
.filter_by_person(|p, _| p.username() == "bryn123")
.mutate(|v| {
// Type-safe mutation using projection
if let Some(mut person) = v.project_mut::<Person<_>>() {
// Update the biography property
person.set_biography("Updated biography: Graph API expert");
true // Indicate that we modified the vertex
} else {
false // No modification
}
});
Verifying Changes
After modification, you can verify the changes:
// Check that Bryn's biography was updated
let updated_bio = graph
.walk()
.vertices()
.filter_by_person(|p, _| p.username() == "bryn123")
.map(|v, _| {
// Type-safe access using projection
v.project::<Person<_>>().unwrap().biography().to_string()
})
.first();
// Verify all people are now older
let all_ages = graph
.walk()
.vertices()
.filter_person() // Type-safe filter using generated helper
.map(|v, _| {
// Type-safe access using projection
let person = v.project::<Person<_>>().unwrap();
(person.name().to_string(), person.age())
})
.collect::<Vec<_>>();
Next Steps
Now that you understand the basic operations, you can:
- Learn about Advanced Traversals using the Walker API
- Explore Context and State in graph traversals
- See Examples of Walker Steps to build more complex queries
Graph Traversal
Overview
The Graph API provides a powerful traversal interface called "walkers" that enables you to navigate and analyze your graph in a flexible, type-safe way. Walkers are the primary method for querying and manipulating graph data.
Basic Concepts
Walkers are built by chaining steps that define how to traverse the graph. Each step performs a specific operation, such as:
- Starting at specific vertices
- Moving to connected edges
- Filtering elements
- Collecting data
- Modifying the graph
Traversal Order (Depth-First)
By default, the Graph API walker performs a Depth-First Search (DFS) traversal. This means it explores as far as possible along each branch before backtracking.
In the diagram above:
- The traversal starts at node A.
- It explores the left branch first: A → B → D → E.
- After reaching the end of the left branch (E), it backtracks to B, then A.
- It then explores the right branch: A → C → F.
- The final DFS order is indicated by the numbers: A(1) → B(2) → D(3) → E(4) → C(5) → F(6).
- The orange arrows highlight the path taken during the traversal.
This DFS behaviour is fundamental to how walkers navigate the graph.
Examples
Basic Traversal
Here's an example that finds all Project vertices that were created by someone that a Person knows:
// Basic traversal example showing a complex path through the graph
pub fn basic_traversal_example<G>(graph: &G)
where
G: Graph<Vertex = Vertex, Edge = Edge> + SupportsVertexLabelIndex + SupportsEdgeLabelIndex,
{
// Find all Person vertices who know someone who created a Project
let _results = graph
.walk()
.vertices(Vertex::person()) // Start with Person vertices
.edges(Edge::follows()) // Follow "follows" edges
.tail() // Move to the target Person
.edges(Edge::created()) // Follow "created" edges
.tail() // Move to the Project
.filter_project()
.collect::<Vec<_>>(); // Collect results
}
This traversal:
- Starts with Person vertices
- Follows "knows" edges to find friends
- Follows "created" edges from friends
- Filters for Project vertices
- Collects the results
Working with Context
Walkers have a built-in context system that allows you to store and transform data as you traverse the graph:
// Context example showing how to calculate the total age of all friends
pub fn context_traversal_example<G>(graph: &G, person_id: G::VertexId)
where
G: Graph<Vertex = Vertex, Edge = Edge> + SupportsEdgeLabelIndex,
{
// Calculate total age of all friends of a person
let _total_age = graph
.walk()
.vertices_by_id(vec![person_id])
.edges(Edge::follows())
.tail()
.filter_person()
.push_context(|v, _| {
// Store age in context
if let Some(person) = v.project::<Person<_>>() {
person.age()
} else {
0
}
})
.fold(0, |acc, _, age| acc + age.deref());
}
In this example, we:
- Start with a specific person
- Find all friends (via "knows" edges)
- Filter for Person vertices
- Store each person's age in the context
- Sum up all the ages
Walker Steps
The Graph API provides many steps that can be combined to create complex traversals. Each step is documented with:
See the Walker Steps section for detailed documentation on each available step.
Walker Overview
Walkers are the central concept in Graph API traversals. They provide a fluent interface for exploring and manipulating graph data.
What are Walkers?
A walker represents a traversal through a graph. It consists of:
- A current position in the graph (vertices or edges)
- Context data that can be carried along the traversal
- A sequence of steps that define the traversal path
Walkers use a builder pattern where each step returns a new walker with the accumulated operations.
How Walkers Work
- Start with an empty walker using
graph.walk()
- Chain steps to define your traversal:
.vertices().edges().tail()
- End with a terminal operation:
.collect()
,.first()
,.count()
, etc.
Each step in the chain modifies the traversal in a specific way, moving to different elements, filtering, or collecting data.
Types of Walkers
The Graph API has two main types of walkers:
- Vertex Walkers: Traverse vertex elements
- Edge Walkers: Traverse edge elements
Some operations switch between these types. For example, .edges()
converts a vertex walker to an edge walker, while
.head()
and .tail()
convert an edge walker to a vertex walker.
Walker States
A walker can be in one of several states:
- Empty: No elements to traverse (starting state)
- Active: Contains elements to traverse
- Terminal: Has performed its final operation
Creating Walkers
You create a walker by calling the walk()
method on a graph:
let walker = graph.walk();
This returns an empty walker that can be used to start your traversal.
Starting Traversals
There are several ways to populate a walker with initial elements:
// Start with all vertices
graph.walk().vertices(VertexSearch::scan())
// Start with specific vertex IDs
graph.walk().vertices_by_id(vec![vertex_id1, vertex_id2])
// Start with vertices matching criteria
graph.walk().vertices(VertexSearch::scan().with_label(Person::label()))
Walker Flow Control
Walkers provide several ways to control the traversal flow:
- Filtering: Keep only elements matching a condition
- Limiting: Restrict the number of elements processed
- Branching: Create sub-traversals that explore different paths
- Early Termination: Stop traversal after finding a match
Example Walker Traversals
Basic Traversal
// Find all Project vertices created by a Person
let _results = graph
.walk()
.vertices(Vertex::person()) // Start with all Person vertices
.edges(Edge::created()) // Follow "Created" edges
.tail() // Move to the target vertices (Projects)
.collect::<Vec<_>>(); // Collect the Project vertices
Multi-Step Traversal
// Find all people who follow someone who created a project
let _results = graph
.walk()
.vertices(Vertex::person()) // Start with all Person vertices
.edges(Edge::follows()) // Follow "follows" edges
.tail() // Move to the followed person
.edges(Edge::created()) // Find "created" edges from these people
.tail() // Move to the Project vertices
.collect::<Vec<_>>(); // Collect results
Traversal with Detour
// For each person, find the projects they created
let _results = graph
.walk()
.vertices(Vertex::person()) // Start with all Person vertices
.detour(|person_walker| {
// For each person, collect the projects they created
person_walker
.edges(Edge::created()) // Follow "created" edges
.tail() // Move to the target (projects)
.take(1) // Only need one match to qualify
})
// Continue with original person vertices that have created projects
.collect::<Vec<_>>(); // Collect results
Next Steps
To learn more about specific walker steps, see the Walker Steps documentation.
Walker Steps
Walker steps are the building blocks for graph traversals. Each step performs a specific operation on the traversal, such as moving the position, filtering elements, or collecting results.
What are Walker Steps?
Walker steps are chainable operations that build a graph traversal. Each step performs a specific operation on the traversal, such as:
- Moving the traversal position (vertices, edges, head, tail)
- Filtering elements (filter)
- Limiting results (limit, first)
- Collecting results (collect)
- Modifying the graph (mutate)
- And many more
Available Steps
Traversal Initiation
- vertices - Start traversal from vertices matching criteria
- vertices_by_id - Start traversal from vertices with specific IDs
Traversal Movement
- detour - Create a sub-traversal from the current position
- edges - Traverse along edges
- head - Move to source vertices of edges
- tail - Move to target vertices of edges
Filtering and Limiting
- filter - Filter elements based on a predicate
- first - Get only the first element
- take - Take a specified number of elements
Context and Data Handling
- push_context - Associate custom data with traversal elements
- default_context - Use predefined context for common patterns
- mutate_context - Modify context during traversal
Terminal Operations
- collect - Gather results into a collection
- count - Count elements in the traversal
- into_iter - Convert traversal to an iterator
- fold - Fold elements into an accumulated value
- map - Transform elements during traversal
- reduce - Combine elements using a reduction function
Control Flow
- control_flow - Control traversal flow and early termination
Side effects
Debugging
- dbg - Print debug information during traversal
Vertices Step
The vertices
step starts a traversal by selecting an initial set of vertices from the graph based on specified criteria (e.g., using an index or a scan). This step initiates the stream of elements for subsequent walker operations.
In this diagram:
- The Conceptual Graph Vertices represent the available vertices in the graph (A: Person, B: Product, C: Person, D: Review).
- The
graph.walk().vertices(Vertex::person())
step is applied, using criteria (like a label index) to select only "Person" vertices. - The Output Stream contains only the selected vertices A and C, which match the criteria and become the initial elements for the rest of the traversal.
Syntax
graph.walk().vertices(search_criteria)
Where search_criteria
is a VertexSearch
object or a predefined search from an index.
Parameters
search_criteria
: AVertexSearch
object that defines the criteria for selecting vertices
Return Value
Returns a new walker positioned at the vertices matching the search criteria.
Examples
Full Scan
When you need to find all vertices in a graph:
// Scan all vertices in the graph
// This performs a full graph scan, which can be expensive for large graphs
let all_vertices = graph
.walk()
.vertices(VertexSearch::scan())
.collect::<Vec<_>>();
println!("Found {} total vertices in the graph", all_vertices.len());
Using a Label Index
For more efficient queries, use label-based indexes:
// Use a label-based index for more efficient lookups
// This narrows the search to only person vertices
let people = graph.walk().vertices(Vertex::person()).collect::<Vec<_>>();
println!("Found {} person vertices", people.len());
Property-Based Filtering
Find vertices based on their properties:
// Use property-based filtering
// This finds vertices with a specific property value
let people_named_bob = graph
.walk()
.vertices(VertexSearch::scan())
.filter_person() // First filter by vertex type
.filter_by_person(|person, _| {
// Then use type-safe accessor methods
person.name() == "Julia"
})
.collect::<Vec<_>>();
println!("Found {} people named Bob", people_named_bob.len());
Combined Filtering
Chain multiple conditions for complex queries:
// Combine filtering to find young people
// Filter after retrieval when specialized indexes aren't available
let young_people = graph
.walk()
.vertices(Vertex::person()) // Get all Person vertices
.filter_by_person(|person, _| {
// Use type-safe accessor methods
person.age() < 30 // Find people under 30
})
.collect::<Vec<_>>();
println!("Found {} people under age 30", young_people.len());
Best Practices
- Start with the most specific index when possible instead of scanning all vertices
- Use specialized indexes for frequently queried properties to improve performance
- Combine multiple search criteria to narrow results early in the traversal
- For ordered results, rely on range indexes rather than sorting later
Common Use Cases
- Entity retrieval: Finding vertices of a specific type (e.g., all users, products, etc.)
- Initial selection: Starting traversals by selecting entry points based on criteria
- Filtered starting sets: Beginning with a targeted subset that matches complex conditions
- Index-driven queries: Leveraging custom indexes for specialized lookups based on specific properties
Vertices By ID Step
The vertices_by_id
step starts a traversal from a specific set of vertices identified by their IDs. This is typically the most efficient way to start a traversal when you know exactly which vertices you want to begin with.
In this diagram:
- The Conceptual Graph Vertices represent available vertices V1 (id=1), V2 (id=2), V3 (id=3), and V4 (id=4).
- The
graph.walk().vertices_by_id([1, 3])
step is applied, specifying IDs 1 and 3. - Pointers indicate that the step uses these IDs to look up the corresponding vertices (V1 and V3) from the graph.
- The Output Stream contains only the selected vertices V1 and V3, which become the initial elements for the rest of the traversal.
Syntax
graph.walk().vertices_by_id(ids)
Where ids
is an iterator yielding vertex IDs.
Parameters
ids
: An iterator yielding vertex IDs to include in the traversal
Return Value
Returns a new walker positioned at the vertices with the specified IDs.
Examples
Basic Usage
Start a traversal with specific vertex IDs:
// Start a traversal with specific vertex IDs
let specific_vertices = graph
.walk()
.vertices_by_id(vec![bryn_id, julia_id])
.collect::<Vec<_>>();
println!("Found {} specific vertices", specific_vertices.len());
Following Relationships
Start from a specific vertex and follow its relationships:
// Start with specific vertices and follow relationships
let knows_relationships = graph
.walk()
.vertices_by_id(vec![bryn_id])
.edges(Edge::follows().outgoing())
.tail()
.collect::<Vec<_>>();
println!("Person knows {} other people", knows_relationships.len());
Using Dynamically Collected IDs
Use IDs collected from a previous traversal:
// Using dynamically collected IDs
// First, find all project vertices
let project_vertices = graph.walk().vertices(Vertex::project()).collect::<Vec<_>>();
// Use those IDs to start a new traversal
let projects = graph
.walk()
.vertices_by_id(project_vertices)
.collect::<Vec<_>>();
println!("Found {} projects using vertices_by_id", projects.len());
Best Practices
- Prefer
vertices_by_id
over other methods when you already have the exact IDs - Be aware that invalid IDs are silently skipped rather than causing errors
- Keep track of the original ID order if the order of results matters
- Consider batching large ID collections for better performance in extensive traversals
Common Use Cases
- Known entry points: Starting traversals from specific, known vertices
- Multi-stage traversals: Using the results of one traversal as the starting point for another
- External ID mapping: Starting from IDs provided by external systems or caches
- Selective subgraph processing: Working with a specific subset of vertices identified by ID
Edges Step
The edges
step traverses from vertices to their connecting edges, allowing navigation along relationships in the
graph. This step shifts the walker's position from vertices to their adjacent edges, transforming a stream of vertices into a stream of edges.
In this diagram:
- An Input Stream contains vertex A.
- Vertex A has outgoing edges: A->B (likes) and A->C (created).
- The
.edges(EdgeSearch::scan())
step processes vertex A. - The Output Stream contains the edge elements A->B and A->C connected to vertex A.
Syntax
walker.edges(search_criteria)
Where search_criteria
is an EdgeSearch
object or a predefined search from an index.
Parameters
search_criteria
: AnEdgeSearch
object that defines criteria for selecting edges, including:- Edge labels
- Direction (incoming, outgoing, or both)
- Property values (when supported)
Return Value
Returns a new walker positioned at the edges matching the search criteria.
Examples
Finding All Connected Edges
Get all edges connected to a vertex:
// Get all edges (both incoming and outgoing) from a vertex
let all_connected_edges = graph
.walk()
.vertices_by_id([bryn])
.edges(EdgeSearch::scan())
.collect::<Vec<_>>();
println!(
"Found {} total edges connected to Bryn",
all_connected_edges.len()
);
Directional Edge Queries
Specify whether you want incoming or outgoing edges:
// Get only outgoing edges from a vertex
let outgoing_edges = graph
.walk()
.vertices_by_id([bryn])
.edges(EdgeSearch::scan().outgoing())
.collect::<Vec<_>>();
println!("Found {} outgoing edges from Bryn", outgoing_edges.len());
// Get only incoming edges to a vertex
let incoming_edges = graph
.walk()
.vertices_by_id([bryn])
.edges(EdgeSearch::scan().incoming())
.collect::<Vec<_>>();
println!("Found {} incoming edges to Bryn", incoming_edges.len());
Label-Based Edge Filtering
Filter edges by their label:
// Get only edges with a specific label
// Using the label index is more efficient
let created_edges = graph
.walk()
.vertices_by_id([bryn])
.edges(Edge::created())
.collect::<Vec<_>>();
println!("Found {} 'Created' edges for Bryn", created_edges.len());
Combined Filtering
Combine direction and label filtering:
// Combine direction and label filtering
let outgoing_follows_edges = graph
.walk()
.vertices_by_id([bryn])
.edges(Edge::follows().outgoing())
.collect::<Vec<_>>();
println!("Bryn follows {} people", outgoing_follows_edges.len());
// Find incoming follows edges (people who follow Bryn)
let incoming_follows_edges = graph
.walk()
.vertices_by_id([bryn])
.edges(Edge::follows().incoming())
.collect::<Vec<_>>();
println!("{} people follow Bryn", incoming_follows_edges.len());
Best Practices
- Specify the direction when possible to limit the search space
- Use label-based indexes to avoid scanning all edges
- Follow edges step with head() or tail() to continue vertex-based traversals
- Consider the naming of relationships to match conceptual understanding
Common Use Cases
- Relationship navigation: Moving from vertices to their connections
- Filtered relationships: Finding specific types of connections between vertices
- Direction-specific queries: Finding incoming or outgoing relationships
- Relationship property examination: Inspecting metadata on connections
Head Step
The head
step navigates from edges to their target (destination) vertices, allowing traversal to where the edges point to. It transforms a stream of edges into a stream of vertices.
In this diagram:
- An Input Stream contains edge elements (e.g., A->B, C->D).
- The
.head()
step processes each edge. - The Output Stream contains the corresponding target (head) vertices (B, D) for each input edge.
Syntax
walker.head()
Parameters
This step takes no parameters.
Return Value
Returns a new walker positioned at the target vertices of the edges in the current traversal.
Example
Find projects created by people known by a starting person:
// Find the projects created by people followed by the starting person
let projects: Vec<_> = graph
.walk()
.vertices_by_id([start_person_id]) // Start at a specific person
.edges(EdgeSearch::scan().outgoing()) // Follow outgoing edges
.filter_follows() // Keep only 'Follows' edges
.head() // Move to the target vertices (people followed by the start person)
.edges(EdgeSearch::scan().outgoing()) // Follow outgoing edges from these people
.filter_created() // Keep only 'Created' edges
.head() // Move to the target vertices (projects created by known people)
.collect();
println!(
"Projects created by people followed by the starting person ({:?}):",
projects
);
Best Practices
- Use
head()
to follow relationships in their natural direction (to the target). - Chain edge-head sequences for multi-hop traversals towards targets.
- Maintain context information when necessary to preserve edge properties while moving to the target.
- Consider traversal depth carefully in highly connected graphs when following edges.
Common Use Cases
- Following relationships: Finding what vertices are connected to your starting points (targets).
- Multi-hop traversals: Discovering indirect connections through multiple relationships towards targets.
- Graph exploration: Navigating through the graph in a directed manner towards targets.
- Social network queries: Implementing patterns like "friends of friends" or "recommendations" by moving to targets.
Tail Step
The tail
step navigates from edges to their source (origin) vertices, allowing traversal back to where the edges
start from. It transforms a stream of edges into a stream of vertices.
In this diagram:
- An Input Stream contains edge elements (e.g., A->B, C->D).
- The
.tail()
step processes each edge. - The Output Stream contains the corresponding source (tail) vertices (A, C) for each input edge.
Syntax
walker.tail()
Parameters
This step takes no parameters.
Return Value
Returns a new walker positioned at the source vertices of the edges in the current traversal.
Examples
Basic Tail Step
Find people who created projects by getting back to the source vertex:
// Find projects created by people
// 1. Start with all people
// 2. Find "created" edges they made
// 3. Use head() to get to the projects (the target vertices)
let projects = graph
.walk()
.vertices(Vertex::person())
.edges(Edge::created().outgoing())
.head() // Move to the target vertex of the edge
.collect::<Vec<_>>();
println!("Found {} projects created by people", projects.len());
Multi-Step Traversal
Traverse multiple relationships to find indirect connections:
// Find people who follow someone who created a project
// This demonstrates chaining head and tail multiple times
let indirect_creators = graph
.walk()
.vertices(Vertex::person())
.edges(Edge::follows().outgoing())
.tail() // Move to the people they follow
.edges(Edge::created().outgoing())
.tail() // Move to the projects created by those people
.collect::<Vec<_>>();
println!(
"Found {} projects created by people the original people know",
indirect_creators.len()
);
Best Practices
- Always follow edge traversals with either
head()
ortail()
to return to vertices. - Use contexts to retain information about the edge when moving back to the source vertex via
tail()
. - Remember that
tail()
returns the source vertex (where the edge starts). - For retrieving both endpoints, consider using context to store one while visiting the other.
Common Use Cases
- Author identification: Finding who created something by looking at edge sources (
tail()
). - Relationship sources: Identifying the initiators of connections (
tail()
). - Backtracking: Returning to source vertices after examining edges (
tail()
). - Edge-based filtering: Finding vertices that have specific incoming edges (by traversing the edge and using
tail()
).
Filter Step
The filter
step narrows a traversal by keeping only vertices or edges that match a specified predicate.
In this diagram:
- An Input Stream contains elements A (age=35), B (age=25), and C (age=40).
- The
.filter(|v| v.age > 30)
step processes each element, applying the predicate. - Elements A and C satisfy the predicate (age > 30) and pass through to the Output Stream.
- Element B does not satisfy the predicate and is Discarded, marked with an 'X'.
Syntax
walker.filter(|element, context| /* predicate logic */)
Parameters
predicate
: A function that takes a reference to a graph element and optional context, and returns a boolean.- Returns
true
to keep the element in the traversal - Returns
false
to remove it from the traversal
- Returns
Return Value
Returns a new walker containing only the elements that match the predicate.
Examples
Basic Filter
Filter vertices based on a simple condition:
// Basic filter using a closure
let adult_people = graph
.walk()
.vertices(Vertex::person())
.filter_by_person(|person, _| person.age() >= 18)
.collect::<Vec<_>>();
println!("Found {} adults", adult_people.len());
Type-Specific Filtering
Use the type-specific filter methods generated by derive macros:
// Use the type-specific filter methods generated by the VertexExt derive macro
let people_named_b = graph
.walk()
.vertices(Vertex::person())
.filter_by_person(|person, _| {
// This closure gets a strongly-typed view of the Person data
person.name().starts_with('B')
})
.collect::<Vec<_>>();
println!(
"Found {} people whose names start with B",
people_named_b.len()
);
Chained Filters
Combine multiple filters for complex queries:
// Chain multiple filters for complex conditions
let specific_people = graph
.walk()
.vertices(Vertex::person())
// First filter: age range
.filter_by_person(|person, _| person.age() > 25 && person.age() < 40)
// Second filter: name contains 'y'
.filter_by_person(|person, _| person.name().contains('y'))
.collect::<Vec<_>>();
println!(
"Found {} people aged 26-39 with 'y' in their name",
specific_people.len()
);
Filter with Context
Filter based on context information:
// Use filter with context
let result = graph
.walk()
.vertices(Vertex::person())
.push_context(|v, _| {
// Store original vertex in context
if let Some(person) = v.project::<Person<_>>() {
person.name().to_string()
} else {
String::new()
}
})
.filter(|_, ctx| {
// Filter based on context
ctx.len() > 3
})
.collect::<Vec<_>>();
println!(
"Found {} people with names longer than 3 characters",
result.len()
);
Best Practices
- Prefer indexed searches over filter steps when querying by property values
- Break complex filtering logic into multiple chained filters for readability
- Use pattern matching to handle different vertex or edge types correctly
- Leverage generated filter methods from derive macros for stronger type safety
Common Use Cases
- Post-retrieval refinement: Filtering elements after initial selection when indexes don't fully cover criteria
- Dynamic filtering: Applying runtime conditions that can't be encoded in initial searches
- Complex conditions: Implementing filtering logic that combines multiple properties or calculations
- Context-aware filtering: Using information from previous traversal steps to inform filtering decisions
ControlFlow Step
The control_flow
step provides precise control over traversal by evaluating each element with a predicate that returns a std::ops::ControlFlow
value. This allows for both filtering and conditional traversal termination in a single operation.
In this diagram:
- Before
control_flow()
: The walker contains all highlighted vertices A, B, and C. - The
.control_flow(predicate)
step is applied, which makes decisions for each vertex:- Continue and keep the vertex (Vertex B)
- Continue but skip the vertex (Vertex A, which is faded)
- Break traversal and optionally keep a final vertex (Vertex C, which terminates the traversal)
Syntax
walker.control_flow(|element, context| /* control flow logic */)
Parameters
predicate
: A function that takes a reference to a graph element and a mutable reference to its context, and returns astd::ops::ControlFlow
value:ControlFlow::Continue(Some(element))
: Include the element and continue traversalControlFlow::Continue(None)
: Skip the element and continue traversalControlFlow::Break(Some(element))
: Include the element and stop traversalControlFlow::Break(None)
: Stop traversal without including any more elements
Return Value
Returns a new walker that applies the control flow logic to the traversal.
Examples
Vertex Control Flow
// Use control_flow to either skip a vertex (None), include it (Some), or stop traversal (Break)
let _project = graph
.walk()
.vertices(VertexSearch::scan())
.control_flow(|vertex, _| {
if let Vertex::Project { name } = vertex.weight() {
// If we find a project with "Graph" in the name, stop traversal
if name.contains("Graph") {
return ControlFlow::Break(Some(vertex));
}
// Include other project vertices
return ControlFlow::Continue(Some(vertex));
}
// Skip non-project vertices
ControlFlow::Continue(None)
})
.first();
Edge Control Flow
// Use control_flow to skip edges (None), include them (Some), or stop traversal (Break)
let _early_connection = graph
.walk()
.vertices_by_id(vec![start_id])
.edges(EdgeSearch::scan())
.control_flow(|edge, _| {
if let Edge::Follows = edge.weight() {
// With Follows edge type, always break
return ControlFlow::Break(Some(edge));
}
// Skip non-'follows' edges
ControlFlow::Continue(None)
})
.first();
Best Practices
- Use
control_flow
when you need to conditionally terminate traversal based on finding specific elements - Prefer
filter
for simple inclusion/exclusion if you don't need to stop traversal - Use the context parameter to track state or accumulate data during traversal
- Return
Break
as soon as you find what you're looking for to optimize performance
Common Use Cases
- Early termination: Stop traversal as soon as a match is found
- Conditional processing: Apply different logic based on element properties
- Limited collection: Gather elements until a specific condition is met
- State-driven traversal: Use context to make decisions based on previously seen elements
- Performance optimization: Avoid unnecessary traversal when a sufficient result is found
Map Step
The map
step transforms vertices or edges in the traversal by applying a mapping function, returning a standard Rust iterator over the transformed elements. This is a terminal step that consumes the walker and ends the Graph API traversal chain.
In this diagram:
- An Input Stream contains elements A, B, C.
- The
.map(|v| v.name())
step processes each element, applying the transformation function. - The output is represented as a Rust Iterator box, containing the resulting values (e.g.,
"NameA"
,"NameB"
,"NameC"
). - The diagram indicates that this step Terminates Walker, meaning no further Graph API steps can be chained after
map
. Standard Rust iterator methods can be used on the result.
Syntax
walker.map(|element, context| {
// transformation logic
})
Parameters
mapping
: A function that takes:- A reference to the current element (vertex or edge)
- The element's context
- Returns a transformed value
Return Value
Returns an iterator that yields the transformed elements. The type of the iterator items is determined by the return type of the mapping function.
Example
pub fn map_example() {
// Stub example - to be implemented
}
Best Practices
- Structure your mapping logic to handle all expected element types and properties
- Use pattern matching in your mapping function to handle different vertex or edge types
- Leverage context data for transformations that require information from previous steps
- Chain standard iterator methods after mapping to further refine results
Common Use Cases
- Property extraction: Transforming graph elements into specific properties or attributes
- Type conversion: Converting graph elements into domain-specific data structures
- Data aggregation: Creating composite values from multiple element properties
- Format transformation: Preparing graph data for serialization or external systems
Fold Step
The fold
step accumulates a result by processing each element in a traversal, operating like the standard Rust fold
operation but specifically for graph traversals. It's a powerful terminal operation that builds a single value from all elements in the stream.
In this diagram:
- An Input Stream contains elements A (age=30), B (age=25), and C (age=40).
- The
.fold(0, |acc, v| acc + v.age())
step is applied. It starts with an initial accumulator value of0
. - The Accumulator visualization shows the value being updated as each element is processed:
- Initial:
0
- After A:
0 + 30 = 30
- After B:
30 + 25 = 55
- After C:
55 + 40 = 95
- Initial:
- The Final Result box shows the final accumulated value (
95
). - This step Terminates Walker, meaning no further Graph API steps can be chained after
fold
.
Syntax
walker.fold(initial_value, |accumulator, element, context| {
// accumulation logic
})
Parameters
initial_value
: The starting value for the accumulatorf
: A closure that takes:- The current accumulator value
- A reference to the current element (vertex or edge)
- The current element's context
- Returns the updated accumulator value
Return Value
Returns the final accumulated value after processing all elements in the traversal.
Example
pub fn fold_example() {
// Create a graph with standard test data
let graph = standard_populated_graph();
// Calculate the sum of ages of all people using fold
let total_age = graph
.walk()
.vertices(VertexSearch::scan())
.filter_person()
.fold(0, |acc, vertex, _ctx| {
// Add the person's age to the accumulator
if let Some(person) = vertex.project::<Person<_>>() {
acc + person.age() as u32
} else {
acc
}
});
println!("Total age of all people: {}", total_age);
// Example with context: Collect names of people older than the context age
let initial_age_threshold = 30;
let names_older_than_threshold = graph
.walk()
.vertices(VertexSearch::scan())
.filter_person()
.push_context(|_, _| initial_age_threshold) // Push the threshold as context
.fold(Vec::new(), |mut names, vertex, ctx| {
if let Some(person) = vertex.project::<Person<_>>() {
// Use context (threshold) in the fold logic
if person.age() as u32 > **ctx {
names.push(person.name().to_string());
}
}
names
});
println!(
"Names of people older than {}: {:?}",
initial_age_threshold, names_older_than_threshold
);
}
Best Practices
- Choose an appropriate initial value that handles edge cases (empty traversals)
- Design fold closures to be commutative when possible for predictable results
- Use type annotations for complex accumulator types to improve readability
- Consider specialized steps like
count()
when their behavior matches your needs
Common Use Cases
- Aggregation: Calculating sums, averages, or other numerical aggregates
- Collection building: Creating custom collections or data structures from traversal results
- State tracking: Building a final state that incorporates data from all elements
- Custom reductions: Implementing specialized reduction operations not covered by built-in steps
Reduce Step
The reduce
step combines all elements in a traversal using a binary operation, returning a single result element. It repeatedly applies a function to pairs of elements, keeping one from each pair, until only one element remains.
In this diagram:
- An Input Stream contains elements A (age=30), B (age=25), and C (age=40).
- The
.reduce(|acc, v| ...)
step is applied, using a reducer function that keeps the element with the maximum age. - The Reduction Process visualization shows the pairwise comparisons:
- A vs B: Element A is kept (30 > 25), B is discarded.
- (A) vs C: Element C is kept (40 > 30), A is discarded.
- The Result of the reduction process is the single remaining element (C), which continues in the walker chain.
Syntax
walker.reduce(|accumulated, next_element, context| {
// combine elements and return either accumulated or next_element
})
Parameters
reducer
: A function that takes:- The accumulated element from previous reduction steps
- The next element to be considered
- The parent walker's context (immutable)
- Returns a ControlFlow with either the accumulated or next element
Return Value
Returns an Option
containing the result element if the traversal is not empty, or None
if the traversal is empty.
Example
use crate::standard_model::{Person, VertexExt, standard_populated_graph};
use graph_api_lib::{Graph, VertexReference, VertexSearch};
pub fn reduce_example() {
// Create a graph with standard test data
let graph = standard_populated_graph();
// Find the oldest person in the graph using reduce
let oldest = graph
.walk()
.vertices(VertexSearch::scan())
.filter_person()
.reduce(|acc, vertex, _ctx| {
let acc_age = if let Some(person) = acc.project::<Person<_>>() {
person.age()
} else {
0
};
let vertex_age = if let Some(person) = vertex.project::<Person<_>>() {
person.age()
} else {
0
};
// Return the person with higher age
if vertex_age > acc_age { vertex } else { acc }
})
.map(|vertex, _ctx| {
if let Some(person) = vertex.project::<Person<_>>() {
format!(
"The oldest person is {:?}, age {}",
vertex.id(),
person.age()
)
} else {
format!("Unexpected non-person vertex: {:?}", vertex.id())
}
})
.next()
.expect("Should find at least one person");
println!("{}", oldest);
}
Best Practices
- Design reducers that follow associative properties when possible
- Handle empty traversals by checking for None in the result
- The reducer can only select one of the elements, not create new ones
- Use
ControlFlow::Continue
to keep reducing, andControlFlow::Break
to halt early - Consider
fold
instead when you need to build a new value rather than select among elements (note:fold
terminates the walker) - Remember that unlike the old API, the context is immutable in the reducer function
Common Use Cases
- Extrema finding: Selecting maximum or minimum elements by some property
- Best match selection: Choosing the most relevant element from a set of results
- Representative selection: Picking a single representative element from similar options
- Priority determination: Finding highest/lowest priority elements in a graph
Take Step
The take
step restricts a traversal to return at most a specified number of elements, helping to control result size
and improve performance.
In this diagram:
- An Input Stream contains elements A, B, C, D.
- The
.take(2)
step acts like a gate, processing elements sequentially. - Only the first two elements, A and B, pass through to the Output Stream.
- Elements C and D are Discarded because the limit of 2 was reached. The traversal stops after yielding the second element.
Syntax
walker.take(n)
Parameters
n
: Ausize
value specifying the maximum number of elements the traversal should return
Return Value
Returns a new walker that will yield at most n
elements.
Examples
Basic Usage
// Get the first 2 people in the graph
println!("First 2 people in the graph:");
let people = graph
.walk()
.vertices(Vertex::person())
.take(2) // Only take the first 2 vertices
.collect::<Vec<_>>();
for person in &people {
println!(" - {:?}", person);
}
println!("Found {} people (limited to 2)", people.len());
With Filtering
// Get up to 3 people over the age of 30
println!("\nUp to 3 people over 30:");
let older_people = graph
.walk()
.vertices(Vertex::person()) // Get all Person vertices
.filter_by_person(|person, _| {
// Using the typed projection with accessor methods
person.age() > 30
})
.take(3) // Only take up to 3 matches
.collect::<Vec<_>>();
for person in &older_people {
println!(" - {:?}", person);
}
println!("Found {} people (limited to 3)", older_people.len());
Edge Traversal Example
// Find the first 2 connections from the first person
if let Some(first_person) = graph.walk().vertices(Vertex::person()).first() {
println!("\nFirst 2 connections from a person:");
let connections = graph
.walk()
.vertices_by_id([first_person])
.edges(EdgeSearch::scan()) // All edge types
.take(2) // Only take the first 2 edges
.collect::<Vec<_>>();
for edge in &connections {
println!(" - {:?}", edge);
}
println!("Found {} connections (limited to 2)", connections.len());
}
Best Practices
- Use
take
early in the traversal chain to reduce computation on intermediate steps - Combine with ordered indexes when sequence matters to ensure consistent results
- For single element retrieval, prefer the more idiomatic
first()
overtake(1)
- Set conservative limits when exploring large graphs to prevent memory issues
Common Use Cases
- Performance optimization: Restricting result size for large graph traversals
- Pagination: Implementing "page-by-page" data retrieval when combined with skip/offset mechanisms
- Top-N queries: Getting the first N elements matching certain criteria
- Resource control: Preventing excessive memory or processing use in production systems
First Step
The first
step consumes the walker and returns the ID of the very first element encountered in the traversal stream, wrapped in an Option
. If the stream is empty, it returns None
. This is a terminal operation that efficiently short-circuits the traversal as soon as the first element is found.
In this diagram:
- Input Elements: The walker starts with elements A, B, C, D.
- The
.first()
step processes the stream, immediately takes element A, and consumes the walker. Elements B, C, and D are never processed or considered. - Returns: Option<ID>: The step returns
Some(ID(A))
, containing the ID of the first element found. - Terminates Walker: This step ends the Graph API walker chain.
Syntax
walker.first()
Parameters
This step takes no parameters.
Return Value
Returns an Option
containing the first element from the traversal, or None
if the traversal is empty.
Examples
Basic Usage
Retrieve the first person vertex from the graph:
// Get the first person in the graph (if any)
let first_person = graph.walk().vertices(Vertex::person()).first();
match first_person {
Some(person) => println!("Found a person: {:?}", person),
None => println!("No people in the graph"),
}
With Filtering
Get the first person matching specific criteria:
// Get the first person with a specific name
let first_bryn = graph
.walk()
.vertices(Vertex::person()) // Get all Person vertices
.filter_by_person(|person, _| {
// Using the typed projection with accessor methods
person.name().contains("Bryn")
})
.first();
match first_bryn {
Some(bryn) => println!("Found Bryn: {:?}", bryn),
None => println!("No one named Bryn in the graph"),
}
Existence Check
Check if any elements match a condition:
// Use first to check if any element matches a condition
let has_young_person = graph
.walk()
.vertices(Vertex::person())
.filter_by_person(|person, _| {
// Using type-safe accessor methods
person.age() < 30
})
.first()
.is_some();
println!(
"Graph {} people under 30",
if has_young_person {
"contains"
} else {
"doesn't contain"
}
);
Best Practices
- Always handle the
None
case when usingfirst()
on potentially empty traversals - Use filtering before
first()
to ensure you get the element you want - For deterministic results, combine with ordered indexes or explicit sorting
- Prefer
first()
overlimit(1).collect::<Vec<_>>()
for better performance
Common Use Cases
- Single element retrieval: Getting exactly one element matching specific criteria
- Existence checking: Determining if any elements match a condition with
is_some()
- Quick lookup: Finding a representative example of a vertex or edge type
- Early termination: Efficiently stopping traversal after finding the first match
Context
The push_context
step allows you to carry information along a graph traversal, making it possible to access data from
previous steps while working with the current element. Context creates a typed value that travels with the traversal
without changing its position.
In this diagram:
- An Input Stream contains elements A and B.
- The
.push_context(|v| v.id())
step is applied. The context function (e.g.,|v| v.id()
) is evaluated once when the step is encountered (conceptually, when element A is processed). - The resulting context value (e.g.,
"A_ID"
) is then attached to all subsequent elements (A and B) in the Output Stream. - This demonstrates that the context value is determined when the
push_context
step is applied and remains fixed for the rest of that traversal segment.
Methods for Adding Context
// Adding context to a traversal
walker.push_context(|element, current_context| {
// Create and return new context value
})
// Adding current vertex/edge as context
walker.push_default_context()
Parameters
context_fn
: A function that takes:- A reference to the current element (vertex or edge)
- The current context (if any)
- Returns a new context value to be carried along the traversal
Return Value
Returns a new walker with the context added, preserving the current traversal position.
Accessing Context
Any step that accepts a function with context (like map
, filter
, etc.) will receive:
- The current element as the first parameter
- The context as the second parameter
// Using context in a map step
walker
.push_context(|v, _| v.id()) // Store vertex ID
.map(|current, context| {
// Use the stored vertex ID from context
format!("Current: {}, Source: {}", current.id(), context)
})
Examples
Vertex Context
Store information about the source vertex during a traversal:
// Use push_default_context to make source vertex information available during traversal
let knows: Vec<_> = graph
.walk()
.vertices_by_id(vec![bryn_id, julia_id])
.push_default_context()
.edges(EdgeSearch::scan().outgoing())
.filter_follows()
.head()
.map(|target, ctx| {
if let Vertex::Person { name, .. } = ctx.vertex() {
format!(
"{} follows {}",
name,
target.project::<Person<_>>().unwrap().name()
)
} else {
"Not a person".to_string()
}
})
.collect::<Vec<_>>();
// Check the results - should have 2 person descriptions
assert_eq!(knows.len(), 2);
println!("Vertex Context Example - Relationships found:");
for relationship in &knows {
println!("- {}", relationship);
}
Edge Context
Track edge types during traversal:
// Walk the graph starting from the person vertex
let edge_types = graph
.walk()
.vertices_by_id(vec![person_id])
.edges(EdgeSearch::scan().outgoing())
.push_context(|edge, _ctx| {
// Determine edge type based on the edge type
let edge_type = match edge.weight() {
Edge::Created => "Created",
Edge::Follows => "Follows",
Edge::Liked { .. } => "Liked",
Edge::Commented { .. } => "Commented",
};
// Return the edge type as context
edge_type
})
.map(|_v, c| *c)
.collect::<Vec<_>>();
println!("{:?}", edge_types);
Path Tracking
Build a representation of the path taken during traversal:
// Track the path while traversing
let paths = graph
.walk()
.vertices_by_id(vec![start_id])
// Start with an empty path containing just the start vertex name
.push_context(|v, _| {
vec![match v.weight() {
Vertex::Person { name, .. } => name.clone(),
_ => "Unknown".to_string(),
}]
})
// Follow outgoing follows edges
.edges(EdgeSearch::scan().outgoing())
.filter_follows()
.tail()
// Add each person to the path
.push_context(|v, ctx| {
let mut new_path = (**ctx).clone();
if let Vertex::Person { name, .. } = v.weight() {
new_path.push(name.clone());
}
new_path
})
// Collect all paths
.map(|_, ctx| ctx.join(" -> "))
.collect::<Vec<_>>();
// Print all paths
println!("All paths from start:");
for path in paths {
println!("- {}", path);
}
Type Safety
The context system is fully type-safe:
- Each context value has a concrete type
- Context transformation functions must return the correct type
- Closures that receive context are provided with the correctly typed context
Common Use Cases
- Path tracking: Store the sequence of vertices or edges traversed
- Metadata collection: Gather information from different parts of the graph
- Aggregation: Build up composite results during traversal
- Decision making: Use information from earlier steps to influence later decisions
Best Practices
- Keep contexts immutable - create new contexts rather than modifying existing ones
- Use
push_default_context()
when you simply need to track the current vertex/edge - Chain multiple context operations to build complex data structures
- Consider type-safety when designing context pipelines
Default Context
The push_default_context
step provides a simplified way to track vertex and edge information during traversal, without
having to define custom context types. It automatically captures the current vertex or edge to make it available in
downstream steps.
In this diagram:
- An Input Stream contains elements A and B.
- The
.push_default_context()
step is applied. - In the Output Stream, each element is paired with a context that automatically contains that specific element.
- Element A is paired with
Context: A
. - Element B is paired with
Context: B
.
- Element A is paired with
- This shows how the default context dynamically reflects the current element being processed.
Syntax
walker.push_default_context()
Parameters
This step takes no parameters.
Return Value
Returns a new walker with the default context added, preserving the current traversal position.
Examples
Tracking Relationships
Use default context to describe relationships between people:
pub fn default_context_example<G>(graph: &G, bryn_id: G::VertexId, julia_id: G::VertexId)
where
G: Graph<Vertex = Vertex, Edge = Edge>,
{
// Use default context to access vertex information directly from prior in the traversal
let knows = graph
.walk()
.vertices_by_id(vec![bryn_id, julia_id])
.push_default_context()
.edges(EdgeSearch::scan().outgoing())
.filter_follows()
.head()
.map(|target_vertex, ctx| {
// Access source person name from context
let source_name = match ctx.vertex() {
Vertex::Person { name, .. } => name.clone(),
_ => "Unknown".to_string(),
};
// Access target person name from vertex
let person = target_vertex.project::<Person<_>>().unwrap();
format!("{} knows {}", source_name, person.name())
})
.collect::<Vec<_>>();
// Check the results
println!("Relationships found:");
for relationship in &knows {
println!("- {}", relationship);
}
}
Working with Edge Properties
Combine edge properties with source vertex information:
pub fn edge_properties_example<G>(graph: &G, person_id: G::VertexId)
where
G: Graph<Vertex = Vertex, Edge = Edge>,
{
// Find relationships with metadata
let relationships = graph
.walk()
.vertices_by_id(vec![person_id])
.push_default_context()
.edges(EdgeSearch::scan().outgoing())
.map(|edge, ctx| {
// Get the source person's name
let source_name = match ctx.vertex() {
Vertex::Person { name, .. } => name.clone(),
_ => "Unknown".to_string(),
};
// Format based on edge type
match edge.weight() {
Edge::Follows => {
format!("{} follows someone", source_name)
}
Edge::Created => {
format!("{} created something", source_name)
}
Edge::Liked { timestamp } => {
format!("{} liked something at {}", source_name, timestamp)
}
Edge::Commented { timestamp } => {
format!("{} commented on something at {}", source_name, timestamp)
}
}
})
.collect::<Vec<_>>();
println!("Person relationships with metadata:");
for rel in relationships {
println!("- {}", rel);
}
}
Default Context Structure
The default context automatically tracks:
- For vertices: The current vertex reference
- For edges: Both the source and target vertex references
Accessing Default Context
You can access the context in subsequent steps like this:
// After pushing default context
walker
.push_default_context()
.map(|current_element, ctx| {
// Access vertex from context
let context_vertex = ctx.vertex();
// Work with the context vertex
// ...
})
Best Practices
- Use default context for simple element tracking rather than creating custom context types
- Chain default contexts in multi-step traversals to maintain element history
- Access context values using the appropriate type-safe methods (e.g.,
ctx.vertex()
,ctx.edge()
) - Consider default context before writing complex custom context functions for basic traversals
Common Use Cases
- Relationship description: Building natural language descriptions (e.g., "Person A knows Person B")
- Path tracking: Recording the sequence of vertices and edges in a traversal
- Element comparison: Comparing properties between current and previous elements
- Data collection: Gathering information from connected elements in the graph
Mutate Context
The mutate_context
step allows you to modify context data during a graph traversal without affecting the traversal position. This provides a way to update state or accumulate information as you move through the graph.
In this diagram:
- An Input Stream contains elements A and B, each associated with a context value.
- The
.mutate_context(|e, ctx| { /* modify ctx */ })
step is applied, allowing in-place modification of the context associated with each element. - The resulting Output Stream shows the same elements (A and B) but with modified context values.
- This demonstrates that
mutate_context
affects only the context data, not the traversal elements or position.
Syntax
walker.mutate_context(|element, context| {
// Modify context in-place via mutable reference
})
Parameters
callback
: A function that takes:- A reference to the current element (vertex or edge)
- A mutable reference to the current context
- Performs in-place modifications to the context
Return Value
Returns a new walker with the same position but modified context data.
Detour Step
The detour
step is a powerful operation that allows you to execute a sub-traversal for each element currently in the
main walker stream. The results yielded by each sub-traversal do not replace the original element in the main stream.
This enables complex conditional logic and exploration without losing the main traversal context.
In this diagram:
- Input Elements: The main walker stream arrives at the
detour
step with elements A and X. .detour(|sub| ...)
: For each input element, a sub-walker is initiated:- Sub-Walker (from A): Starts at A, executes the steps defined in the closure (e.g.,
edges().head()
), and yields results B and C. - Sub-Walker (from X): Starts at X, executes the same steps, and yields result Y.
- Sub-Walker (from A): Starts at A, executes the steps defined in the closure (e.g.,
- Output Elements (Results): The main walker stream continues, now containing the combined results from all sub-walkers (A, A, X). The original elements (A, X) have been replaced.
Syntax
walker.detour(|sub_walker_builder| {
// Configure the sub-walker with additional steps
sub_walker_builder
.edges(...) // Example step
.head() // Example step
// ... more steps ...
})
Parameters
detour_fn
: A closure that receives aStartWalkerBuilder
positioned at the current element from the main stream. This closure defines the steps for the sub-traversal.
Return Value
Returns a new walker where the elements are the combined results of all the sub-traversals executed within the detour
.
Examples
Basic Detour
This example shows how to use a detour to find projects created by people who follow someone over 30:
// Basic detour example: Find projects created by people who follow someone over 30
println!("Projects created by people who follow someone over 30:");
let projects = graph
.walk()
.vertices(Vertex::person()) // Start with all people
.detour(|sub_walker| {
// The detour checks if this person follows someone over 30
sub_walker
.edges(Edge::follows()) // Follow "follows" edges
.head() // Move to the target person
.filter_by_person(|person, _| person.age() > 30) // Check if they're over 30
.take(1) // We only need one match to qualify
})
// Back to original person vertices that passed the detour check
.edges(Edge::created()) // Find what they created
.head() // Move to the created project
.filter_project() // Keep only project vertices
.collect::<Vec<_>>();
for project in projects {
println!("{:?}", project);
}
Nested Detours
This example demonstrates using nested detours for more complex traversal logic:
// Complex detour example: Find people who created projects that were liked by others
println!("\nPeople who created projects that were liked by others:");
let creators = graph
.walk()
.vertices(Vertex::person()) // Start with all people
.push_default_context()
.detour(|creator_walker| {
// Detour to check if this person created something that was liked
creator_walker
.edges(Edge::created()) // Find what they created
.head() // Move to the created project
.detour(|project_walker| {
// Nested detour to check if the project was liked
project_walker
.edges(Edge::liked()) // Find "liked" edges pointing to this project
.tail() // Move to the person who liked it
.filter(|liker, ctx| {
// Check that the liker is different from the creator
*ctx.vertex_id() != liker.id()
})
})
})
.collect::<Vec<_>>();
for creator in creators {
println!("{:?} created something that was liked", creator);
}
When to use detour
Detour shines in several common graph traversal scenarios:
-
Conditional filtering based on connected elements:
For example, "Find people who follow someone over 30" requires exploring connections before making filtering decisions -
Looking ahead without changing position:
When you need to check properties of elements further in the graph before deciding how to proceed -
Complex multi-step conditions:
When a condition requires checking multiple properties or relationships that would be awkward with chained filters -
Collecting contextual information:
When you need to gather information from related elements to inform decisions in the main traversal -
Avoiding traversal state mutation:
When you want to explore potential paths without permanently changing your traversal state
Implementation Notes
- The detour closure receives a fresh walker starting at the current position
- All walker operations inside the detour are executed in a sub-traversal context
- Results from the detour replace the current elements in the traversal
- The main traversal continues from where it left off, but with the filtered/transformed elements
- For better performance, try to use indexed lookups within detours whenever possible
- Detours can be nested for extremely complex traversal logic
- Using detours improves code readability by logically grouping related traversal operations
Collect Step
The collect
step gathers the IDs of all elements from a traversal into a specified Rust collection type (e.g., Vec
, HashSet
, BTreeSet
). This is a terminal operation that consumes the walker and returns the populated collection.
In this diagram:
- Input Elements: The walker starts with elements A, B, C.
- The
.collect::<Vec<_>>()
step processes the stream and consumes the walker. The type parameter (Vec<_>
) specifies the desired collection type. - Returns: Vec<ID>: The step returns a Rust
Vec
containing the unique identifiers (IDs) of the elements that were processed ([ID(A), ID(B), ID(C)]
). - Terminates Walker: This step ends the Graph API walker chain.
Syntax
walker.collect::<C>()
Where C
is the collection type you want to gather results into.
Parameters
This step takes no parameters, but uses a type parameter to specify the collection type.
Return Value
Returns a collection of type C
where C
implements FromIterator
.
Examples
Collecting into a Vec
The most common use case is collecting elements into a Vec
:
// Collect results into a Vec
// Use the person() index method to get all Person vertices
let person_vertices: Vec<_> = graph
.walk()
.vertices(Vertex::person()) // Type-safe vertex lookup by label
.collect::<Vec<_>>();
println!("Found {} person vertices", person_vertices.len());
Collecting Unique Values
You can collect into a HashSet
to get unique values:
// Collect into a HashSet for unique elements
let unique_names: HashSet<String> = graph
.walk()
.vertices(Vertex::person())
// Use map to extract properties from each person
.map(|person, _| {
// Use projection to access Person methods in a type-safe way
person
.project::<Person<_>>() // Project to Person type
.map(|p| p.name().to_string()) // Use accessor method from projection
.unwrap_or_else(|| "Unknown".to_string())
})
.collect::<HashSet<String>>();
println!("Found {} unique person names", unique_names.len());
Collecting into an Range Set
Use a BTreeSet
when you need range unique values:
// Collect into a BTreeSet for range unique elements
let range_ages: BTreeSet<u8> = graph
.walk()
.vertices(Vertex::person())
// Use filter_person() to work exclusively with Person vertices (no closure needed)
.filter_person() // Label-based type filter with no closure
// Extract age from each Person vertex
.map(|person, _| {
// Use projection to access Person methods in a type-safe way
person
.project::<Person<_>>() // Project to Person type
.map(|p| p.age()) // Use age() accessor method
.unwrap_or(0)
})
.collect::<BTreeSet<u8>>();
// Print ages in ascending order (BTreeSet maintains order)
println!("Person ages (range): {:?}", range_ages);
Best Practices
- Use
limit
beforecollect
when working with large graphs to control memory usage - Choose the right collection type for your needs:
Vec
: When order matters and duplicates are acceptableHashSet
: When you need unique elements and don't care about orderBTreeSet
: When you need unique elements in a specific order
Common Use Cases
- Result accumulation: Collecting all vertices meeting specific criteria
- Set operations: Gathering unique elements via
HashSet
for later processing - Ordered results: Using
BTreeSet
when elements need to be in a specific order - Custom collections: Feeding traversal results into specialized data structures
Count Step
The count
step consumes the walker and efficiently counts the number of elements that have passed through the traversal up to that point. It is a terminal step that returns a single usize
value representing the total count.
In this diagram:
- Input Elements: The walker starts with elements A, B, C, D.
- The
.count()
step processes the stream and consumes the walker. - Returns: usize: The step returns a single
usize
value, which is the total number of elements processed (4 in this case). - Terminates Walker: This step ends the Graph API walker chain.
Syntax
walker.count()
Parameters
This step takes no parameters.
Return Value
Returns a usize
representing the number of elements in the traversal.
Examples
Basic Count
Count the number of people in the graph:
// Basic count - how many people are in the graph?
let person_count = graph.walk().vertices(Vertex::person()).count();
println!("Total people in graph: {}", person_count);
Filtered Count
Count elements that match specific criteria:
// Count with filtering - how many people are over 30?
let older_person_count = graph
.walk()
.vertices(Vertex::person())
.filter_by_person(|person, _| person.age() > 30)
.count();
println!("People over 30: {}", older_person_count);
Edge Count
Count relationships in the graph:
// Count relationships - how many 'follows' relationships exist?
let knows_count = graph
.walk()
.vertices(VertexSearch::scan())
.edges(Edge::follows())
.count();
println!("Total 'follows' relationships: {}", knows_count);
Analytics
Use counts to calculate graph analytics:
// Count for analytics - average number of people followed per person
let person_count = graph.walk().vertices(Vertex::person()).count();
if person_count > 0 {
let knows_count = graph
.walk()
.vertices(Vertex::person())
.edges(Edge::follows().incoming())
.count();
let avg_known = knows_count as f64 / person_count as f64;
println!("Average people followed per person: {:.2}", avg_known);
}
Best Practices
- Use count directly rather than collecting results just to count them
- Consider indexed counts for large graphs when available in your implementation
- Combine with filter steps for specific counting queries
- Use count to validate expectations in tests and assertions
Common Use Cases
- Existence checking: Determining if any elements match criteria (count > 0)
- Graph analytics: Calculating statistics like average connections per node
- Validation: Ensuring expected numbers of elements exist in certain conditions
- Performance metrics: Measuring graph size and density characteristics
Into Iterator
The into_iter
step converts a walker traversal into a standard Rust iterator. This is a terminal step that consumes the walker and returns an iterator yielding the IDs of the vertices or edges that were in the traversal stream. It allows you to bridge the Graph API's walker system with Rust's standard iterator ecosystem.
In this diagram:
- Input Elements: The walker starts with elements A, B, C.
- The
.into_iter()
step processes the stream and consumes the walker. - Rust Iterator: The step returns a standard Rust iterator. This iterator yields the unique identifiers (IDs) of the elements that were processed (ID(A), ID(B), ID(C)).
- Terminates Walker: This step ends the Graph API walker chain. Subsequent operations must use standard Rust iterator methods.
Syntax
walker.into_iter()
Parameters
This step takes no parameters.
Return Value
Returns a standard Rust iterator that yields vertex or edge IDs from the traversal.
Examples
Basic Usage
Convert a traversal to an iterator and collect results:
// Basic iteration to collect IDs
let vertex_ids = graph
.walk()
.vertices(VertexSearch::scan())
.into_iter()
.collect::<Vec<_>>();
// There should be at least 4 vertices in the graph
assert!(vertex_ids.len() >= 4);
Filtering with Standard Iterators
Use standard Rust iterator methods:
// We can use standard iterator operations like filtering
let filtered_vertices = graph
.walk()
.vertices(VertexSearch::scan())
.into_iter()
.filter(|vertex_id| {
// Get the vertex reference from the ID
if let Some(vertex) = graph.vertex(*vertex_id) {
// Check if it's a Person
matches!(vertex.weight(), Vertex::Person { .. })
} else {
false
}
})
.collect::<Vec<_>>();
// There should be exactly 2 Person vertices (bryn and julia)
assert_eq!(filtered_vertices.len(), 2);
Comparing with Walker Methods
Walker methods vs standard iterator methods:
// Using .map() on the walker directly yields references with context
let vertex_names = graph
.walk()
.vertices(VertexSearch::scan())
.map(|vertex, _ctx| match vertex.weight() {
Vertex::Person { name, .. } => name.clone(),
Vertex::Project { name } => name.clone(),
_ => "Unknown".to_string(),
})
.collect::<Vec<_>>();
assert!(vertex_names.contains(&"Bryn".to_string()));
assert!(vertex_names.contains(&"Julia".to_string()));
assert!(vertex_names.contains(&"GraphApi".to_string()));
assert!(vertex_names.contains(&"Rust".to_string()));
Best Practices
- Use
into_iter
only when you need to leverage standard Rust iterators - Remember that standard iterator methods lose access to graph context
- Consider extracting essential data into context before converting to an iterator
- When working with large graphs, use the Graph API's lazy evaluation before converting to an iterator
Common Use Cases
- Integration with existing code: Bridging Graph API traversals with existing iterator-based systems
- Complex iterator chains: Using specialized iterator adaptors not available as walker steps
- ID-based operations: Working with element IDs directly for memory efficiency
- Ecosystem integration: Connecting graph traversals with Rust's extensive iterator ecosystem
Probe Step
The probe
step allows you to execute a callback function for each element in the traversal, primarily for side effects like logging, metrics collection, or debugging, without altering the elements themselves or the flow of the traversal.
In this diagram:
- Input Elements: The walker starts with elements A, B, C.
- The
.probe(callback)
step processes each element. - Side Effect: As each element (A, B, C) passes through, the provided callback function is executed. This is shown triggering actions like logging or updating metrics.
- Output Elements (Unchanged): The elements A, B, C continue to the next step in the traversal, completely unaffected by the
probe
operation.
Syntax
walker.probe(|element, context| {
// inspection logic
})
Parameters
inspector
: A function that takes:- A reference to the current element (vertex or edge)
- The element's context
- Performs some side effect like logging or debugging
Return Value
Returns the same traversal unchanged, allowing you to continue chaining steps.
Examples
Inspecting Vertices
This example shows how to use the probe
step to inspect and count vertices during traversal:
// Use probe to inspect vertices during traversal without affecting the flow
println!("Examining each person vertex:");
let mut person_count = 0;
graph
.walk()
.vertices(Vertex::person()) // Type-safe vertex lookup by label
.probe(|vertex, _| {
// Inspect vertex properties using type-safe projection
if let Some(person) = vertex.project::<Person<_>>() {
person_count += 1;
println!(" Found person: {} (age {})", person.name(), person.age());
}
})
.count();
println!(" Total people found: {}", person_count);
Inspecting Edges
This example demonstrates using the probe
step to examine relationships between vertices:
// Use probe to inspect relationships between vertices
println!("\nExamining Bryn's outgoing relationships:");
graph
.walk()
.vertices(Vertex::person_by_username("bryn123"))
.edges(EdgeSearch::scan().outgoing())
.probe(|edge, _| {
// Get the target vertex to display relationship context
let target = graph.vertex(edge.head()).unwrap();
// Display relationship information
match edge.weight() {
crate::standard_model::Edge::Created => {
if let Some(project) = target.project::<Project<_>>() {
println!(" Bryn created project: {}", project.name());
}
}
crate::standard_model::Edge::Follows => {
if let Some(person) = target.project::<Person<_>>() {
println!(" Bryn follows: {}", person.name());
}
}
crate::standard_model::Edge::Liked { timestamp } => {
if let Some(project) = target.project::<Project<_>>() {
println!(" Bryn liked project {} on {}", project.name(), timestamp);
}
}
crate::standard_model::Edge::Commented { timestamp } => {
if let Some(project) = target.project::<Project<_>>() {
println!(
" Bryn commented on project {} on {}",
project.name(),
timestamp
);
}
}
}
})
.count();
Best Practices
- Insert probe steps at key points in complex traversals to verify correct behavior
- Use descriptive logging within probes to make debugging output meaningful
- Add counters or statistics collection to understand traversal performance
- Keep probe functions simple and side-effect only; don't try to affect the traversal flow
Common Use Cases
- Debugging: Inserting temporary probe steps to understand traversal behavior
- Logging: Recording traversal information during development or in production
- Metrics collection: Gathering statistics about traversal performance and results
- Inspection: Examining element properties at specific points in the traversal
Mutate Step
The mutate
step iterates over elements in the traversal, applying a callback function to each one. This callback receives mutable access to the graph, allowing for in-place modification of vertices or edges. mutate
is a terminal step that consumes the walker and returns the total count (usize
) of elements processed.
In this diagram:
- Input Elements: The walker starts with elements A, B, C, each having an initial value (e.g.,
val=1
). - The
.mutate(callback)
step processes each element. The callback function has access to modify the graph. - Mutated Elements (In Graph): The diagram shows the state of the elements after the mutation callback has been applied (e.g., A' with
val=10
). This highlights the in-place nature of the operation. - Returns: usize: The step completes and returns the count of elements processed (3 in this case), terminating the walker chain.
Syntax
walker.mutate(|element, context, graph| {
// mutation logic using graph access
})
Parameters
callback
: A function that takes:- A reference to the current element (vertex or edge) - Note: Direct mutable access to the element itself might vary; mutation often happens via the
graph
reference using the element's ID. - The element's context
- A mutable reference to the graph (
&mut G
) - Performs modifications using the mutable graph reference (e.g.,
graph.vertex_mut(element.id())
).
- A reference to the current element (vertex or edge) - Note: Direct mutable access to the element itself might vary; mutation often happens via the
Return Value
Returns a usize
representing the number of elements processed by the mutate
step.
Example
// Example 1: Update all person vertices to increment their age
println!("Incrementing age of all people:");
// Print original ages
graph
.walk()
.vertices(Vertex::person())
.probe(|vertex, _| {
if let Some(person) = vertex.project::<Person<_>>() {
println!(" Before: {} is {} years old", person.name(), person.age());
}
})
.count();
// Perform the mutation - increment everyone's age by 1
let updated_count = graph
.walk_mut() // Must use walk_mut() for mutations
.vertices(Vertex::person())
.mutate(|graph, vertex_id, _| {
// Note: updating requires cloning and replacing due to Rust's ownership model
if let Some(mut vertex) = graph.vertex_mut(vertex_id) {
if let Some(mut person) = vertex.project_mut::<PersonMut<_, _>>() {
// Increment the person age
person.set_age(person.age() + 1);
}
}
});
println!(" Updated {} person vertices", updated_count);
Best Practices
- Use
mutate
specifically for modifying graph elements based on traversal results. - Access mutable elements via the provided
graph
reference and the element's ID (e.g.,graph.vertex_mut(id)
). - Be mindful that mutations are applied directly and immediately to the graph state.
- Consider using
filter
orcontrol_flow
beforemutate
to precisely target which elements should be modified. - Understand the graph implementation's behavior regarding mutations during iteration (e.g., adding/removing elements might invalidate iterators in some graph types).
Common Use Cases
- Updating properties: Modifying attributes of vertices or edges (e.g., incrementing age, changing status).
- Graph cleaning: Standardizing data or fixing inconsistencies found during traversal.
- State transitions: Updating element states based on workflow logic during traversal.
- Bulk updates: Applying a change to a set of elements identified by prior traversal steps.
Debug Step
The dbg
step provides a simple way to inspect the elements flowing through a traversal without altering them. It
prints detailed information about each element to the console as it passes through the step.
In this diagram:
- Input Elements: The walker starts with elements A, B, C.
- The
.dbg("Label")
step processes each element. - Console Output: As each element (A, B, C) passes through the
dbg
step, its details are printed to the console, prefixed with the provided label. This is shown as a side effect. - Output Elements (Unchanged): The elements A, B, C continue to the next step in the traversal, completely
unaffected by the
dbg
operation.
Syntax
walker.dbg("Custom label")
Parameters
label
: An optional string label to identify the debug output.
Return Value
Returns the same traversal unchanged, allowing you to continue chaining steps.
Example
// Create a pipeline with multiple dbg steps to see how data transforms
let result = graph
.walk()
.vertices(Vertex::person()) // Start with all people
.dbg("all-people") // Debug all people vertices
.filter_by_person(|person, _| {
// Find people over 30
person.age() > 30
})
.dbg("people-over-30") // Debug filtered results
.count();
println!("Found {} people over 30", result);
Best Practices
- Use meaningful labels to identify each debug checkpoint in complex traversals
- Remove or comment out debug steps before deploying to production
- Add debug steps before and after steps that might be causing issues
- For production logging, replace with
probe
steps that have custom formatting
Common Use Cases
- Traversal troubleshooting: Understanding why a traversal isn't returning expected results
- Learning: Exploring how traversals work by seeing each element's detailed state
- Development checkpoints: Verifying the state of a traversal at key points
- Context inspection: Examining the full context structure during traversal
Context System
The context system is a powerful feature of the Graph API that enriches your traversals by carrying data alongside the stream of elements (vertices or edges). This allows you to maintain state, collect information, and make decisions based on previously processed elements as you explore the graph.
What is Context?
In this diagram:
- An Input Stream contains elements A and B.
- The
push_context
step is applied. It calculates a context value (e.g.,"Ctx_A"
) based on the first element encountered (A) and attaches this fixed context to all subsequent elements. - The Stream with Context now contains pairs of
(Element, Context)
:(A, "Ctx_A")
,(B, "Ctx_A")
. - A subsequent
map
step receives both the current element and its associated context, allowing it to perform transformations using both pieces of information. - The Final Output shows the results produced by the
map
step.
Context allows you to:
- Carry information from previous steps to later steps in the traversal
- Transform data as you move through the graph
- Make decisions based on a combination of current and past elements
- Build complex data structures by accumulating information during traversal
- Maintain state without changing the traversal position
Context Methods
The Graph API provides two primary methods for managing context:
push_context
// Add or transform context
walker.push_context(|element, current_context| {
// Extract data from the current element
// Optionally use the existing context
// Return a new context value
})
This method creates or transforms context based on the current element and any existing context.
push_default_context
// Add the current vertex or edge as context
walker.push_default_context()
This method stores the current vertex or edge in context, making it available in later steps.
Basic Context Usage
Here's a simple example of storing information in context and using it later:
pub fn basic_context_example() {
let graph = standard_populated_graph();
// Store person name in context and use it later
let person_projects = graph
.walk()
.vertices()
.filter_person() // Type-safe filter using generated helper
.push_context(|v, _| {
// Extract and store person's name
v.project::<Person<_>>().unwrap().name().to_string()
})
.edges(EdgeSearch::scan().outgoing())
.filter_created() // Type-safe filter using generated helper
.vertices()
.filter_project() // Type-safe filter using generated helper
.map(|project, ctx| {
// Format using project name and person name from context
format!(
"{} created the {} project",
ctx,
project.project::<Project<_>>().unwrap().name()
)
})
.collect::<Vec<_>>();
}
Nested Context with Detours
When using the detour
step, context becomes nested, allowing you to access the parent context:
pub fn nested_context_example() {
let graph = standard_populated_graph();
// Use nested context to track relationships
let follows_ages = graph
.walk()
.vertices()
.filter_person() // Type-safe filter using generated helper
.push_context(|v, _| {
// Store source person's age in context
v.project::<Person<_>>().unwrap().age()
})
.detour(|w| {
// Start a sub-traversal that keeps parent context
w.edges(EdgeSearch::scan().outgoing())
.filter_follows() // Type-safe filter using generated helper
.vertices()
.filter_person() // Type-safe filter using generated helper
.map(|target, ctx| {
// Access source's age from context
let source_age = *ctx;
let target_age = target.project::<Person<_>>().unwrap().age();
// Calculate and return age difference
let diff = target_age as i32 - source_age as i32;
if diff > 0 {
"follows someone older"
} else if diff < 0 {
"follows someone younger"
} else {
"follows someone the same age"
}
})
})
.collect::<Vec<_>>();
}
Default Context
The Graph API provides a built-in default context that simplifies common patterns:
pub fn default_context_example() {
let graph = standard_populated_graph();
// Use the built-in default context to access source vertex
let relationships = graph
.walk()
.vertices()
.filter_person() // Type-safe filter using generated helper
.push_default_context() // Store current vertex in default context
.edges(EdgeSearch::scan().outgoing())
.vertices()
.map(|target, ctx| {
// Access the source vertex from context
let source = ctx.vertex();
let source_name = source
.project::<Person<_>>()
.map(|p| p.name())
.unwrap_or("Unknown");
// Format the relationship based on target type
match target.weight() {
Vertex::Person { name, .. } => format!("{} is connected to {}", source_name, name),
Vertex::Project { name, .. } => format!("{} works on {}", source_name, name),
_ => "Unknown relationship".to_string(),
}
})
.collect::<Vec<_>>();
}
Type Safety
The context system is fully type-safe:
- Each context value has a specific type determined by your context function
- The compiler enforces correct handling of context types
- Context functions must return values compatible with downstream operations
- Errors in context type handling are caught at compile time, not runtime
Context Lifecycle
- Creation: Context is created when you first call a context method
- Transformation: Context can be transformed at any step in the traversal
- Access: Any step that accepts a closure can access the context
- Nesting: Detours create nested contexts with access to parent contexts
- Immutability: Context is immutable; transformations create new contexts
Best Practices
- Keep context small: Large contexts can impact performance
- Use immutable data structures: Create new contexts rather than modifying existing ones
- Leverage type safety: Let the compiler ensure your context manipulations are valid
- Consider cloning costs: For large data, use
Arc
or similar for cheap cloning - Use default context: For simple cases,
push_default_context()
is cleaner than custom context - Chain context operations: Build complex data structures through multiple context steps
- Document context types: When using complex context chains, document the context type at each step
Common Use Cases
- Path tracking: Record the path taken through the graph
- Property collection: Gather properties from different vertices/edges
- Decision making: Use information from earlier elements to influence traversal decisions
- Aggregation: Build composite results during traversal
- Statistical analysis: Calculate metrics as you traverse the graph
Walker Best Practices
This page provides guidance on effectively using the Graph API walker system for graph traversals.
General Best Practices
-
Chain steps logically
- Build your traversal in a logical sequence that mirrors how you would describe the path
- Group related operations together to improve readability
-
Use appropriate search criteria
- Limit vertices and edges early in the traversal to reduce the traversal set
- Use the most specific search criteria available (label, index, property)
-
Leverage type projections
- Use
.project::<Type<_>>()
to access type-specific methods - Handle projection failures gracefully with
match
orif let
- Use
-
Use context for data collection
- Store intermediate results in context rather than using external collections
- Use context to carry state through the traversal
-
Consider performance
- For very large graphs, filter early to reduce the traversal set
- Use indexed lookups when available
- Limit traversal depth for potentially unbounded searches
Optimization Tips
Early Filtering
Filter vertices and edges as early as possible in the traversal:
use graph_api_test::Vertex;
use graph_api_test::Edge;
use graph_api_test::VertexExt;
use graph_api_test::EdgeExt;
use graph_api_test::Person;
use graph_api_test::Project;
use graph_api_test::populate_graph;
use graph_api_lib::EdgeSearch;
use graph_api_lib::VertexSearch;
use graph_api_simplegraph::SimpleGraph;
use graph_api_lib::Graph;
use graph_api_lib::VertexReference;
use graph_api_lib::EdgeReference;
use std::collections::HashSet;
// Create a new graph
let mut graph = SimpleGraph::new();
// Populate the graph with test data
let refs = populate_graph(&mut graph);
// Efficient - filters early
graph.walk()
.vertices(VertexSearch::scan().with_label(Person::label()))
.filter(|v, _| v.project::<Person<_>>().unwrap().age() > 30)
.edges(...)
// ... rest of traversal
// Less efficient - processes all edges before filtering
graph.walk()
.vertices(VertexSearch::scan().with_label(Person::label()))
.edges(...)
.head()
.filter(|v, _| v.project::<Person<_>>().unwrap().age() > 30)
// ... rest of traversal
Use Indexes
Take advantage of indexes when available:
use graph_api_test::Vertex;
use graph_api_test::Edge;
use graph_api_test::VertexExt;
use graph_api_test::EdgeExt;
use graph_api_test::Person;
use graph_api_test::Project;
use graph_api_test::populate_graph;
use graph_api_lib::EdgeSearch;
use graph_api_lib::VertexSearch;
use graph_api_simplegraph::SimpleGraph;
use graph_api_lib::Graph;
use graph_api_lib::VertexReference;
use graph_api_lib::EdgeReference;
use std::collections::HashSet;
// Create a new graph
let mut graph = SimpleGraph::new();
// Populate the graph with test data
let refs = populate_graph(&mut graph);
// Using an index (more efficient)
graph.walk()
.vertices(VertexSearch::index(Person::by_name_index(), "Bryn"))
// ... rest of traversal
// Full scan (less efficient)
graph.walk()
.vertices(VertexSearch::scan())
.filter(|v, _| {
if let Ok(person) = v.project::<Person<_>>() {
person.name() == "Bryn"
} else {
false
}
})
// ... rest of traversal
Limit Traversal Size
Use take()
to prevent processing excessive elements:
use graph_api_test::Vertex;
use graph_api_test::Edge;
use graph_api_test::VertexExt;
use graph_api_test::EdgeExt;
use graph_api_test::Person;
use graph_api_test::Project;
use graph_api_test::populate_graph;
use graph_api_lib::EdgeSearch;
use graph_api_lib::VertexSearch;
use graph_api_simplegraph::SimpleGraph;
use graph_api_lib::Graph;
use graph_api_lib::VertexReference;
use graph_api_lib::EdgeReference;
use std::collections::HashSet;
// Create a new graph
let mut graph = SimpleGraph::new();
// Populate the graph with test data
let refs = populate_graph(&mut graph);
// Limit to first 10 results
graph.walk()
.vertices(VertexSearch::scan())
.take(10)
.collect::<Vec<_>>();
Use Detours Effectively
Detours allow for complex traversals without losing your place:
use graph_api_test::Vertex;
use graph_api_test::Edge;
use graph_api_test::VertexExt;
use graph_api_test::EdgeExt;
use graph_api_test::Person;
use graph_api_test::Project;
use graph_api_test::populate_graph;
use graph_api_lib::EdgeSearch;
use graph_api_lib::VertexSearch;
use graph_api_simplegraph::SimpleGraph;
use graph_api_lib::Graph;
use graph_api_lib::VertexReference;
use graph_api_lib::EdgeReference;
use std::collections::HashSet;
// Create a new graph
let mut graph = SimpleGraph::new();
// Populate the graph with test data
let refs = populate_graph(&mut graph);
// Find people and their projects with ratings
graph.walk()
.vertices(VertexSearch::scan().with_label(Person::label()))
.push_context(|v, _| v.id()) // Store person ID
.detour(|v| {
v.edges(EdgeSearch::scan().with_label(Edge::created_label()))
.tail()
.push_context(|v, ctx| {
// Return both the person ID and project
(ctx.clone(), v.project::<Project<_>>().unwrap().name().to_string())
})
})
.collect::<Vec<_>>();
Common Patterns
Finding Connected Vertices
use graph_api_test::Vertex;
use graph_api_test::Edge;
use graph_api_test::VertexExt;
use graph_api_test::EdgeExt;
use graph_api_test::Person;
use graph_api_test::Project;
use graph_api_test::populate_graph;
use graph_api_lib::EdgeSearch;
use graph_api_lib::VertexSearch;
use graph_api_simplegraph::SimpleGraph;
use graph_api_lib::Graph;
use graph_api_lib::VertexReference;
use graph_api_lib::EdgeReference;
use std::collections::HashSet;
// Create a new graph
let mut graph = SimpleGraph::new();
// Populate the graph with test data
let refs = populate_graph(&mut graph);
// Find all friends of Bryn
let friends = graph.walk()
.vertices(VertexSearch::index(Person::by_name_index(), "Bryn"))
.edges(EdgeSearch::scan().with_label(Edge::knows_label()))
.tail()
.collect::<Vec<_>>();
Filtering by Properties
use graph_api_test::Vertex;
use graph_api_test::Edge;
use graph_api_test::VertexExt;
use graph_api_test::EdgeExt;
use graph_api_test::Person;
use graph_api_test::Project;
use graph_api_test::populate_graph;
use graph_api_lib::EdgeSearch;
use graph_api_lib::VertexSearch;
use graph_api_simplegraph::SimpleGraph;
use graph_api_lib::Graph;
use graph_api_lib::VertexReference;
use graph_api_lib::EdgeReference;
use std::collections::HashSet;
// Create a new graph
let mut graph = SimpleGraph::new();
// Populate the graph with test data
let refs = populate_graph(&mut graph);
// Find all people over 30
let seniors = graph.walk()
.vertices(VertexSearch::scan().with_label(Person::label()))
.filter(|v, _| v.project::<Person<_>>().unwrap().age() > 30)
.collect::<Vec<_>>();
Collecting Property Values
use graph_api_test::Vertex;
use graph_api_test::Edge;
use graph_api_test::VertexExt;
use graph_api_test::EdgeExt;
use graph_api_test::Person;
use graph_api_test::Project;
use graph_api_test::populate_graph;
use graph_api_lib::EdgeSearch;
use graph_api_lib::VertexSearch;
use graph_api_simplegraph::SimpleGraph;
use graph_api_lib::Graph;
use graph_api_lib::VertexReference;
use graph_api_lib::EdgeReference;
use std::collections::HashSet;
// Create a new graph
let mut graph = SimpleGraph::new();
// Populate the graph with test data
let refs = populate_graph(&mut graph);
// Collect names of all projects
let project_names = graph.walk()
.vertices(VertexSearch::scan().with_label(Project::label()))
.map(|v, _| v.project::<Project<_>>().unwrap().name().to_string())
.collect::<Vec<_>>();
Computing Aggregates
use graph_api_test::Vertex;
use graph_api_test::Edge;
use graph_api_test::VertexExt;
use graph_api_test::EdgeExt;
use graph_api_test::Person;
use graph_api_test::Project;
use graph_api_test::populate_graph;
use graph_api_lib::EdgeSearch;
use graph_api_lib::VertexSearch;
use graph_api_simplegraph::SimpleGraph;
use graph_api_lib::Graph;
use graph_api_lib::VertexReference;
use graph_api_lib::EdgeReference;
use std::collections::HashSet;
// Create a new graph
let mut graph = SimpleGraph::new();
// Populate the graph with test data
let refs = populate_graph(&mut graph);
// Calculate average age of all people
let (sum, count) = graph.walk()
.vertices(VertexSearch::scan().with_label(Person::label()))
.fold((0, 0), |(sum, count), v, _| {
let age = v.project::<Person<_>>().unwrap().age();
(sum + age, count + 1)
});
let average_age = if count > 0 { sum as f64 / count as f64 } else { 0.0 };
Implementation Guide for Graph Backends
This guide will walk you through the process of implementing a new graph backend for the Graph API. While the Graph API
provides several ready-to-use implementations (like SimpleGraph
and PetGraph
), you might want to build your own
implementation to support specific requirements or optimize for particular use cases.
Overview
Implementing a graph backend involves satisfying the requirements of the Graph
trait and its associated types. This
guide will walk you through:
- Understanding the core components needed for a graph implementation
- Creating the fundamental data structures
- Implementing the required traits
- Building support for indexing (optional but recommended)
- Testing your implementation
Core Components
A Graph API implementation requires several components:
- Graph Structure: The primary data structure that stores vertices and edges
- Vertex and Edge IDs: Unique identifiers for graph elements
- References: Structures that refer to vertices and edges
- Iterators: Types for traversing vertices and edges
- Index Support: Optional components for efficient element lookup by property values
Step 1: Define IDs and Basic Structures
Start by defining your graph ID types and core data structure:
#![allow(unused)] fn main() { // Example vertex and edge IDs pub struct MyVertexId { label: u16, // Identifies the variant of the vertex enum index: u32, // Unique index within that label } pub struct MyEdgeId { label: u16, // Identifies the variant of the edge enum index: u32, // Unique index within that label from: MyVertexId, to: MyVertexId, } // Import the necessary traits use graph_api_lib::{ Graph, Element, EdgeSearch, VertexSearch, SupportsVertexLabelIndex, SupportsEdgeLabelIndex, SupportsVertexHashIndex, SupportsVertexRangeIndex, SupportsEdgeRangeIndex, SupportsVertexFullTextIndex, SupportsEdgeAdjacentLabelIndex, SupportsClear, SupportsElementRemoval }; // Main graph structure pub struct MyGraph<Vertex, Edge> where Vertex: Element, // From graph_api_lib Edge: Element, // From graph_api_lib { // Your internal storage goes here // Example: vertices: Vec<YourVertexStorage<Vertex>>, edges: Vec<YourEdgeStorage<Edge>>, // More fields as needed (indexes, etc.) } }
Step 2: Define Reference Types
Create reference types that will be returned when accessing vertices and edges:
#![allow(unused)] fn main() { pub struct MyVertexReference<'graph, Graph> where Graph: graph_api_lib::Graph, { id: Graph::VertexId, weight: &'graph Graph::Vertex, // Additional fields if needed } pub struct MyVertexReferenceMut<'graph, Graph> where Graph: graph_api_lib::Graph, { id: Graph::VertexId, weight: &'graph mut Graph::Vertex, // Additional fields (like a reference to indexes) indexes: &'graph mut YourIndexStorage, } // Similar for EdgeReference and EdgeReferenceMut }
Step 3: Implement the Required Traits
Now implement the required traits for your reference types:
#![allow(unused)] fn main() { impl<'graph, Graph> graph_api_lib::VertexReference<'graph, Graph> for MyVertexReference<'graph, Graph> where Graph: graph_api_lib::Graph, { fn id(&self) -> Graph::VertexId { self.id } fn weight(&self) -> &Graph::Vertex { self.weight } fn project< 'reference, T: graph_api_lib::Project<'reference, <Graph as graph_api_lib::Graph>::Vertex>, >( &'reference self, ) -> Option<T> { graph_api_lib::Project::project(self.weight) } } // Implement VertexReferenceMut, EdgeReference, EdgeReferenceMut similarly }
Step 4: Define Iterator Types
Create iterator types for walking through vertices and edges:
#![allow(unused)] fn main() { pub struct MyVertexIter<'search, 'graph, Graph> where Graph: graph_api_lib::Graph + 'graph, { // Internal state needed for iteration // Example: vertices: &'graph [YourVertexStorage<Graph::Vertex>], current_label: usize, current_index: usize, count: usize, limit: usize, } // Implement Iterator for MyVertexIter impl<'graph, Graph> Iterator for MyVertexIter<'_, 'graph, Graph> where Graph: graph_api_lib::Graph<VertexId=MyVertexId> + 'graph, { type Item = MyVertexReference<'graph, Graph>; fn next(&mut self) -> Option<Self::Item> { // Your iterator implementation // Return the next vertex reference, or None if done } } // Similarly for MyEdgeIter }
Step 5: Implement the Graph Trait
Now implement the Graph
trait for your graph structure:
#![allow(unused)] fn main() { // First implement the core Graph trait impl<Vertex, Edge> Graph for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { // Define the core types type Vertex = Vertex; type Edge = Edge; type VertexId = MyVertexId; type EdgeId = MyEdgeId; // Reference types type VertexReference<'graph> = MyVertexReference<'graph, Self> where Self: 'graph; type VertexReferenceMut<'graph> = MyVertexReferenceMut<'graph, Self> where Self: 'graph; type EdgeReference<'graph> = MyEdgeReference<'graph, Self> where Self: 'graph; type EdgeReferenceMut<'graph> = MyEdgeReferenceMut<'graph, Self> where Self: 'graph; // Iterator types type EdgeIter<'search, 'graph> = MyEdgeIter<'search, 'graph, Self> where Self: 'graph; type VertexIter<'search, 'graph> = MyVertexIter<'search, 'graph, Self> where Self: 'graph; // Implement the core graph operations fn add_vertex(&mut self, vertex: Self::Vertex) -> Self::VertexId { // Implementation details } fn add_edge( &mut self, from: Self::VertexId, to: Self::VertexId, edge: Self::Edge, ) -> Self::EdgeId { // Implementation details } fn vertex(&self, id: Self::VertexId) -> Option<Self::VertexReference<'_>> { // Implementation details } fn vertex_mut(&mut self, id: Self::VertexId) -> Option<Self::VertexReferenceMut<'_>> { // Implementation details } fn vertices<'search>( &self, vertex_search: &VertexSearch<'search, Self>, ) -> Self::VertexIter<'search, '_> { // Implementation details } fn edge(&self, id: Self::EdgeId) -> Option<Self::EdgeReference<'_>> { // Implementation details } fn edge_mut(&mut self, id: Self::EdgeId) -> Option<Self::EdgeReferenceMut<'_>> { // Implementation details } fn edges<'search>( &self, id: Self::VertexId, search: &EdgeSearch<'search, Self>, ) -> Self::EdgeIter<'search, '_> { // Implementation details } // No clear method here - it's moved to the SupportsClear trait } // Then implement support traits for the features you want to provide // Implement SupportsElementRemoval if you want to support removing vertices and edges impl<Vertex, Edge> SupportsElementRemoval for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { fn remove_vertex(&mut self, id: Self::VertexId) -> Option<Self::Vertex> { // Implementation details for removing a vertex // Remember to handle connected edges and update indexes } fn remove_edge(&mut self, id: Self::EdgeId) -> Option<Self::Edge> { // Implementation details for removing an edge // Remember to update adjacency lists and indexes } } impl<Vertex, Edge> SupportsVertexLabelIndex for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { // Any trait-specific methods would go here } impl<Vertex, Edge> SupportsEdgeLabelIndex for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { // Any trait-specific methods would go here } impl<Vertex, Edge> SupportsVertexHashIndex for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { // Any trait-specific methods would go here } // Add more support trait implementations as needed // Implement SupportsClear if you want to support clearing the graph impl<Vertex, Edge> SupportsClear for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { fn clear(&mut self) { // Clear all graph data } } }
Step 6: Implement Indexing (Optional)
If your graph supports indexes, you'll need to implement the index handling. This typically involves:
- Creating a mutation listener for updating indexes on vertex/edge changes
- Creating storage for different index types (hash, range, full-text)
- Implementing index update logic when vertices or edges are modified
#![allow(unused)] fn main() { // Example MutationListener for vertex modifications pub struct MyVertexMutationListener<'reference, Element> { // References to indexes and ID information indexes: &'reference mut YourIndexStorage, id: YourInternalId, } impl<'reference, Element> graph_api_lib::MutationListener<'reference, Element> for MyVertexMutationListener<'reference, Element> where Element: graph_api_lib::Element, { fn update(&mut self, index: <Element::Label as Label>::Index, before: Value, after: Value) { // Update the indexes with the changed value // Remove old index entry // Add new index entry } } }
Step 7: Advanced Features
Consider implementing additional features:
- Custom Traversal Optimizations: For specific types of queries
- Specialized Index Types: For domain-specific data types
- Persistence Layer: If your graph needs to be saved/loaded
- Concurrency Support: For multi-threaded access
Step 8: Testing
Test your implementation using the Graph API's test suite:
#![allow(unused)] fn main() { #[cfg(test)] mod test { use crate::MyGraph; use graph_api_test::test_suite; test_suite!(MyGraph::new()); } }
Best Practices
- Optimized ID Types: Use small, efficient ID types (consider using newtype patterns)
- Memory Efficiency: Consider spatial locality and data structure layout
- Index Handling: Carefully manage indexes to avoid inconsistencies
- Error Handling: Provide meaningful errors for invalid operations
- Documentation: Document the specific characteristics of your implementation
Example: Implementing a Simple Adjacency List Graph
Here's a sketch of how you might implement a simple adjacency list graph:
#![allow(unused)] fn main() { pub struct AdjListGraph<V, E> { vertices: Vec<Option<V>>, edges: Vec<Option<EdgeData<E>>>, adjacency: Vec<Vec<EdgeIndex>>, // Outgoing edges for each vertex } struct EdgeData<E> { edge: E, from: VertexIndex, to: VertexIndex, } // Implement the necessary traits... }
Next Steps
Implementing a graph backend requires careful attention to detail, but the Graph API's trait system provides a clear structure to follow. By implementing the core traits and considering performance implications, you can create a specialized graph backend that perfectly fits your use case.
Remember to check the source code of existing implementations like SimpleGraph
for more detailed examples of how to
handle complex scenarios like indexing and traversal.
Creating a Graph Implementation
This chapter provides detailed guidance for implementing a new Graph API backend. It expands on the concepts introduced in the Implementation Guide for Graph Backends.
Design Considerations
Before starting implementation, consider these key design decisions:
-
Storage Strategy: How will you store vertices and edges? Options include:
- Adjacency lists (good for sparse graphs)
- Adjacency matrices (good for dense graphs)
- Edge lists
- Custom hybrid approaches
-
Memory vs. Performance Trade-offs: Will your implementation prioritize:
- Low memory usage
- Fast traversal operations
- Fast mutation operations
- Fast index lookups
-
Feature Support: Which Graph API features will you support?
- Basic graph operations (required)
- Label indexes
- Hash indexes
- Range indexes
- Full-text indexes
- Other specialized indexes
-
Usage Context: Is your implementation designed for:
- General-purpose use
- Specific application domains
- Particular data patterns
Core Implementation Approach
A typical implementation follows these steps:
1. Define ID Types
#![allow(unused)] fn main() { pub struct MyVertexId { // Implementation-specific fields label: u16, index: u32, } pub struct MyEdgeId { // Implementation-specific fields label: u16, index: u32, from: MyVertexId, to: MyVertexId, } }
Make sure your ID types:
- Implement
Copy
,Clone
,Debug
,Eq
,PartialEq
, andHash
- Are small and efficient (IDs are used extensively)
- Can be converted to
ElementId
2. Create Core Graph Structure
#![allow(unused)] fn main() { pub struct MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { // Your internal storage vertices: Vec<VertexCollection>, edges: Vec<EdgeCollection>, // Optional index structures // ... } }
3. Implement Reference Types
Reference types provide safe access to graph elements:
#![allow(unused)] fn main() { pub struct MyVertexReference<'graph, Graph> where Graph: graph_api_lib::Graph, { id: Graph::VertexId, weight: &'graph Graph::Vertex, } pub struct MyVertexReferenceMut<'graph, Graph> where Graph: graph_api_lib::Graph, { id: Graph::VertexId, weight: &'graph mut Graph::Vertex, // Reference to indexes for updating indexes: &'graph mut IndexCollection, } }
Implement similar types for EdgeReference
and EdgeReferenceMut
.
4. Implement Iterator Types
Create iterator types for traversing vertices and edges:
#![allow(unused)] fn main() { pub struct MyVertexIter<'search, 'graph, Graph> where Graph: graph_api_lib::Graph + 'graph, { // Internal state graph: &'graph Graph, current_position: usize, // ... } impl<'graph, Graph> Iterator for MyVertexIter<'_, 'graph, Graph> where Graph: graph_api_lib::Graph<VertexId=MyVertexId> + 'graph, { type Item = MyVertexReference<'graph, Graph>; fn next(&mut self) -> Option<Self::Item> { // Implementation } } }
5. Mutation Listener for Indexes
If your graph supports indexes, implement a mutation listener:
#![allow(unused)] fn main() { pub struct MyMutationListener<'reference, Element> { // References to index structures indexes: &'reference mut IndexCollection, id: VertexInternalId, } impl<'reference, Element> graph_api_lib::MutationListener<'reference, Element> for MyMutationListener<'reference, Element> where Element: graph_api_lib::Element, { fn update(&mut self, index: <Element::Label as Label>::Index, before: Value, after: Value) { // Update indexes as needed } } }
Implementation Tips
Efficiency Considerations
-
Memory Layout: Keep related data together to improve cache locality.
-
Data Structures: Choose the right data structures for your use case:
Vec
for dense storage with stable IDsHashMap
for sparse storage with integer IDsBTreeMap
for ordered collections- Consider specialized structures like
smallbox
for tiny collections
-
Avoid Cloning: Pass references whenever possible instead of cloning data.
-
Index Carefully: Indexes improve query performance but add overhead to mutations.
Supporting Advanced Features
For implementing advanced features like range and full-text indexing:
- Range Indexes: Consider using
BTreeMap
or similar ordered collections. - Full-text Indexes: Implement tokenization, stemming, and inverted indexes.
- Custom Properties: Support for user-defined property types may require generics.
Example Implementation Structure
A typical implementation might be organized like this:
my_graph/
├── src/
│ ├── lib.rs # Main exports and Graph trait implementation
│ ├── id.rs # ID type definitions
│ ├── graph/
│ │ ├── mod.rs # Core graph implementation
│ │ ├── iter.rs # Iterator implementations
│ │ └── debug.rs # Debug helpers
│ └── index/
│ ├── mod.rs # Index type definitions
│ ├── hash.rs # Hash index implementation
│ ├── range.rs # Range index implementation
│ └── full_text.rs # Full-text index implementation
└── tests/
└── integration.rs # Integration tests using test_suite!
Next Steps
After implementing the basic graph functionality:
- Review the Testing Your Implementation chapter for testing strategies.
- Check the Implementing Indexes chapter for adding index support.
Remember that implementing a graph backend is an iterative process. Start with a minimal working implementation, test it thoroughly, and then add more features incrementally.
Testing Your Implementation
Testing a graph implementation thoroughly is essential to ensure correctness, reliability, and performance. The Graph API provides a comprehensive test suite to verify that your implementation meets all the requirements of the Graph API traits.
Using the Test Suite
The graph-api-test
crate provides a test suite that can be used to verify your graph implementation. The test suite
covers:
- Basic graph operations (adding/removing vertices and edges)
- Graph traversal and walker steps
- Index functionality
- Edge cases and error handling
- Fuzzing tests for robustness
Setting Up the Test Suite
To test your implementation:
- Add
graph-api-test
as a dev-dependency in yourCargo.toml
:
[dev-dependencies]
graph-api-test = { version = "0.2.0", features = ["vertex-hash-index", "vertex-label-index", "vertex-full-text-index", "vertex-range-index", "edge-label-index"] }
- Create a test module in your crate:
#![allow(unused)] fn main() { #[cfg(test)] mod test { use crate::MyGraph; use graph_api_test::test_suite; test_suite!(MyGraph::new()); } }
The test_suite!
macro generates test cases for all the functionality supported by the Graph API.
Understanding Test Coverage
The test suite covers several areas:
Basic Graph Operations
- Adding vertices and edges
- Removing vertices and edges
- Retrieving vertices and edges by ID
- Modifying vertex and edge properties
Walker Steps
Tests for all walker steps:
vertices
,vertices_by_id
,edges
filter
,map
,limit
,first
head
,tail
,detour
collect
,count
,into_iter
probe
,mutate
,dbg
- Context operations
Index Tests
If your graph supports indexes, tests for:
- Vertex and edge label indexes
- Hash indexes
- Range indexes
- Full-text indexes
Conditional Feature Testing
The test suite adapts to the features your graph implementation supports. For example, if your graph doesn't support range indexes, those tests will be skipped.
Writing Additional Tests
While the test suite covers most functionality, you should write additional tests for:
- Implementation-specific features: Any custom functionality your graph provides.
- Edge cases: Unusual usage patterns specific to your implementation.
- Performance tests: Verify that your implementation meets performance requirements.
Example: Testing a Custom Feature
#![allow(unused)] fn main() { #[test] fn test_my_custom_feature() { let mut graph = MyGraph::new(); // Test setup let vertex = graph.add_vertex(Vertex::Person { name: "Test".to_string(), ... }); // Test the custom feature let result = graph.my_custom_feature(vertex); // Assertions assert_eq!(result, expected_value); } }
Fuzzing Tests
The test suite includes fuzzing tests that apply random sequences of operations to your graph to find edge cases and bugs. These tests help ensure your implementation is robust against unexpected usage patterns.
Features and Extensions
This chapter covers how to extend your graph implementation with additional features and capabilities beyond the basic Graph API requirements.
Core vs. Extended Features
The Graph API defines a set of core features required for all graph implementations, plus optional extended features:
Required Core Features
All graph implementations must support:
- Basic Graph Management: Adding/removing vertices and edges
- Element Access: Retrieving vertices and edges by ID
- Graph Traversal: Supporting the walker API
- Vertex and Edge Types: Using the client's vertex and edge enum definitions
Optional Extended Features
Optional features you may want to support:
- Indexing: Various indexing strategies (label, hash, range, full-text)
- Graph Clearing: Efficiently removing all elements
- Custom Traversal Steps: Graph-specific optimization for certain traversals
- Persistence: Saving and loading graphs from storage
- Concurrency: Thread-safe access to graph elements
Declaring Feature Support
The Graph API uses separate support traits to indicate which features a graph implementation supports:
#![allow(unused)] fn main() { // First implement the core Graph trait impl<Vertex, Edge> Graph for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { // Core Graph functionality // ... } // Then implement the relevant support traits impl<Vertex, Edge> SupportsVertexLabelIndex for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { // Any trait-specific methods if needed } impl<Vertex, Edge> SupportsEdgeLabelIndex for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { // Any trait-specific methods if needed } // Skip implementing SupportsVertexFullTextIndex if not supported }
Implementing Optional Features
Index Support
To add index support to your graph implementation:
- Declare Support: Update your
Graph
implementation to declare support for specific index types. - Create Index Structures: Implement the appropriate data structures for each index type.
- Update Index Maintenance: Ensure indexes are updated when vertices/edges are added, modified, or removed.
Example for hash index support:
#![allow(unused)] fn main() { // Declare support type SupportsVertexHashIndex = Supported; // Create index structure struct HashIndex<K, V> { map: HashMap<K, HashSet<V>>, } impl<K: Hash + Eq, V: Copy + Eq + Hash> HashIndex<K, V> { fn new() -> Self { Self { map: HashMap::new(), } } fn insert(&mut self, key: K, value: V) { self.map.entry(key).or_default().insert(value); } fn remove(&mut self, key: &K, value: &V) { if let Some(values) = self.map.get_mut(key) { values.remove(value); if values.is_empty() { self.map.remove(key); } } } fn get(&self, key: &K) -> impl Iterator<Item=V> + '_ { self.map .get(key) .into_iter() .flat_map(|values| values.iter().copied()) } } }
Graph Clearing
To support clearing all elements from a graph, implement the SupportsClear
trait:
#![allow(unused)] fn main() { // First implement the Graph trait impl<Vertex, Edge> Graph for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { // ...core graph functionality } // Then implement the SupportsClear trait impl<Vertex, Edge> SupportsClear for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { fn clear(&mut self) { self.vertices.clear(); self.edges.clear(); self.indexes.iter_mut().for_each(|idx| idx.clear()); // Clear any other data structures } } }
Extending the Walker API
You can extend the Walker API for your graph implementation by creating custom steps:
#![allow(unused)] fn main() { // Define a custom step extension trait pub trait CustomStepExt<'g, G: Graph> { // A custom step that applies a specific algorithm fn my_custom_step(self, param: CustomParam) -> Self; } // Implement for the appropriate walker type impl<'g, G: Graph> CustomStepExt<'g, G> for VertexWalker<'g, G> { fn my_custom_step(self, param: CustomParam) -> Self { // Implementation that uses the walker's state // and applies your custom algorithm // ... } } }
See custom_step.rs for a full example.
Integration with External Systems
You might want to integrate your graph implementation with external systems:
Serialization/Deserialization
Add support for saving and loading graphs:
#![allow(unused)] fn main() { impl<V, E> MyGraph<V, E> where V: Element + Serialize + for<'de> Deserialize<'de>, E: Element + Serialize + for<'de> Deserialize<'de>, { pub fn save<W: Write>(&self, writer: W) -> Result<(), Error> { // Serialize the graph } pub fn load<R: Read>(reader: R) -> Result<Self, Error> { // Deserialize the graph } } }
Database Integration
Create adapters for database systems:
#![allow(unused)] fn main() { pub struct DatabaseBackedGraph<V, E> { connection: DbConnection, // Other fields for caching, etc. _phantom: PhantomData<(V, E)>, } impl<V, E> Graph for DatabaseBackedGraph<V, E> where V: Element, E: Element, { // Implement Graph trait with database operations // ... } }
Feature Compatibility
When implementing optional features, use trait bounds in your API:
#![allow(unused)] fn main() { pub fn search_full_text<G>(graph: &G, text: &str) -> Vec<G::VertexId> where G: Graph + SupportsVertexFullTextIndex, { // Can safely use full text search here graph.walk() .vertices(Vertex::person_by_biography(text)) .collect() } }
Best Practices for Extensions
When adding features and extensions, follow these best practices:
- Maintain Core API Compatibility: Ensure extensions don't break the core Graph API.
- Document Extensions Thoroughly: Clearly document what extensions are available and how to use them.
- Test Extensions Separately: Write dedicated tests for extended features.
- Consider Performance Impact: Ensure extensions don't negatively impact core operations.
Implementing Indexes
Indexes are a critical component for efficient graph operations, allowing for quick lookups of elements based on property values. This chapter guides you through implementing different types of indexes in your graph backend.
Index Types Overview
The Graph API supports four main types of indexes:
- Label Indexes: Index vertices and edges by their label (enum variant)
- Hash Indexes: Lookup elements by a property value using exact matching
- Range Indexes: Find elements with property values in a specific range
- Full-Text Indexes: Search text properties using more complex matching
The Index Infrastructure
The Index Trait
The Graph API defines the Index
trait to represent element indexes:
#![allow(unused)] fn main() { pub trait Index where Self: Sized + Copy + Eq + Hash + Debug, { /// The type of the element being indexed fn ty(&self) -> TypeId; /// The index ordinal fn ordinal(&self) -> usize; /// The type of index fn index_type(&self) -> IndexType; } }
Index Types
The IndexType
enum defines the supported index types:
#![allow(unused)] fn main() { pub enum IndexType { Hash, // Exact matching Range, // Range queries FullText, // Text search } }
Declare Index Support
To support indexes in your implementation, implement the appropriate support traits:
#![allow(unused)] fn main() { impl<Vertex, Edge> Graph for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, { // Core Graph functionality // ... } // Implement support traits for the indexing features you want to provide impl<Vertex, Edge> SupportsVertexLabelIndex for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, {} impl<Vertex, Edge> SupportsEdgeLabelIndex for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, {} impl<Vertex, Edge> SupportsVertexHashIndex for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, {} impl<Vertex, Edge> SupportsEdgeHashIndex for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, {} impl<Vertex, Edge> SupportsVertexRangeIndex for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, {} impl<Vertex, Edge> SupportsEdgeRangeIndex for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, {} impl<Vertex, Edge> SupportsVertexFullTextIndex for MyGraph<Vertex, Edge> where Vertex: Element, Edge: Element, {} }
Implementing Label Indexes
Label indexes allow for quickly finding all vertices or edges of a specific label (enum variant).
Data Structure
A simple label index can be implemented using a vector of sets:
#![allow(unused)] fn main() { /// Stores vertex IDs by label struct LabelIndex { // One entry per label, containing all vertex IDs with that label indexes: Vec<HashSet<VertexId>>, } impl LabelIndex { fn new(label_count: usize) -> Self { Self { indexes: (0..label_count).map(|_| HashSet::new()).collect(), } } fn insert(&mut self, label: usize, id: VertexId) { self.indexes[label].insert(id); } fn remove(&mut self, label: usize, id: &VertexId) { self.indexes[label].remove(id); } fn get(&self, label: usize) -> impl Iterator<Item=VertexId> + '_ { self.indexes[label].iter().copied() } } }
Integration with Vertex/Edge Operations
Update label indexes during vertex/edge operations:
#![allow(unused)] fn main() { fn add_vertex(&mut self, vertex: Self::Vertex) -> Self::VertexId { let label_idx = vertex.label().ordinal(); let vertex_id = // create a new vertex ID // Add to label index self.label_index.insert(label_idx, vertex_id); // Rest of implementation // ... } }
Implementing Hash Indexes
Hash indexes allow for quick lookups by property value using exact matching.
Data Structure
A hash index maps property values to sets of element IDs:
#![allow(unused)] fn main() { struct HashIndex<K, V> { map: HashMap<K, HashSet<V>>, } impl<K: Hash + Eq, V: Copy + Eq + Hash> HashIndex<K, V> { fn new() -> Self { Self { map: HashMap::new(), } } fn insert(&mut self, key: K, value: V) { self.map.entry(key).or_default().insert(value); } fn remove(&mut self, key: &K, value: &V) { if let Some(values) = self.map.get_mut(key) { values.remove(value); if values.is_empty() { self.map.remove(key); } } } fn get(&self, key: &K) -> impl Iterator<Item=V> + '_ { self.map .get(key) .into_iter() .flat_map(|values| values.iter().copied()) } } }
Integration with Mutation Handling
When a vertex or edge property changes, you need to update the hash indexes:
#![allow(unused)] fn main() { pub struct MyMutationListener<'reference, Element> { indexes: &'reference mut IndexCollection, id: VertexInternalId, } impl<'reference, Element> MutationListener<'reference, Element> for MyMutationListener<'reference, Element> where Element: Element, { fn update(&mut self, index: <Element::Label as Label>::Index, before: Value, after: Value) { // Remove the old value from the index self.indexes[index.ordinal()].remove(&before, self.id); // Add the new value to the index self.indexes[index.ordinal()].insert(after, self.id); } } }
Implementing Range Indexes
Range indexes allow for finding elements with property values in a specific range.
Data Structure
A range index typically uses an ordered map like BTreeMap
:
#![allow(unused)] fn main() { struct RangeIndex<K, V> { map: BTreeMap<K, HashSet<V>>, } impl<K: Ord, V: Copy + Eq + Hash> RangeIndex<K, V> { fn new() -> Self { Self { map: BTreeMap::new(), } } fn insert(&mut self, key: K, value: V) { self.map.entry(key).or_default().insert(value); } fn remove(&mut self, key: &K, value: &V) { if let Some(values) = self.map.get_mut(key) { values.remove(value); if values.is_empty() { self.map.remove(key); } } } fn get(&self, key: &K) -> impl Iterator<Item=V> + '_ { self.map .get(key) .into_iter() .flat_map(|values| values.iter().copied()) } fn range<'a, R>(&'a self, range: R) -> impl Iterator<Item=V> + 'a where R: RangeBounds<K> + 'a, K: 'a, { self.map .range(range) .flat_map(|(_, values)| values.iter().copied()) } } }
Handling Range Queries
Implement range query support in your vertex and edge iterators:
#![allow(unused)] fn main() { fn vertices<'search>( &self, search: &VertexSearch<'search, Self>, ) -> Self::VertexIter<'search, '_> { // ...other search handling if let VertexSearch::Range { index, range, .. } = search { let index_storage = &self.indexes[index.ordinal()]; return VertexIter { iter: index_storage.range(range.clone(), index), // ...other fields }; } // ...default handling } }
Implementing Full-Text Indexes
Full-text indexes enable searching through text properties with capabilities like prefix matching, fuzzy matching, or tokenized searches.
Data Structure
A simple full-text index implementation might use an inverted index:
#![allow(unused)] fn main() { struct FullTextIndex<V> { // Maps tokens to element IDs tokens: HashMap<String, HashSet<V>>, // Maps element IDs to their full text contents: HashMap<V, String>, } impl<V: Copy + Eq + Hash> FullTextIndex<V> { fn new() -> Self { Self { tokens: HashMap::new(), contents: HashMap::new(), } } fn insert(&mut self, id: V, text: &str) { // Remove old content if it exists self.remove(&id); // Store the full text self.contents.insert(id, text.to_string()); // Tokenize the text and add to inverted index for token in tokenize(text) { self.tokens.entry(token).or_default().insert(id); } } fn remove(&mut self, id: &V) { if let Some(text) = self.contents.remove(id) { // Remove from token index for token in tokenize(&text) { if let Some(ids) = self.tokens.get_mut(&token) { ids.remove(id); if ids.is_empty() { self.tokens.remove(&token); } } } } } fn search(&self, query: &str) -> impl Iterator<Item=V> + '_ { // Tokenize the query let query_tokens: Vec<_> = tokenize(query).collect(); // Find matches query_tokens .into_iter() .filter_map(move |token| self.tokens.get(&token)) .flatten() .copied() .collect::<HashSet<_>>() // Deduplicate .into_iter() } } // Helper function to tokenize text fn tokenize(text: &str) -> impl Iterator<Item=String> + '_ { text.to_lowercase() .split_whitespace() .map(|s| s.to_string()) } }
Advanced Full-Text Features
For more advanced full-text search capabilities, consider:
- Stemming: Reducing words to their root form (e.g., "running" → "run")
- N-grams: Creating token sequences for partial matching
- Fuzzy Matching: Supporting approximate matching with edit distance
- Relevance Scoring: Ranking results by relevance to the query
Efficient Index Updates
For efficient index updates, consider these strategies:
Lazy Updates
Only update indexes when needed:
#![allow(unused)] fn main() { fn remove_vertex(&mut self, id: Self::VertexId) -> Option<Self::Vertex> { // Fetch the vertex let vertex = self.vertices.get(id.index())?; // Only update indexes if the vertex exists for index in vertex.label().indexes() { self.indexes[index.ordinal()].remove(&self.get_value(vertex, index), id.index()); } // Remove the vertex // ... } }
Batched Updates
For bulk operations, batch index updates:
#![allow(unused)] fn main() { fn add_vertices_batch(&mut self, vertices: Vec<Self::Vertex>) -> Vec<Self::VertexId> { let mut ids = Vec::with_capacity(vertices.len()); let mut index_updates = Vec::new(); // First pass: add vertices and collect index updates for vertex in vertices { let id = self.add_vertex_no_index(vertex); ids.push(id); // Collect index updates for index in vertex.label().indexes() { index_updates.push((index, self.get_value(&vertex, index), id)); } } // Second pass: apply all index updates for (index, value, id) in index_updates { self.indexes[index.ordinal()].insert(value, id); } ids } }
Common Pitfalls
When implementing indexes, watch out for these common issues:
- Index Inconsistency: Ensure indexes are always updated when elements change
- Memory Overhead: Indexes increase memory usage; consider selective indexing
- Type Safety: Ensure index type safety, especially with range and full-text indexes
- Empty Sets: Handle empty index entries to avoid wasted memory
- Concurrency: If supporting concurrent access, protect indexes with appropriate synchronization
Performance Considerations
For optimal index performance:
- Index Selection: Only index properties that will be frequently queried
- Data Structure Choice: Choose appropriate data structures based on workload
- Memory vs. Speed: Balance memory usage with lookup speed
- Measure: Benchmark index operations to identify bottlenecks
Testing Indexes
Test your index implementation thoroughly:
#![allow(unused)] fn main() { #[test] fn test_hash_index() { let mut graph = MyGraph::new(); // Add vertices with indexed properties let v1 = graph.add_vertex(Vertex::Person { name: "Alice".to_string(), age: 30, // ...other fields }); let v2 = graph.add_vertex(Vertex::Person { name: "Bob".to_string(), age: 25, // ...other fields }); // Test index lookup let results = graph.walk() .vertices(Vertex::person_by_name("Alice")) .collect::<Vec<_>>(); assert_eq!(results.len(), 1); assert_eq!(results[0].id(), v1); // Test index update after mutation graph.vertex_mut(v1).unwrap() .project_mut::<PersonMut<_>>().unwrap() .set_name("Alicia"); let results = graph.walk() .vertices(Vertex::person_by_name("Alicia")) .collect::<Vec<_>>(); assert_eq!(results.len(), 1); assert_eq!(results[0].id(), v1); } }
Next Steps
Implementing efficient indexes is key to building a high-performance graph backend. By carefully designing index structures and ensuring proper updates during mutations, you can provide fast lookup capabilities while maintaining reasonable memory usage. Remember that different use cases may require different indexing strategies, so consider the expected query patterns when deciding which indexes to implement and how to optimize them.
Benchmarking
Benchmarking is essential for measuring and comparing the performance of your graph implementation. This chapter covers how to set up and run benchmarks, interpret results, and use benchmarking to guide optimization.
Setting Up Benchmarks
The Graph API provides built-in benchmarking tools to help you evaluate your implementation's performance.
Adding Benchmarking Dependencies
First, add the necessary dependencies to your Cargo.toml
:
[dev-dependencies]
criterion = "0.5"
graph-api-benches = { version = "0.2.0", features = ["vertex-hash-index", "vertex-label-index", "vertex-full-text-index", "vertex-range-index", "edge-label-index"] }
[[bench]]
name = "my_graph_benchmarks"
harness = false
Creating a Basic Benchmark Suite
Create a benchmark file in your project's benches
directory:
#![allow(unused)] fn main() { // benches/my_graph_benchmarks.rs use criterion::{criterion_group, criterion_main, Criterion}; use graph_api_benches::bench_suite; use my_graph::MyGraph; fn criterion_benchmark(c: &mut Criterion) { bench_suite!(c, || MyGraph::new()); } criterion_group!(benches, criterion_benchmark); criterion_main!(benches); }
The bench_suite!
macro runs a standardized set of benchmarks against your graph implementation.
Standard Benchmark Suite
The standard benchmark suite in graph-api-benches
tests several aspects of graph performance:
1. Construction Benchmarks
Measures the performance of creating graph elements:
- Creating vertices
- Creating edges
- Creating graphs of different sizes
2. Query Benchmarks
Evaluates lookup and traversal performance:
- Vertex retrieval by ID
- Vertex retrieval by index
- Edge traversal
- Path finding
3. Mutation Benchmarks
Tests the efficiency of modifying the graph:
- Adding vertices and edges
- Removing vertices and edges
- Modifying vertex and edge properties
4. Traversal Benchmarks
Measures the performance of walking the graph:
- Simple steps (vertices, edges)
- Filter operations
- Map operations
- Complex traversals
5. Scale Benchmarks
Assesses how performance scales with graph size:
- Small graphs (10s of vertices)
- Medium graphs (100s of vertices)
- Large graphs (1000s of vertices)
- Huge graphs (10,000s of vertices)
Running Benchmarks
To run the benchmarks:
# Run all benchmarks
cargo bench
# Run a specific benchmark
cargo bench -- vertex_insertion
API Reference
Check out the docs on docs.rs,
Support Traits
The Graph API uses a trait-based approach to indicate which optional features a graph implementation supports. This allows for a more flexible and extensible API, where new capabilities can be added over time without breaking existing implementations.
Core Trait
The core Graph
trait provides the essential functionality that all graph implementations must provide:
#![allow(unused)] fn main() { pub trait Graph: Sized + Debug { type Vertex: Debug + Element; type Edge: Debug + Element; type VertexId: Debug + Eq + PartialEq + Copy + Clone + Hash + Into<ElementId<Self>> + 'static; type EdgeId: Debug + Eq + PartialEq + Copy + Clone + Hash + Into<ElementId<Self>> + 'static; // Reference types type VertexReference<'graph>: VertexReference<'graph, Self> where Self: 'graph; type VertexReferenceMut<'graph>: VertexReferenceMut<'graph, Self> where Self: 'graph; type EdgeReference<'graph>: EdgeReference<'graph, Self> where Self: 'graph; type EdgeReferenceMut<'graph>: EdgeReferenceMut<'graph, Self> where Self: 'graph; // Iterator types type EdgeIter<'search, 'graph>: Iterator<Item=Self::EdgeReference<'graph>> where Self: 'graph; type VertexIter<'search, 'graph>: Iterator<Item=Self::VertexReference<'graph>> where Self: 'graph; // Core methods fn add_vertex(&mut self, vertex: Self::Vertex) -> Self::VertexId; fn add_edge(&mut self, from: Self::VertexId, to: Self::VertexId, edge: Self::Edge) -> Self::EdgeId; fn vertex(&self, id: Self::VertexId) -> Option<Self::VertexReference<'_>>; fn vertex_mut(&mut self, id: Self::VertexId) -> Option<Self::VertexReferenceMut<'_>>; fn vertices<'search>(&self, vertex_search: &VertexSearch<'search, Self>) -> Self::VertexIter<'search, '_>; fn edge(&self, id: Self::EdgeId) -> Option<Self::EdgeReference<'_>>; fn edge_mut(&mut self, id: Self::EdgeId) -> Option<Self::EdgeReferenceMut<'_>>; fn edges<'search>(&self, id: Self::VertexId, search: &EdgeSearch<'search, Self>) -> Self::EdgeIter<'search, '_>; // Default implementations fn dbg<T: Into<ElementId<Self>>>(&self, id: T) -> String { ... } fn walk(&self) -> StartWalkerBuilder<ImmutableMarker, Self> { ... } fn walk_mut(&mut self) -> StartWalkerBuilder<MutableMarker, Self> { ... } // Default implementation that panics if SupportsClear is not implemented fn clear(&mut self) { panic!("This graph implementation does not support clearing. Implement the SupportsClear trait for this graph type to add clearing support.") } } }
Support Traits
These traits enable optional features for graph implementations. Each support trait extends the Graph
trait, so
implementations must first implement Graph
before they can implement any support traits.
Vertex Index Support
#![allow(unused)] fn main() { /// Supports indexing of vertices by label pub trait SupportsVertexLabelIndex: Graph {} /// Supports indexing of vertices by field using a hash index pub trait SupportsVertexHashIndex: Graph {} /// Supports indexing of vertices by field with range queries pub trait SupportsVertexRangeIndex: Graph {} /// Supports indexing of vertices by field using a full text index pub trait SupportsVertexFullTextIndex: Graph {} }
Edge Index Support
#![allow(unused)] fn main() { /// Supports indexing of edges by label pub trait SupportsEdgeLabelIndex: Graph {} /// Supports indexing of edges by field using a hash index pub trait SupportsEdgeHashIndex: Graph {} /// Supports indexing of edges by field with range queries pub trait SupportsEdgeRangeIndex: Graph {} /// Supports indexing of edges by adjacent vertex label pub trait SupportsEdgeAdjacentLabelIndex: Graph {} }
Mutation Support
#![allow(unused)] fn main() { /// Supports clearing all vertices and edges pub trait SupportsClear: Graph { /// Clears the graph, removing all vertices and edges fn clear(&mut self); } /// Supports removal of individual vertices and edges pub trait SupportsElementRemoval: Graph { /// Removes a vertex from the graph and returns the vertex. fn remove_vertex(&mut self, id: Self::VertexId) -> Option<Self::Vertex>; /// Removes an edge from the graph and returns the edge. fn remove_edge(&mut self, id: Self::EdgeId) -> Option<Self::Edge>; } }
Using Support Traits
When you implement a graph, first implement the core Graph
trait, and then implement any support traits for the
features you want to provide:
#![allow(unused)] fn main() { // Core implementation impl<V, E> Graph for MyGraph<V, E> where V: Element, E: Element, { // Implement required methods and types... } // Add support for vertex label indexing impl<V, E> SupportsVertexLabelIndex for MyGraph<V, E> where V: Element, E: Element, { // No additional methods required } // Add support for clearing impl<V, E> SupportsClear for MyGraph<V, E> where V: Element, E: Element, { fn clear(&mut self) { // Implementation to clear all vertices and edges self.vertices.clear(); self.edges.clear(); // Clear any indexes... } } }
Using Supported Features
When writing code that uses supported features, use trait bounds to ensure the graph implementation supports the required features:
#![allow(unused)] fn main() { // Function that uses vertex label indexing fn get_people<G>(graph: &G) -> Vec<G::VertexId> where G: Graph + SupportsVertexLabelIndex, { // Can safely use label indexing here graph.walk() .vertices(Vertex::person()) .map(|v| v.id()) .collect() } // Function that uses range indexing fn get_adults<G>(graph: &G) -> Vec<G::VertexId> where G: Graph + SupportsVertexRangeIndex, { // Can safely use range indexing here graph.walk() .vertices(Vertex::person_by_age_range(18..)) .map(|v| v.id()) .collect() } // Function that removes elements fn remove_vertex_and_connected_edges<G>(graph: &mut G, id: G::VertexId) where G: Graph + SupportsElementRemoval, { // Can safely use element removal here graph.remove_vertex(id); } }
Adding New Support Traits
The trait-based approach allows for adding new support traits over time without breaking existing code. To add a new support trait:
- Define the new trait extending the
Graph
trait - Add any required methods
- Implement the trait for graph implementations that support the feature
#![allow(unused)] fn main() { // Example: Adding support for spatial indexing pub trait SupportsSpatialIndex: Graph { // Optional specialized methods for spatial indexing fn nearest_neighbors(&self, point: Point, k: usize) -> Vec<Self::VertexId>; } // Implement for a graph that supports spatial indexing impl<V, E> SupportsSpatialIndex for SpatialGraph<V, E> where V: Element, E: Element, { fn nearest_neighbors(&self, point: Point, k: usize) -> Vec<Self::VertexId> { // Implementation... } } }
Derive Macros
The Graph API provides derive macros to make working with your graph model types straightforward and type-safe. These macros generate code that integrates your custom types with the Graph API framework.
Overview
There are two primary derive macros:
VertexExt
- For vertex enum typesEdgeExt
- For edge enum types
These macros generate:
- Label enums for type-safe queries
- Index enums for efficient property lookups
- Helper methods for traversing and querying the graph
- Projection types for type-safe access to variant fields
- Walker builder filter extensions for type-safe filtering
VertexExt Derive Macro
Generated Types
When you apply #[derive(VertexExt)]
to an enum, the following types are generated:
- VertexLabel enum - Contains variants matching your enum's variants
- VertexIndex enum - Contains variants for each indexed field
- Projection structs - For accessing fields in a type-safe way
Example
use graph_api_derive::VertexExt;
use uuid::Uuid;
#[derive(Debug, Clone, VertexExt)]
pub enum Vertex {
Person {
#[index]
name: String,
#[index(range)]
age: u64,
#[index(full_text)]
biography: String,
unique_id: Uuid, // Not indexed
},
Project {
name: String,
},
Tag, // Unit variant
}
This generates:
// Label enum
pub enum VertexLabel {
Person,
Project,
Tag,
}
// Index enum with methods
pub enum VertexIndex {
PersonName,
PersonAge,
PersonBiography,
}
// Projection structs (simplified)
pub struct Person<'a, V> {
name: &'a String,
age: &'a u64,
biography: &'a String,
unique_id: &'a Uuid,
}
pub struct PersonMut<'a, V, L> {
name: &'a mut String,
age: &'a mut u64,
biography: &'a mut String,
unique_id: &'a mut Uuid,
}
Generated Methods
The derive macro generates several methods on the VertexIndex
enum:
Label-based Querying
For each enum variant, a method is generated to query by label:
// Query for all Person vertices
Vertex::person() -> VertexSearch<'_, Graph>
Property-based Querying
For each indexed field, methods are generated for exact matching:
// Query for Person vertices with a specific name
Vertex::person_by_name(value: & str) -> VertexSearch<'_, Graph>
// Query for Person vertices with a specific age
Vertex::person_by_age(value: u64) -> VertexSearch<'_, Graph>
Range-based Querying
For fields with the #[index(range)]
attribute:
// Query for Person vertices with age in a range
Vertex::person_by_age_range(range: Range<u64>) -> VertexSearch<'_, Graph>
Full-text Querying
For fields with the #[index(full_text)]
attribute:
// Query for Person vertices with matching text in biography
Vertex::person_by_biography(search: & str) -> VertexSearch<'_, Graph>
EdgeExt Derive Macro
Generated Types
When you apply #[derive(EdgeExt)]
to an enum, similar types are generated:
- EdgeLabel enum - Contains variants matching your enum's variants
- EdgeIndex enum - Contains variants for each indexed field
- Projection structs - For accessing fields in a type-safe way
Example
use graph_api_derive::EdgeExt;
#[derive(Debug, Clone, EdgeExt)]
pub enum Edge {
Knows {
since: u32,
},
Created,
Rated(Rating),
}
This generates:
// Label enum
pub enum EdgeLabel {
Knows,
Created,
Rated,
}
// Index enum with methods
pub enum EdgeIndex {
// EdgeIndex variants (if indexed fields exist)
}
Generated Methods
The EdgeIndex enum offers methods for edge traversal:
// Query for all Knows edges
EdgeIndex::knows() -> EdgeSearch<'_, Graph>
// Specify outgoing direction
EdgeIndex::knows().outgoing()
// Specify incoming direction
EdgeIndex::knows().incoming()
// Limit result count
EdgeIndex::knows().limit(n)
Walker Builder Filter Extensions
The derive macros also generate filter extension methods on the walker builders to simplify filtering based on vertex/edge types.
For Unit Variants
For unit variants (without fields), a single filter method is generated:
// Filter for all instances of the unit variant
fn filter_tag(self) -> /* walker builder */
Usage example:
// Get all Tag vertices
let tags = graph
.walk()
.vertices(VertexSearch::scan())
.filter_tag()
.collect::<Vec<_ > > ();
For Named Fields Variants
For variants with named fields, two filter methods are generated:
// 1. Filter for all instances of this variant
fn filter_person(self) -> /* walker builder */
// 2. Filter with custom logic using the projected fields
fn filter_by_person<F>(self, filter: F) -> /* walker builder */
where
F: Fn(Person<Graph::Vertex>, &Context) -> bool
Usage example:
// Get all Person vertices
let all_persons = graph
.walk()
.vertices(VertexSearch::scan())
.filter_person()
.collect::<Vec<_ > > ();
// Get Person vertices with specific criteria
let adults = graph
.walk()
.vertices(VertexSearch::scan())
.filter_by_person( | person, _ | person.age() > = 18)
.collect::<Vec<_ > > ();
For Tuple Variants
For tuple variants, similar filter methods are generated:
// 1. Filter for all instances of this variant
fn filter_rated(self) -> /* walker builder */
// 2. Filter with custom logic using the tuple fields
fn filter_by_rated<F>(self, filter: F) -> /* walker builder */
where
F: Fn(&Rating, &Context) -> bool
Usage example:
// Get edges with high ratings
let high_ratings = graph
.walk()
.vertices_by_id([person_id])
.edges(EdgeSearch::scan())
.filter_by_rated( | rating, _ | rating.stars > = 4)
.collect::<Vec<_ > > ();
Benefits of Filter Extensions
These filter extensions provide several advantages:
- Type Safety - The closures receive strongly typed projections
- Code Clarity - Filters are expressive and self-documenting
- IDE Support - Better autocompletion for variant fields
- Context Access - Access to the walker's context object
- Pattern Matching - No need for manual pattern matching
Using Generated Types
In Graph Queries
The generated types integrate with the Graph walker pattern:
// Find all person vertices
let people = graph
.walk()
.vertices(Vertex::person())
.collect::<Vec<_ > > ();
// Find people with a specific name
let named_people = graph
.walk()
.vertices(Vertex::person_by_name("Bryn"))
.collect::<Vec<_ > > ();
// Find people in an age range
let adults = graph
.walk()
.vertices(Vertex::person_by_age_range(18..65))
.collect::<Vec<_ > > ();
// Find outgoing 'knows' edges from a vertex
let friends = graph
.walk()
.vertices_by_id([person_id])
.edges(EdgeIndex::knows().outgoing())
.collect::<Vec<_ > > ();
Combined with Filter Extensions
Filter extensions can be combined with other walker steps:
// Find adults named "Bryn" with a complex filter
let result = graph
.walk()
.vertices(Vertex::person())
.filter_by_person( | person, _ | {
person.age() > = 18 & & person.name().contains("Bryn")
})
.collect::<Vec<_ > > ();
// Find friendship edges created before 2000
let old_friendships = graph
.walk()
.vertices_by_id([person_id])
.edges(EdgeIndex::knows().outgoing())
.filter_by_knows( | knows, _ | knows.since() < 2000)
.collect::<Vec<_ > > ();
Type Constraints
When using these types, your Graph type needs to implement appropriate support:
fn example<G>(graph: &G)
where
G: Graph<Vertex=Vertex, Edge=Edge>,
G::SupportsVertexLabelIndex: Supported,
{
// Now you can use label-based indexes
graph.walk().vertices(Vertex::person())...
}
Index Attributes
You can use these attributes on struct fields:
#[index]
- Basic indexing for efficient lookups#[index(range)]
- Enables range queries#[index(full_text)]
- Enables text search (String fields only)
Best Practices
-
Use the appropriate index type for your query pattern:
- Use label index for type filtering
- Use property index for exact matches
- Use range index for numeric ranges
- Use full-text for searching text content
-
Apply indexes sparingly:
- Each index adds memory overhead
- Only index fields you'll query frequently, it's OK to filter once you are on the graph.
-
Consider the query planner:
- Using an index in a vertices() step is typically more efficient than filtering the entire graph
- Combining indices with other walker steps can create efficient traversal patterns
-
Use filter extensions for type-safety:
- Prefer
filter_by_person()
overfilter()
with manual pattern matching - Leverage the projection types for field access
- Use specific filter methods for clearer, more maintainable code
- Prefer
Existing Implementations
The Graph API ecosystem includes two ready-to-use graph implementations that cater to different use cases and performance requirements. This section provides an overview of the existing implementations, their features, and suitability for different applications.
Core Implementations
The Graph API project provides two main graph implementations:
-
SimpleGraph: A reference implementation built specifically for the Graph API. It fully supports all Graph API features, including most index types, making it ideal for testing Graph API functionality and serving as a blueprint for new implementations.
-
PetGraph: An adapter for the excellent and widely-used petgraph crate. This demonstrates Graph API compatibility with established Rust graph libraries and is a great choice for performance-sensitive applications or projects already using petgraph.
Choosing an Implementation
When deciding which graph implementation to use, consider the following factors:
Feature Support
Different implementations support different features:
Feature | SimpleGraph | PetGraph |
---|---|---|
Vertex label indexes | ✅ | ❌ |
Edge label indexes | ✅ | ❌ |
Vertex hash indexes | ✅ | ❌ |
Edge hash indexes | ❌ | ❌ |
Vertex range indexes | ✅ | ❌ |
Edge range indexes | ❌ | ❌ |
Vertex full-text indexes | ✅ | ❌ |
Edge adjacent label indexes* | ✅ | ❌ |
Graph clearing | ✅ | ✅ |
* Edge adjacent indexes are not fully supported in graph-api yet.
Performance Characteristics
Performance varies between implementations:
- SimpleGraph: Primarily designed for feature completeness and testing; not optimized for high performance.
- PetGraph: A mature, well-optimized, and widely-used graph library suitable for production use.
Memory Usage
- SimpleGraph: Memory usage is straightforward but not heavily optimized.
- PetGraph: Leverages petgraph's memory model, which is generally efficient.
Integration
- SimpleGraph: Best for demonstrating and testing Graph API features or as a starting point for custom implementations.
- PetGraph: Ideal for integrating Graph API capabilities into projects already using petgraph or when requiring petgraph's performance characteristics.
Creating Your Own Implementation
If the existing implementations don't meet your needs, you can create your own by implementing the Graph
trait. See
the Implementation Guide for detailed instructions on creating a custom graph
implementation.
Future Implementations
The Graph API is designed to support a variety of graph implementations.
Community contributions of new graph implementations are welcomed and encouraged!
SimpleGraph
SimpleGraph
is the reference implementation for the Graph API, designed primarily to showcase and test the full
capabilities of the API, including comprehensive indexing support for property graphs. While functional, it's not
optimized for high performance.
Installation
Add SimpleGraph to your project dependencies:
[dependencies]
graph-api-lib = "0.2.0"
graph-api-derive = "0.1.4" # For derive macros
graph-api-simplegraph = "0.2.1"
Overview
SimpleGraph
is a custom in-memory graph implementation built specifically to demonstrate and validate all Graph API
features, including every type of index. It serves as a clear example for developers implementing the Graph API traits
and is invaluable for testing Graph API consumers against a fully compliant backend. It handles property graph use cases
where elements have labels (enum variants) and indexed properties.
#![allow(unused)] fn main() { use graph_api_simplegraph::SimpleGraph; use graph_api_derive::{VertexExt, EdgeExt}; use graph_api_lib::Graph; // Define vertex and edge types #[derive(Debug, Clone, VertexExt)] pub enum Vertex { Person { #[index(hash)] name: String, #[index(range)] age: u64, }, Project { name: String }, } #[derive(Debug, Clone, EdgeExt)] pub enum Edge { Knows { since: i32 }, Created, } // Create a new graph let mut graph = SimpleGraph::new(); // Use the graph let alice = graph.add_vertex(Vertex::Person { name: "Alice".to_string(), age: 30 }); let project = graph.add_vertex(Vertex::Project { name: "Graph API".to_string() }); graph.add_edge(alice, project, Edge::Created); }
Architecture
SimpleGraph
uses a custom data structure designed specifically for property graphs:
-
Vertex Storage: Vertices are stored in collections grouped by label (enum variant), allowing for efficient label-based filtering.
-
Edge Storage: Edges are stored with both head and tail connections, enabling fast traversal in both directions.
-
Indexes: Multiple index types are implemented to support different query patterns:
- Label indexes for finding vertices/edges by label
- Hash indexes for exact property matching
- Range indexes for numeric and string range queries
- Full-text indexes for text search
-
Adjacency Lists: Each vertex maintains an adjacency list for fast edge traversal.
Features
SimpleGraph
supports all Graph API features:
- ✅ Vertex label indexes
- ✅ Edge label indexes
- ✅ Vertex hash indexes
- ❌ Edge hash indexes
- ✅ Vertex range indexes
- ❌ Edge range indexes
- ✅ Vertex full-text indexes
- ✅ Edge adjacent label indexes (this isn't fully supported in graph-api-lib yet)
- ✅ Graph clearing
Performance Characteristics
SimpleGraph
is primarily designed for feature completeness and ease of understanding, not for high performance:
- Correctness over Speed: Implementation prioritizes demonstrating API features correctly.
- Basic Optimizations: Includes fundamental optimizations like adjacency lists but lacks advanced performance tuning.
Benchmarks
Benchmarks exist primarily to validate the functionality of the API features rather than to showcase raw speed. Expect:
- Functional insertion and removal.
- Correct index lookups (hash, range, full-text) but potentially slower than highly optimized alternatives.
- Basic traversal efficiency.
Use Cases
SimpleGraph
is ideal for:
- Testing: Verifying code that uses the Graph API against a fully compliant implementation.
- Reference: Understanding how to implement the Graph API traits and features.
- Learning: Exploring the capabilities of the Graph API in a controlled environment.
- Prototyping: Quickly building graph-based applications where performance is not the primary concern initially.
Implementation Notes
SimpleGraph
is implemented using several key components:
Core Data Structures
LabelledVertices
: Stores vertices by label, with efficient access by IDLabelledEdges
: Stores edges by labelVertexStorage
: Maintains vertex data and adjacency informationVertexIndexStorage
: Handles different index types (hash, range, full-text)
Indexes
SimpleGraph
implements several index types:
HashIndex
: Maps property values to sets of vertex/edge IDsRangeIndex
: Uses ordered maps for range queriesFullTextIndex
: Implements inverted indexes for text search
Mutation Handling
To maintain index consistency, SimpleGraph
uses a mutation listener system:
#![allow(unused)] fn main() { // When a vertex property changes, the index is automatically updated graph.vertex_mut(vertex_id) .unwrap() .project_mut::<PersonMut<_ > > () .unwrap() .set_name("New Name"); }
Limitations
While SimpleGraph
is a robust implementation, it has some limitations:
- In-memory only: All graph data must fit in memory
- Single-threaded: No built-in support for concurrent access
- No persistence: No built-in support for saving/loading graphs
Source Code
The source code for SimpleGraph
is available in
the graph-api-simplegraph crate.
Example Usage
Here's a complete example demonstrating key features of SimpleGraph
:
#![allow(unused)] fn main() { use graph_api_simplegraph::SimpleGraph; use graph_api_derive::{VertexExt, EdgeExt}; use graph_api_lib::{Graph, VertexSearch, EdgeSearch, Direction}; // Define vertex and edge types #[derive(Debug, Clone, VertexExt)] pub enum Vertex { Person { #[index(hash)] name: String, #[index(range)] age: u64, #[index(full_text)] bio: String, }, Project { name: String }, } #[derive(Debug, Clone, EdgeExt)] pub enum Edge { Knows { since: i32 }, Created, } // Create a new graph let mut graph = SimpleGraph::new(); // Add vertices let alice = graph.add_vertex(Vertex::Person { name: "Alice".to_string(), age: 30, bio: "Graph enthusiast".to_string(), }); let bob = graph.add_vertex(Vertex::Person { name: "Bob".to_string(), age: 25, bio: "Software developer".to_string(), }); let project = graph.add_vertex(Vertex::Project { name: "Graph API".to_string() }); // Add edges graph.add_edge(alice, bob, Edge::Knows { since: 2020 }); graph.add_edge(alice, project, Edge::Created); graph.add_edge(bob, project, Edge::Created); // Query by label let people = graph.walk() .vertices(Vertex::person_all()) .collect::<Vec<_ > > (); assert_eq!(people.len(), 2); // Query by property (hash index) let alice_found = graph.walk() .vertices(Vertex::person_by_name("Alice")) .collect::<Vec<_ > > (); assert_eq!(alice_found.len(), 1); // Query by range (range index) let young_people = graph.walk() .vertices(Vertex::person_by_age(20..30)) .collect::<Vec<_ > > (); assert_eq!(young_people.len(), 1); // Text search (full-text index) let developers = graph.walk() .vertices(Vertex::person_by_bio("developer")) .collect::<Vec<_ > > (); assert_eq!(developers.len(), 1); // Traversal let alices_creations = graph.walk() .vertices(Vertex::person_by_name("Alice")) .edges(Edge::created().direction(Direction::Outgoing)) .head() .collect::<Vec<_ > > (); assert_eq!(alices_creations.len(), 1); }
Advanced Features
SimpleGraph
supports several advanced features:
Custom Indexing
You can define custom indexes on vertex and edge properties:
#![allow(unused)] fn main() { #[derive(Debug, Clone, VertexExt)] pub enum Vertex { Person { #[index(hash)] id: u64, #[index(range)] score: f64, #[index(full_text)] description: String, }, } }
Mutation with Index Updates
When properties are modified, indexes are automatically updated:
#![allow(unused)] fn main() { // Update an indexed property graph.vertex_mut(vertex_id) .unwrap() .project_mut::<PersonMut<_ > > () .unwrap() .set_age(31); // The age index is automatically updated, so this query now works let people = graph.walk() .vertices(Vertex::person_by_age(31..32)) .collect::<Vec<_ > > (); assert_eq!(people.len(), 1); }
PetGraph Adapter
The PetGraph
adapter provides Graph API compatibility for the excellent and
widely-used petgraph Rust graph library. This allows projects using petgraph
to
benefit from the Graph API's ergonomic query interface while retaining petgraph
's robust performance, extensive
algorithm suite, and maturity.
Installation
Add PetGraph support to your project dependencies:
[dependencies]
graph-api-lib = "0.2.0"
graph-api-derive = "0.1.4" # For derive macros
graph-api-petgraph = "0.1.5" # Graph API adapter for petgraph
petgraph = "0.6" # The underlying graph library
Overview
This adapter wraps petgraph::stable_graph::StableGraph
, enabling it to be used seamlessly with Graph API traits and
walkers. It acts as a bridge, translating Graph API calls into petgraph
operations. Note that it primarily exposes
petgraph
's core graph structure and does not add the advanced indexing features found in SimpleGraph
.
#![allow(unused)] fn main() { use petgraph::stable_graph::StableGraph; use graph_api_derive::{VertexExt, EdgeExt}; use graph_api_lib::Graph; // Define vertex and edge types #[derive(Debug, Clone, VertexExt)] pub enum Vertex { Person { name: String, age: u64, }, Project { name: String }, } #[derive(Debug, Clone, EdgeExt)] pub enum Edge { Knows { since: i32 }, Created, } // Create a new petgraph StableGraph (which implements Graph) let mut graph = StableGraph::new(); // Use the graph through the Graph API let alice = graph.add_vertex(Vertex::Person { name: "Alice".to_string(), age: 30 }); let project = graph.add_vertex(Vertex::Project { name: "Graph API".to_string() }); graph.add_edge(alice, project, Edge::Created); }
Architecture
The PetGraph adapter:
- Maps Graph API concepts to petgraph: Translates between Graph API's model and petgraph's model
- Provides wrapper types: Wraps petgraph references to implement Graph API reference traits
- Adapts traversal patterns: Adapts petgraph's traversal methods to match Graph API expectations
Features
PetGraph
supports a subset of Graph API features:
- ❌ Vertex label indexes
- ❌ Edge label indexes
- ❌ Vertex hash indexes
- ❌ Edge hash indexes
- ❌ Vertex range indexes
- ❌ Edge range indexes
- ❌ Vertex full-text indexes
- ❌ Edge adjacent label indexes
- ✅ Graph clearing
Petgraph Integration
The primary advantage of PetGraph
is access to petgraph's rich ecosystem:
Graph Algorithms
Petgraph provides many graph algorithms that can be used alongside Graph API:
#![allow(unused)] fn main() { use petgraph::algo::dijkstra; use petgraph::stable_graph::StableGraph; use graph_api_lib::Graph; // Create and populate graph using Graph API let mut graph = StableGraph::new(); let a = graph.add_vertex(Vertex::Person { name: "A".to_string(), age: 30 }); let b = graph.add_vertex(Vertex::Person { name: "B".to_string(), age: 25 }); let c = graph.add_vertex(Vertex::Person { name: "C".to_string(), age: 40 }); graph.add_edge(a, b, Edge::Knows { since: 2010 }); graph.add_edge(b, c, Edge::Knows { since: 2015 }); graph.add_edge(a, c, Edge::Knows { since: 2020 }); // Use petgraph algorithms directly on the same graph let path = dijkstra( & graph, a, Some(c), | _ | 1); }
Visualization
Petgraph supports graph visualization with Graphviz:
#![allow(unused)] fn main() { use petgraph::dot::{Dot, Config}; // Create and populate graph using Graph API let mut graph = StableGraph::new(); // ... add vertices and edges ... // Use petgraph's Dot export println!("{:?}", Dot::with_config(&graph, &[Config::EdgeNoLabel])); }
Performance Characteristics
PetGraph
inherits performance characteristics from petgraph:
- Efficient traversal: Petgraph's adjacency list representation enables fast traversal
- Fast mutations: Quick addition and removal of vertices and edges
- Algorithm optimizations: Petgraph's algorithms are optimized for performance
However, PetGraph
lacks indexing support, which means:
- All label-based searches require full scans
- Property-based lookups require iterating through all vertices/edges
Use Cases
The PetGraph
adapter is an excellent choice when:
- Performance is critical: Leverage
petgraph
's optimized data structures and algorithms. - Integrating with existing
petgraph
code: Use the Graph API interface on existingpetgraph
graphs. - Needing advanced graph algorithms: Access
petgraph
's rich library of algorithms directly. - Working with simpler graph structures: When the advanced indexing features of
SimpleGraph
are not required. - Building production systems: Benefit from
petgraph
's maturity and widespread use.
Implementation Notes
The PetGraph
adapter consists of several wrapper types:
VertexReferenceWrapper
: Wraps petgraph node referencesVertexReferenceWrapperMut
: Wraps mutable petgraph node referencesEdgeReferenceWrapper
: Wraps petgraph edge referencesEdgeReferenceWrapperMut
: Wraps mutable petgraph edge referencesVertexIter
: Adapts petgraph's vertex iterationEdgeIter
: Adapts petgraph's edge iteration
These wrappers implement the corresponding Graph API traits to provide compatibility.
Limitations
When using PetGraph
, be aware of these limitations:
- No indexing support: All lookups by label or property require full scans
- Limited filtering: Edge and vertex filtering must be done after retrieval
- Compatibility gaps: Some petgraph features may not map directly to Graph API concepts
Source Code
The source code for the PetGraph
adapter is available in
the graph-api-lib crate under the
petgraph
module.
Example Usage
Here's an example demonstrating PetGraph
usage:
#![allow(unused)] fn main() { use petgraph::stable_graph::StableGraph; use graph_api_derive::{VertexExt, EdgeExt}; use graph_api_lib::{Graph, VertexSearch, EdgeSearch, Direction}; // Define vertex and edge types #[derive(Debug, Clone, VertexExt)] pub enum Vertex { Person { name: String, age: u64, }, Project { name: String }, } #[derive(Debug, Clone, EdgeExt)] pub enum Edge { Knows { since: i32 }, Created, } // Create a new graph let mut graph = StableGraph::new(); // Add vertices let alice = graph.add_vertex(Vertex::Person { name: "Alice".to_string(), age: 30, }); let bob = graph.add_vertex(Vertex::Person { name: "Bob".to_string(), age: 25, }); let project = graph.add_vertex(Vertex::Project { name: "Graph API".to_string() }); // Add edges graph.add_edge(alice, bob, Edge::Knows { since: 2020 }); graph.add_edge(alice, project, Edge::Created); graph.add_edge(bob, project, Edge::Created); // Basic traversal (without indexing) let all_vertices = graph.walk() .vertices(VertexSearch::scan()) .collect::<Vec<_ > > (); assert_eq!(all_vertices.len(), 3); // Find all people (manual filtering since indexing isn't supported) let people = graph.walk() .vertices(VertexSearch::scan()) .filter_by_person( | _, _ | true) .collect::<Vec<_ > > (); assert_eq!(people.len(), 2); // Find projects connected to Alice let alices_projects = graph.walk() .vertices_by_id(vec![alice]) .edges(EdgeSearch::scan()) .filter_by_created( | _, _ | true) .head() .collect::<Vec<_ > > (); assert_eq!(alices_projects.len(), 1); }
Integrating with Petgraph Algorithms
One of the main advantages of using PetGraph
is access to petgraph's algorithms:
#![allow(unused)] fn main() { use petgraph::stable_graph::StableGraph; use petgraph::algo::{dijkstra, is_cyclic_directed}; use graph_api_lib::Graph; // Create and populate graph using Graph API let mut graph = StableGraph::new(); let a = graph.add_vertex(Vertex::Person { name: "A".to_string(), age: 30 }); let b = graph.add_vertex(Vertex::Person { name: "B".to_string(), age: 25 }); let c = graph.add_vertex(Vertex::Person { name: "C".to_string(), age: 40 }); graph.add_edge(a, b, Edge::Knows { since: 2010 }); graph.add_edge(b, c, Edge::Knows { since: 2015 }); // Use petgraph algorithms let distances = dijkstra( & graph, a, None, | _ | 1); assert_eq!(distances[&c], 2); // Distance from A to C is 2 let is_cyclic = is_cyclic_directed( & graph); assert_eq!(is_cyclic, false); // Add an edge to create a cycle graph.add_edge(c, a, Edge::Knows { since: 2020 }); let is_cyclic = is_cyclic_directed( & graph); assert_eq!(is_cyclic, true); }
When to Choose PetGraph
Consider using PetGraph
when:
- You're already using petgraph in your project
- You need access to petgraph's graph algorithms
- You want to use the Graph API's ergonomic interface
- You don't need indexing features
If you require index-based lookups or other advanced Graph API features, consider using SimpleGraph
instead.