Exploring Without Indexes
Before we dive into indexes, let's understand why they're necessary by exploring graph traversal without them.
The Challenge of Finding Starting Points
In a property graph, one of the biggest challenges is finding the right starting points for your traversal. Without indexes, your only option is to scan the entire graph, examining each vertex to find matches.
In this diagram:
- The graph contains several vertices (A-E), some of which have the property
name: "Alice"
. - The query at the top left indicates the desired starting condition: find vertices where
name = "Alice"
. - The blue arrows illustrate the process without an index: To satisfy the query, the traversal mechanism must
conceptually examine every single vertex (A, B, C, D, E) to check if its
name
property matches "Alice". This represents a full graph scan. - The orange highlighting on vertices
A
andC
shows the result of this scan – these are the only vertices found that satisfy the query condition.
This full scan becomes computationally expensive and slow, especially in large graphs, highlighting the need for indexes to quickly locate starting vertices.
Example: Graph Without Indexes
Let's consider a simple social network model without any indexes:
// Define vertex types for a social media application
#[derive(Debug, Clone, VertexExt)]
pub enum Vertex {
// Person vertex with various properties
Person {
name: String, // Not indexed
#[index(hash)] // Hash index for exact lookups
username: String,
#[index(full_text)] // Full-text index for text search
biography: String,
#[index(range)] // Range index for range queries
age: u8,
},
// Project vertex with minimal properties
Project {
name: String,
},
// Comment vertex
Comment {
text: String,
date: String,
},
}
// Define edge types that connect vertices
#[derive(Debug, Clone, EdgeExt)]
pub enum Edge {
// Simple edges without properties
Created,
Follows,
// Edges with properties
Liked { timestamp: String },
Commented { timestamp: String },
}
Scanning the Entire Graph
Without indexes, the only way to find vertices matching specific criteria is to scan the entire graph.
Finding by Name (Without Index)
When a field isn't indexed (like name
in the Person
vertex), we have to scan all vertices to find matches:
// INEFFICIENT: Find all people named "Bryn" by scanning
// Since name is not indexed, we must scan all vertices
let bryn_vertices = graph
.walk()
.vertices(VertexSearch::scan()) // Must scan ALL vertices
.filter_by_person(|person, _| person.name() == "Bryn")
.collect::<Vec<_>>();
println!("Found {} vertices for Bryn", bryn_vertices.len());
Finding Projects (Without Index)
Similarly, to find a project by name, we need to scan the entire graph:
// INEFFICIENT: Find all projects with a specific name by scanning
// Since Project::name is not indexed, we must scan all vertices
let graphapi_projects = graph
.walk()
.vertices(VertexSearch::scan()) // Must scan ALL vertices
.filter_by_project(|project, _| project.name() == "GraphApi")
.collect::<Vec<_>>();
println!("Found {} GraphApi projects", graphapi_projects.len());
Performance Comparison
Let's compare the performance of a full scan versus using an index:
- Inefficient approach - scanning all vertices:
// COMPARISON: Using an index vs. not using an index
// 1. Inefficient: Find people with a specific username using a scan
let julia_by_scan = graph
.walk()
.vertices(VertexSearch::scan())
.filter_by_person(|person, _| person.username() == "julia456")
.collect::<Vec<_>>();
println!("Found {} with username julia456", julia_by_scan.len());
- Efficient approach - using the index:
// 2. Efficient: Find the same person using the username index
let julia_by_index = graph
.walk()
.vertices(Vertex::person_by_username("julia456"))
.collect::<Vec<_>>();
println!("Found {} with username julia456", julia_by_index.len());
Performance Implications
For small graphs, scanning might be acceptable, but as graphs grow, scanning becomes increasingly inefficient:
- Linear Time Complexity: Scanning requires examining every vertex, making it O(n) where n is the number of vertices
- Resource Intensive: Needs to load and process every vertex, even those not matching criteria
- Poor Scalability: Performance degrades linearly as the graph grows
When to Use Full Scans
Despite their inefficiency, full scans do have legitimate uses:
- During initial development: When you're still figuring out your data model
- For admin or maintenance tasks: When completeness is more important than speed
- For small graphs: When the overhead of maintaining indexes exceeds their benefit
- For exploratory analysis: When you need to examine all data without preconceptions
Moving Beyond Scans
In the following sections, we'll explore how different types of indexes can dramatically improve traversal performance by allowing you to:
- Find vertices by exact property values using standard indexes
- Search for vertices within ranges using range indexes
- Find vertices containing specific text using full-text indexes
Indexes provide the entry points for efficient graph traversal, turning potentially expensive operations into near-constant time lookups.