Exploring Without Indexes

Before we dive into indexes, let's understand why they're necessary by exploring graph traversal without them.

The Challenge of Finding Starting Points

In a property graph, one of the biggest challenges is finding the right starting points for your traversal. Without indexes, your only option is to scan the entire graph, examining each vertex to find matches.

In this diagram:

  • The graph contains several vertices (A-E), some of which have the property name: "Alice".
  • The query at the top left indicates the desired starting condition: find vertices where name = "Alice".
  • The blue arrows illustrate the process without an index: To satisfy the query, the traversal mechanism must conceptually examine every single vertex (A, B, C, D, E) to check if its name property matches "Alice". This represents a full graph scan.
  • The orange highlighting on vertices A and C shows the result of this scan – these are the only vertices found that satisfy the query condition.

This full scan becomes computationally expensive and slow, especially in large graphs, highlighting the need for indexes to quickly locate starting vertices.

Example: Graph Without Indexes

Let's consider a simple social network model without any indexes:

// Define vertex types for a social media application
#[derive(Debug, Clone, VertexExt)]
pub enum Vertex {
    // Person vertex with various properties
    Person {
        name: String, // Not indexed

        #[index(hash)] // Hash index for exact lookups
        username: String,

        #[index(full_text)] // Full-text index for text search
        biography: String,

        #[index(range)] // Range index for range queries
        age: u8,
    },

    // Project vertex with minimal properties
    Project {
        name: String,
    },

    // Comment vertex
    Comment {
        text: String,
        date: String,
    },
}

// Define edge types that connect vertices
#[derive(Debug, Clone, EdgeExt)]
pub enum Edge {
    // Simple edges without properties
    Created,
    Follows,

    // Edges with properties
    Liked { timestamp: String },
    Commented { timestamp: String },
}

Scanning the Entire Graph

Without indexes, the only way to find vertices matching specific criteria is to scan the entire graph.

Finding by Name (Without Index)

When a field isn't indexed (like name in the Person vertex), we have to scan all vertices to find matches:

    // INEFFICIENT: Find all people named "Bryn" by scanning
    // Since name is not indexed, we must scan all vertices
    let bryn_vertices = graph
        .walk()
        .vertices(VertexSearch::scan()) // Must scan ALL vertices
        .filter_by_person(|person, _| person.name() == "Bryn")
        .collect::<Vec<_>>();

    println!("Found {} vertices for Bryn", bryn_vertices.len());

Finding Projects (Without Index)

Similarly, to find a project by name, we need to scan the entire graph:

    // INEFFICIENT: Find all projects with a specific name by scanning
    // Since Project::name is not indexed, we must scan all vertices
    let graphapi_projects = graph
        .walk()
        .vertices(VertexSearch::scan()) // Must scan ALL vertices
        .filter_by_project(|project, _| project.name() == "GraphApi")
        .collect::<Vec<_>>();

    println!("Found {} GraphApi projects", graphapi_projects.len());

Performance Comparison

Let's compare the performance of a full scan versus using an index:

  1. Inefficient approach - scanning all vertices:
    // COMPARISON: Using an index vs. not using an index
    // 1. Inefficient: Find people with a specific username using a scan
    let julia_by_scan = graph
        .walk()
        .vertices(VertexSearch::scan())
        .filter_by_person(|person, _| person.username() == "julia456")
        .collect::<Vec<_>>();

    println!("Found {} with username julia456", julia_by_scan.len());
  1. Efficient approach - using the index:
    // 2. Efficient: Find the same person using the username index
    let julia_by_index = graph
        .walk()
        .vertices(Vertex::person_by_username("julia456"))
        .collect::<Vec<_>>();

    println!("Found {} with username julia456", julia_by_index.len());

Performance Implications

For small graphs, scanning might be acceptable, but as graphs grow, scanning becomes increasingly inefficient:

  1. Linear Time Complexity: Scanning requires examining every vertex, making it O(n) where n is the number of vertices
  2. Resource Intensive: Needs to load and process every vertex, even those not matching criteria
  3. Poor Scalability: Performance degrades linearly as the graph grows

When to Use Full Scans

Despite their inefficiency, full scans do have legitimate uses:

  • During initial development: When you're still figuring out your data model
  • For admin or maintenance tasks: When completeness is more important than speed
  • For small graphs: When the overhead of maintaining indexes exceeds their benefit
  • For exploratory analysis: When you need to examine all data without preconceptions

Moving Beyond Scans

In the following sections, we'll explore how different types of indexes can dramatically improve traversal performance by allowing you to:

  1. Find vertices by exact property values using standard indexes
  2. Search for vertices within ranges using range indexes
  3. Find vertices containing specific text using full-text indexes

Indexes provide the entry points for efficient graph traversal, turning potentially expensive operations into near-constant time lookups.