virtual-entity-resolution-design.md 16 KB

Virtual Entity Resolution Design

Problem Statement

Catalogue groups, Category members, and Index results are "virtual" entities that exist in indexes but cannot be looked up directly by path. For example:

// This works:
var catalogue = yield engine.get_entity_async(EntityPath.parse("/spry/users/users/by_username"));
var group = yield catalogue.get_child_async("testuser");

// This fails with ENTITY_NOT_FOUND:
var group = yield engine.get_entity_or_null_async(EntityPath.parse("/spry/users/users/by_username/testuser"));

The diagnosis correctly identified that:

  • Catalogue config exists: catcfg:/spry/users/users/by_username
  • Catalogue group index contains members ✓
  • BUT: No entity:/spry/users/users/by_username/testuser file exists ✗

This is a design gap - virtual children of indexed entities should be resolvable by path just like children of containers.

Design Goals

  1. Path Transparency: get_entity_async("/catalogue/group") should work the same as catalogue.get_child_async("group")
  2. Performance: No full path traversal needed - only check immediate parent for indexed entity types
  3. Consistency: Category, Catalogue, and Index all support direct path lookup for their virtual children
  4. Backward Compatibility: Existing code continues to work

Key Insight

Only Documents can be children of Category, Catalogue, and Index entities.

This means:

  • No recursive traversal is needed - we only need to check one level up
  • The parent path tells us everything we need to know about how to resolve the child
  • We can determine resolution strategy based solely on the parent's entity type

Entity Type Resolution Matrix

Parent Type Child Storage Lookup Method
Container Persistent (entity: + children:) Direct entity lookup
Category Virtual (member index catx:) Check parent's member index
Catalogue Virtual (group index catl:) Check parent's group index, then resolve to Document
Index Virtual (search pattern) Execute search, find matching Document

Architecture

flowchart TB
    subgraph "Path Resolution Flow"
        START["get_entity_async(path)"]
        CHECK_ENTITY["Check entity:prefix for path"]
        ENTITY_EXISTS{"Entity exists?"}
        
        GET_PARENT["Get parent path"]
        CHECK_PARENT_TYPE["Load parent entity type"]
        
        PARENT_CONTAINER{"Parent is Container?"}
        PARENT_CATEGORY{"Parent is Category?"}
        PARENT_CATALOGUE{"Parent is Catalogue?"}
        PARENT_INDEX{"Parent is Index?"}
        
        NOT_FOUND["Return ENTITY_NOT_FOUND"]
        RETURN_ENTITY["Return Entity"]
        
        CAT_CHECK["Check if child_name in member index"]
        CAT_EXISTS{"In index?"}
        CAT_RETURN["Return Document from index"]
        
        CATL_CHECK["Check if child_name is a group key"]
        CATL_IS_GROUP{"Is group key?"}
        CATL_GROUP["Return CatalogueGroup virtual entity"]
        CATL_CHECK_DOC["Check if child_name is a document in any group"]
        CATL_DOC_EXISTS{"Found in group?"}
        CATL_RETURN_DOC["Return Document"]
        
        IDX_SEARCH["Execute search with child_name as pattern"]
        IDX_RESULTS{"Has results?"}
        IDX_RETURN["Return IndexResult virtual entity"]
    end
    
    START --> CHECK_ENTITY
    CHECK_ENTITY --> ENTITY_EXISTS
    ENTITY_EXISTS -->|Yes| RETURN_ENTITY
    ENTITY_EXISTS -->|No| GET_PARENT
    GET_PARENT --> CHECK_PARENT_TYPE
    CHECK_PARENT_TYPE --> PARENT_CONTAINER
    
    PARENT_CONTAINER -->|Yes| NOT_FOUND
    PARENT_CONTAINER -->|No| PARENT_CATEGORY
    
    PARENT_CATEGORY -->|Yes| CAT_CHECK
    CAT_CHECK --> CAT_EXISTS
    CAT_EXISTS -->|Yes| CAT_RETURN
    CAT_EXISTS -->|No| NOT_FOUND
    
    PARENT_CATEGORY -->|No| PARENT_CATALOGUE
    PARENT_CATALOGUE -->|Yes| CATL_CHECK
    CATL_CHECK --> CATL_IS_GROUP
    CATL_IS_GROUP -->|Yes| CATL_GROUP
    CATL_IS_GROUP -->|No| CATL_CHECK_DOC
    CATL_CHECK_DOC --> CATL_DOC_EXISTS
    CATL_DOC_EXISTS -->|Yes| CATL_RETURN_DOC
    CATL_DOC_EXISTS -->|No| NOT_FOUND
    
    PARENT_CATALOGUE -->|No| PARENT_INDEX
    PARENT_INDEX -->|Yes| IDX_SEARCH
    IDX_SEARCH --> IDX_RESULTS
    IDX_RESULTS -->|Yes| IDX_RETURN
    IDX_RESULTS -->|No| NOT_FOUND
    PARENT_INDEX -->|No| NOT_FOUND

Implementation Plan

Phase 1: Add Virtual Entity Resolution to EmbeddedEngine

Modify EmbeddedEngine._create_entity_from_storage_async() to handle virtual entity resolution when direct entity lookup fails.

1.1 New Internal Method: _try_resolve_virtual_child_async()

/**
 * Attempts to resolve a path as a virtual child of an indexed entity.
 *
 * This method is called when direct entity lookup fails. It checks if
 * the parent is a Category, Catalogue, or Index and attempts to resolve
 * the child name through the appropriate index.
 *
 * @param path The path to resolve
 * @return The entity, or null if not a virtual child
 */
private async Core.Entity? _try_resolve_virtual_child_async(Core.EntityPath path) throws Core.EngineError {
    // Root has no parent
    if (path.is_root) {
        return null;
    }
    
    var parent_path = path.parent;
    var child_name = path.name;
    
    // Check if parent exists
    bool parent_exists = yield _entity_exists_async_internal(parent_path);
    if (!parent_exists) {
        return null;
    }
    
    // Get parent entity type
    Core.EntityType? parent_type;
    try {
        parent_type = yield _get_entity_type_async_internal(parent_path);
    } catch (Storage.StorageError e) {
        return null;
    }
    
    if (parent_type == null) {
        return null;
    }
    
    // Resolve based on parent type
    switch ((!) parent_type) {
        case Core.EntityType.CATEGORY:
            return yield _resolve_category_child_async(parent_path, child_name);
            
        case Core.EntityType.CATALOGUE:
            return yield _resolve_catalogue_child_async(parent_path, child_name);
            
        case Core.EntityType.INDEX:
            return yield _resolve_index_child_async(parent_path, child_name);
            
        default:
            // Container children must be persisted entities
            return null;
    }
}

1.2 Category Child Resolution

/**
 * Resolves a child of a Category by checking the member index.
 */
private async Core.Entity? _resolve_category_child_async(
    Core.EntityPath parent_path, 
    string child_name
) throws Core.EngineError {
    // Check if child_name is in the category's member index
    // Members are stored as full document paths
    foreach (var doc_path in _category_store.get_members(parent_path)) {
        var doc_entity_path = Core.EntityPath.parse(doc_path);
        if (doc_entity_path.name == child_name) {
            // Found - return the actual document
            return yield _create_entity_from_storage_async(doc_entity_path);
        }
    }
    
    return null;
}

1.3 Catalogue Child Resolution

Catalogues have two types of virtual children:

  1. Group Keys - e.g., /catalogue/admin returns a CatalogueGroup
  2. Documents within groups - e.g., /catalogue/admin/someuser returns a Document

    /**
    * Resolves a child of a Catalogue.
    *
    * First checks if child_name is a group key (returns CatalogueGroup).
    * Then checks if it's a document name within any group (returns Document).
    */
    private async Core.Entity? _resolve_catalogue_child_async(
    Core.EntityPath parent_path, 
    string child_name
    ) throws Core.EngineError {
    // First: Check if child_name is a group key
    foreach (var key in _catalogue_store.get_group_keys(parent_path)) {
        if (key == child_name) {
            // Return a CatalogueGroup virtual entity
            var catalogue = yield _create_entity_from_storage_async(parent_path) as Catalogue;
            if (catalogue != null) {
                return new CatalogueGroup(_engine, catalogue, child_name);
            }
        }
    }
        
    // Second: Check if child_name is a document within any group
    foreach (var key in _catalogue_store.get_group_keys(parent_path)) {
        foreach (var doc_path in _catalogue_store.get_group_members(parent_path, key)) {
            var doc_entity_path = Core.EntityPath.parse(doc_path);
            if (doc_entity_path.name == child_name) {
                // Found - return the actual document
                return yield _create_entity_from_storage_async(doc_entity_path);
            }
        }
    }
        
    return null;
    }
    

1.4 Index Child Resolution

/**
 * Resolves a child of an Index by executing a search.
 *
 * The child_name is treated as a search pattern (e.g., "*term*").
 */
private async Core.Entity? _resolve_index_child_async(
    Core.EntityPath parent_path, 
    string child_name
) throws Core.EngineError {
    // Load the index entity
    var index = yield _create_entity_from_storage_async(parent_path) as Index;
    if (index == null) {
        return null;
    }
    
    // Execute search with child_name as pattern
    var result = ((!) index).search(child_name);
    return result;  // Returns IndexResult or null
}

1.5 Modify get_entity_async()

public async Core.Entity? get_entity_async(Core.EntityPath path) throws Core.EngineError {
    // First: Try direct entity lookup
    bool exists = yield _entity_exists_async_internal(path);
    if (exists) {
        return yield _create_entity_from_storage_async(path);
    }
    
    // Second: Try virtual child resolution
    var virtual_entity = yield _try_resolve_virtual_child_async(path);
    if (virtual_entity != null) {
        return (!) virtual_entity;
    }
    
    // Not found anywhere
    throw new Core.EngineError.ENTITY_NOT_FOUND(
        "Entity not found: %s".printf(path.to_string())
    );
}

1.6 Modify entity_exists_async()

public async bool entity_exists_async(Core.EntityPath path) throws Core.EngineError {
    // First: Check persistent storage
    bool exists = yield _entity_exists_async_internal(path);
    if (exists) {
        return true;
    }
    
    // Second: Check if it's a virtual child
    var virtual_entity = yield _try_resolve_virtual_child_async(path);
    return virtual_entity != null;
}

Phase 2: Sync Methods for Hook Context

The sync methods used by hooks also need to support virtual entity resolution.

2.1 Modify get_entity_or_null_sync()

internal Core.Entity? get_entity_or_null_sync(Core.EntityPath path) {
    // First: Try direct lookup
    var entity = _get_entity_or_null_sync_internal(path);
    if (entity != null) {
        return entity;
    }
    
    // Second: Try virtual child resolution (sync)
    return _try_resolve_virtual_child_sync(path);
}

2.2 Add _try_resolve_virtual_child_sync()

/**
 * Synchronous virtual child resolution for hook context.
 */
private Core.Entity? _try_resolve_virtual_child_sync(Core.EntityPath path) {
    if (path.is_root) {
        return null;
    }
    
    var parent_path = path.parent;
    var child_name = path.name;
    
    // Check if parent exists
    if (!_storage.entity_exists(parent_path)) {
        return null;
    }
    
    // Get parent type
    Core.EntityType? parent_type;
    try {
        parent_type = _storage.get_entity_type(parent_path);
    } catch (Storage.StorageError e) {
        return null;
    }
    
    if (parent_type == null) {
        return null;
    }
    
    switch ((!) parent_type) {
        case Core.EntityType.CATEGORY:
            return _resolve_category_child_sync(parent_path, child_name);
            
        case Core.EntityType.CATALOGUE:
            return _resolve_catalogue_child_sync(parent_path, child_name);
            
        case Core.EntityType.INDEX:
            return _resolve_index_child_sync(parent_path, child_name);
            
        default:
            return null;
    }
}

Phase 3: Update CatalogueGroup for Direct Path Access

The CatalogueGroup class currently creates its path as parent.path.append_child(group_key). This is correct, but we need to ensure it can be created directly from a path without needing the parent instance.

3.1 Add Factory Method to CatalogueGroup

/**
 * Creates a CatalogueGroup from a path.
 *
 * This is used by EmbeddedEngine for virtual entity resolution.
 *
 * @param engine The engine
 * @param path The full path including group key
 * @return The CatalogueGroup, or null if the group doesn't exist
 */
public static CatalogueGroup? from_path(Core.Engine engine, Core.EntityPath path) {
    if (path.is_root) {
        return null;
    }
    
    var parent_path = path.parent;
    var group_key = path.name;
    
    // Verify parent is a Catalogue
    var embedded = engine as Engine.EmbeddedEngine;
    if (embedded == null) {
        return null;
    }
    
    var catalogue_store = ((!) embedded).catalogue_store;
    
    // Check if group key exists
    foreach (var key in catalogue_store.get_group_keys(parent_path)) {
        if (key == group_key) {
            // Create parent catalogue
            var catalogue = ((!) embedded).get_entity_or_null_sync(parent_path) as Catalogue;
            if (catalogue != null) {
                return new CatalogueGroup(engine, catalogue, group_key);
            }
        }
    }
    
    return null;
}

Edge Cases and Considerations

1. Nested Paths in Catalogues

For path /catalogue/group/document:

  • First level (group) resolves to a CatalogueGroup
  • Second level (document) is resolved by CatalogueGroup.get_child_async()

This already works correctly because CatalogueGroup.get_child_async() looks up documents in the group.

2. Index Search Patterns with Slashes

Index search patterns like *term* should work, but patterns containing / would be parsed as multiple path segments. This is a pre-existing limitation.

3. Performance Considerations

  • Category lookup: O(k) where k = number of members (must scan to match name)
  • Catalogue group lookup: O(g) where g = number of groups
  • Catalogue document lookup: O(g × m) where m = average group size
  • Index lookup: Depends on search pattern complexity

For large catalogues, consider adding a name→path index if this becomes a bottleneck.

4. Caching

The current implementation doesn't cache virtual entity lookups. If performance becomes an issue, consider:

  1. Caching the parent entity during resolution
  2. Adding a name→path lookup index for catalogue groups

Testing Strategy

Unit Tests

  1. Category Resolution

    • Test direct path lookup for category member
    • Test non-existent member returns null
    • Test that persisted entities still work
  2. Catalogue Resolution

    • Test direct path lookup for catalogue group
    • Test direct path lookup for document within group
    • Test non-existent group/document returns null
  3. Index Resolution

    • Test direct path lookup with search pattern
    • Test pattern with no matches returns null
  4. Mixed Paths

    • Test paths that mix containers and indexed entities
    • Test deeply nested paths

Integration Tests

  1. Existing Tests Compatibility

    • Ensure all existing tests pass
    • Verify no regression in navigation-based access
  2. Performance Tests

    • Benchmark virtual entity resolution
    • Compare with navigation-based access

Migration

No migration needed - this is purely an enhancement to the resolution logic. Existing data structures remain unchanged.

Summary

Component Change
EmbeddedEngine.get_entity_async() Add virtual child resolution fallback
EmbeddedEngine.entity_exists_async() Add virtual child existence check
EmbeddedEngine.get_entity_or_null_sync() Add sync virtual child resolution
New: _try_resolve_virtual_child_async() Async virtual entity resolution dispatcher
New: _try_resolve_virtual_child_sync() Sync virtual entity resolution dispatcher
New: _resolve_category_child_async() Category member lookup
New: _resolve_catalogue_child_async() Catalogue group/document lookup
New: _resolve_index_child_async() Index search execution
CatalogueGroup Add from_path() factory method