KEY-SCHEMA.md 11 KB

U# Key Schema

This document describes the key schema used by Implexus for storing data in the underlying key-value store.

Overview

Implexus uses a prefix-based key schema to organize different types of data within the key-value store. All keys are strings, and values are serialized using the ElementWriter / ElementReader serialization system.

The storage layer consists of two main components:

Key Prefixes

Prefix Description Value Type Managed By
entity: Entity metadata Serialized type and label BasicStorage
props: Document properties Serialized Properties BasicStorage
children: Container children Serialized string array BasicStorage
config: Category configuration Serialized type_label + expression BasicStorage
catcfg: Catalogue configuration Serialized type_label + expression BasicStorage
typeidx: Type index Serialized string array IndexManager
cat: Category members Serialized string array IndexManager
catl: Catalogue groups/keys Serialized string array IndexManager
idx: Text search indices Serialized string array IndexManager

Key Patterns

Entity Storage

Entity metadata is stored with the entity: prefix:

entity:<entity_path> → [type_code: int64, type_label: string]

Examples:

entity:/users/john         → [1, "User"]
entity:/products/widget    → [2, "Product"]
entity:/categories/active  → [4, null]

The type code corresponds to EntityType enum values:

  • 0 = UNKNOWN
  • 1 = DOCUMENT
  • 2 = CATEGORY
  • 3 = CATALOGUE
  • 4 = CONTAINER
  • 5 = INDEX

Document Properties

Document properties are stored with the props: prefix:

props:<document_path> → Serialized Properties

Example:

props:/users/john → {"name": "John Doe", "email": "john@example.com", "age": 30}

Properties are serialized as a dictionary using ElementWriter.write_dictionary().

Container Children

Container child names are stored with the children: prefix:

children:<container_path> → [child_name1, child_name2, ...]

Example:

children:/users → ["john", "jane", "admin"]
children:/products → ["widget", "gadget", "tool"]

Category Configuration

Category configuration is stored with the config: prefix:

config:<category_path> → [type_label: string, expression: string]

Example:

config:/categories/active-users → ["User", "status == 'active'"]
config:/categories/expensive-products → ["Product", "price > 100"]

Catalogue Configuration

Catalogue configuration is stored with the catcfg: prefix:

catcfg:<catalogue_path> → [type_label: string, expression: string]

Example:

catcfg:/catalogues/users-by-role → ["User", "role"]
catcfg:/catalogues/products-by-category → ["Product", "category"]

Type Index

The global type index maps type labels to document paths:

typeidx:<type_label> → [doc_path1, doc_path2, ...]

Examples:

typeidx:User → ["/users/john", "/users/jane", "/users/admin"]
typeidx:Product → ["/products/widget", "/products/gadget"]
typeidx:Order → ["/orders/001", "/orders/002"]

This enables fast lookup of all documents of a specific type.

Category Members

Category member sets are stored with the cat: prefix and :members suffix:

cat:<category_path>:members → [doc_path1, doc_path2, ...]

Examples:

cat:/categories/active-users:members → ["/users/john", "/users/jane"]
cat:/categories/expensive-products:members → ["/products/widget", "/products/gadget"]

Catalogue Groups

Catalogue groups use the catl: prefix with two key patterns:

Group Members

catl:<catalogue_path>:group:<key_value> → [doc_path1, doc_path2, ...]

Examples:

catl:/catalogues/users-by-role:group:admin → ["/users/admin"]
catl:/catalogues/users-by-role:group:user → ["/users/john", "/users/jane"]
catl:/catalogues/products-by-category:group:electronics → ["/products/widget", "/products/gadget"]

Group Keys List

catl:<catalogue_path>:keys → [key1, key2, ...]

Examples:

catl:/catalogues/users-by-role:keys → ["admin", "user", "guest"]
catl:/catalogues/products-by-category:keys → ["electronics", "clothing", "food"]

N-gram Index

Text search indices use the idx: prefix with n-gram type specifiers:

Trigram Index

idx:<index_path>:tri:<trigram> → [doc_path1, doc_path2, ...]

Examples:

idx:/indices/document-search:tri:the → ["/docs/doc1", "/docs/doc2", "/docs/doc3"]
idx:/indices/document-search:tri:ing → ["/docs/doc1", "/docs/doc4"]

Bigram Reverse Index

Used for finding trigrams that contain a specific bigram:

idx:<index_path>:bi:<bigram> → [trigram1, trigram2, ...]

Example:

idx:/indices/document-search:bi:th → ["the", "tha", "thi"]

Unigram Reverse Index

Used for finding bigrams that start with a specific character:

idx:<index_path>:uni:<unigram> → [bigram1, bigram2, ...]

Example:

idx:/indices/document-search:uni:t → ["th", "tr", "to"]

Document Content Cache

Stores the indexed content for a document (used for reindexing):

idx:<index_path>:doc:<doc_path> → <indexed_content>

Example:

idx:/indices/document-search:doc:/docs/doc1 → "The quick brown fox jumps over the lazy dog"

Value Serialization

All values are serialized using the ElementSerializer system with type tags for proper deserialization.

Type Codes

Code Type Description
0x00 NULL Null value
0x01 BOOL Boolean (1 byte)
0x02 INT64 64-bit signed integer (big-endian)
0x03 UINT64 64-bit unsigned integer (big-endian)
0x04 DOUBLE 64-bit IEEE 754 floating point
0x05 STRING Length-prefixed UTF-8 string
0x06 BINARY Length-prefixed binary data
0x07 ARRAY Count-prefixed array of elements
0x08 DICTIONARY Count-prefixed key-value pairs

String Encoding

Strings are serialized as:

[length: int64][utf-8 bytes]

Set/Array Encoding

Sets (like member lists) are serialized as arrays:

[ARRAY_CODE: 0x07][count: int64][element1][element2]...[elementN]

Each element is a string element with its own type code and length prefix.

Properties Encoding

Properties dictionaries are serialized as:

[DICTIONARY_CODE: 0x08][count: int64][key1][value1][key2][value2]...

Index Management

Indices are kept in sync with document changes through the HookManager event system.

Event Flow

  1. Document is created, updated, or deleted via EmbeddedEngine
  2. HookManager fires the appropriate event
  3. Indexed entities (Category, Catalogue, Index) listen for events
  4. IndexManager updates the relevant index entries
  5. All operations within a transaction are committed atomically (when supported)

Transaction Safety

When using a backend with native transaction support (like LMDB):

dbm.begin_transaction();
try {
    // Update document properties
    storage.store_properties(path, properties);
    
    // Update type index
    index_manager.add_to_type_index(type_label, path);
    
    // Update category indices
    index_manager.add_to_category(category_path, path);
    
    // All changes commit atomically
    dbm.commit_transaction();
} catch (StorageError e) {
    dbm.rollback_transaction();
}

Index Rebuilding

If indices become corrupted or need rebuilding:

  1. Iterate through all keys with the relevant prefix
  2. For each document, re-evaluate index expressions
  3. Update index entries accordingly

    // Example: Rebuild type index for "User"
    var user_paths = new Vector<string>();
    foreach (var key in dbm.keys) {
    if (key.has_prefix("props:")) {
        var path = key.substring(6);
        var type_label = storage.get_entity_type_label(path);
        if (type_label == "User") {
            user_paths.add(path);
        }
    }
    }
    
    // Clear and rebuild
    dbm.delete("typeidx:User");
    foreach (var path in user_paths) {
    index_manager.add_to_type_index("User", path);
    }
    

Key Enumeration Patterns

Finding All Entities

foreach (var key in dbm.keys) {
    if (key.has_prefix("entity:")) {
        var path = key.substring(7);
        // Process entity
    }
}

Finding All Documents of a Type

var user_paths = index_manager.get_paths_for_type("User");
foreach (var path in user_paths) {
    var props = storage.load_properties(new EntityPath(path));
    // Process document
}

Finding Category Members

var members = index_manager.get_category_members("/categories/active-users");
foreach (var member_path in members) {
    // Process member
}

Finding Catalogue Groups

var keys = index_manager.get_catalogue_keys("/catalogues/users-by-role");
foreach (var key in keys) {
    var members = index_manager.get_catalogue_group_members(
        "/catalogues/users-by-role", key
    );
    // Process group
}

Storage Efficiency Considerations

Key Length

Keys are stored verbatim, so shorter paths reduce storage overhead:

  • Prefer /users/john over /application/data/users/john
  • Type labels should be concise but descriptive

Set Storage

Member sets are stored as arrays of strings. For very large sets:

  • Consider splitting into multiple categories with filter expressions
  • Use catalogues for grouping to enable efficient subset queries

N-gram Index Size

Text indices can grow large for documents with much text:

  • Trigram indices: O(unique_trigrams × matching_documents)
  • Consider indexing only specific fields, not entire documents
  • Use the document content cache sparingly

See Also