U# Key Schema This document describes the key schema used by Implexus for storing data in the underlying key-value store. ## Overview Implexus uses a prefix-based key schema to organize different types of data within the key-value store. All keys are strings, and values are serialized using the [`ElementWriter`](src/Storage/ElementSerializer.vala:53) / [`ElementReader`](src/Storage/ElementSerializer.vala:271) serialization system. The storage layer consists of two main components: - [`BasicStorage`](src/Storage/Storage.vala:186) - Entity metadata and properties - [`IndexManager`](src/Storage/IndexManager.vala:49) - Index data for fast lookups ## Key Prefixes | Prefix | Description | Value Type | Managed By | |--------|-------------|------------|------------| | `entity:` | Entity metadata | Serialized type and label | [`BasicStorage`](src/Storage/Storage.vala:186) | | `props:` | Document properties | Serialized Properties | [`BasicStorage`](src/Storage/Storage.vala:186) | | `children:` | Container children | Serialized string array | [`BasicStorage`](src/Storage/Storage.vala:186) | | `config:` | Category configuration | Serialized type_label + expression | [`BasicStorage`](src/Storage/Storage.vala:186) | | `catcfg:` | Catalogue configuration | Serialized type_label + expression | [`BasicStorage`](src/Storage/Storage.vala:186) | | `typeidx:` | Type index | Serialized string array | [`IndexManager`](src/Storage/IndexManager.vala:49) | | `cat:` | Category members | Serialized string array | [`IndexManager`](src/Storage/IndexManager.vala:49) | | `catl:` | Catalogue groups/keys | Serialized string array | [`IndexManager`](src/Storage/IndexManager.vala:49) | | `idx:` | Text search indices | Serialized string array | [`IndexManager`](src/Storage/IndexManager.vala:49) | ## Key Patterns ### Entity Storage Entity metadata is stored with the `entity:` prefix: ``` entity: → [type_code: int64, type_label: string] ``` **Examples:** ``` entity:/users/john → [1, "User"] entity:/products/widget → [2, "Product"] entity:/categories/active → [4, null] ``` The type code corresponds to [`EntityType`](src/Core/EntityType.vala) enum values: - 0 = UNKNOWN - 1 = DOCUMENT - 2 = CATEGORY - 3 = CATALOGUE - 4 = CONTAINER - 5 = INDEX ### Document Properties Document properties are stored with the `props:` prefix: ``` props: → Serialized Properties ``` **Example:** ``` props:/users/john → {"name": "John Doe", "email": "john@example.com", "age": 30} ``` Properties are serialized as a dictionary using [`ElementWriter.write_dictionary()`](src/Storage/ElementSerializer.vala:237). ### Container Children Container child names are stored with the `children:` prefix: ``` children: → [child_name1, child_name2, ...] ``` **Example:** ``` children:/users → ["john", "jane", "admin"] children:/products → ["widget", "gadget", "tool"] ``` ### Category Configuration Category configuration is stored with the `config:` prefix: ``` config: → [type_label: string, expression: string] ``` **Example:** ``` config:/categories/active-users → ["User", "status == 'active'"] config:/categories/expensive-products → ["Product", "price > 100"] ``` ### Catalogue Configuration Catalogue configuration is stored with the `catcfg:` prefix: ``` catcfg: → [type_label: string, expression: string] ``` **Example:** ``` catcfg:/catalogues/users-by-role → ["User", "role"] catcfg:/catalogues/products-by-category → ["Product", "category"] ``` ### Type Index The global type index maps type labels to document paths: ``` typeidx: → [doc_path1, doc_path2, ...] ``` **Examples:** ``` typeidx:User → ["/users/john", "/users/jane", "/users/admin"] typeidx:Product → ["/products/widget", "/products/gadget"] typeidx:Order → ["/orders/001", "/orders/002"] ``` This enables fast lookup of all documents of a specific type. ### Category Members Category member sets are stored with the `cat:` prefix and `:members` suffix: ``` cat::members → [doc_path1, doc_path2, ...] ``` **Examples:** ``` cat:/categories/active-users:members → ["/users/john", "/users/jane"] cat:/categories/expensive-products:members → ["/products/widget", "/products/gadget"] ``` ### Catalogue Groups Catalogue groups use the `catl:` prefix with two key patterns: #### Group Members ``` catl::group: → [doc_path1, doc_path2, ...] ``` **Examples:** ``` catl:/catalogues/users-by-role:group:admin → ["/users/admin"] catl:/catalogues/users-by-role:group:user → ["/users/john", "/users/jane"] catl:/catalogues/products-by-category:group:electronics → ["/products/widget", "/products/gadget"] ``` #### Group Keys List ``` catl::keys → [key1, key2, ...] ``` **Examples:** ``` catl:/catalogues/users-by-role:keys → ["admin", "user", "guest"] catl:/catalogues/products-by-category:keys → ["electronics", "clothing", "food"] ``` ### N-gram Index Text search indices use the `idx:` prefix with n-gram type specifiers: #### Trigram Index ``` idx::tri: → [doc_path1, doc_path2, ...] ``` **Examples:** ``` idx:/indices/document-search:tri:the → ["/docs/doc1", "/docs/doc2", "/docs/doc3"] idx:/indices/document-search:tri:ing → ["/docs/doc1", "/docs/doc4"] ``` #### Bigram Reverse Index Used for finding trigrams that contain a specific bigram: ``` idx::bi: → [trigram1, trigram2, ...] ``` **Example:** ``` idx:/indices/document-search:bi:th → ["the", "tha", "thi"] ``` #### Unigram Reverse Index Used for finding bigrams that start with a specific character: ``` idx::uni: → [bigram1, bigram2, ...] ``` **Example:** ``` idx:/indices/document-search:uni:t → ["th", "tr", "to"] ``` #### Document Content Cache Stores the indexed content for a document (used for reindexing): ``` idx::doc: ``` **Example:** ``` idx:/indices/document-search:doc:/docs/doc1 → "The quick brown fox jumps over the lazy dog" ``` ## Value Serialization All values are serialized using the [`ElementSerializer`](src/Storage/ElementSerializer.vala) system with type tags for proper deserialization. ### Type Codes | Code | Type | Description | |------|------|-------------| | 0x00 | NULL | Null value | | 0x01 | BOOL | Boolean (1 byte) | | 0x02 | INT64 | 64-bit signed integer (big-endian) | | 0x03 | UINT64 | 64-bit unsigned integer (big-endian) | | 0x04 | DOUBLE | 64-bit IEEE 754 floating point | | 0x05 | STRING | Length-prefixed UTF-8 string | | 0x06 | BINARY | Length-prefixed binary data | | 0x07 | ARRAY | Count-prefixed array of elements | | 0x08 | DICTIONARY | Count-prefixed key-value pairs | ### String Encoding Strings are serialized as: ``` [length: int64][utf-8 bytes] ``` ### Set/Array Encoding Sets (like member lists) are serialized as arrays: ``` [ARRAY_CODE: 0x07][count: int64][element1][element2]...[elementN] ``` Each element is a string element with its own type code and length prefix. ### Properties Encoding Properties dictionaries are serialized as: ``` [DICTIONARY_CODE: 0x08][count: int64][key1][value1][key2][value2]... ``` ## Index Management Indices are kept in sync with document changes through the [`HookManager`](src/Engine/HookManager.vala) event system. ### Event Flow 1. Document is created, updated, or deleted via [`EmbeddedEngine`](src/Engine/EmbeddedEngine.vala) 2. [`HookManager`](src/Engine/HookManager.vala) fires the appropriate event 3. Indexed entities ([`Category`](src/Entities/Category.vala), [`Catalogue`](src/Entities/Catalogue.vala), [`Index`](src/Entities/Index.vala)) listen for events 4. [`IndexManager`](src/Storage/IndexManager.vala) updates the relevant index entries 5. All operations within a transaction are committed atomically (when supported) ### Transaction Safety When using a backend with native transaction support (like LMDB): ```vala dbm.begin_transaction(); try { // Update document properties storage.store_properties(path, properties); // Update type index index_manager.add_to_type_index(type_label, path); // Update category indices index_manager.add_to_category(category_path, path); // All changes commit atomically dbm.commit_transaction(); } catch (StorageError e) { dbm.rollback_transaction(); } ``` ### Index Rebuilding If indices become corrupted or need rebuilding: 1. Iterate through all keys with the relevant prefix 2. For each document, re-evaluate index expressions 3. Update index entries accordingly ```vala // Example: Rebuild type index for "User" var user_paths = new Vector(); foreach (var key in dbm.keys) { if (key.has_prefix("props:")) { var path = key.substring(6); var type_label = storage.get_entity_type_label(path); if (type_label == "User") { user_paths.add(path); } } } // Clear and rebuild dbm.delete("typeidx:User"); foreach (var path in user_paths) { index_manager.add_to_type_index("User", path); } ``` ## Key Enumeration Patterns ### Finding All Entities ```vala foreach (var key in dbm.keys) { if (key.has_prefix("entity:")) { var path = key.substring(7); // Process entity } } ``` ### Finding All Documents of a Type ```vala var user_paths = index_manager.get_paths_for_type("User"); foreach (var path in user_paths) { var props = storage.load_properties(new EntityPath(path)); // Process document } ``` ### Finding Category Members ```vala var members = index_manager.get_category_members("/categories/active-users"); foreach (var member_path in members) { // Process member } ``` ### Finding Catalogue Groups ```vala var keys = index_manager.get_catalogue_keys("/catalogues/users-by-role"); foreach (var key in keys) { var members = index_manager.get_catalogue_group_members( "/catalogues/users-by-role", key ); // Process group } ``` ## Storage Efficiency Considerations ### Key Length Keys are stored verbatim, so shorter paths reduce storage overhead: - Prefer `/users/john` over `/application/data/users/john` - Type labels should be concise but descriptive ### Set Storage Member sets are stored as arrays of strings. For very large sets: - Consider splitting into multiple categories with filter expressions - Use catalogues for grouping to enable efficient subset queries ### N-gram Index Size Text indices can grow large for documents with much text: - Trigram indices: O(unique_trigrams × matching_documents) - Consider indexing only specific fields, not entire documents - Use the document content cache sparingly ## See Also - [STORAGE-BACKENDS.md](STORAGE-BACKENDS.md) - Available storage backends - [Architecture/07-Storage-Layer.md](Architecture/07-Storage-Layer.md) - Storage layer architecture - [Architecture/11-Indexed-Entities.md](Architecture/11-Indexed-Entities.md) - Indexed entity documentation