Migrate from legacy BasicStorage + IndexManager to the new HighLevel + LowLevel architecture while preserving all performance optimizations.
BasicStorage - High-level storage interfaceIndexManager - Index operations with performance optimizations| Optimization | Location | Description |
|---|---|---|
| HashSet dedup on load | load_string_set() L764-770 |
Uses HashSet during deserialization to deduplicate while preserving order |
| HashSet for membership checks | add_to_ngram_index() L407-408 |
Creates HashSet for O(1) contains() checks instead of O(n) Vector.contains() |
| HashSet for remove | remove_from_ngram_index() L428-429 |
HashSet for efficient membership test before rebuild |
| Batch add with HashSet | add_to_ngram_index_batch() L446-456 |
Tracks changes, only saves if modified, uses HashSet for dedup |
| Batch remove with HashSet | remove_from_ngram_index_batch() L463-474 |
HashSet for values, rebuilds vector without matches |
| Set members with dedup | set_category_members() L205-213 |
Uses HashSet to deduplicate input enumerable |
| Batch reverse index | add_bigrams_reverse_batch() L564-570 |
Dictionary-based batch operations |
| Trigram batch ops | add_trigrams_batch() L622-627 |
Dictionary-based batch trigram operations |
Add the missing batch methods and HashSet optimizations to LowLevel storage classes.
CategoryIndexStorageFile: src/Storage/LowLevel/CategoryIndexStorage.vala
Changes:
set_members() method with HashSet deduplication (from IndexManager L205-213)add_member() to use HashSet for O(1) membership checkremove_member() to use HashSet for O(1) membership checkCurrent vs Optimized:
// Current (O(n) contains check)
public void add_member(string category_path, string doc_path) throws StorageError {
string key = members_key(category_path);
var members = load_string_set(key);
if (!members.contains(doc_path)) { // O(n) operation
members.add(doc_path);
save_string_set(key, members);
}
}
// Optimized (O(1) contains check)
public void add_member(string category_path, string doc_path) throws StorageError {
string key = members_key(category_path);
var members = load_string_set(key);
var members_hash = new Invercargill.DataStructures.HashSet<string>();
foreach (var m in members) members_hash.add(m);
if (!members_hash.has(doc_path)) { // O(1) operation
members.add(doc_path);
save_string_set(key, members);
}
}
CatalogueIndexStorageFile: src/Storage/LowLevel/CatalogueIndexStorage.vala
Changes:
set_group_members() method with HashSet deduplicationadd_to_group() with HashSet membership checkremove_from_group() with HashSet membership checkTextIndexStorage (Critical - most complex)File: src/Storage/LowLevel/TextIndexStorage.vala
Changes:
add_trigram() with HashSet membership check (from IndexManager L404-414)remove_trigram() with HashSet membership check (from IndexManager L425-440)add_trigram_batch() method (from IndexManager L442-457)remove_trigram_batch() method (from IndexManager L459-475)add_bigram_mapping_batch() method (from IndexManager L564-570)add_unigram_mapping_batch() method (from IndexManager L614-620)add_trigrams_batch() dictionary method (from IndexManager L622-628)remove_trigrams_batch() dictionary method (from IndexManager L630-636)load_string_set() to use HashSet for deduplication (from IndexManager L747-779)TypeIndexStorageFile: src/Storage/LowLevel/TypeIndexStorage.vala
Changes:
add_document() with HashSet membership checkremove_document() with HashSet membership checkload_string_set() to use HashSet for deduplicationExpose the new LowLevel batch methods through the HighLevel facades.
CategoryStoreFile: src/Storage/HighLevel/CategoryStore.vala
Changes:
set_members() facade methodCatalogueStoreFile: src/Storage/HighLevel/CatalogueStore.vala
Changes:
set_group_members() facade methodIndexStoreFile: src/Storage/HighLevel/IndexStore.vala
Changes:
add_trigram_batch() facade methodremove_trigram_batch() facade methodadd_bigram_mappings_batch() facade methodadd_unigram_mappings_batch() facade methodadd_trigrams_batch() dictionary methodremove_trigrams_batch() dictionary methodUpdate Category, Catalogue, and Index entities to use HighLevel stores instead of IndexManager.
Category EntityFile: src/Entities/Category.vala
Changes:
get_index_manager() calls with get_category_store() callspopulate_index() to use CategoryStore.set_members()add_document() to use CategoryStore.add_member()remove_document() to use CategoryStore.remove_member()contains_document() to use CategoryStore.get_members()batch_update_members() to use CategoryStore methodsclear_index() to use CategoryStore.clear_index()Example migration:
// Before (using IndexManager)
var index_manager = get_index_manager();
if (index_manager != null) {
((!) index_manager).add_to_category(_path.to_string(), doc_path);
}
// After (using CategoryStore)
var store = get_category_store();
if (store != null) {
((!) store).add_member(_path, doc_path);
}
Catalogue EntityFile: src/Entities/Catalogue.vala
Changes:
get_index_manager() calls with get_catalogue_store() callsIndex EntityFile: src/Entities/Index.vala
Changes:
get_index_manager() calls with get_index_store() callsEngineConfigurationFile: src/Engine/EngineConfiguration.vala
Changes:
index_manager property (L171)Core.Engine interfaceFile: src/Core/Engine.vala
Changes:
index_manager property (L215-216)EmbeddedEngineFile: src/Engine/EmbeddedEngine.vala
Changes:
_index_manager field (L67)_configuration.index_manager assignment (L190)EngineFactoryFile: src/Engine/EngineFactory.vala
Changes:
BasicStorage with direct Dbm + HighLevel storesRemoteEngineFile: src/Engine/RemoteEngine.vala
Changes:
ServerFile: src/Server/Server.vala
Changes:
src/Storage/Storage.vala (BasicStorage class)src/Storage/IndexManager.valaFile: src/meson.build
Changes:
'Storage/Storage.vala' from storage_sources (L35)'Storage/IndexManager.vala' from storage_sources (L36)File: tests/Storage/LowLevelStorageTest.vala (new)
CategoryIndexStorage with HashSet optimizationsCatalogueIndexStorage with HashSet optimizationsTextIndexStorage batch operationsTypeIndexStorage with HashSet optimizationsFile: tests/Storage/HighLevelStorageTest.vala (new)
CategoryStore facadeCatalogueStore facadeIndexStore batch methodsStorageTest.vala to remove BasicStorage tests// When checking if item exists before add/remove
var set_hash = new Invercargill.DataStructures.HashSet<string>();
foreach (var item in set) set_hash.add(item);
if (!set_hash.has(new_item)) { // O(1) instead of O(n)
set.add(new_item);
save_string_set(key, set);
}
// When deserializing, deduplicate while preserving order
var result = new Invercargill.DataStructures.Vector<string>();
var hash_set = new Invercargill.DataStructures.HashSet<string>();
foreach (var item in array) {
if (!item.is_null()) {
string value = item.as<string>();
if (!hash_set.has(value)) { // Prevent duplicates
hash_set.add(value);
result.add(value);
}
}
}
// Only save if changes were made
bool changed = false;
var existing_hash = new Invercargill.DataStructures.HashSet<string>();
foreach (var ex in existing) existing_hash.add(ex);
foreach (var val in values) {
if (!existing_hash.has(val)) {
existing_hash.add(val);
existing.add(val);
changed = true;
}
}
if (changed) save_string_set(key, existing); // Only save if modified
// Process multiple keys in batch
public void add_trigrams_batch(string index_path,
Invercargill.DataStructures.Dictionary<string, Invercargill.DataStructures.Vector<string>> additions)
throws StorageError {
foreach (var trigram in additions.keys) {
Invercargill.DataStructures.Vector<string> docs;
additions.try_get(trigram, out docs);
add_to_ngram_index_batch(index_path, "tri", trigram, docs);
}
}
| Risk | Impact | Mitigation |
|---|---|---|
| Performance regression | High | Benchmark before/after, preserve HashSet patterns |
| Data corruption | Critical | Same key prefixes used, no data migration needed |
| API breakage | Medium | HighLevel stores already exist, just need to use them |
| Test coverage | Medium | Add tests for LowLevel classes before migration |
| Phase | Complexity |
|---|---|
| Phase 1: LowLevel optimizations | Medium - careful pattern copying |
| Phase 2: HighLevel batch methods | Low - simple facades |
| Phase 3: Entity migration | Medium - many call sites |
| Phase 4: Engine configuration | Low - few changes |
| Phase 5: Remaining usage | Low - few files |
| Phase 6: Remove legacy | Low - delete files |
| Phase 7: Testing | Medium - comprehensive testing |