# HookManager Batch Optimization Plan ## Problem Even after fixing the double-processing bug, batched operations are still slower than individual inserts: | Operation | Individual | Batched | Ratio | |-----------|------------|---------|-------| | create_document_small | 4.19ms | 238.51ms per batch (23.85ms/doc) | 5.7× slower | | create_document_large | 42.92ms | 452.75ms per batch (45.27ms/doc) | 1.05× slower | ## Root Cause Analysis In `commit_batch()`, even when ALL handlers are batched handlers (Category, Catalogue, Index all implement `BatchedHookHandler` with `supports_batch = true`), the code still calls `batch.execute()`: ```vala public void commit_batch() { // Execute batch for batched handlers execute_batch_for_handlers((!) _current_batch); // ← Correct: calls on_batch_change() // Also execute individual events for non-batched handlers ((!) _current_batch).execute(this); // ← WASTEFUL when no non-batched handlers! _current_batch = null; _batch_mode = false; } ``` ### What `batch.execute()` does (unnecessarily when all handlers are batched): 1. **`get_consolidated_events()`** - Creates new Vector, Dictionary, iterates all events 2. **For each consolidated event:** - Calls `engine.get_entity_or_null()` - **Storage lookup!** - Calls `notify_entity_change_from_event()` → `notify_entity_change_immediate()` - Iterates ALL handlers just to skip them (they're all batched) 3. **`execute_property_changes()`** - Iterates property changes, calls handlers that skip ### Why This is Expensive For10 documents with2 properties each: - 30 events recorded - 10 entity lookups from storage (expensive!) - 30 handler iterations (all skipped, but still iterated) ## Solution Modify `commit_batch()` to check if there are any non-batched handlers before calling `batch.execute()`: ```vala public void commit_batch() { if (_current_batch == null) { return; } // Execute batch for batched handlers execute_batch_for_handlers((!) _current_batch); // Only execute individual events if there are non-batched handlers if (has_non_batched_handlers()) { ((!) _current_batch).execute(this); } _current_batch = null; _batch_mode = false; } private bool has_non_batched_handlers() { foreach (var handler in _handlers) { if (!(handler is BatchedHookHandler)) { return true; } var batched = (BatchedHookHandler) handler; if (!batched.supports_batch) { return true; } } return false; } ``` ## Expected Outcome After fix: - `batch.execute()` is skipped entirely when all handlers support batching - No unnecessary entity lookups - No unnecessary handler iterations - Batched inserts should be **faster** than individual inserts (single transaction vs N transactions) ## Verification 1. Run tests: `meson test -C builddir` 2. Run benchmarks: `builddir/tools/implexus-perf/implexus-perf gdbm:///tmp/perf-test` 3. Compare batched vs individual insert times