hookmanager-batch-fix.md 3.8 KB

HookManager Batch Processing Fix

Problem

Batched document creation is significantly slower than regular inserts due to a double-processing bug in HookManager.commit_batch().

Performance Impact

Operation Regular Batched Slowdown
create_document_small_indexed 1.95ms 62.22ms 32×
create_document_large_indexed 33.98ms 128.26ms

Root Cause

In HookManager.commit_batch():

public void commit_batch() {
    // Execute batch for batched handlers
    execute_batch_for_handlers((!) _current_batch);  // Step 1
    
    // Also execute individual events for non-batched handlers
    ((!) _current_batch).execute(this);  // Step 2 - BUG: executes ALL handlers
}

The comment says "non-batched handlers" but HookBatch.execute() calls notify_entity_change_immediate() which iterates ALL handlers including BatchedHookHandler instances.

Solution

Modify the notify_*_immediate() methods to skip handlers that support batch processing, since they already processed events in execute_batch_for_handlers().

Changes Required

1. Fix notify_entity_change_immediate() (around line 905)

Before:

private void notify_entity_change_immediate(Core.Entity entity, EntityChangeType change_type) {
    foreach (var handler in _handlers) {
        try {
            handler.on_entity_change(entity, change_type);
        } catch (Error e) {
            warning("Hook handler threw error for %s: %s", entity.path.to_string(), e.message);
        }
    }
}

After:

private void notify_entity_change_immediate(Core.Entity entity, EntityChangeType change_type) {
    foreach (var handler in _handlers) {
        // Skip batched handlers - they already processed events in execute_batch_for_handlers()
        if (handler is BatchedHookHandler && ((BatchedHookHandler) handler).supports_batch) {
            continue;
        }
        try {
            handler.on_entity_change(entity, change_type);
        } catch (Error e) {
            warning("Hook handler threw error for %s: %s", entity.path.to_string(), e.message);
        }
    }
}

2. Fix notify_property_change_immediate() (around line 919)

Before:

private void notify_property_change_immediate(Core.Entity entity, string property_name, Value? old_value, Value? new_value) {
    foreach (var handler in _handlers) {
        try {
            handler.on_property_change(entity, property_name, old_value, new_value);
        } catch (Error e) {
            warning("Hook handler threw error for %s.%s: %s", entity.path.to_string(), property_name, e.message);
        }
    }
}

After:

private void notify_property_change_immediate(Core.Entity entity, string property_name, Value? old_value, Value? new_value) {
    foreach (var handler in _handlers) {
        // Skip batched handlers - they already processed events in execute_batch_for_handlers()
        if (handler is BatchedHookHandler && ((BatchedHookHandler) handler).supports_batch) {
            continue;
        }
        try {
            handler.on_property_change(entity, property_name, old_value, new_value);
        } catch (Error e) {
            warning("Hook handler threw error for %s.%s: %s", entity.path.to_string(), property_name, e.message);
        }
    }
}

Expected Outcome

After the fix:

  • Batched handlers process events exactly once via on_batch_change()
  • Non-batched handlers process events individually via on_entity_change() / on_property_change()
  • Batched inserts should be faster than regular inserts (as intended)

Verification

  1. Run all tests: meson test -C builddir
  2. Run performance benchmarks: builddir/tools/implexus-perf/implexus-perf
  3. Compare batched vs regular insert times