13-Transaction-Batching.md 37 KB

Transaction Batching Design

This document describes the design for batching hook execution at transaction commit time to improve performance by avoiding individual hook invocations for each insert operation.

Problem Statement

Current Hook Execution Flow

The current implementation invokes hooks immediately for each entity change:

sequenceDiagram
    participant App as Application
    participant Engine as EmbeddedEngine
    participant Hook as HookManager
    participant Cat as Category
    participant Storage as IndexManager
    
    App->>Engine: create_document with type_label
    Engine->>Engine: store document
    Engine->>Hook: notify_entity_change CREATED
    Hook->>Cat: on_entity_change CREATED
    Cat->>Storage: add_to_category
    Storage->>Storage: write to DBM
    
    App->>Engine: set_property on document
    Engine->>Engine: store property
    Engine->>Hook: notify_document_property_change
    Hook->>Cat: on_document_property_change
    Cat->>Storage: update_category_membership
    Storage->>Storage: write to DBM
    
    Note over App,Storage: Each operation triggers immediate storage I/O

Performance Issues

From performance benchmarks in PERF.md:

Operation Avg Time Ops/sec
create_document_small ~1.5ms ~640
create_category ~17ms ~58
create_catalogue ~58ms ~17
create_index ~4000ms ~0.25

The issues are:

  1. Immediate Hook Invocation: Each document operation triggers all registered hooks immediately
  2. Multiple Storage Writes: Each hook invocation performs separate storage I/O
  3. No Batching Opportunity: Hooks cannot optimize for bulk operations
  4. Transaction Overhead: Each hook may start its own transaction or write outside transaction scope

Example: Bulk Document Insert

When inserting 1000 documents in a transaction:

var tx = engine.begin_transaction();
try {
    for (int i = 0; i < 1000; i++) {
        var doc = container.create_document("doc%d".printf(i), "Post");
        doc.set_entity_property("title", new ValueElement("Title %d".printf(i)));
        doc.set_entity_property("author", new ValueElement("user123"));
        // Each create and set_property triggers hooks immediately!
    }
    tx.commit();
} catch (Error e) {
    tx.rollback();
}

Current behavior:

  • 1000 document creations × N hooks = 1000N hook invocations
  • 2000 property sets × M hooks = 2000M hook invocations
  • Each invocation does storage I/O

Proposed Solution

Batched Hook Execution

Hooks should be accumulated during a transaction and executed in batch at commit time:

sequenceDiagram
    participant App as Application
    participant Engine as EmbeddedEngine
    participant Batch as HookBatch
    participant Hook as HookManager
    participant Cat as Category
    participant Storage as IndexManager
    
    App->>Engine: begin_transaction
    Engine->>Batch: create new HookBatch
    
    loop For each document
        App->>Engine: create_document
        Engine->>Engine: store document
        Engine->>Batch: record CREATED event
        Note over Batch: Event queued, no hook call
        
        App->>Engine: set_property
        Engine->>Engine: store property
        Engine->>Batch: record MODIFIED event
        Note over Batch: Event queued, no hook call
    end
    
    App->>Engine: commit_transaction
    Engine->>Batch: flush
    Batch->>Hook: execute_batch events
    Hook->>Cat: on_batch_change events
    Cat->>Storage: batch_update_category
    Storage->>Storage: single batched write
    
    Note over App,Storage: All hooks executed once with all changes

Key Components

1. HookEvent Record

Represents a single entity change event:

namespace Implexus.Engine {

/**
 * Represents a queued hook event for batched processing.
 */
public class HookEvent : Object {
    /**
     * The type of change that occurred.
     */
    public EntityChangeType change_type { get; construct set; }
    
    /**
     * The entity path affected.
     */
    public Core.EntityPath entity_path { get; construct set; }
    
    /**
     * The entity type.
     */
    public Core.EntityType entity_type { get; construct set; }
    
    /**
     * The type label for documents.
     */
    public string? type_label { get; construct set; }
    
    /**
     * The property name for property changes.
     */
    public string? property_name { get; construct set; }
    
    /**
     * The old property value.
     */
    public Invercargill.Element? old_value { get; construct set; }
    
    /**
     * The new property value.
     */
    public Invercargill.Element? new_value { get; construct set; }
    
    /**
     * Creates a new hook event.
     */
    public HookEvent(
        EntityChangeType change_type,
        Core.EntityPath entity_path,
        Core.EntityType entity_type,
        string? type_label = null,
        string? property_name = null,
        Invercargill.Element? old_value = null,
        Invercargill.Element? new_value = null
    ) {
        Object(
            change_type: change_type,
            entity_path: entity_path,
            entity_type: entity_type,
            type_label: type_label,
            property_name: property_name,
            old_value: old_value,
            new_value: new_value
        );
    }
}

} // namespace Implexus.Engine

2. HookBatch Class

Accumulates events during a transaction:

namespace Implexus.Engine {

/**
 * Accumulates hook events during a transaction for batched execution.
 *
 * The HookBatch collects entity change events during a transaction
 * and provides methods to consolidate and execute them efficiently.
 */
public class HookBatch : Object {
    
    // === Private Fields ===
    
    /**
     * Accumulated events in order of occurrence.
     */
    private Invercargill.DataStructures.Vector<HookEvent> _events;
    
    /**
     * Map of entity path to final state for consolidation.
     */
    private Invercargill.DataStructures.Dictionary<string, EntityFinalState> _entity_states;
    
    // === Constructors ===
    
    /**
     * Creates a new empty HookBatch.
     */
    public HookBatch() {
        _events = new Invercargill.DataStructures.Vector<HookEvent>();
        _entity_states = new Invercargill.DataStructures.Dictionary<string, EntityFinalState>();
    }
    
    // === Event Recording ===
    
    /**
     * Records an entity creation event.
     *
     * @param entity The entity that was created
     */
    public void record_created(Core.Entity entity) {
        var evt = new HookEvent(
            EntityChangeType.CREATED,
            entity.path,
            entity.entity_type,
            entity.type_label
        );
        _events.add(evt);
        update_entity_state(entity.path.to_string(), StateChangeType.CREATED, entity);
    }
    
    /**
     * Records an entity modification event.
     *
     * @param entity The entity that was modified
     */
    public void record_modified(Core.Entity entity) {
        var evt = new HookEvent(
            EntityChangeType.MODIFIED,
            entity.path,
            entity.entity_type,
            entity.type_label
        );
        _events.add(evt);
        update_entity_state(entity.path.to_string(), StateChangeType.MODIFIED, entity);
    }
    
    /**
     * Records an entity deletion event.
     *
     * @param path The path of the deleted entity
     * @param entity_type The type of the deleted entity
     * @param type_label The type label if it was a document
     */
    public void record_deleted(
        Core.EntityPath path,
        Core.EntityType entity_type,
        string? type_label
    ) {
        var evt = new HookEvent(
            EntityChangeType.DELETED,
            path,
            entity_type,
            type_label
        );
        _events.add(evt);
        update_entity_state(path.to_string(), StateChangeType.DELETED, null);
    }
    
    /**
     * Records a property change event.
     *
     * @param document The document whose property changed
     * @param property_name The name of the property
     * @param old_value The previous value
     * @param new_value The new value
     */
    public void record_property_change(
        Core.Entity document,
        string property_name,
        Invercargill.Element? old_value,
        Invercargill.Element? new_value
    ) {
        var evt = new HookEvent(
            EntityChangeType.MODIFIED,
            document.path,
            document.entity_type,
            document.type_label,
            property_name,
            old_value,
            new_value
        );
        _events.add(evt);
        
        // Track property changes in entity state
        var path = document.path.to_string();
        if (!_entity_states.has_key(path)) {
            _entity_states.set(path, new EntityFinalState(document));
        }
        var state = _entity_states.get(path);
        state.record_property_change(property_name, old_value, new_value);
    }
    
    // === Event Consolidation ===
    
    /**
     * Gets consolidated events for efficient batch processing.
     *
     * This method consolidates multiple events for the same entity:
     * - CREATED followed by MODIFIED → just CREATED with final state
     * - Multiple MODIFIED → single MODIFIED with final state
     * - CREATED followed by DELETED → no events (cancelled out)
     *
     * @return Consolidated vector of events
     */
    public Invercargill.DataStructures.Vector<HookEvent> get_consolidated_events() {
        var consolidated = new Invercargill.DataStructures.Vector<HookEvent>();
        
        // Group by entity path
        var events_by_path = new Invercargill.DataStructures.Dictionary<
            string, 
            Invercargill.DataStructures.Vector<HookEvent>
        >();
        
        foreach (var evt in _events) {
            var path = evt.entity_path.to_string();
            if (!events_by_path.has_key(path)) {
                events_by_path.set(path, new Invercargill.DataStructures.Vector<HookEvent>());
            }
            events_by_path.get(path).add(evt);
        }
        
        // Consolidate each entity's events
        foreach (var entry in events_by_path.entries) {
            var path = entry.key;
            var events = entry.value;
            var final_event = consolidate_entity_events(events);
            
            if (final_event != null) {
                consolidated.add((!) final_event);
            }
        }
        
        return consolidated;
    }
    
    /**
     * Consolidates events for a single entity.
     */
    private HookEvent? consolidate_entity_events(
        Invercargill.DataStructures.Vector<HookEvent> events
    ) {
        if (events.peek_count() == 0) {
            return null;
        }
        
        bool was_created = false;
        bool was_deleted = false;
        HookEvent? last_event = null;
        
        foreach (var evt in events) {
            switch (evt.change_type) {
                case EntityChangeType.CREATED:
                    was_created = true;
                    break;
                case EntityChangeType.DELETED:
                    was_deleted = true;
                    break;
                case EntityChangeType.MODIFIED:
                    break;
            }
            last_event = evt;
        }
        
        // If created and deleted in same transaction, cancel out
        if (was_created && was_deleted) {
            return null;
        }
        
        // Return appropriate event
        if (was_created) {
            return new HookEvent(
                EntityChangeType.CREATED,
                last_event.entity_path,
                last_event.entity_type,
                last_event.type_label
            );
        }
        
        if (was_deleted) {
            return new HookEvent(
                EntityChangeType.DELETED,
                last_event.entity_path,
                last_event.entity_type,
                last_event.type_label
            );
        }
        
        // Just modifications
        return new HookEvent(
            EntityChangeType.MODIFIED,
            last_event.entity_path,
            last_event.entity_type,
            last_event.type_label
        );
    }
    
    // === Batch Execution ===
    
    /**
     * Executes all batched events through the hook manager.
     *
     * @param hook_manager The hook manager to notify
     */
    public void execute(HookManager hook_manager) {
        var consolidated = get_consolidated_events();
        
        foreach (var evt in consolidated) {
            switch (evt.change_type) {
                case EntityChangeType.CREATED:
                case EntityChangeType.MODIFIED:
                case EntityChangeType.DELETED:
                    // Get entity if not deleted
                    Core.Entity? entity = null;
                    if (evt.change_type != EntityChangeType.DELETED) {
                        entity = hook_manager.engine.get_entity_or_null(evt.entity_path);
                    }
                    
                    if (entity != null || evt.change_type == EntityChangeType.DELETED) {
                        hook_manager.notify_entity_change_from_event(evt, entity);
                    }
                    break;
            }
        }
        
        // Execute property change events
        execute_property_changes(hook_manager);
    }
    
    /**
     * Executes property change events.
     */
    private void execute_property_changes(HookManager hook_manager) {
        foreach (var entry in _entity_states.entries) {
            var state = entry.value;
            foreach (var prop_change in state.property_changes.entries) {
                var property_name = prop_change.key;
                var change = prop_change.value;
                
                if (state.entity != null) {
                    hook_manager.notify_document_property_change(
                        state.entity,
                        property_name,
                        change.old_value,
                        change.new_value
                    );
                }
            }
        }
    }
    
    // === Utility Methods ===
    
    /**
     * Updates the entity state tracking.
     */
    private void update_entity_state(
        string path,
        StateChangeType change_type,
        Core.Entity? entity
    ) {
        if (!_entity_states.has_key(path)) {
            if (entity != null) {
                _entity_states.set(path, new EntityFinalState((!) entity));
            }
        }
        
        var state = _entity_states.get(path);
        if (state != null) {
            state.record_change(change_type);
        }
    }
    
    /**
     * Clears all accumulated events.
     */
    public void clear() {
        _events.clear();
        _entity_states.clear();
    }
    
    /**
     * Gets the number of accumulated events.
     */
    public int event_count {
        get { return (int) _events.peek_count(); }
    }
    
    /**
     * Checks if there are any events to process.
     */
    public bool has_events {
        get { return _events.peek_count() > 0; }
    }
}

/**
 * Tracks the final state of an entity during a transaction.
 */
internal class EntityFinalState : Object {
    public Core.Entity entity;
    public bool was_created = false;
    public bool was_deleted = false;
    public Invercargill.DataStructures.Dictionary<string, PropertyChange> property_changes;
    
    public EntityFinalState(Core.Entity entity) {
        this.entity = entity;
        this.property_changes = new Invercargill.DataStructures.Dictionary<string, PropertyChange>();
    }
    
    public void record_change(StateChangeType change_type) {
        switch (change_type) {
            case StateChangeType.CREATED:
                was_created = true;
                break;
            case StateChangeType.DELETED:
                was_deleted = true;
                break;
            case StateChangeType.MODIFIED:
                break;
        }
    }
    
    public void record_property_change(
        string property_name,
        Invercargill.Element? old_value,
        Invercargill.Element? new_value
    ) {
        if (!property_changes.has_key(property_name)) {
            property_changes.set(property_name, new PropertyChange(old_value, new_value));
        } else {
            // Update the new value, keep the original old value
            var existing = property_changes.get(property_name);
            existing.new_value = new_value;
        }
    }
}

/**
 * Represents a property change with old and new values.
 */
internal class PropertyChange : Object {
    public Invercargill.Element? old_value;
    public Invercargill.Element? new_value;
    
    public PropertyChange(Invercargill.Element? old_value, Invercargill.Element? new_value) {
        this.old_value = old_value;
        this.new_value = new_value;
    }
}

internal enum StateChangeType {
    CREATED,
    MODIFIED,
    DELETED
}

} // namespace Implexus.Engine

3. BatchedHookHandler Interface

New interface for hooks that support batch processing:

namespace Implexus.Engine {

/**
 * Interface for hooks that can process batched events efficiently.
 *
 * Implementing this interface allows hooks to optimize their index
 * updates when processing multiple changes at once.
 */
public interface BatchedHookHandler : Object, EntityChangeHandler {
    
    /**
     * Called with a batch of entity changes.
     *
     * This method receives all changes for entities matching the
     * hook's type filter. The hook can optimize storage writes
     * by processing all changes together.
     *
     * @param events The consolidated events for matching entities
     */
    public abstract void on_batch_change(Invercargill.DataStructures.Vector<HookEvent> events);
    
    /**
     * Called with batched property changes.
     *
     * @param document The document that changed
     * @param changes Map of property name to old/new values
     */
    public abstract void on_batch_property_change(
        Core.Entity document,
        Invercargill.DataStructures.Dictionary<string, PropertyChange> changes
    );
    
    /**
     * Indicates whether this handler prefers batch processing.
     *
     * If true, on_batch_change will be called instead of individual
     * on_entity_change calls.
     */
    public abstract bool supports_batch { get; }
}

} // namespace Implexus.Engine

4. Modified HookManager

Updated HookManager with batch support:

namespace Implexus.Engine {

/**
 * Manages hooks for entity change notifications with batch support.
 */
public class HookManager : Object {
    
    // === Private Fields ===
    
    private GLib.List<EntityChangeHandler> _handlers;
    private GLib.List<DocumentPropertyChangeHandler> _property_handlers;
    private GLib.List<BatchedHookHandler> _batched_handlers;
    
    /**
     * The engine this hook manager is associated with.
     */
    public weak Core.Engine engine { get; set; }
    
    /**
     * The current batch for transaction mode, or null if not in transaction.
     */
    private HookBatch? _current_batch = null;
    
    /**
     * Whether batch mode is active (i.e., within a transaction).
     */
    private bool _batch_mode = false;
    
    // === Constructors ===
    
    public HookManager() {
        _handlers = new GLib.List<EntityChangeHandler>();
        _property_handlers = new GLib.List<DocumentPropertyChangeHandler>();
        _batched_handlers = new GLib.List<BatchedHookHandler>();
    }
    
    // === Batch Mode Control ===
    
    /**
     * Begins batch mode for transaction processing.
     *
     * In batch mode, all events are accumulated instead of being
     * immediately dispatched to handlers.
     */
    public void begin_batch() {
        _batch_mode = true;
        _current_batch = new HookBatch();
    }
    
    /**
     * Commits the current batch, executing all accumulated events.
     */
    public void commit_batch() {
        if (_current_batch == null) {
            return;
        }
        
        // Execute batch for batched handlers
        execute_batch_for_handlers((!) _current_batch);
        
        // Also execute individual events for non-batched handlers
        ((!) _current_batch).execute(this);
        
        // Clear batch
        _current_batch = null;
        _batch_mode = false;
    }
    
    /**
     * Rolls back the current batch, discarding all accumulated events.
     */
    public void rollback_batch() {
        if (_current_batch != null) {
            ((!) _current_batch).clear();
        }
        _current_batch = null;
        _batch_mode = false;
    }
    
    // === Event Notification ===
    
    /**
     * Notifies handlers of an entity change.
     *
     * In batch mode, events are queued. Otherwise, handlers are
     * invoked immediately.
     */
    public void notify_entity_change(Core.Entity entity, EntityChangeType change_type) {
        if (_batch_mode && _current_batch != null) {
            // Queue the event
            switch (change_type) {
                case EntityChangeType.CREATED:
                    ((!) _current_batch).record_created(entity);
                    break;
                case EntityChangeType.MODIFIED:
                    ((!) _current_batch).record_modified(entity);
                    break;
                case EntityChangeType.DELETED:
                    ((!) _current_batch).record_deleted(
                        entity.path,
                        entity.entity_type,
                        entity.type_label
                    );
                    break;
            }
        } else {
            // Immediate dispatch
            notify_entity_change_immediate(entity, change_type);
        }
    }
    
    /**
     * Notifies handlers of a document property change.
     */
    public void notify_document_property_change(
        Core.Entity document,
        string property_name,
        Invercargill.Element? old_value,
        Invercargill.Element? new_value
    ) {
        if (_batch_mode && _current_batch != null) {
            ((!) _current_batch).record_property_change(
                document,
                property_name,
                old_value,
                new_value
            );
        } else {
            notify_property_change_immediate(document, property_name, old_value, new_value);
        }
    }
    
    // === Internal Methods ===
    
    /**
     * Notifies handlers from a stored event.
     */
    internal void notify_entity_change_from_event(HookEvent evt, Core.Entity? entity) {
        if (entity != null) {
            notify_entity_change_immediate((!) entity, evt.change_type);
        }
    }
    
    /**
     * Immediately notifies all handlers.
     */
    private void notify_entity_change_immediate(Core.Entity entity, EntityChangeType change_type) {
        foreach (var handler in _handlers) {
            try {
                handler.on_entity_change(entity, change_type);
            } catch (Error e) {
                warning("Hook handler threw error for %s: %s",
                    entity.path.to_string(), e.message);
            }
        }
    }
    
    /**
     * Immediately notifies all property handlers.
     */
    private void notify_property_change_immediate(
        Core.Entity document,
        string property_name,
        Invercargill.Element? old_value,
        Invercargill.Element? new_value
    ) {
        foreach (var handler in _property_handlers) {
            try {
                handler.on_document_property_change(document, property_name, old_value, new_value);
            } catch (Error e) {
                warning("Property hook handler threw error for %s.%s: %s",
                    document.path.to_string(), property_name, e.message);
            }
        }
    }
    
    /**
     * Executes batch for handlers that support batch processing.
     */
    private void execute_batch_for_handlers(HookBatch batch) {
        var consolidated = batch.get_consolidated_events();
        
        foreach (var handler in _batched_handlers) {
            if (handler.supports_batch) {
                try {
                    // Filter events by type_label if handler is type-specific
                    var filtered = filter_events_for_handler(consolidated, handler);
                    if (filtered.peek_count() > 0) {
                        handler.on_batch_change(filtered);
                    }
                } catch (Error e) {
                    warning("Batched hook handler threw error: %s", e.message);
                }
            }
        }
    }
    
    /**
     * Filters events to only those relevant to a handler.
     */
    private Invercargill.DataStructures.Vector<HookEvent> filter_events_for_handler(
        Invercargill.DataStructures.Vector<HookEvent> events,
        BatchedHookHandler handler
    ) {
        // If handler implements TypeFilteredHook, filter by type_label
        var filtered = new Invercargill.DataStructures.Vector<HookEvent>();
        
        // For now, return all events - handlers can filter internally
        foreach (var evt in events) {
            filtered.add(evt);
        }
        
        return filtered;
    }
    
    // === Handler Registration ===
    
    /**
     * Registers a handler for entity changes.
     */
    public void register_handler(EntityChangeHandler handler) {
        _handlers.append(handler);
        
        // Also track as batched handler if applicable
        if (handler is BatchedHookHandler) {
            _batched_handlers.append((BatchedHookHandler) handler);
        }
    }
    
    /**
     * Unregisters a handler.
     */
    public void unregister_handler(EntityChangeHandler handler) {
        _handlers.remove(handler);
        
        if (handler is BatchedHookHandler) {
            _batched_handlers.remove((BatchedHookHandler) handler);
        }
    }
    
    /**
     * Registers a handler for property changes.
     */
    public void register_property_handler(DocumentPropertyChangeHandler handler) {
        _property_handlers.append(handler);
    }
    
    /**
     * Unregisters a property handler.
     */
    public void unregister_property_handler(DocumentPropertyChangeHandler handler) {
        _property_handlers.remove(handler);
    }
    
    // === Utility Methods ===
    
    public void clear_all() {
        _handlers = new GLib.List<EntityChangeHandler>();
        _property_handlers = new GLib.List<DocumentPropertyChangeHandler>();
        _batched_handlers = new GLib.List<BatchedHookHandler>();
    }
    
    public uint handler_count {
        get { return _handlers.length(); }
    }
    
    public uint property_handler_count {
        get { return _property_handlers.length(); }
    }
}

} // namespace Implexus.Engine

Modified EmbeddedTransaction

The EmbeddedTransaction needs to integrate with the batch system:

namespace Implexus.Engine {

public class EmbeddedTransaction : Object, Core.Transaction {
    
    private weak EmbeddedEngine _engine;
    private bool _active = true;
    private Invercargill.DataStructures.Vector<PendingOperation> _operations;
    private Invercargill.DataStructures.Dictionary<string, Invercargill.Element?> _snapshots;
    
    public EmbeddedTransaction(EmbeddedEngine engine) throws Core.EngineError {
        _engine = engine;
        _operations = new Invercargill.DataStructures.Vector<PendingOperation>();
        _snapshots = new Invercargill.DataStructures.Dictionary<string, Invercargill.Element?>();
        
        // Begin hook batching
        engine.hook_manager.begin_batch();
        
        // Notify engine that transaction started
        engine.begin_transaction_internal();
    }
    
    public bool active {
        get { return _active; }
    }
    
    public void commit() throws Core.EngineError {
        if (!_active) {
            throw new Core.EngineError.TRANSACTION_ERROR("Transaction is not active");
        }
        
        try {
            // Apply all pending operations
            foreach (var op in _operations) {
                apply_operation(op);
            }
            
            // Commit hook batch - executes all accumulated hooks
            _engine.hook_manager.commit_batch();
            
            // Clear and deactivate
            _operations.clear();
            _snapshots.clear();
            _active = false;
            
            // Notify engine that transaction ended
            _engine.end_transaction_internal();
        } catch (Core.EngineError e) {
            rollback();
            throw e;
        }
    }
    
    public void rollback() {
        if (!_active) {
            return;
        }
        
        // Rollback hook batch - discards all accumulated hooks
        _engine.hook_manager.rollback_batch();
        
        // Restore snapshots
        foreach (var key in _snapshots.keys) {
            var value = _snapshots.get(key);
            if (value != null && !((!) value).is_null()) {
                // Would restore the snapshot
            }
        }
        
        // Clear and deactivate
        _operations.clear();
        _snapshots.clear();
        _active = false;
        
        // Notify engine that transaction ended
        _engine.end_transaction_internal();
    }
    
    // ... rest of the class remains the same ...
}

} // namespace Implexus.Engine

Updated Indexed Entity Implementations

Indexed entities should implement BatchedHookHandler for optimal performance:

namespace Implexus.Entities {

public class Category : AbstractEntity, Engine.BatchedHookHandler {
    
    // ... existing code ...
    
    /**
     * Indicates this handler supports batch processing.
     */
    public bool supports_batch {
        get { return true; }
    }
    
    /**
     * Handles a batch of entity changes efficiently.
     *
     * This method processes all changes in one pass, then performs
     * a single batch update to the category index.
     */
    public void on_batch_change(Invercargill.DataStructures.Vector<Engine.HookEvent> events) {
        ensure_config_loaded();
        
        var to_add = new Invercargill.DataStructures.Vector<string>();
        var to_remove = new Invercargill.DataStructures.Vector<string>();
        
        foreach (var evt in events) {
            // Skip non-documents and wrong type
            if (evt.entity_type != Core.EntityType.DOCUMENT) {
                continue;
            }
            if (evt.type_label != _type_label) {
                continue;
            }
            
            var doc_path = evt.entity_path.to_string();
            
            switch (evt.change_type) {
                case Engine.EntityChangeType.CREATED:
                    // Evaluate predicate and add if matches
                    var entity = _engine.get_entity_or_null(evt.entity_path);
                    if (entity != null && evaluate_predicate((!) entity)) {
                        to_add.add(doc_path);
                    }
                    break;
                    
                case Engine.EntityChangeType.MODIFIED:
                    // Re-evaluate and update membership
                    var entity = _engine.get_entity_or_null(evt.entity_path);
                    if (entity != null) {
                        bool should_include = evaluate_predicate((!) entity);
                        bool is_included = contains_document(doc_path);
                        
                        if (should_include && !is_included) {
                            to_add.add(doc_path);
                        } else if (!should_include && is_included) {
                            to_remove.add(doc_path);
                        }
                    }
                    break;
                    
                case Engine.EntityChangeType.DELETED:
                    to_remove.add(doc_path);
                    break;
            }
        }
        
        // Batch update the index
        try {
            batch_update_members(to_add, to_remove);
        } catch (Storage.StorageError e) {
            warning("Failed to batch update category: %s", e.message);
        }
    }
    
    /**
     * Batch updates the category membership.
     */
    private void batch_update_members(
        Invercargill.DataStructures.Vector<string> to_add,
        Invercargill.DataStructures.Vector<string> to_remove
    ) throws Storage.StorageError {
        var index_manager = get_index_manager();
        if (index_manager == null) {
            return;
        }
        
        // Add all new members
        foreach (var path in to_add) {
            ((!) index_manager).add_to_category(_path.to_string(), path);
        }
        
        // Remove all deleted members
        foreach (var path in to_remove) {
            ((!) index_manager).remove_from_category(_path.to_string(), path);
        }
    }
    
    /**
     * Handles batched property changes.
     */
    public void on_batch_property_change(
        Core.Entity document,
        Invercargill.DataStructures.Dictionary<string, Engine.PropertyChange> changes
    ) {
        // For category, we just need to re-evaluate the predicate
        // This is handled by on_batch_change
    }
    
    // ... rest of the class ...
}

} // namespace Implexus.Entities

API Changes

New Interfaces

Interface Purpose
BatchedHookHandler Hooks that support batch processing
HookEvent Represents a single entity change event
HookBatch Accumulates events during transaction

Modified Classes

Class Changes
HookManager Added batch mode support
EmbeddedTransaction Integrates with batch system
Category Implements BatchedHookHandler
Catalogue Implements BatchedHookHandler
Index Implements BatchedHookHandler

No Public API Changes

The batching is entirely internal - applications continue to use:

var tx = engine.begin_transaction();
try {
    // ... operations ...
    tx.commit();
} catch (Error e) {
    tx.rollback();
}

Thread-Safety Considerations

Single-Threaded Design

The current design assumes single-threaded access:

  1. Transaction Isolation: Only one transaction per engine at a time
  2. Batch Ownership: Batch belongs to the current transaction
  3. Handler Registration: Handlers should be registered before transactions

Thread-Safety Recommendations

If multi-threading is needed in the future:

public class HookManager : Object {
    private GLib.Mutex _batch_mutex;
    
    public void begin_batch() {
        _batch_mutex.lock();
        try {
            // ... existing code ...
        } finally {
            _batch_mutex.unlock();
        }
    }
    
    // ... other methods with mutex protection ...
}

Event Accumulation Safety

The HookBatch uses Invercargill data structures which are not thread-safe:

  • Events should only be added from the transaction thread
  • Batch execution happens on the commit thread
  • No concurrent access to batches

Performance Implications

Expected Improvements

Scenario Current With Batching Improvement
1000 doc inserts 1000N hook calls 1 batch call ~1000×
1000 property sets 1000M hook calls 1 batch call ~1000×
Mixed operations O(n) hook calls O(1) batch ~n×

Memory Overhead

During a transaction:

  • Events stored in memory until commit
  • Entity states tracked for consolidation
  • Property changes accumulated

For 10,000 operations:

  • ~10,000 HookEvent objects (~160 bytes each) = ~1.6MB
  • Entity state dictionary ~500KB
  • Total overhead: ~2MB (acceptable)

Trade-offs

Aspect Benefit Cost
Latency Lower per-operation Higher at commit
Throughput Much higher overall Slightly higher memory
Complexity Transparent to users Internal complexity
Error Handling Atomic batch All-or-nothing

Migration Path

Phase 1: Add Batch Infrastructure

  1. Add HookEvent, HookBatch classes
  2. Add BatchedHookHandler interface
  3. Update HookManager with batch mode
  4. No behavior changes yet

Phase 2: Integrate with Transactions

  1. Update EmbeddedTransaction to use batch mode
  2. Enable batch mode in begin_transaction()
  3. Execute batch in commit()
  4. Discard batch in rollback()

Phase 3: Update Indexed Entities

  1. Update Category to implement BatchedHookHandler
  2. Update Catalogue to implement BatchedHookHandler
  3. Update Index to implement BatchedHookHandler
  4. Optimize batch update methods in IndexManager

Phase 4: Testing and Optimization

  1. Add unit tests for batch processing
  2. Add performance benchmarks
  3. Optimize hot paths
  4. Profile and tune memory usage

Summary

This design introduces batched hook execution for transactions:

  1. Accumulate Events: Hook events are queued during transactions
  2. Consolidate Events: Multiple events for same entity are merged
  3. Batch Execute: All events processed at commit time
  4. Optimized Updates: Indexed entities can batch their storage writes

The result is significantly improved performance for bulk operations while maintaining the same public API and transaction semantics.