phase-5-async-refactoring.md 15 KB

Phase 5: Async Support Refactoring - Implementation Plan


⚠️ CRITICAL IMPLEMENTATION NOTES - READ FIRST ⚠️

Code Style Requirements

  1. DO NOT use GLib.List, GLib.HashTable, or Libgee collections

    • Use ONLY Invercargill.DataStructures for all collections
    • Vector<T> instead of GLib.List<T> or arrays
    • Dictionary<TKey, TValue> instead of GLib.HashTable or GLib.Map
    • HashSet<T> for set operations
  2. Reference the Invercargill Library

    • Analyze ../Invercargill/src/lib/ for DataStructures and Expressions patterns
    • Key directories: DataStructures/, Expressions/, Mapping/
    • Use Invercargill.Expressions for all expression handling
  3. No Raw SQL in High-Level APIs

    • All expression parameters use Invercargill.Expressions syntax
    • Raw SQL is only generated internally by dialect implementations

Async Pattern Requirements

  1. Thread Offloading at Lowest Level

    • SQLite has no native async API - thread offloading is acceptable
    • Thread offloading should happen in interfaces (Command, Connection, Transaction)
    • Higher-level code should use yield to propagate async correctly
  2. No Blocking Async Methods

    • Async methods must NOT simply call their sync counterparts
    • Use yield to call async methods on dependencies
    • Maintain the async chain all the way down to the interface level

Executive Summary

This document provides a detailed implementation plan for Phase 5: Async Support Refactoring. The goal is to fix async method stubs throughout the project that currently block and call their sync counterparts, instead of properly using async/yield patterns.

Problem Analysis

Current State

The codebase has two categories of async implementations:

Category 1: Correctly Implemented (Thread Offloading at Interface Level)

These are implemented correctly - they spawn a thread and yield back to the main loop:

File Method Pattern
command.vala execute_query_async() Thread + Idle.add + yield
command.vala execute_non_query_async() Thread + Idle.add + yield
command.vala execute_scalar_async() Thread + Idle.add + yield
command.vala execute_batch_async() Thread + Idle.add + yield
connection.vala open_async() Thread + Idle.add + yield
connection.vala begin_transaction_async() Thread + Idle.add + yield
connection.vala execute_async() Thread + Idle.add + yield
transaction.vala commit_async() Thread + Idle.add + yield
transaction.vala rollback_async() Thread + Idle.add + yield
connection-provider.vala create_connection_async() Thread + Idle.add + yield

These are acceptable because SQLite has no native async API. Thread offloading at the interface level is the correct approach.

Category 2: Problematic Stubs (Need Refactoring)

These methods just call their sync counterparts, blocking the calling thread:

File Method Current Implementation
orm-session.vala:328 execute_query_async<T>() return execute_query<T>(query);
projection-query.vala:274 materialise_async() return materialise();
projection-query.vala:323 first_async() return first();

Category 3: Already Correctly Propagating Async

These methods correctly use yield to call async dependencies:

File Method Implementation
query.vala:175 materialise_async() yield _session.execute_query_async<T>(this)
query.vala:206 first_async() yield _session.execute_query_async<T>(limited_query)
connection-factory.vala:69 create_and_open_async() yield provider.create_connection_async(cs)

Architecture Flow

flowchart TB
    subgraph High-Level API
        PQ[ProjectionQuery.materialise_async]
        Q[Query.materialise_async]
        QF[Query.first_async]
    end
    
    subgraph Session Layer
        OS[OrmSession.execute_query_async T]
    end
    
    subgraph Interface Layer - Thread Boundary
        CMD[Command.execute_query_async]
        CONN[Connection.open_async]
        TX[Transaction.commit_async]
    end
    
    subgraph SQLite Implementation
        SC[SqliteCommand]
        SCON[SqliteConnection]
        STX[SqliteTransaction]
    end
    
    PQ --> OS
    Q --> OS
    QF --> OS
    OS --> CMD
    CMD --> SC
    CONN --> SCON
    TX --> STX

Implementation Plan

Part 1: Refactor OrmSession.execute_query_async

File: src/orm/orm-session.vala

Current Implementation:

internal async Invercargill.Enumerable<T> execute_query_async<T>(Query<T> query) throws SqlError {
    // For now, delegate to synchronous version
    // TODO: Implement true async execution when needed
    return execute_query<T>(query);
}

Required Changes:

The method needs to be refactored to use async/await properly. The key insight is that the sync version does three things:

  1. Builds the SQL
  2. Creates a command and executes it
  3. Materializes results

We need to use the async version of command execution:

internal async Invercargill.Enumerable<T> execute_query_async<T>(Query<T> query) throws SqlError {
    var mapper = get_mapper<T>();
    
    // Build SELECT SQL - same as sync version
    var sql = new StringBuilder();
    sql.append("SELECT * FROM ");
    sql.append(mapper.table_name);
    
    // Add WHERE clause if filter exists
    if (query.filter != null) {
        var visitor = new ExpressionToSqlVisitor(_dialect, mapper);
        query.filter.accept(visitor);
        sql.append(" WHERE ");
        sql.append(visitor.get_sql());
    }
    
    // Add ORDER BY
    if (query.orderings.length > 0) {
        sql.append(" ORDER BY ");
        bool first = true;
        foreach (var ordering in query.orderings) {
            if (!first) {
                sql.append(", ");
            }
            sql.append(ordering.expression);
            if (ordering.descending) {
                sql.append(" DESC");
            }
            first = false;
        }
    }
    
    // Add LIMIT and OFFSET
    if (query.limit_value != null) {
        sql.append_printf(" LIMIT %d", query.limit_value);
    }
    if (query.offset_value != null) {
        sql.append_printf(" OFFSET %d", query.offset_value);
    }
    
    var command = _connection.create_command(sql.str);
    
    // Add parameters from expression visitor if filter exists
    if (query.filter != null) {
        var visitor = new ExpressionToSqlVisitor(_dialect, mapper);
        query.filter.accept(visitor);
        var parameters = visitor.get_parameters();
        var param_names = visitor.get_parameter_names();
        
        // Bind parameters using their generated names
        var param_array = parameters.to_array();
        var name_array = param_names.to_array();
        for (int i = 0; i < param_array.length && i < name_array.length; i++) {
            command.with_parameter<Invercargill.Element>(name_array[i], param_array[i]);
        }
    }
    
    // Execute asynchronously - THIS IS THE KEY CHANGE
    var results = yield command.execute_query_async();
    
    // Materialize results
    var entities = new Vector<T>();
    
    foreach (var row in results) {
        try {
            var entity = mapper.materialise(row);
            entities.add(entity);
        } catch (Error e) {
            throw new SqlError.GENERAL_ERROR("Failed to materialize entity: %s".printf(e.message));
        }
    }
    
    return entities;
}

Part 2: Refactor ProjectionQuery.materialise_async

File: src/orm/projections/projection-query.vala

Current Implementation:

public async Vector<TProjection> materialise_async() throws SqlError {
    // For now, delegate to synchronous version
    // TODO: Implement true async execution when needed
    return materialise();
}

Required Changes:

The async version should use async command execution:

public async Vector<TProjection> materialise_async() throws SqlError {
    // Build the SQL using the projection SQL builder
    string? where_expr = get_combined_where();
    
    BuiltQuery built = _query_sql_builder.build_with_split(
        where_expr,
        _order_by_clauses,
        _limit_value,
        _offset_value
    );
    
    string sql = built.sql;
    
    // Execute through session's connection
    var connection = get_connection_from_session();
    var command = connection.create_command(sql);
    
    // Execute asynchronously - THIS IS THE KEY CHANGE
    var results = yield command.execute_query_async();
    
    // Materialize the results using the projection mapper
    var mapper = new ProjectionMapper<TProjection>(_query_definition);
    
    try {
        return mapper.map_all(results);
    } catch (ProjectionError e) {
        throw new SqlError.GENERAL_ERROR(
            @"Failed to materialize projection: $(e.message)"
        );
    }
}

Part 3: Refactor ProjectionQuery.first_async

File: src/orm/projections/projection-query.vala

Current Implementation:

public async TProjection? first_async() throws SqlError {
    // For now, delegate to synchronous version
    return first();
}

Required Changes:

The async version should use the async materialise:

public async TProjection? first_async() throws SqlError {
    // Create a copy with limit 1
    var limited_query = new ProjectionQuery<TProjection>(
        _query_session, 
        _query_definition,
        _query_sql_builder
    );
    
    // Copy state
    foreach (var clause in _where_clauses) {
        limited_query._where_clauses.add(clause);
    }
    foreach (var order in _order_by_clauses) {
        limited_query._order_by_clauses.add(order);
    }
    limited_query._use_or = _use_or;
    limited_query._offset_value = _offset_value;
    limited_query._limit_value = 1;  // Override limit
    
    // Use async materialise - THIS IS THE KEY CHANGE
    var results = yield limited_query.materialise_async();
    
    if (results.length > 0) {
        return results.get(0);
    }
    return null;
}

Files to Modify

File Lines Changes
src/orm/orm-session.vala 328-332 Replace stub with proper async implementation using yield command.execute_query_async()
src/orm/projections/projection-query.vala 274-278 Replace stub with proper async implementation using yield command.execute_query_async()
src/orm/projections/projection-query.vala 323-326 Replace stub with proper async implementation using yield limited_query.materialise_async()

Implementation Order

  1. Step 1: Refactor OrmSession.execute_query_async<T>()

    • This is the core method used by Query<T> async methods
    • Changes propagate to Query.materialise_async() and Query.first_async()
  2. Step 2: Refactor ProjectionQuery.materialise_async()

    • Uses async command execution
  3. Step 3: Refactor ProjectionQuery.first_async()

    • Uses async materialise from Step 2

Verification

Test Cases to Add/Update

  1. Test async query execution doesn't block

    • Verify that Query.materialise_async() actually yields
    • Verify that Query.first_async() actually yields
  2. Test async projection query execution doesn't block

    • Verify that ProjectionQuery.materialise_async() actually yields
    • Verify that ProjectionQuery.first_async() actually yields
  3. Test async results match sync results

    • Verify async methods return same data as sync counterparts

Existing Tests to Verify

Design Decisions

Why Thread Offloading at Interface Level is Acceptable

SQLite is a file-based database with no network communication. It has no native async API. The only way to achieve non-blocking behavior is to offload to a thread. Doing this at the interface level (Command, Connection, Transaction) is the correct approach because:

  1. Single point of threading logic - All threading is isolated to interfaces
  2. Future database support - When PostgreSQL/MySQL are added, they can implement native async at the interface level
  3. Consistent API - Higher-level code always uses yield, regardless of the underlying implementation

Why Higher-Level Code Should Use yield

Higher-level code (ORM, projections) should use yield to call async methods because:

  1. Proper async propagation - The async/await pattern requires the whole call chain to be async
  2. Main loop integration - Yielding allows the main loop to process other events
  3. Cancellation support - Proper async enables future cancellation token support

Architecture After Refactoring

sequenceDiagram
    participant User Code
    participant Query T
    participant OrmSession
    participant Command
    participant Thread Pool
    participant SQLite
    
    User Code->>Query T: materialise_async
    Query T->>OrmSession: execute_query_async T
    OrmSession->>OrmSession: Build SQL
    OrmSession->>Command: create_command
    OrmSession->>Command: execute_query_async
    Command->>Thread Pool: new Thread
    Thread Pool->>SQLite: execute_query
    SQLite-->>Thread Pool: results
    Thread Pool->>Command: Idle.add callback
    Command-->>OrmSession: yield returns
    OrmSession->>OrmSession: Materialize entities
    OrmSession-->>Query T: Enumerable T
    Query T-->>User Code: Enumerable T

Summary

This refactoring ensures that async methods throughout the project properly use the async/yield pattern instead of blocking on sync calls. The key changes are:

  1. OrmSession.execute_query_async<T>() - Use yield command.execute_query_async()
  2. ProjectionQuery.materialise_async() - Use yield command.execute_query_async()
  3. ProjectionQuery.first_async() - Use yield limited_query.materialise_async()

These changes maintain the thread-offloading at the lowest level (interfaces) while ensuring higher-level code properly propagates async behavior using yield.