phase-12-unified-select-many.md 9.0 KB

Phase 12: Unified select_many API

Overview

This phase implements a unified select_many<T> API that works with:

  • Scalar types: Collect column values into ImmutableBuffer<T>
  • Entity types: Materialize joined entities directly
  • Projection types: Materialize nested projections (existing behavior)

Problem Statement

The current select_many<T> only works with registered projection types. When users try to collect scalar values from a join, they get a NULL assertion error:

// This fails with assertion error
.select_many<string>("permissions", expr("p.permission"), (o, v) => o._permissions = v.to_immutable_buffer())

The error occurs because:

  1. The mapper tries to extract Enumerable<string> from a scalar database value
  2. The extraction fails, returning null
  3. Calling .to_immutable_buffer() on null triggers the assertion

Design Decisions

1. Type Detection Strategy

Decision: Automatic detection from TypeProvider

When select_many<T> is called, detect the item type:

  1. Check if T is registered as an entity in TypeProvider → Entity mode
  2. Check if T is registered as a projection in TypeProvider → Projection mode
  3. Otherwise treat as scalar → Scalar mode

    flowchart TD
    A[select_many T called] --> B{T registered as entity?}
    B -->|Yes| C[Entity Mode]
    B -->|No| D{T registered as projection?}
    D -->|Yes| E[Projection Mode]
    D -->|No| F[Scalar Mode]
        
    C --> G[Materialize using EntityMapper]
    E --> H[Materialize using ProjectionMapper]
    F --> I[Extract scalar values directly]
    

2. Expression Semantics

Decision: Allow both variable references and column expressions

  • Variable reference (expr("p")): Used for entities and projections
  • Column expression (expr("p.permission")): Used for scalars

The expression type is validated against the detected mode:

  • Entity/Projection mode: Expression must be a variable reference
  • Scalar mode: Expression must be a column reference

3. Collection Type

Decision: Use ImmutableBuffer<T> instead of Enumerable<T>

Rationale:

  • ImmutableBuffer<T> is more appropriate for materialized collections
  • Better represents the actual data structure returned
  • More efficient for iteration and access

Setter signature changes from:

PropertySetter<TProjection, Enumerable<TItem>>

To:

PropertySetter<TProjection, ImmutableBuffer<TItem>>

4. Grouping Key Inference

Decision: Analyze join expression to detect parent vs child references

For a join like:

.join<UserPermissionEntity>("p", expr("p.user_id == u.id"))

The analyzer:

  1. Parses both sides of the equality expression
  2. Identifies which side references the parent (source or earlier join)
  3. Identifies which side references the new join variable
  4. Extracts the parent-side expression as the grouping key

    flowchart LR
    A["p.user_id == u.id"] --> B{Which side is parent?}
    B --> C["p.user_id references p - child"]
    B --> D["u.id references u - parent"]
    D --> E[Grouping key: u.id]
    

5. Empty Collections

Decision: Return empty ImmutableBuffer<T> when no child rows exist

The setter always receives a non-null collection:

  • With child rows: ImmutableBuffer<T> with values
  • Without child rows: Empty ImmutableBuffer<T>

API Examples

Scalar Collection

public class UserProjection : Object {
    public int64 id { get; set; }
    public string username { get; set; }
    public ImmutableBuffer<string> permissions { get; set; }
}

session.register_projection<UserProjection>(p => p
    .source<UserEntity>("u")
    .select<int64?>("id", expr("u.id"), (o, v) => o.id = v)
    .select<string>("username", expr("u.username"), (o, v) => o.username = v)
    .join<UserPermissionEntity>("p", expr("p.user_id == u.id"))
    .select_many<string>("permissions", expr("p.permission"), (o, v) => o.permissions = v)
);

Entity Collection

public class UserWithPermissions : Object {
    public int64 id { get; set; }
    public ImmutableBuffer<UserPermissionEntity> permissions { get; set; }
}

session.register_projection<UserWithPermissions>(p => p
    .source<UserEntity>("u")
    .select<int64?>("id", expr("u.id"), (o, v) => o.id = v)
    .join<UserPermissionEntity>("p", expr("p.user_id == u.id"))
    .select_many<UserPermissionEntity>("permissions", expr("p"), (o, v) => o.permissions = v)
);

Projection Collection (Existing)

public class UserWithOrderSummaries : Object {
    public int64 id { get; set; }
    public ImmutableBuffer<OrderSummary> orders { get; set; }
}

// OrderSummary must be registered as a projection
session.register_projection<UserWithOrderSummaries>(p => p
    .source<UserEntity>("u")
    .select<int64?>("id", expr("u.id"), (o, v) => o.id = v)
    .join<OrderEntity>("o", expr("o.user_id == u.id"))
    .select_many<OrderSummary>("orders", expr("o"), (o, v) => o.orders = v)
);

Implementation Plan

Step 1: Add CollectionItemMode Enum

Create an enum to represent the three modes:

public enum CollectionItemMode {
    SCALAR,      // Simple values like string, int64
    ENTITY,      // Registered entity types
    PROJECTION   // Registered projection types
}

Step 2: Create Unified CollectionSelection Class

Replace CollectionProjectionSelection with a unified CollectionSelection that handles all three modes:

public class CollectionSelection<TProjection, TItem> : SelectionDefinition {
    public Expression entry_point_expression { get; private set; }
    public PropertySetter<TProjection, ImmutableBuffer<TItem>> setter { get; private set; }
    public CollectionItemMode item_mode { get; private set; }
    
    // For SCALAR mode: column expression like "p.permission"
    // For ENTITY/PROJECTION mode: variable reference like "p"
}

Step 3: Update ProjectionBuilder.select_many

Modify select_many<TItem> to:

  1. Detect the item mode using TypeProvider
  2. Validate the expression matches the mode
  3. Create the appropriate CollectionSelection

Step 4: Add JoinConditionAnalyzer

Create a utility class to extract the grouping key from join conditions:

public class JoinConditionAnalyzer {
    public Expression? extract_parent_key_expression(
        Expression join_condition,
        string parent_variable,
        string child_variable
    );
}

Step 5: Update ProjectionMapper

Modify map_all() to:

  1. Detect collection selections
  2. Group rows by the inferred parent key
  3. For each group, collect child values based on item mode:
    • SCALAR: Extract scalar values directly
    • ENTITY: Materialize using EntityMapper
    • PROJECTION: Materialize using ProjectionMapper
  4. Call setters with ImmutableBuffer<TItem>

Step 6: Add Tests

Add comprehensive tests for all three modes:

  • test_select_many_scalar_strings()
  • test_select_many_scalar_ints()
  • test_select_many_entities()
  • test_select_many_projections()
  • test_select_many_empty_collection()
  • test_select_many_multiple_joins()

Files to Modify

File Changes
src/orm/projections/selection-types.vala Add CollectionItemMode enum, update CollectionSelection
src/orm/projections/projection-builder.vala Update select_many for type detection
src/orm/projections/projection-mapper.vala Implement grouping and materialization logic
src/orm/projections/projection-definition.vala Add grouping key storage
src/orm/projections/join-condition-analyzer.vala New file for join analysis
src/tests/projection-test.vala Add tests for all modes

Migration Notes

Breaking Changes

The setter signature changes from Enumerable<TItem> to ImmutableBuffer<TItem>:

// Before
.select_many<OrderSummary>("orders", expr("o"), (o, v) => o.orders = v)

// After - same syntax, but v is now ImmutableBuffer
.select_many<OrderSummary>("orders", expr("o"), (o, v) => o.orders = v)

For most users, this is a non-breaking change if their property type matches.

Compatibility

Existing projections using select_many with registered projection types will continue to work. The only change is the collection type returned.

Open Questions

  1. Multiple collections per projection: Should we support multiple select_many calls on the same projection? This would require more complex SQL generation (multiple joins).

  2. Nested collections: Should we support select_many inside another select_many? This would require recursive grouping.

  3. Custom grouping: Should we allow overriding the inferred grouping key with an explicit expression?

Success Criteria

  • select_many<string> works for scalar string collections
  • select_many<int64> works for scalar integer collections
  • select_many<TEntity> works for entity collections
  • select_many<TProjection> works for projection collections (existing)
  • Empty collections return empty ImmutableBuffer, not null
  • Grouping key is correctly inferred from join conditions
  • All tests pass