# Phase 4: Projections and Joins ## Overview This phase introduces **Projections** - a new concept for read-only, composable query shapes that can join multiple entities and nested projections together. ### Key Concepts | Concept | Description | |---------|-------------| | **Entity** | Read/write, table-mapped, full CRUD operations | | **Projection** | Read-only, composable, join-aware query shape | ### Important: Invercargill Expressions All expression parameters throughout this API use **Invercargill.Expressions** syntax, NOT raw SQL. These are expression strings that the system parses, analyzes, and translates to SQL. This allows: - Type-safe analysis of expression structure - Variable scoping and translation - Aggregate detection for WHERE/HAVING splitting - Cross-dialect compatibility ```vala // ✅ Correct: Invercargill expression syntax .select("user_id", "r.id", ...) .where("user_id > 100 && user_name contains 'admin'") // ❌ Incorrect: Raw SQL (not supported) .select("user_id", "users.id AS user_id", ...) .where("user_id > 100 AND user_name LIKE '%admin%'") ``` ## API Design ### Projection Registration ```vala // Register a projection with the session session.register_projection(p => p .source("r") .join("o", "r.id == o.user_id") .group_by("r.id") .select("user_id", "r.id", (x, v) => x.user_id = v) .select("user_name", "r.name", (x, v) => x.user_name = v) .select("order_count", "COUNT(o.id)", (x, v) => x.order_count = v) .select("total_spent", "SUM(o.total)", (x, v) => x.total_spent = v) ); ``` ### Projection Query ```vala // Query projections using the same API as entities var results = session.query() .where("user_id > 100") .or_where("order_count >= 5") .order_by_desc("total_spent") .limit(10) .materialise_async(); ``` ## Builder Methods ### source\(string variable_name) Defines the primary entity source and declares a variable for use in expressions. ```vala .source("r") // r refers to User entity ``` ### join\(string variable_name, string join_expression) Joins another entity with a condition expression. ```vala .join("o", "r.id == o.user_id") ``` **Multiple joins to same entity type are allowed:** ```vala .join("sender", "t.sender_id == sender.id") .join("receiver", "t.receiver_id == receiver.id") ``` ### select\(string friendly_name, string expression, PropertySetter setter) Selects a scalar value with a friendly name for use in where/order_by expressions. | Parameter | Description | |-----------|-------------| | `friendly_name` | Name used in `where()` and `order_by()` expressions | | `expression` | **Invercargill expression** using declared variables (e.g., `"r.id"`, `"COUNT(o.id)"`) | | `setter` | Function to set the value on the result object | ```vala .select("user_id", "r.id", (x, v) => x.user_id = v) .select("user_name", "r.name", (x, v) => x.user_name = v) .select("order_count", "COUNT(o.id)", (x, v) => x.order_count = v) ``` ### select\(string friendly_name, string entry_point_expression, PropertySetter setter) Selects a nested projection for 1:1 relationships. | Parameter | Description | |-----------|-------------| | `friendly_name` | Name used for nested property navigation in `where()` expressions | | `entry_point_expression` | **Invercargill expression** selecting the variable that provides the source type for the nested projection (e.g., `"pr"` if `pr` is a Profile and the nested projection's source is Profile) | | `setter` | Function to set the nested projection on the result object | ```vala session.register_projection(p => p .source("p") .select("id", "p.id", (x, v) => x.id = v) .select("bio", "p.bio", (x, v) => x.bio = v) ); session.register_projection(p => p .source("u") .join("pr", "u.profile_id == pr.id") .select("id", "u.id", (x, v) => x.id = v) .select("profile", "pr", (x, v) => x.profile = v) // entry_point "pr" must be of type Profile (matches ProfileSummary's source) ); // Query with nested navigation var results = session.query() .where("profile.id == 5") // Navigate into nested projection .materialise_async(); ``` ### select_many\(string friendly_name, string entry_point_expression, PropertySetter setter) Selects a collection of nested projections for 1:N relationships. | Parameter | Description | |-----------|-------------| | `friendly_name` | Name used for nested collection property navigation | | `entry_point_expression` | **Invercargill expression** selecting the variable that provides the source type for the nested projection (e.g., `"o"` if `o` is an Order and the nested projection's source is Order) | | `setter` | Function to set the collection on the result object | ```vala // Child projection session.register_projection(p => p .source("r") .join("oi", "r.id == oi.order_id") .group_by("r.id") .select("order_id", "r.id", (x, v) => x.order_id = v) .select("total", "SUM(oi.price)", (x, v) => x.total = v) ); // Parent projection with collection session.register_projection(p => p .source("r") .join("o", "r.id == o.user_id") .select("user_id", "r.id", (x, v) => x.user_id = v) .select_many("orders", "o", (x, v) => x.orders = v) // entry_point "o" must be of type Order (matches OrderSummary's source) ); // Query var results = session.query() .where("user_id == 55") .materialise_async(); ``` ### group_by(params string[] expressions) Explicitly declares GROUP BY columns. Required for aggregate queries. ```vala .group_by("r.id", "r.name") ``` ## Variable Scoping and Translation ### Per-Projection Scope Variable names are scoped to each projection. The same variable name can map to different types in different projections. ```vala // OrderSummary: r → Order, o → OrderItem session.register_projection(p => p .source("r") .join("o", "r.id == o.order_id") .select("order_id", "r.id", ...) ); // UserWithOrders: r → User, o → Order session.register_projection(p => p .source("r") .join("o", "r.id == o.user_id") .select_many("o", ...) // Entry point type must match child's source ); ``` ### Type-Based Translation When nesting projections, translation happens by TYPE lookup, not variable name: ```mermaid flowchart LR subgraph Child - OrderSummary C_Map["Order → r
OrderItem → o"] end subgraph Parent - UserWithOrders P_Map["User → r
Order → o"] end subgraph Translation T["Child needs Order
Parent's Order is o
Translate: r → o"] end C_Map --> T P_Map --> T ``` ### Generated SQL Aliases SQL uses auto-generated aliases to prevent collisions. The format includes the type name for debugging clarity: **Format:** `val_N_TypeName` where N is incremental and TypeName is `typeof(T).name()` ```vala // UserWithOrders projection session.register_projection(p => p .source("r") .join("o", "r.id == o.user_id") .select("user_id", "r.id", ...) .select_many("o", ...) ); // Generated SQL (with debug-friendly aliases): // SELECT val_1_User.id, val_2_Order.id, SUM(val_4_OrderItem.price) // FROM users val_1_User // JOIN orders val_2_Order ON val_1_User.id = val_2_Order.user_id // JOIN orders val_3_Order ON val_2_Order.id = val_3_Order.id // JOIN order_items val_4_OrderItem ON val_3_Order.id = val_4_OrderItem.order_id // GROUP BY val_1_User.id, val_3_Order.id ``` This makes debugging queries much easier - you can immediately see which table each alias refers to. ## Auto-Join for Nested Projections When using `select_many`, the child projection's additional joins are automatically added: ```mermaid flowchart TB subgraph Parent PSrc["source User - r"] PJoin["join Order - o"] PSM["select_many OrderSummary - o"] end subgraph Child - OrderSummary CSrc["source Order - r"] CJoin["join OrderItem - o"] end subgraph Generated SQL V1["val1 = users"] V2["val2 = orders"] V3["val3 = orders - for child"] V4["val4 = order_items - auto-added"] end PSrc --> V1 PJoin --> V2 PSM --> V3 CJoin --> V4 ``` ### Entry Point Type Check The entry point variable type must match the child projection's source type: ```vala // ✅ Valid: "o" is Order, child's source is Order .select_many("o", ...) // ❌ Invalid: "r" is User, child's source is Order .select_many("r", ...) // Error: ProjectionEntryTypeMismatchError // "select_many requires source of type Order, but 'r' is type User" ``` ## Expression Translation ### Friendly Name Resolution Chain ```mermaid flowchart LR subgraph Input I["where - user_id == 55"] end subgraph Step 1 - Friendly Name Lookup S1["user_id → r.id"] end subgraph Step 2 - Variable Translation S2["r.id → val1.id"] end subgraph SQL SQL["WHERE val1.id = 55"] end I --> S1 --> S2 --> SQL ``` ### Nested Property Navigation ```vala // Navigate into nested projections .where("profile.id == 5") .where("profile.bio contains 'developer'") ``` Translation: `profile.id` → lookup `profile` → find nested `ProfileSummary` → lookup `id` in ProfileSummary → `pr.id` → `val2.id` ### WHERE vs HAVING Detection Aggregate expressions are automatically placed in HAVING clause: ```vala .where("user_id > 100") // Non-aggregate → WHERE .where("order_count >= 5") // Aggregate → HAVING .where("user_id > 100 && order_count >= 5") // Split: WHERE + HAVING ``` ### Mixed Aggregate/Non-Aggregate OR - Auto Subquery When OR combines aggregate and non-aggregate expressions, the query is automatically wrapped in a subquery: ```vala .where("user_id > 100 || order_count >= 5") // Generated SQL: // SELECT * FROM ( // SELECT u.id as user_id, COUNT(o.id) as order_count // FROM users u // LEFT JOIN orders o ON u.id = o.user_id // GROUP BY u.id // ) subq // WHERE user_id > 100 OR order_count >= 5 ``` ## Complete Example ```vala // Result types public class OrderSummary : Object { public int64 order_id { get; set; } public double total { get; set; } } public class UserWithOrderStats : Object { public int64 user_id { get; set; } public string user_name { get; set; } public int64 order_count { get; set; } public double total_spent { get; set; } public Enumerable orders { get; set; } } // Register projections session.register_projection(p => p .source("r") .join("oi", "r.id == oi.order_id") .group_by("r.id") .select("order_id", "r.id", (x, v) => x.order_id = v) .select("total", "SUM(oi.price)", (x, v) => x.total = v) ); session.register_projection(p => p .source("r") .join("o", "r.id == o.user_id") .group_by("r.id") .select("user_id", "r.id", (x, v) => x.user_id = v) .select("user_name", "r.name", (x, v) => x.user_name = v) .select("order_count", "COUNT(o.id)", (x, v) => x.order_count = v) .select("total_spent", "SUM(o.total)", (x, v) => x.total_spent = v) .select_many("orders", "o", (x, v) => x.orders = v) ); // Query var results = session.query() .where("user_id > 100") .or_where("order_count >= 5") .order_by_desc("total_spent") .limit(10) .materialise_async(); // Generated SQL: // SELECT * FROM ( // SELECT // val1.id as user_id, // val1.name as user_name, // COUNT(val2.id) as order_count, // SUM(val2.total) as total_spent, // val3.id as order_id, // SUM(val4.price) as total // FROM users val1 // JOIN orders val2 ON val1.id = val2.user_id // JOIN orders val3 ON val1.id = val3.user_id // JOIN order_items val4 ON val3.id = val4.order_id // GROUP BY val1.id, val3.id // ) subq // WHERE user_id > 100 OR order_count >= 5 // ORDER BY total_spent DESC // LIMIT 10 ``` ## Implementation Components ### New Classes | Class | Responsibility | |-------|---------------| | `ProjectionDefinition` | Stores source, joins, selections, group_by for a projection | | `ProjectionBuilder` | Fluent builder for registering projections | | `ProjectionQuery` | Query builder for projections (extends existing Query API) | | `ProjectionMapper` | Materializes projection results from database rows | | `VariableTranslator` | Translates variables to SQL aliases with type information | | `FriendlyNameResolver` | Resolves friendly names to SQL expressions | | `AggregateAnalyzer` | Detects aggregates in expressions for WHERE/HAVING split | | `ProjectionSqlBuilder` | Builds complete SELECT queries from ProjectionDefinition | ### Modified Classes | Class | Changes | |-------|---------| | `OrmSession` | Add `register_projection()`, modify `query()` to support projections | | `Query` | Support friendly name resolution, WHERE/HAVING split | | `ExpressionToSqlVisitor` | Support variable translation, aggregate detection | ### SqlDialect Interface Extensions The `SqlDialect` interface needs new methods for projection support: ```vala public interface SqlDialect : Object { // ... existing methods ... /** * Builds a SELECT statement with JOINs for a projection query. * * @param sources The source and join definitions * @param selections The column selections with expressions * @param group_by The GROUP BY expressions * @param where_clause The WHERE clause expression (already translated) * @param having_clause The HAVING clause expression (already translated) * @param order_by The ORDER BY expressions * @param limit Optional limit * @param offset Optional offset * @return The complete SQL SELECT statement */ public abstract string build_projection_select( Vector sources, Vector selections, Vector group_by, string? where_clause, string? having_clause, Vector order_by, int64? limit, int64? offset ); /** * Builds a subquery wrapper for mixed aggregate/non-aggregate OR conditions. * * @param inner_query The inner query with GROUP BY * @param combined_where The combined WHERE condition to apply to the subquery result * @return The wrapped SQL query */ public abstract string wrap_subquery_for_mixed_or( string inner_query, string combined_where ); /** * Generates a table alias with type information for debugging. * * @param index The alias index (1-based) * @param type_name The entity type name * @return The formatted alias (e.g., "val_1_User") */ public abstract string generate_table_alias(int index, string type_name); } ``` ### SQLite-Specific Implementation (SqliteDialect) ```vala public class SqliteDialect : Object, SqlDialect { // ... existing methods ... public string build_projection_select( Vector sources, Vector selections, Vector group_by, string? where_clause, string? having_clause, Vector order_by, int64? limit, int64? offset ) { var sql = new StringBuilder(); // SELECT clause sql.append("SELECT "); var select_parts = new Vector(); foreach (var sel in selections) { select_parts.add(@"$(sel.expression) AS $(sel.friendly_name)"); } sql.append(string.joinv(", ", select_parts.to_array())); // FROM clause var primary = sources[0]; sql.append(@" FROM $(primary.table_name) $(primary.alias)"); // JOIN clauses for (int i = 1; i < sources.size; i++) { var join = sources[i]; sql.append(@" JOIN $(join.table_name) $(join.alias) ON $(join.condition)"); } // WHERE clause if (where_clause != null) { sql.append(@" WHERE $(where_clause)"); } // GROUP BY clause if (group_by.size > 0) { sql.append(" GROUP BY "); sql.append(string.joinv(", ", group_by.to_array())); } // HAVING clause if (having_clause != null) { sql.append(@" HAVING $(having_clause)"); } // ORDER BY clause if (order_by.size > 0) { sql.append(" ORDER BY "); var order_parts = new Vector(); foreach (var ob in order_by) { order_parts.add(ob.descending ? @"$(ob.expression) DESC" : ob.expression); } sql.append(string.joinv(", ", order_parts.to_array())); } // LIMIT clause if (limit != null) { sql.append(@" LIMIT $(limit)"); } // OFFSET clause if (offset != null) { sql.append(@" OFFSET $(offset)"); } return sql.str; } public string wrap_subquery_for_mixed_or(string inner_query, string combined_where) { return @"SELECT * FROM ($inner_query) subq WHERE $combined_where"; } public string generate_table_alias(int index, string type_name) { return @"val_$(index)_$(type_name)"; } } ``` ### ExpressionToSqlVisitor Extensions The existing [`ExpressionToSqlVisitor`](src/expressions/expression-to-sql-visitor.vala) needs to support: 1. **Variable translation**: Replace user-defined variables with generated aliases 2. **Aggregate function detection**: Identify COUNT, SUM, AVG, MIN, MAX calls 3. **Friendly name resolution**: Look up friendly names in the projection context ```vala // New constructor parameter for projection context public ExpressionToSqlVisitor( EntityMapper? mapper, ProjectionDefinition? projection = null, // NEW Map? variable_aliases = null // NEW: user_var -> sql_alias ) { // ... } // New method to check if expression contains aggregates public bool contains_aggregate(Expression expr) { // Visit expression and return true if any aggregate function found } // New method to translate variable references protected override void visit_property_access(PropertyAccess expr) { if (variable_aliases != null && variable_aliases.has_key(expr.property_name)) { // Translate user variable to SQL alias builder.append(variable_aliases[expr.property_name]); } else { // Existing logic } } ``` ### Error Types | Error | When Thrown | |-------|-------------| | `ProjectionEntryTypeMismatchError` | Entry point type doesn't match child's source type | | `UndefinedFriendlyNameError` | Friendly name not found in projection | | `DuplicateVariableError` | Variable name already defined in projection | | `MissingGroupByError` | Aggregate query without group_by | | `ProjectionNotRegisteredError` | Query requested for unregistered projection type | ## Constraints 1. **Unique variable names** - Each source/join must have a unique variable name within a projection 2. **Entry point type match** - `select_many(entry_point, ...)` requires entry_point type to match child's source type 3. **Explicit GROUP BY** - Aggregate queries require explicit `.group_by()` declaration 4. **Client-side grouping** - 1:N collections are grouped client-side (no JSON aggregation) 5. **Invercargill Expressions only** - All expression parameters must use Invercargill.Expressions syntax, not raw SQL