This document describes the design for a SafePath API that provides a succinct, variadic constructor for creating URL-encoded entity paths in Implexus.
The existing EntityPath class provides:
Multiple constructors:
EntityPath(string path_string) - parses a path stringEntityPath.root() - creates the root pathEntityPath.from_segments(Enumerable<string> segments) - creates from segment collectionEntityPath.with_child(EntityPath parent, string name) - creates child pathCurrent escaping mechanism (tilde-based, similar to RFC 6901 JSON Pointer):
~ → ~7e
/ → ~2f
\ → ~5c
\0 → ~00
Limitations:
From EntityPathTest.vala, typical usage includes:
// Current verbose patterns
var root = new EntityPath.root();
var users = root.append_child("users");
var john = users.append_child("john");
// Or string-based
var path = new EntityPath("/users/john/profile");
EntityPath instances for seamless integrationnamespace Implexus.Core {
/**
* SafePath provides a convenient factory for creating EntityPath instances
* with automatic URL encoding of path segments.
*
* Example usage:
* {{{
* var path = SafePath.path("users", "john doe", "profile");
* // Creates EntityPath for /users/john%20doe/profile
*
* var root = SafePath.path(); // Root path
* var simple = SafePath.path("catalogue"); // Single segment
* }}}
*/
public class SafePath : Object {
/**
* Characters that MUST be encoded in path segments.
* Based on RFC 3986 with additional safety characters.
*/
private const string RESERVED_CHARS = "!*'();:@&=+$,/?#[]%\"\\<>^`{|}~";
/**
* Creates an EntityPath from variadic segments, URL-encoding each segment.
*
* @param first_segment The first path segment (required to start variadic args)
* @param ... Additional segments, terminated by null
* @return A new EntityPath with encoded segments
*
* Example:
* {{{
* var path = SafePath.path("users", "john", "profile", null);
* // Result: /users/john/profile
*
* var encoded = SafePath.path("data", "2024/01", "file name", null);
* // Result: /data/2024%2F01/file%20name
* }}}
*/
public static EntityPath path(string? first_segment, ...) {
var segments = new Invercargill.DataStructures.Vector<string>();
if (first_segment == null) {
return new EntityPath.root();
}
// Add first segment
segments.add(encode_segment(first_segment));
// Process variadic arguments
va_list args = va_list();
while (true) {
string? segment = args.arg();
if (segment == null) {
break;
}
segments.add(encode_segment(segment));
}
return new EntityPath.from_segments(segments.as_enumerable());
}
/**
* Creates an EntityPath from an array of segments.
* Alternative API for when array-based construction is preferred.
*
* @param segments Array of path segments
* @return A new EntityPath with encoded segments
*/
public static EntityPath from_array(string[] segments) {
if (segments.length == 0) {
return new EntityPath.root();
}
var encoded_segments = new Invercargill.DataStructures.Vector<string>();
foreach (var segment in segments) {
encoded_segments.add(encode_segment(segment));
}
return new EntityPath.from_segments(encoded_segments.as_enumerable());
}
/**
* URL-encodes a path segment according to RFC 3986.
*
* Encodes:
* - All reserved URI characters
* - Space (as %20, not +)
* - Non-ASCII characters (as percent-encoded UTF-8)
* - Control characters
*
* @param segment The raw segment to encode
* @return The URL-encoded segment
*/
public static string encode_segment(string segment) {
if (segment.length == 0) {
return "";
}
// Use GLib's URI escaping with custom reserved set
// GLib.Uri.escape_string encodes space as %20 by default
return Uri.escape_string(segment, RESERVED_CHARS, true);
}
/**
* Decodes a URL-encoded path segment.
*
* @param encoded The encoded segment
* @return The decoded segment
* @throws EntityError.INVALID_PATH if the segment contains invalid percent-encoding
*/
public static string decode_segment(string encoded) throws EntityError {
string? decoded = Uri.unescape_string(encoded);
if (decoded == null) {
throw new EntityError.INVALID_PATH(
"Invalid percent-encoding in path segment: %s".printf(encoded)
);
}
return decoded;
}
}
} // namespace Implexus.Core
An alternative design extends EntityPath directly with static factory methods:
// Add to EntityPath class
public partial class EntityPath {
/**
* Creates an EntityPath from variadic segments with automatic URL encoding.
* Terminate with null.
*
* Example:
* {{{
* var path = EntityPath.from_parts("users", "john doe", null);
* }}}
*/
public static EntityPath from_parts(string? first_segment, ...) {
var segments = new Invercargill.DataStructures.Vector<string>();
if (first_segment == null) {
return new EntityPath.root();
}
segments.add(SafePath.encode_segment(first_segment));
va_list args = va_list();
while (true) {
string? segment = args.arg();
if (segment == null) break;
segments.add(SafePath.encode_segment(segment));
}
return new EntityPath.from_segments(segments.as_enumerable());
}
}
Following RFC 3986 with additional safety considerations:
| Category | Characters | Encoding Example |
|---|---|---|
| Space | %20 |
|
| Reserved | ! * ' ( ) ; : @ & = + $ , / ? # [ ] |
%21, %2A, etc. |
| Percent | % |
%25 |
| Control | \x00-\x1F |
%00-%1F` |
| Non-ASCII | Unicode chars | UTF-8 percent-encoded |
Use these GLib methods for encoding/decoding:
// Encoding
string encoded = Uri.escape_string(segment, RESERVED_CHARS, true);
// Decoding
string? decoded = Uri.unescape_string(encoded);
Note: Uri.escape_string() with escape_reserved = true encodes all reserved characters. We pass a custom reserved set to ensure consistent behavior.
The current EntityPath uses tilde escaping (~2f for /). SafePath uses standard URL encoding (%2F for /) because:
%XX format is immediately recognizable// Empty string segment - allowed but produces empty encoded result
var path = SafePath.path("users", "", "profile", null);
// Result: /users//profile (double slash normalized by EntityPath parsing)
// Recommendation: Validate segments before calling SafePath
// null terminates the variadic list
var path = SafePath.path("a", "b", null, "c", null);
// Result: /a/b (stops at first null)
// Slashes in segment names are encoded
var path = SafePath.path("data", "2024/01/15", "log", null);
// Result: /data/2024%2F01%2F15/log
// Percent signs are double-encoded safely
var path = SafePath.path("query", "100%", null);
// Result: /query/100%25
// Unicode characters are UTF-8 percent-encoded
var path = SafePath.path("users", "日本語", null);
// Result: /users/%E6%97%A5%E6%9C%AC%E8%AA%9E
flowchart LR
A[Raw segments] --> B[SafePath.path]
B --> C[URL encode each segment]
C --> D[EntityPath.from_segments]
D --> E[EntityPath instance]
E --> F[to_string: escaped display]
E --> G[to_key: raw storage key]
The EntityPath stores raw (unencoded) segments internally. The encoding happens at construction time:
// Input: "john doe" (contains space)
var path = SafePath.path("users", "john doe", null);
// Internal storage: segments = ["users", "john doe"]
// to_string(): "/users/john%20doe" (URL encoded for display)
// to_key(): "users/john doe" (raw for storage keys)
Important: This design stores raw segments, not encoded ones. This matches the current EntityPath behavior where escaping is only applied in to_string().
For true safety, we should store the encoded segments:
public static EntityPath path(string? first_segment, ...) {
var segments = new Invercargill.DataStructures.Vector<string>();
if (first_segment == null) {
return new EntityPath.root();
}
// Store ENCODED segments
segments.add(encode_segment(first_segment));
va_list args = va_list();
while (true) {
string? segment = args.arg();
if (segment == null) break;
segments.add(encode_segment(segment));
}
return new EntityPath.from_segments(segments.as_enumerable());
}
With this approach:
to_string(): /users/john%20doe (segments already encoded)to_key(): users/john%20doe (encoded in storage too)This ensures special characters never appear in storage keys.
// Simple path
var catalogue = SafePath.path("catalogue", null);
// EntityPath: /catalogue
// Nested path
var document = SafePath.path("catalogue", "category", "document", null);
// EntityPath: /catalogue/category/document
// Spaces
var user_path = SafePath.path("users", "John Smith", null);
// EntityPath: /users/John%20Smith
// Slashes in names
var date_path = SafePath.path("logs", "2024/01/15", null);
// EntityPath: /logs/2024%2F01%2F15
// Query strings (common in document IDs)
var doc = SafePath.path("docs", "id=123&type=pdf", null);
// EntityPath: /docs/id%3D123%26type%3Dpdf
// When segments are already in an array
string[] parts = { "users", user_id, "settings" };
var settings_path = SafePath.from_array(parts);
// Creating documents with safe paths
public async Document create_document(Engine engine, string catalogue,
string category, string doc_name) throws Error {
var path = SafePath.path(catalogue, category, doc_name, null);
return yield engine.create_document(path);
}
// Building index paths
var index_path = SafePath.path("catalogue", "products", "indexes", "price", null);
When implementing this design:
SafePath class in src/Core/SafePath.valapath() variadic method with null terminatorfrom_array() array-based methodencode_segment() using GLib.Uri.escape_stringdecode_segment() using GLib.Uri.unescape_stringtests/Core/SafePathTest.vala:
src/meson.build to include new fileSegment Validation: Should SafePath reject empty segments, or pass them through?
Encoding Storage: Should segments be stored encoded or raw?
Error on Invalid Input: What should happen with null bytes in segments?
%00 (already handled by URL encoding)API Style: Static factory class vs. EntityPath extension method?
SafePath class, add EntityPath.from_parts() as convenience aliasThe SafePath API provides:
SafePath.path("a", "b", "c", null)EntityPath instancesThis design enables safer, more readable path construction while maintaining full compatibility with the existing EntityPath system.