Skip to main content

The Setup: Week 2’s Hidden Debt

After building seven CRM domain models with a custom DynamoDB macro and full API layer, I thought the foundation was solid. Initial reality check: Event logs showed something wrong.
[WARN] Capsule isolation violation detected
Entity: RoleAssignment
Event: RoleAssignmentCreated
Issue: Missing capsule_id parameter
Impact: Cross-tenant data leak risk
Wait. The verifier approved this code. How did a capsule isolation bug make it to main? The answer changed how I think about AI-assisted verification.

The Journey: Following the Event Trail

The Event That Revealed Everything

I opened a fresh Evaluator session to investigate:
Analyze event logs from the last 48 hours.

Context:
- Multi-tenant platform with capsule isolation
- Every entity must be scoped to tenant + capsule
- Event sourcing captures all domain operations

Task:
Find patterns in events that suggest isolation violations.
Evaluator’s analysis: The agent didn’t just grep for “error” or “warn”. It did something more interesting - it analyzed event schemas across services.
Pattern analysis:
  1. Scanned 2,847 events across auth, CRM, and catalog services
  2. Extracted event schemas by grouping events by type
  3. Compared field patterns across related events
  4. Flagged inconsistencies where similar operations had different fields
Example finding:
AccountCreated event:
✅ Has: tenant_id, capsule_id, account_id
✅ Pattern: TENANT#{tenant}#CAPSULE#{capsule}#...

RoleAssignmentCreated event:
❌ Has: tenant_id, assignment_id
❌ Missing: capsule_id
❌ Pattern: TENANT#{tenant}#ROLE#{role}#...

INCONSISTENCY: RoleAssignment operations are not capsule-isolated
RISK LEVEL: HIGH (security isolation boundary violation)
This was fascinating. Evaluator didn’t need explicit rules about capsule isolation. It inferred the pattern from seeing other events and spotted the deviation.

The Macro Compliance Sweep

With the event analysis showing 6 entities missing capsule isolation, I realized: our DynamoDB macro didn’t enforce this pattern. The macro we built:
#[derive(DynamoDbEntity)]
#[table_name = "platform_data"]
#[pk = "TENANT#{tenant_id}#ACCOUNT#{id}"]
pub struct AccountEntity {
    pub tenant_id: TenantId,
    pub capsule_id: CapsuleId,  // ← Present but not enforced
    // ...
}
The problem: Macro generated correct code for entities that included capsule_id, but didn’t require it. Builder sessions for RoleAssignment and SecurityGroup entities simply… forgot to include it. Next task: Make capsule isolation mandatory at compile time.

Compile-Time Enforcement

Builder session:
Enhance DynamoDbEntity macro to enforce capsule isolation.

Requirements:
1. Every entity MUST have capsule_id field
2. Partition key pattern MUST include CAPSULE#{capsule_id}
3. Compilation fails if either is missing
4. Add helpful compiler messages

Reference: .plans/348-capsule-isolation.md
Builder’s implementation:
// Updated macro with compile-time checks
#[proc_macro_derive(DynamoDbEntity, attributes(table_name, pk, sk, capsule_isolated))]
pub fn derive_dynamodb_entity(input: TokenStream) -> TokenStream {
    let ast = parse_macro_input!(input as DeriveInput);

    // NEW: Check for capsule_isolated attribute
    let requires_capsule = has_capsule_isolated_attr(&ast);

    if requires_capsule {
        // NEW: Validate capsule_id field exists
        if !has_field(&ast, "capsule_id") {
            return compile_error!(
                "Entity marked #[capsule_isolated] must have capsule_id field"
            );
        }

        // NEW: Validate PK pattern includes CAPSULE#
        let pk_pattern = get_pk_pattern(&ast);
        if !pk_pattern.contains("CAPSULE#{capsule_id}") {
            return compile_error!(
                "Entity marked #[capsule_isolated] must include CAPSULE#{{capsule_id}} in partition key"
            );
        }
    }

    // Generate implementation...
}
Usage:
// ✅ CORRECT - will compile
#[derive(DynamoDbEntity)]
#[capsule_isolated]
#[pk = "TENANT#{tenant_id}#CAPSULE#{capsule_id}#ROLE#{id}"]
pub struct RoleAssignmentEntity {
    pub tenant_id: TenantId,
    pub capsule_id: CapsuleId,  // Required
    pub id: RoleId,
    // ...
}

// ❌ WRONG - won't compile
#[derive(DynamoDbEntity)]
#[capsule_isolated]
#[pk = "TENANT#{tenant_id}#ROLE#{id}"]  // Missing CAPSULE#
pub struct BadEntity {
    pub tenant_id: TenantId,
    // capsule_id missing - compile error
    pub id: RoleId,
}
Verification session:
Verify capsule isolation enforcement in DynamoDB macro.

Test plan:
1. Create test entities with/without capsule_id
2. Verify compile errors for violations
3. Check error messages are helpful
4. Ensure existing entities still compile
Verifier found an edge case:
Issue: Nested Capsule ReferencesBuilder’s macro checked for “capsule_id field exists” but didn’t handle nested structs:
#[derive(DynamoDbEntity)]
#[capsule_isolated]
pub struct ComplexEntity {
    pub tenant_id: TenantId,
    pub context: EntityContext,  // Contains capsule_id internally
    pub id: EntityId,
}

pub struct EntityContext {
    pub capsule_id: CapsuleId,  // Macro doesn't see this
}
Impact: Macro would reject valid nested patterns.Fix: Enhanced field checking to traverse nested structs one level deep.
Rework cycle: 2 hours to handle nested fields, re-verify, approved.

What We Built

Macro Enhancements

DynamoDB Macro:
  • Compile-time capsule isolation enforcement
  • Nested struct field validation
  • Clear error messages
API Macro:
  • Removed automatic parameter inference
  • Explicit path/query parameter declarations
  • OpenAPI spec correctness guarantees

Capsule Isolation Fixes

Entities updated:
  • RoleAssignment
  • SecurityGroup
  • Entitlement
  • Session (auth)
  • Federal compliance data (CRM)
Pattern: All user-scoped data now has tenant + capsule isolation

Documentation

CLAUDE.md improvements:
  • 12 architectural patterns documented
  • DynamoDB entity guidelines
  • API route conventions
  • Repository patterns
  • Event naming conventions
Impact: Agents reference these during planning

Infrastructure Principles

New ADR added:
  • ADR-0010: Capsule Isolation Enforcement
  • Infrastructure Principles §2.1: API Capsule Context Resolution
Governance: Architecture decisions now codified
Metrics:
  • 72 commits
  • 6 entities fixed for capsule isolation
  • 23 API routes refactored
  • 2 macros enhanced
  • 5 documentation files improved
  • 0 new features (all refinement)

What We Learned: Event Sourcing as a Verification Tool

Learning 1: Events Reveal Inconsistencies Humans Miss

The traditional approach:
  • Write code
  • Write tests
  • Code review
  • Merge
The problem: Tests verify “does this work?” not “is this consistent with everything else?” Event sourcing changes this: Every operation emits an event. Events have schemas. Schema inconsistencies reveal architectural violations. Example:
// AccountCreated event schema:
{
  "tenant_id": "...",
  "capsule_id": "...",  // Present
  "account_id": "...",
  "timestamp": "..."
}

// RoleAssignmentCreated event schema:
{
  "tenant_id": "...",
  "capsule_id": null,  // Missing!
  "assignment_id": "...",
  "timestamp": "..."
}

// Inconsistency detected: RoleAssignment operations are not capsule-isolated
AI advantage: Evaluator can analyze thousands of events, spot patterns, and flag deviations in minutes. Human disadvantage: We review code files, not event logs. Cross-service consistency is hard to verify manually.

Learning 2: Macros Are Force Multipliers - For Good and Bad

Before enforcement: DynamoDB macro made it easy to create entities. Also made it easy to forget capsule isolation. Impact:
  • 6 entities created without proper isolation
  • Each one a potential data leak
  • All passed code review (including AI verification)
After enforcement: Enhanced macro makes it impossible to forget.
// This code won't compile anymore
#[derive(DynamoDbEntity)]
#[capsule_isolated]
pub struct MyEntity {
    pub tenant_id: TenantId,
    // ERROR: Missing required field 'capsule_id' for capsule-isolated entity
    pub id: EntityId,
}
Principle: Macros should encode invariants, not just reduce boilerplate. Good macro:
  • Reduces repetition
  • Enforces correctness
  • Fails at compile time (not runtime)
  • Generates helpful error messages
Bad macro:
  • Just saves typing
  • Allows incorrect patterns
  • “Helpful” inference that’s sometimes wrong

Learning 3: Documentation Changes AI Behavior

Before CLAUDE.md patterns: Builder would ask: “Should this entity be capsule-isolated?” Verifier would check: “Does the implementation match the plan?” Problem: Neither had a reference for “what’s the standard pattern?” After CLAUDE.md patterns: Builder reads patterns first, applies them by default. Verifier checks: “Does this match the documented standard pattern?” Concrete example: Before (EntitlementEntity):
// Builder's first attempt (no guidance)
#[derive(DynamoDbEntity)]
#[pk = "TENANT#{tenant_id}#ENTITLEMENT#{id}"]  // Missing CAPSULE#
pub struct EntitlementEntity {
    pub tenant_id: TenantId,
    // capsule_id missing
    pub id: EntitlementId,
}
Verifier didn’t catch this because the plan didn’t specify capsule isolation explicitly. After (SecurityGroupEntity):
// Builder's first attempt (with CLAUDE.md guidance)
#[derive(DynamoDbEntity)]
#[capsule_isolated]  // Applied pattern automatically
#[pk = "TENANT#{tenant_id}#CAPSULE#{capsule_id}#GROUP#{id}"]
pub struct SecurityGroupEntity {
    pub tenant_id: TenantId,
    pub capsule_id: CapsuleId,
    pub id: GroupId,
}
Difference: Builder referenced CLAUDE.md patterns during planning. Got it right the first time. Insight: AI doesn’t just need instructions - it needs reference patterns to apply consistently.

Learning 4: Verification Is Not Just “Does It Work?”

Builder’s verification focus:
  • Does it compile? ✅
  • Do tests pass? ✅
  • Does it meet requirements? ✅
Verifier’s additional checks:
  • Is this consistent with other entities?
  • Does this follow documented patterns?
  • What are the edge cases?
  • What could go wrong in production?
Example from API macro verification: Builder implemented explicit path/query parameter declarations. Tests passed. All routes worked. Verifier asked: “What happens if a developer accidentally declares the same parameter as both path and query?”
// Potential bug
#[api_endpoint(
    path = "/accounts/{account_id}",
    path_params(account_id),
    query_params(account_id),  // ← Same parameter in both!
)]
Builder’s response: “The macro would generate invalid OpenAPI spec. Runtime error.” Verifier’s recommendation: “Add compile-time check to prevent duplicate parameter declarations.” Builder added the check. This bug never existed in real code - Verifier prevented it hypothetically. Lesson: Good verification asks “what COULD go wrong?” not just “what IS wrong?”

Principles We Established

What we learned: Capsule isolation bugs made it to production because enforcement was runtime-only (tests).New rule: Architectural invariants MUST be enforced at compile time via macros or type system.Example:
// ❌ Runtime check (can be forgotten)
fn create_entity(entity: Entity) -> Result<()> {
    if entity.capsule_id.is_none() {
        return Err("Missing capsule_id");
    }
    // ...
}

// ✅ Compile-time enforcement (can't be forgotten)
#[derive(DynamoDbEntity)]
#[capsule_isolated]  // Compiler enforces capsule_id field exists
pub struct Entity {
    pub capsule_id: CapsuleId,  // Required by macro
    // ...
}
What we learned: Event logs revealed bugs that state-based testing missed.New rule: When debugging data consistency issues, always start with event logs, not database state.Practice:
  • Every operation emits events
  • Event schemas are validated
  • Schema inconsistencies = architectural violations
  • AI agents analyze event patterns during verification
What we learned: CLAUDE.md patterns changed how Builder and Verifier approached tasks.New rule: Document architectural patterns proactively. AI will reference them.Format:
  • Pattern name
  • When to use it
  • Code example
  • Anti-patterns (what NOT to do)
  • Rationale (why this pattern?)
What we learned: Macros that silently accept incorrect patterns are worse than no macros.New rule: Macros should validate ALL invariants and produce clear compile errors.Checklist for new macros:
  • Validates all required fields exist
  • Validates field types are correct
  • Validates attribute patterns (e.g., PK format)
  • Produces helpful error messages
  • Has tests for invalid inputs (should not compile)

The Event Schema Validator

That event log warning wasn’t from production. It was from our test environment’s event analysis system. Event Schema Validator (Lambda function):
// Runs on every event published to EventBridge
async fn validate_event_schema(event: EventBridgeEvent) -> Result<()> {
    // 1. Extract event type
    let event_type = event.detail_type;

    // 2. Load expected schema for this event type
    let expected_schema = load_schema(&event_type)?;

    // 3. Compare actual event fields to expected schema
    let actual_fields = extract_fields(&event.detail);

    // 4. Check for required fields
    for required_field in expected_schema.required_fields {
        if !actual_fields.contains(&required_field) {
            log_violation(MissingRequiredField {
                event_type,
                missing_field: required_field,
                severity: Critical,
            });
        }
    }

    // 5. Check for consistency across related events
    if let Some(related_events) = get_related_events(&event_type) {
        for related in related_events {
            check_field_consistency(&event, &related)?;
        }
    }

    Ok(())
}
This Lambda caught 6 isolation violations in test environment BEFORE any code reached production. Why this matters: Traditional testing asks: “Does this code work?” Event validation asks: “Is this code consistent with our architectural patterns?” Different questions. Different bugs caught.

Code Examples (Sanitized)

Here’s the final capsule-isolated entity pattern we established:
// Generic pattern for all user-scoped entities
#[derive(DynamoDbEntity, Debug, Clone, Serialize, Deserialize)]
#[capsule_isolated]
#[table_name = "platform_data"]
#[pk = "TENANT#{tenant_id}#CAPSULE#{capsule_id}#ENTITY#{entity_type}#{id}"]
#[sk = "METADATA#v{version}"]
pub struct GenericEntity {
    // Required isolation fields (enforced by macro)
    pub tenant_id: TenantId,
    pub capsule_id: CapsuleId,

    // Entity identification
    pub id: EntityId,
    pub entity_type: EntityType,

    // Event sourcing metadata
    pub version: u64,
    pub last_event_id: Option<EventId>,

    // Audit fields
    pub created_at: DateTime<Utc>,
    pub created_by: UserId,
    pub updated_at: DateTime<Utc>,
    pub updated_by: UserId,

    // Entity-specific data
    pub data: serde_json::Value,
}

// Macro enforces:
// 1. tenant_id and capsule_id fields must exist
// 2. PK must include both TENANT# and CAPSULE# segments
// 3. Compile error if either is missing
// 4. Generated code validates isolation at query time
What this pattern gave us:
  • Zero capsule isolation bugs after enforcement
  • Consistent PK/SK patterns across all entities
  • Compile-time guarantees for security boundaries
  • Clear audit trail via event sourcing metadata

Metrics

Planned work: Fix capsule isolation gapsActual work:
  • 6 entities fixed for isolation ✅
  • DynamoDB macro enhanced ✅
  • API macro refactored ✅
  • 23 routes updated ✅
  • CLAUDE.md patterns added ✅
  • Infrastructure ADR written ✅
Commits: 72 (3x more than Week 2, but smaller/focused)Speedup: 3.1x faster than manual estimate