Week 3: When AI Discovers Your Bugs Through Events
This is Week 3 of “Building with AI” - a 10-week journey documenting how I use multi-agent AI
workflows to build a production-grade SaaS platform.This week: The event sourcing system we built became our debugging tool. 72 commits reveal how
AI agents learned to detect architectural violations by analyzing event patterns.
Previously, I thought we had completed successfully. Seven CRM domain models, custom DynamoDB macro, full API
layer - all verified and merged.Initial reality check: Event logs showed something wrong.
Wait. The verifier approved this code last week.How did a capsule isolation bug make it to main?The answer changed how I think about AI-assisted verification.
I opened a fresh Evaluator session to investigate:
Copy
Analyze event logs from the last 48 hours.Context:- Multi-tenant platform with capsule isolation- Every entity must be scoped to tenant + capsule- Event sourcing captures all domain operationsTask:Find patterns in events that suggest isolation violations.
Evaluator’s analysis (2 hours):The agent didn’t just grep for “error” or “warn”. It did something more interesting - it
analyzed event schemas across services.
Evaluator's Detection Logic
Pattern analysis:
Scanned 2,847 events across auth, CRM, and catalog services
Extracted event schemas by grouping events by type
Compared field patterns across related events
Flagged inconsistencies where similar operations had different fields
Example finding:
Copy
AccountCreated event:✅ Has: tenant_id, capsule_id, account_id✅ Pattern: TENANT#{tenant}#CAPSULE#{capsule}#...RoleAssignmentCreated event:❌ Has: tenant_id, assignment_id❌ Missing: capsule_id❌ Pattern: TENANT#{tenant}#ROLE#{role}#...INCONSISTENCY: RoleAssignment operations are not capsule-isolatedRISK LEVEL: HIGH (security isolation boundary violation)
This was fascinating. Evaluator didn’t need explicit rules about capsule isolation. It
inferred the pattern from seeing other events and spotted the deviation.
With the event analysis showing 6 entities missing capsule isolation, I realized:
our DynamoDB macro from Week 2 didn’t enforce this pattern.The macro we built:
The problem: Macro generated correct code for entities that included capsule_id, but didn’t require it.Builder sessions for RoleAssignment and SecurityGroup entities simply… forgot to include it.Next task: Make capsule isolation mandatory at compile time.
Enhance DynamoDbEntity macro to enforce capsule isolation.Requirements:1. Every entity MUST have capsule_id field2. Partition key pattern MUST include CAPSULE#{capsule_id}3. Compilation fails if either is missing4. Add helpful compiler messagesReference: .plans/348-capsule-isolation.md```text**Builder's implementation (4 hours):**```rust// Updated macro with compile-time checks#[proc_macro_derive(DynamoDbEntity, attributes(table_name, pk, sk, capsule_isolated))]pub fn derive_dynamodb_entity(input: TokenStream) -> TokenStream { let ast = parse_macro_input!(input as DeriveInput); // NEW: Check for capsule_isolated attribute let requires_capsule = has_capsule_isolated_attr(&ast); if requires_capsule { // NEW: Validate capsule_id field exists if !has_field(&ast, "capsule_id") { return compile_error!( "Entity marked #[capsule_isolated] must have capsule_id field" ); } // NEW: Validate PK pattern includes CAPSULE# let pk_pattern = get_pk_pattern(&ast); if !pk_pattern.contains("CAPSULE#{capsule_id}") { return compile_error!( "Entity marked #[capsule_isolated] must include CAPSULE#{{capsule_id}} in partition key" ); } } // Generate implementation...}
With DynamoDB entities fixed, I moved to the API layer.Problem discovered: API route handlers automatically inferred path parameters but got them wrong sometimes.Example bug:
// ❌ WRONG - included query in path paramsparams( ("account_id" = AccountId, Path, ...), ("contact_id" = ContactId, Path, ...), ("query" = ActivityQuery, Path, ...), // Should be Query, not Path)
Why did this happen?The macro tried to be “helpful” by inferring:
“If parameter name matches a word in the path, it’s a path param. Otherwise, it’s also a path param.”Wrong heuristic.
Principle established: Macros should be explicit and boring, not clever and inference-heavy.Impact: Refactored 23 API routes across 3 services (auth, CRM, catalog).Verifier’s role: Caught 5 routes where Builder incorrectly categorized parameters during the refactor.
After fixing 6 isolation bugs, enhancing 2 macros, and refactoring 23 routes, a pattern emerged.Pattern I noticed: Verifier kept asking the same questions:
“Should this entity be capsule-isolated?”
“What’s the standard partition key pattern?”
“How do we handle nested resources in API paths?”
Idea: What if we documented patterns proactively so Verifier could reference them?Next task: Update CLAUDE.md with architectural patterns.
Learning 1: Events Reveal Inconsistencies Humans Miss
The traditional approach:
Write code
Write tests
Code review
Merge
The problem: Tests verify “does this work?” not “is this consistent with everything else?”Event sourcing changes this:Every operation emits an event. Events have schemas. Schema inconsistencies reveal architectural violations.Example from Week 3:
AI advantage: Evaluator can analyze thousands of events, spot patterns, and flag deviations in minutes.Human disadvantage: We review code files, not event logs. Cross-service consistency is hard to verify manually.
Before CLAUDE.md patterns (Week 2):Builder would ask: “Should this entity be capsule-isolated?”Verifier would check: “Does the implementation match the plan?”Problem: Neither had a reference for “what’s the standard pattern?”After CLAUDE.md patterns (Week 3):Builder reads patterns first, applies them by default.Verifier checks: “Does this match the documented standard pattern?”Concrete example:Before (Week 2 - EntitlementEntity):
Difference: Builder referenced CLAUDE.md patterns during planning. Got it right the first time.Insight: AI doesn’t just need instructions - it needs reference patterns to apply consistently.
Learning 4: Verification Is Not Just “Does It Work?”
Builder’s verification focus:
Does it compile? ✅
Do tests pass? ✅
Does it meet requirements? ✅
Verifier’s additional checks:
Is this consistent with other entities?
Does this follow documented patterns?
What are the edge cases?
What could go wrong in production?
Example from API macro verification:Builder implemented explicit path/query parameter declarations. Tests passed. All routes worked.Verifier asked: “What happens if a developer accidentally declares the same parameter as both path and query?”
Copy
// Potential bug#[api_endpoint( path = "/accounts/{account_id}", path_params(account_id), query_params(account_id), // ← Same parameter in both!)]
Builder’s response: “The macro would generate invalid OpenAPI spec. Runtime error.”Verifier’s recommendation: “Add compile-time check to prevent duplicate parameter declarations.”Builder added the check. This bug never existed in real code - Verifier prevented it hypothetically.Lesson: Good verification asks “what COULD go wrong?” not just “what IS wrong?”
What we learned: Capsule isolation bugs made it to production because enforcement was runtime-only (tests).New rule: Architectural invariants MUST be enforced at compile time via macros or type system.Example:
Copy
// ❌ Runtime check (can be forgotten)fn create_entity(entity: Entity) -> Result<()> { if entity.capsule_id.is_none() { return Err("Missing capsule_id"); } // ...}// ✅ Compile-time enforcement (can't be forgotten)#[derive(DynamoDbEntity)]#[capsule_isolated] // Compiler enforces capsule_id field existspub struct Entity { pub capsule_id: CapsuleId, // Required by macro // ...}
Principle 2: Events Are the Truth, State Is the Cache
What we learned: Event logs revealed bugs that state-based testing missed.New rule: When debugging data consistency issues, always start with event logs, not database state.Practice:
Every operation emits events
Event schemas are validated
Schema inconsistencies = architectural violations
AI agents analyze event patterns during verification
Principle 3: Documentation Is AI Context
What we learned: CLAUDE.md patterns changed how Builder and Verifier approached tasks.New rule: Document architectural patterns proactively. AI will reference them.Format:
Pattern name
When to use it
Code example
Anti-patterns (what NOT to do)
Rationale (why this pattern?)
Principle 4: Macros Must Fail Loudly
What we learned: Macros that silently accept incorrect patterns are worse than no macros.New rule: Macros should validate ALL invariants and produce clear compile errors.Checklist for new macros:
After fixing the API macro: I merged it and started using it immediately for new routes.Later during CI: CI failed. 8 routes using the new macro weren’t compiling.What happened?I updated the macro to require explicit path_params/query_params attributes. But I didn’t
update existing routes to use the new syntax.The failed routes:
Why did this happen?I tested the macro with new routes (worked fine). I verified the macro logic (correct). But I
didn’t check backwards compatibility with existing routes.Classic mistake: Changed the API without migration plan.The fix (2 hours):
Found all existing routes using old syntax (23 routes across 3 services)
Updated each to new explicit syntax
Added deprecation warnings to old syntax (instead of breaking immediately)
Created migration guide in CLAUDE.md
Lesson: API changes (even internal macros) need backwards compatibility checks. Should’ve been part of verification.
Commits: 72 (3x more than Week 2, but smaller/focused)Time estimate (manual): 2 weeksActual time (AI workflow): 4.5 daysSpeedup: 3.1x
Bugs found by event analysis: 6 critical isolation violationsBugs found by Verifier: 12 (macro edge cases, API inconsistencies)Bugs prevented by compile-time enforcement: Unknown (can’t ship what won’t compile)Rework cycles: 1.8 per sub-task (down from 2.3 in Week 2)Why less rework? CLAUDE.md patterns reduced ambiguity.
That event wasn’t from production. It was from our test environment’s event analysis system.Here’s what I built (quietly, over the weekend before Week 3):Event Schema Validator (Lambda function):
Copy
// Runs on every event published to EventBridgeasync fn validate_event_schema(event: EventBridgeEvent) -> Result<()> { // 1. Extract event type let event_type = event.detail_type; // 2. Load expected schema for this event type let expected_schema = load_schema(&event_type)?; // 3. Compare actual event fields to expected schema let actual_fields = extract_fields(&event.detail); // 4. Check for required fields for required_field in expected_schema.required_fields { if !actual_fields.contains(&required_field) { log_violation(MissingRequiredField { event_type, missing_field: required_field, severity: Critical, }); } } // 5. Check for consistency across related events if let Some(related_events) = get_related_events(&event_type) { for related in related_events { check_field_consistency(&event, &related)?; } } Ok(())}
This Lambda caught 6 isolation violations in test environment BEFORE any code reached production.Why this matters:Traditional testing asks: “Does this code work?”Event validation asks: “Is this code consistent with our architectural patterns?”Different questions. Different bugs caught.
Week 3 was about hardening patterns. Week 4 will be about scaling patterns.The challenge:Our CRM domain now has:
15 entity types
37 API routes
8 background workers
3 event processing pipelines
The questions:
Can our macro patterns scale to 100+ entities?
Can event validation handle 10,000 events/second?
How do we verify cross-service consistency at scale?
Week 4 focus:
Multi-service integration testing
Event replay for debugging
AI-assisted performance optimization
Distributed tracing patterns
The big experiment:Can AI help us find performance bottlenecks BEFORE load testing? We’ll use the event logs to
simulate production traffic patterns and see what breaks.
Disclaimer: This content documents my personal AI workflow experiments. All examples are from
personal projects and have been sanitized to remove proprietary information.Code snippets represent generic patterns for educational purposes. This does not represent my
employer’s technologies or approaches.