Testing Infrastructure

Event Flow E2E Testing (The Work Nobody Wants to Do)

The Context: After building a CRM domain layer with event sourcing, we had unit tests, integration tests, even some event flow tests. But we didn’t have comprehensive end-to-end verification that events actually flow correctly through the entire system. Why this matters: In event-sourced systems, bugs in event flow are catastrophic:

Miss an event → audit trail broken
Wrong event order → state corruption
Cross-tenant event leak → compliance violation

The traditional problem: Writing E2E tests for event flows is tedious:

Set up test infrastructure (EventBridge, SQS, DynamoDB Streams)
Create test data for each entity type
Trigger actions and wait for async event delivery
Verify event payload, ordering, and side effects
Clean up test resources
Repeat for every entity and workflow combination

Estimated manual effort: 2-3 weeks for comprehensive coverage What I tried: Give the entire task to AI.

The Planning Session

The Prompt
AI's Output
Implementation
What Verifier Caught

Planning session for comprehensive E2E event flow testing.

Context:
- Event-sourced CRM with 7 entities (Account, Contact, Lead,
  Opportunity, Activity, Product, Address)
- Events flow: DynamoDB Streams → EventBridge → SQS → Consumers
- Need to verify event delivery, ordering, and payload correctness
- Must test cross-entity workflows (e.g., Lead conversion)

Requirements:
1. Test each entity's event flow independently
2. Test multi-entity workflows (create Account → add Contact → convert Lead)
3. Test failure scenarios (event delivery failure, consumer errors)
4. All tests must run against LocalStack (no AWS resources)

Design comprehensive test suite with:
- Test data fixtures
- Event verification utilities
- Async event waiting helpers
- Clear assertion patterns

Evaluator (Opus) produced a 24-page plan with:Test Architecture:

TestEventBus abstraction wrapping EventBridge + SQS
EventCollector for async event aggregation
Trait-based event matchers for flexible assertions
Per-entity test suites + workflow test suites

21 Test Scenarios identified:

Contact events (5 scenarios)
Lead events (6 scenarios including conversion)
Opportunity events (4 scenarios)
Partner events (3 scenarios)
Cross-entity workflows (3 scenarios)

Test Utilities:

Event comparison with field-level diff
Timeout-based async waiting (with clear failure messages)
Test data builders for each entity type
Event payload sanitization for assertions

Builder (Sonnet) implemented in 6 hours:Created test files:

eva-crm/tests/integration/event_flow/
├── mod.rs                      (shared utilities)
├── contact_events_test.rs      (5 scenarios)
├── lead_events_test.rs         (6 scenarios)
├── opportunity_events_test.rs  (4 scenarios)
└── partner_events_test.rs      (3 scenarios)

eva-crm/tests/e2e/
└── lead_conversion_workflow_test.rs (multi-entity)

Test utilities created:

EventCollector with timeout and filtering
Event matchers with helpful error messages
Test data fixtures (40+ reusable entities)

All 21 tests passing with LocalStack

The result: Comprehensive E2E event testing that would have taken 2-3 weeks manually, done in 1.5 days with AI. But more importantly: I actually have confidence in the event system now. Without AI, I would’ve written 3-4 “smoke tests” and hoped for the best.

Key Insight: AI makes thorough testing economically viable. The marginal cost of going from “some tests” to “comprehensive coverage” dropped from weeks to hours.

The Event Collector Implementation

Here’s the event collector utility we built for E2E testing:

/// Async event collector with timeout and filtering
pub struct EventCollector {
    queue_url: String,
    sqs_client: aws_sdk_sqs::Client,
    timeout: Duration,
}

impl EventCollector {
    /// Collect events matching predicate within timeout
    pub async fn collect_events<F>(
        &self,
        predicate: F,
        expected_count: usize,
    ) -> Result<Vec<Event>>
    where
        F: Fn(&Event) -> bool,
    {
        let start = Instant::now();
        let mut collected = Vec::new();

        // Exponential backoff with jitter
        let mut delay = Duration::from_millis(100);

        while start.elapsed() < self.timeout {
            // Poll SQS queue
            let messages = self.sqs_client
                .receive_message()
                .queue_url(&self.queue_url)
                .max_number_of_messages(10)
                .wait_time_seconds(1)
                .send()
                .await?
                .messages
                .unwrap_or_default();

            for msg in messages {
                let event: Event = serde_json::from_str(&msg.body)?;

                if predicate(&event) {
                    collected.push(event);

                    if collected.len() >= expected_count {
                        return Ok(collected);
                    }
                }
            }

            // Exponential backoff with jitter
            tokio::time::sleep(delay).await;
            delay = (delay * 2).min(Duration::from_secs(5));
        }

        Err(Error::EventCollectionTimeout {
            expected: expected_count,
            received: collected.len(),
            elapsed: start.elapsed(),
        })
    }
}

Usage in tests:

#[tokio::test]
async fn test_lead_conversion_emits_events() {
    let collector = EventCollector::new("test-queue-url", Duration::from_secs(10));

    // Trigger lead conversion
    convert_lead_to_opportunity(lead_id).await?;

    // Collect events
    let events = collector
        .collect_events(
            |e| e.entity_type == "Lead" || e.entity_type == "Opportunity",
            2, // Expect: LeadConverted + OpportunityCreated
        )
        .await?;

    // Verify event ordering and payload
    assert_eq!(events[0].event_type, "LeadConverted");
    assert_eq!(events[1].event_type, "OpportunityCreated");
    assert_eq!(events[1].payload["lead_id"], lead_id);
}

What made this work:

Exponential backoff handles EventBridge → SQS delays
Predicate filtering allows flexible event matching
Helpful error messages on timeout (shows expected vs received)
Reusable across all 21 test scenarios

Testing Principles Established

Principle 1: AI Excels at Systematic Test Coverage

What we learned: Work that requires consistent application of testing patterns across many cases is perfectly suited for AI.Example: 21 E2E test scenarios following same pattern in 1.5 daysRule: If testing requires “do the same validation many times consistently,” delegate entirely to AI.Anti-pattern: Using AI for exploratory testing (AI needs clear success criteria).

Principle 2: Testing Becomes Worth Doing

What we learned: Comprehensive testing that was “too expensive” manually becomes “obviously worth it” with AI.Example: E2E event flow testing

Manual estimate: 2-3 weeks (not worth it)
With AI: 1.5 days (absolutely worth it)

Result: Testing quality improved not because AI writes better tests, but because comprehensive testing became economically viable.Rule: Re-evaluate “not worth the effort” decisions when AI changes the effort equation.

Principle 3: Test Coverage ≠ Requirement Coverage

What we learned: Builder wrote tests that covered code paths but didn’t map to requirements.Practice: Every test must reference a requirement in the test name.Example:

#[test]
fn test_account_name_validation_per_prd_section_2_1() {
    // Test specific requirement from PRD
}

Metrics

Test coverage created:

Event flow tests: 21 scenarios
Integration tests: 47 scenarios
Unit tests: 156 tests
Overall coverage: 89%

Critical issues prevented by comprehensive testing:

Race condition in event collection (async timing)
Cross-tenant event leak in negative tests
Incomplete cleanup between tests

Value: Prevented at least 2 weeks of production debugging and customer issues.

Implementation

​Event Flow E2E Testing (The Work Nobody Wants to Do)

​The Planning Session

​The Event Collector Implementation

​Testing Principles Established

​Metrics

Event Flow E2E Testing (The Work Nobody Wants to Do)

The Planning Session

The Event Collector Implementation

Testing Principles Established

Metrics