Event-Driven AI: How Kafka Delivers Real-Time Context to LLMs

Most enterprise AI is contextually blind.

It knows what was true when the database was last queried. In a mortgage servicing platform processing 5,000 loan onboardings in six months, "last queried" means stale — and a stale context fed to an LLM produces confident answers about a state that no longer exists.

The fix is not a faster database query. It is a different architectural pattern: event-driven context assembly, where the AI agent subscribes to the same event stream that drives your business logic, and context is assembled from live events rather than polled state.

At MortgageIQ, this is the pattern behind SO — the Servicing Operations AI agent. Every loan boarding event on Kafka becomes context available to AI agents in real time. This post documents the pattern, the tradeoffs, and when it matters.

Why Request-Driven Context Fails at Scale

The standard pattern: a user sends a question → the application queries a database for relevant context → context is injected into the LLM prompt → response is generated.

This works in demos. It breaks under three production conditions:

1. High event velocity. At 5,000 loan onboardings in six months — roughly 27 per day, with processing spikes — a database poll every N seconds either misses events or hammers the database. Neither is acceptable when a loan officer is asking "what's the status of loan #84732?"

2. State that changes mid-session. A loan's status changes during an underwriting session. The user asks a follow-up question. The database-backed context reflects the state at session start, not the current state. The AI answers confidently about a status that changed two minutes ago.

3. Multi-service context assembly. A mortgage servicing platform has loan status in one service, document status in another, borrower communications in a third. Assembling context from three database queries per request, under load, is a latency problem that compounds as services are added.

Kafka solves all three — not because it's faster than a database, but because it inverts the model: instead of the AI pulling context when needed, context is pushed to the AI as events occur.

The Architecture

There are two context channels feeding the LLM simultaneously:

Live context — assembled from Kafka events, keyed by loan ID in Redis, always current
Static knowledge — retrieved from the RAG knowledge base (loan guidelines, policy documents)

The LLM sees both. Live context answers "what is happening with this loan right now." Static knowledge answers "what does the guideline say about this situation."

The Context Assembler

The Context Assembler is a .NET worker service that consumes Kafka topics and maintains a Redis key per loan. It is not an LLM. It is event processing — deterministic, fast, and cheap.

public class LoanContextConsumer : BackgroundService
{
    protected override async Task ExecuteAsync(CancellationToken ct)
    {
        _consumer.Subscribe(["loan.boarded", "loan.status.changed",
                             "document.indexed", "borrower.contacted"]);

        while (!ct.IsCancellationRequested)
        {
            var result = _consumer.Consume(ct);
            var loanEvent = JsonSerializer.Deserialize<LoanEvent>(result.Message.Value);

            await _contextStore.MergeAsync(
                key: $"loan-context:{loanEvent.LoanId}",
                patch: loanEvent.ToContextPatch(),
                ttl: TimeSpan.FromHours(24)
            );
        }
    }
}

ToContextPatch() normalizes the event into a flat context object:

{
  "loan_id": "84732",
  "status": "underwriting",
  "documents_received": ["W-2", "pay-stub", "bank-statement"],
  "documents_pending": ["appraisal"],
  "last_event": "document.indexed",
  "last_event_ts": "2026-03-23T14:22:00Z",
  "borrower_contacted": true,
  "contact_channel": "email"
}

When the AI agent handles a query for loan #84732, it reads this key from Redis. The context is current to the last Kafka event — typically within seconds of the originating action.

Assembling the Grounded Prompt

The AI agent combines both context channels before calling the LLM:

var liveContext = await _contextStore.GetAsync($"loan-context:{loanId}");
var guidelines  = await _retriever.QueryAsync(userQuestion, cancellationToken);

var systemPrompt = $"""
    You are SO, the Servicing Operations AI agent at MortgageIQ.

    Current loan status for loan #{loanId}:
    {JsonSerializer.Serialize(liveContext, _jsonOptions)}

    Relevant loan guidelines:
    {string.Join("\n\n", guidelines.Select(g => $"[{g.SourceName}]\n{g.Snippet}"))}

    Answer only from the provided loan status and guidelines.
    Cite the guideline source when you reference policy.
    If the loan status does not contain the information needed, say so.
    """;

The model receives current state AND policy context in a single prompt. It answers "loan #84732 is in underwriting, pending the appraisal document" by reading live context — not by querying a database, not by hallucinating a status.

The Orkes Conductor Integration

At MortgageIQ, Kafka events trigger Orkes Conductor (Saga orchestration) workflows. The Context Assembler is one step in a larger Saga that coordinates the loan boarding process. This means the AI agent's live context is kept consistent with the Saga state — the same event that triggers the next step in loan processing also updates the AI agent's context.

The AI doesn't poll Conductor for workflow state. It reads the Redis projection that the Context Assembler maintains from Kafka events. This decouples the AI layer from the orchestration layer — SO has no dependency on Conductor's internal state representation.

When to Use This Pattern

Use event-driven context when:

State changes faster than a reasonable poll interval
Context spans multiple services (joining at query time is expensive)
You need AI responses to reflect current state, not snapshot state
Your platform already produces Kafka events for business logic

Stick with request-driven context when:

State is stable (a knowledge base that updates weekly doesn't need Kafka)
Query latency is acceptable and state doesn't change mid-session
You're in Phase 1 of an AI integration and operational complexity matters

MortgageIQ Phase 4A uses request-driven RAG against a static knowledge base — because the knowledge base updates infrequently and loan-specific context isn't required. Event-driven context is the Phase 5 evolution, when SO needs to answer questions about specific in-flight loans.

What I've Seen Fail

1. Polling the database for context under load. Every AI query triggers N database reads. Works at 10 queries/minute. Falls over at 1,000. The database becomes the bottleneck for the AI layer — the opposite of the design intent.

2. Including the full event payload in the LLM prompt. Kafka events can be large. An MSP loan status event can carry 50+ fields. Injecting the raw event into the LLM prompt consumes the token budget, adds noise, and reduces answer quality. The Context Assembler's job is to normalize and reduce — the LLM should receive a structured summary, not a raw event.

3. No TTL on context store entries. Redis fills up with context for loans that closed six months ago. The Context Assembler writes but never expires. Set a TTL that matches your session window — 24 hours for active loan processing, longer for archival queries.

4. Context store as source of truth. Redis is a projection of Kafka events — it's a read model, not a write model. Teams that start writing loan state back to Redis (from the AI layer) have introduced a second source of truth. The database or the Saga state is authoritative. Redis is the read-optimized projection for AI context only.

5. Skipping the Context Assembler and reading Kafka directly from the AI agent. The AI agent then owns event deserialization, schema evolution, consumer group management, and offset tracking. It becomes a Kafka consumer AND an AI orchestrator. These are different concerns. The Context Assembler is the separation of concerns that keeps the AI layer clean.

The Architecture Implication

Event-driven context delivery is not an AI pattern — it's a data engineering pattern applied to AI. The Kafka topics, the consumer group, the Redis projection, the TTL strategy — these are data engineering decisions that determine how fresh and how reliable your AI's context is.

The teams that get enterprise AI right treat context delivery as a first-class architectural concern, not an afterthought. The LLM is the last mile. The 80% of the work is getting the right context to it, at the right time, in the right shape.

At MortgageIQ, that work happens in the event stream. At 5,000 loan onboardings in six months, it has to.