AI Governance & Safety June 19, 2026 · 39 min read

Building a Production Knowledge Graph at Lakeside Trust Bank: The Relationship-Banker Agent

Part 11c of the Knowledge Graph Practitioner's Guide. Closes the Lakeside Trust Bank trilogy. The same graph from Parts 11a and 11b now backs the relationship-banker agent. Covers how the CoALA four-layer memory model maps to Lakeside's named graphs, how three production workflows (portfolio decisions, client-meeting prep, advisor-facing summaries) map to three trust-tier policies (strict-tier-floor, tier-segregated, tier-explicit-citation), how the agent and the EU AI Act conformance assessor read the same substrate, what Lakeside got wrong on the way, the contract discipline that keeps the agent operable across quarterly FIBO releases, a cost-and-benefit roll-up that previews Appendix B, and a Do Next table that spans foundation, operational, governance, and agent layers.

By Vikas Pratap Singh
#knowledge-graph #reference-architecture #financial-services #ai-agents #relationship-banker #coala-memory #trust-tier-retrieval #eu-ai-act #graphrag

Knowledge Graph Practitioner’s Guide: Overview | Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 | Part 7 | Part 8 | Part 9 | Part 10 | Part 11a | Part 11b | Part 11c | Appendix A | Appendix B | Appendix C | Part 12

Recap: Where Parts 11a And 11b Left Off

Lakeside Trust Bank is the series’ hypothetical worked-example bank, a composite drawn from publicly documented failure patterns rather than a real client engagement; the numbers in this piece are illustrative, not measured telemetry from a single institution.

Part 11a introduced Lakeside Trust Bank ($75B in assets, US-headquartered with an EU subsidiary, 1.2M retail customers, 22,000 commercial counterparties, ~280 CDEs at 1:20 implementing-field ratio) and the Monday-morning Müller-family question that exposed the four-or-five-store anti-pattern. The bank’s response was a deliberate knowledge graph: a modular ontology imported from FIBO BE plus FIBO LOAN plus FIBO SEC plus FIBO FBC plus W3C Time plus PROV-O plus SKOS plus DCAT 3 plus SHACL with a thin lksb: in-house module under 100 classes, an 8-stage construction pipeline that converged Track 1 (R2RML over the cloud data warehouse plus an open table format plus the retail master-data platform) and Track 2 (the LLM-assisted extraction pipeline over the enterprise document store’s credit memos, KYC files, and advisor notes) at one identity, and a hybrid serving tier (the canonical RDF triple store’s SPARQL endpoint plus a property-graph traversal view) (see Appendix A for the specific tools). Part 11b absorbed the governance back office onto the same substrate: the OpenLineage-to-PROV-O bridge across ~480K Run events per day, the 280 CDEs as typed nodes (Counterparty Credit Exposure with 22 implementing fields across 5 systems as the worked example), four SPARQL templates that answer BCBS 239 Principle 3, ECB RDARR, GDPR Article 30 ROPA, and EU AI Act Article 10, the trust-tier-by-reporting-surface table across 9 surfaces, and the named-graph version chain across a quarterly FIBO release. This piece closes the trilogy.

Wednesday Morning, Q1 2026: The Banker, The Agent, And A 30-Minute Window

Three days after the Monday Müller-family reconciliation that took three weeks pre-KG and 180ms post-KG, the same relationship banker opened her Wednesday calendar to a 12:00 client meeting with the Müller patriarch in Chicago. She had a 30-minute window between her 11:30 internal review and the noon meeting to prepare. Her brief to the agent was three sentences: “Brief me on any material changes to the Müller-family exposure since the last quarterly review. Surface any open AML or credit issues that could come up. Suggest two open-ended questions about the family’s succession plan that I can use if the conversation slows.”

Pre-agent (and pre-KG), the same prep would have meant a one-hour slack to senior bankers, a thirty-minute call with the AML team, and a hope that the credit officer happened to be at her desk. Pre-KG with an agent, the agent would have done what the Apex Capital research analyst agent did in Part 9: pulled relevant chunks from a flat vector index that mixed the audited gold counterparty profile with a draft pitch deck about a competitor acquisition that an associate had uploaded to the enterprise document store a year earlier and never finalized. The agent would have confidently cited the draft deck as if the acquisition had happened. The relationship banker would have walked into the noon meeting with a two-million-dollar talking point about a transaction the family had never done. The Apex Capital incident, replayed at Lakeside, with the same root cause: a retrieval surface that did not know which evidence was gold and which was bronze.

Post-KG and post-agent, the prep ran differently. The agent classified the three sub-tasks. “Material changes to exposure since the last quarterly review” was a portfolio-decision-adjacent question. The agent’s retrieval planner picked the strict-tier-floor policy and emitted a SPARQL query that filtered to gold-tier-only triples in the operational graph. The query traversed the same Müller-family subgraph from Monday’s reconciliation, applied a validFrom > 2026-01-01 filter from the bitemporal annotations, and returned three triples: a $1.2M increase in commercial loan exposure to the operating company in February, a new $0.3M trust-asset position created when the irrevocable trust was established (the same trust that Monday’s reconciliation had nearly missed), and a $0.4M reduction in the offshore vehicle’s securities position. Every triple carried a Source IRI, a validatedAgainstShapes link to the SHACL ShapeGraph that approved the underlying Run, and a tier annotation of gold.

The “open AML or credit issues” sub-task was a tier-segregated question. The agent’s planner emitted a query that returned gold-tier facts plus silver-tier signals in separately labeled context blocks. The gold-tier result was an empty set: no open AML investigations on any Müller-controlled entity. The silver-tier result, returned in a separate <silver-evidence> block in the prompt, was a recent advisor note (Track 2, extracted from a credit officer’s free-form memo) flagging a question about whether the irrevocable trust’s funding source had been documented to the AML team’s satisfaction. The agent surfaced the silver-tier signal explicitly, with a tier badge and a citation back to the advisor note’s IRI, and recommended that the relationship banker raise the question with the AML team before the noon meeting rather than during.

The “two open-ended questions about succession” sub-task was a tier-explicit-citation question. The agent retrieved across all tiers (gold trust-structure facts, silver advisor notes from prior client meetings, bronze-tier exploratory analyses by the bank’s wealth-planning research desk) and generated two suggested questions, each rendered with explicit tier citations: one based on a gold-tier fact about the irrevocable trust’s beneficiary structure, one based on a silver-tier note that the patriarch’s middle daughter had asked about charitable-giving structures eight months earlier. The relationship banker read both suggestions, understood the tier of each, and chose to use the gold-tier-anchored question as the conversation opener.

Wednesday’s prep took eleven minutes. The relationship banker walked into the noon meeting with a brief that had no fabricated facts, no laundered tiers, and an audit trail per cited claim that the bank’s compliance team could read after the fact. The same graph that answered the Monday operational question and the Tuesday governance question answered the Wednesday agent question. Three days, three consumers, one substrate. The agent layer is the third of the three production layers Part 11a’s foundation was designed to carry.

Three Workflows, Three Trust-Tier Policies

The relationship-banker agent at Lakeside has three production workflows that map cleanly onto the three trust-tier policies from Part 9. Each workflow has a different consequence model, and the policy is matched to the consequence.

WorkflowConsequence modelTrust-tier policyWhat it allows; what it blocks
Portfolio decision (credit expansion, exposure increase, suitability assertion, regulatory disclosure)High-stakes; a wrong fact moves money or appears in a regulatory filingStrict-tier-floor (gold only)Allows: triples that have passed the SHACL gate, carry full provenance, and were generated by a gold-tier source. Blocks: any triple at silver, bronze, or quarantined tier. Fails closed if no gold-tier evidence exists
Client-meeting prep (open issues, recent changes, talking points, suggested questions)Medium-stakes; the banker reads the brief and makes the judgmentTier-segregated context (gold and silver in labeled blocks; bronze on request)Allows: gold-tier facts in the primary context block, silver-tier signals in a labeled <silver-evidence> block, bronze-tier hints only if the planner explicitly asks. Blocks: tier hiding (silver evidence appearing in the gold block)
Advisor-facing summary (relationship review, monthly digest, post-meeting recap)Low-stakes individually; high-stakes if patterns of misattribution accumulateTier-explicit-citation (every claim carries a tier badge)Allows: any tier in the rendered summary. Blocks: any cited claim whose tier is below what the post-processing layer can verify against the source IRI

The mapping is not a single switch on the agent. It is a runtime classification step that runs once per turn. The agent’s retrieval planner reads the user message, classifies it against the three workflow categories using a small fine-tuned classifier (Lakeside trained one on a few thousand labeled internal banker queries; high precision and recall, tracked as the gating metric on a held-out set, are illustrative of the bar rather than measured telemetry), and emits the SPARQL or Cypher query parameterized by the corresponding tier policy. A query that the classifier judges ambiguous (the banker asked a question that does not fit cleanly into one category, or the question spans more than one) defaults to the strictest applicable policy.

The classifier itself is a procedural-memory artifact at Lakeside, not a hard-coded if/else block. It lives in the skill subgraph (described below) with a version IRI, a training-data provenance chain that the Article 10 SPARQL template from Part 11b reads against, a quality measurement on every release, and an owl:priorVersion link to the previous classifier release. When a misclassification incident is reported (a portfolio-decision query was routed to tier-segregated, allowing silver-tier evidence into a high-stakes brief), the incident becomes a labeled training example for the next classifier release, the classifier is retrained, the new version IRI is minted, and the prior version is deprecated through the Part 8 migration playbook. The classifier is operated like a model, not a rule.

What this looks like in practice. Most agent platforms in 2026 still treat retrieval policy as a system-prompt instruction (“when you answer regulatory questions, only use authoritative sources”). The instruction is not load-bearing. The agent will obey it sometimes and ignore it other times, and there is no way to audit which times. The Lakeside discipline is to make the policy a first-class field on the retrieval call, enforce it in the SPARQL query, and verify it in the response post-processing. Three layers, mechanical enforcement, audit trail per query. If your agent’s tier policy lives in a system prompt, the next incident is the question of when, not whether.

The CoALA Four-Layer Memory Model At Lakeside

The agent’s memory architecture follows the CoALA four-layer model from Part 9: working, episodic, semantic, procedural. The Lakeside specialization is that all four layers share the same graph substrate, partitioned by named graph, with the same identity discipline and the same provenance contract that the operational and governance layers already use.

CoALA layerLakeside named graphWhat it holdsTier profile
Working memoryThe current prompt context (not persisted)The current banker query, the current retrieval result set, the current scratchpad reasoningTier per source; assembled per turn from semantic and episodic recall
Episodic memoryhttps://lakeside.com/graph/episodic/banker/{bankerId}/{period}Per-banker conversation history with the agent, prior client meeting recaps the banker filed, prior agent decisions and recommendations the banker accepted or overrodeSilver-tier typical; bronze for unverified banker free-text annotations; gold for facts the banker explicitly attested to
Semantic memoryThe operational graph from Part 11a plus the governance triples from Part 11bCounterparty entities, exposures, beneficial ownership, transaction risk patterns, CDEs, lineage, regulatory crosswalksGold-tier dominant; silver for Track 2 extractions; bronze for legacy migrations not yet validated
Procedural memoryhttps://lakeside.com/graph/procedural/agent/{agentVersion}Skill nodes (account-review skill, meeting-prep skill, suitability-check skill, BCBS-aware-response-framing skill, tier-policy-classifier skill) with preconditions, postconditions, success metrics, training-data provenance, and version IRIsGold-tier-only at production release; silver during development; quarantined when a skill regressed in evaluation

Each layer reads and writes through the same retrieval planner. The planner does not know about CoALA labels at runtime; it knows about named graphs and trust tiers. The CoALA layering is the team’s mental model for what each named graph contains; the engineering enforcement is at the graph level.

Working memory is straightforward. The agent’s prompt for the Wednesday meeting prep contained the banker’s query, three retrieval result blocks (gold-tier exposure changes, silver-tier AML signal, gold-and-silver succession context), and a system instruction that anchored the response template. Working memory is not persisted past the turn. It does not need to be; the contributing source IRIs are.

Episodic memory is the most operationally novel of the four at Lakeside. Every banker has a per-banker named graph that accumulates the agent’s conversation history with that banker, the banker’s filed client-meeting recaps, and the agent’s prior decisions and recommendations. Episodic facts carry their own provenance (the conversation timestamp, the banker IRI, the client IRI, the source agent version) and a tier annotation that defaults to silver. When the banker explicitly attests to an episodic fact (“yes, the patriarch did say that in the November meeting”), the fact is promoted to gold; when an episodic fact is contradicted by a later semantic update (the operational graph is corrected and the underlying counterparty fact is restated), the affected episode is flagged stale rather than allowed to be recalled as if still true. The invalidation flow is the answer to the Part 9 stale-episodic-recall failure mode.

Semantic memory is the heart. The operational graph from Part 11a (counterparties, exposures, beneficial ownership, transaction risk) plus the governance triples from Part 11b (CDEs, lineage, policies, regulatory crosswalks) make up the semantic substrate. The agent reads from semantic memory through the same SPARQL endpoint that the operational and governance consumers use. Semantic memory is the only layer where the agent reads gold-tier evidence in the strict-tier-floor policy; the other three layers contribute lower-tier signal where the policy permits.

Procedural memory is the layer most teams underbuild in 2026. At Lakeside, the agent’s skills live as typed nodes in a per-version skill subgraph: an account-review skill (preconditions: the banker has authority over the counterparty; postconditions: the brief covers exposure, recent changes, AML, credit; success metric: banker acceptance rate above 90%); a meeting-prep skill; a suitability-check skill; a BCBS-aware-response-framing skill (which constrains the agent’s outputs to the format that the BCBS 239 reporting team will accept if the agent’s brief becomes part of an audit trail); the tier-policy-classifier skill. Each skill carries a training-data provenance chain back to the operational and governance graphs, which is why the Article 10 SPARQL template can answer training-data provenance for the agent without a separate compliance project. The skill subgraph is versioned per agent release; pinning the agent to a version IRI is how Lakeside reproduces a past decision the agent made.

A diagram showing the CoALA four-layer memory model mapped to Lakeside's named graphs as a vertical stack. Top of the stack labeled "Working memory (the prompt; not persisted)" shown as a thin slate band; an arrow from below indicates "selective injection per turn." Below that, a deep teal layer labeled "Episodic memory (per-banker named graph)" with a sample IRI "https://lakeside.com/graph/episodic/banker/b-3271/2026-Q1" inset; visible content includes "prior client meeting recaps," "agent conversation history," "banker overrides and acceptances"; a tier badge reads "silver typical; gold on banker attestation; stale-flagged on semantic update." Below that, a deeper teal layer (the largest in the stack) labeled "Semantic memory (the operational graph from Part 11a + governance triples from Part 11b)" with sample IRIs "https://lakeside.com/graph/operational/2026-Q1" and "https://lakeside.com/graph/governance/2026-Q1"; visible content includes "Counterparty entities, exposures, beneficial ownership, CDEs, lineage, policies, regulatory crosswalks"; a tier badge reads "gold dominant; silver for Track 2; bronze for legacy migrations." Bottom of the stack, an amber layer labeled "Procedural memory (versioned skill subgraph)" with a sample IRI "https://lakeside.com/graph/procedural/agent/relationship-banker-v3"; visible content includes "account-review skill, meeting-prep skill, suitability-check skill, BCBS-aware-response-framing skill, tier-policy-classifier skill, each with preconditions, postconditions, success metrics, training-data provenance"; a tier badge reads "gold at production; silver during development; quarantined on regression." A vertical column on the right side spanning all four layers labeled "Same identity discipline + same provenance contract + same trust tiers" with arrows pointing into each layer indicating the unified substrate. A horizontal bar across the bottom labeled "Retrieval planner (reads from all four layers; tier policy per workflow)" with three branches up into the stack indicating per-turn retrieval composition. Caption: "four CoALA memory layers, four named-graph partitions, one substrate, one identity, one provenance contract; the agent reads through the planner like the regulator reads through a SPARQL template."

Trust-Tier-Aware Retrieval In Production

The Wednesday brief above is the worked example of Part 9’s trust-tier-aware retrieval pattern operationalized at Lakeside. The pattern enforces the policy at three layers (retrieval planner, prompt assembly, response post-processing). All three are required; single-layer enforcement leaks. The Lakeside specialization is that each layer reads the tier from the same lksb:trustTier predicate that the operational queries, the governance reports, and the SHACL gate already populate.

Enforcement layerLakeside implementationWhat it catches
Retrieval plannerThe planner emits SPARQL with FILTER(?trustTier IN (...)) matching the workflow policy. Bronze content is structurally excluded for gold-only queries; silver content appears in a separate result set for tier-segregated queries; tier is projected as a column for tier-explicit queriesTier laundering at retrieval (the Apex Capital failure mode); a bronze fact never reaches the prompt for a gold-only query
Prompt assemblyIf lower-tier content reaches the prompt under tier-segregated policy, it is wrapped in a labeled XML block (<gold-evidence>, <silver-evidence>, <bronze-evidence>) with a system instruction that the model must weight gold over silver and surface the tier on every cited factTier hiding inside the prompt (a silver fact appearing in the gold block by mistake); the model conflating tiers without surfacing them in the response
Response post-processingThe agent’s response is parsed for cited facts; each cited fact’s IRI is looked up in the source graph; the fact’s actual tier is compared to the workflow policy floor; the response is blocked if any cited fact’s actual tier is below the policy floorThe agent reading bronze and emitting gold-sounding citations; a hallucinated citation IRI that does not resolve to a real fact in the graph; tier promotion that did not happen at the source

The third layer is the one that catches the failures the first two miss. A planner that filters correctly can still emit a query whose result the agent reorders; a prompt that wraps tiers correctly can still be misread by a model under load; the response post-processing is the final mechanical check. At Lakeside, the post-processing service runs as a sidecar to the agent runtime; every agent response transits the sidecar before it returns to the banker. The sidecar adds a small latency budget to each response (illustrative, not instrumented; well under the operational latency budgets Lakeside cares about) and blocks a small number of responses across the relationship-banker fleet for tier-policy violations. The blocked responses are sent to a queue that the AI engineering team reviews; the queue is the leading indicator that either the planner classifier needs retraining or the policy itself needs adjustment.

The mechanism is the same one the Microsoft VeriTrail provenance work generalizes across multi-step agent workflows. Lakeside’s specialization is that the verification target is the trust tier, not just the existence of a citation. A citation that resolves to a real fact in the graph is not enough; the cited fact must have the tier the workflow allowed.

A diagram showing Lakeside's trust-tier policy enforcement as three workflows feeding a three-layer enforcement pipeline. Left side: three workflow chips stacked vertically. Top chip in deep teal labeled "Portfolio decision (credit expansion, exposure increase, regulatory disclosure)" with annotation "policy: strict-tier-floor (gold only)." Middle chip in slate labeled "Client-meeting prep (open issues, recent changes, talking points)" with annotation "policy: tier-segregated context (gold + silver in labeled blocks)." Bottom chip in amber labeled "Advisor-facing summary (relationship review, monthly digest)" with annotation "policy: tier-explicit-citation (every claim carries a tier badge)." Center: a horizontal three-column band labeled "Enforcement at three layers." Column 1 in deep blue labeled "Retrieval planner (SPARQL FILTER on lksb:trustTier)" with a small SPARQL fragment annotation FILTER(?trustTier IN (kg:gold)). Column 2 in violet labeled "Prompt assembly (labeled XML blocks: gold-evidence, silver-evidence, bronze-evidence)" with a small annotation <gold-evidence>... </gold-evidence>. Column 3 in green labeled "Response post-processing sidecar (verify cited fact tier against policy)" with a small annotation "blocks a small number of responses; small latency add (illustrative, not instrumented)." Each workflow chip on the left has three thin arrows fanning into the three columns, with arrowhead colors matching the workflow chip color (teal arrows from portfolio decision, slate from client prep, amber from advisor summary). Each column emits a single arrow on the right side merging into a single output box labeled "Verified response with tier-explicit audit trail." Below the merged box, a small green annotation reads "every cited fact resolves to a real IRI; every IRI's tier is verified against the workflow policy; mismatches block the response." A small dotted lane below the diagram in red shows the unprotected path: workflow → flat retrieval → unlabeled prompt → unverified response → "Apex Capital incident" red X. Caption: "the protected path is three layers; the dotted lane is what happens when any one layer is skipped."

The Wednesday Retrieval Flow With And Without Tier Discipline

The Wednesday brief was the protected path. The unprotected path (the Apex-style failure replayed at Lakeside) is the dotted lane. The two flows side by side make the architectural choice concrete. The same banker query, the same underlying corpus, two retrieval architectures, two outcomes.

The unprotected flow runs the banker query against a flat vector index over the same corpus the operational graph ingests. The “irrevocable trust” Track 2 fact (gold-tier in the operational graph because it was extracted, resolved, validated, and attested) and the “competitor acquisition” draft pitch deck (bronze in any sane tiering, but flat in the vector index) come back as similarly-relevant chunks. The agent reads both and produces a brief that mixes them. The banker walks into the noon meeting with a fabricated acquisition. The post-mortem at Lakeside would take three weeks, surface in a regulatory exam, and require a Material Weakness disclosure. The Apex Capital fix that would have prevented the original incident is the same fix Lakeside implemented at design time.

The protected flow uses the three-layer enforcement above. The portfolio-decision sub-task hits strict-tier-floor and returns only gold-tier triples. The client-meeting-prep sub-task hits tier-segregated and returns gold facts plus a labeled silver-tier signal about the AML question on the irrevocable trust funding source. The advisor-facing-summary sub-task hits tier-explicit-citation and renders every cited succession-planning suggestion with its source tier visible. No fabricated fact reaches the brief. The audit trail per cited claim survives a regulatory question six months later.

A diagram showing two retrieval flows side by side for the same banker query. Left panel labeled "Unprotected (Apex-style)" with five horizontally stacked stages: (1) banker query "brief me on the Müller meeting," (2) flat vector index over the entire corpus including audited gold profiles and unfinalized document-store pitch decks, (3) chunk return ranked by cosine similarity with no tier metadata visible, (4) prompt assembled with both gold and bronze chunks treated identically, (5) agent response that confidently cites the bronze pitch deck as if the competitor acquisition had happened. The five stages use red dashed borders. A red X icon at the right with annotation "fabricated fact reaches the banker; Apex incident replayed at Lakeside; regulatory exam finding; Material Weakness disclosure trajectory." Right panel labeled "Protected (Lakeside production)" with eight stages: (1) banker query (same), (2) tier-policy classifier categorizes three sub-tasks (portfolio: strict-tier-floor; meeting prep: tier-segregated; succession: tier-explicit-citation), (3) retrieval planner emits three SPARQL queries with FILTER(?trustTier IN (...)) per policy, (4) graph traversal returns triples with full provenance per fact, (5) prompt assembly wraps results in labeled blocks (gold-evidence, silver-evidence) with tier badges, (6) agent response generated with tier-explicit citations, (7) response post-processing sidecar verifies each cited fact's tier against the workflow policy, (8) verified response delivered to the banker. The eight stages use green solid borders. A green check icon at the right with annotation "no fabricated facts; every cited claim auditable; brief ready in 11 minutes." Across the bottom of both panels, a small grey legend reads "same banker, same query, same corpus; the architectural difference is the tier discipline at retrieval, prompt, and post-processing." Caption: "the unprotected lane is what happens when any one of the three enforcement layers is skipped; the protected lane is what survives a regulatory exam."

The Agent And The Regulator Read The Same Substrate

Lakeside classified the relationship-banker agent as a high-risk AI system under the EU AI Act Article 6 classification rules, because it influences credit decisions and surfaces in client-facing interactions. High-risk AI obligations were originally set to begin enforcement on August 2, 2026 per the implementation timeline. As of mid-2026, that date remains legally active, but the EU’s provisional Digital Omnibus agreement of May 7, 2026 would postpone high-risk Annex III obligations, including Article 10 data governance, to December 2, 2027 (embedded Annex I systems to August 2, 2028), pending formal adoption and publication in the Official Journal. Either way, Lakeside’s EU subsidiary is in scope. Three Articles in particular shape the agent’s design.

ArticleObligationHow Lakeside answers from the same graph
Article 10 (Data and Data Governance)Training, validation, and test data must be relevant, representative, free of errors, and complete; data sources, collection, and preparation must be documentedThe Article 10 SPARQL template from Part 11b traverses ?model prov:wasDerivedFrom+ ?trainingDataset and filters to lksb:trustTier "gold" for the production training corpus. Every training dataset has the seven-field provenance contract from Part 7 and a quality measurement from the SHACL gate. The agent’s procedural-memory skills (including the tier-policy classifier) are model artifacts whose training data is queryable through the same template
Article 12 (Record-Keeping)High-risk AI systems must automatically log events relevant to identifying risks and post-market monitoringEvery agent response is logged into an append-only named graph https://lakeside.com/graph/agent-logs/relationship-banker/{periodId}; the log entries are PROV-O Activities with prov:wasAttributedTo linking to the agent version IRI, prov:used linking to every retrieved triple’s IRI, and the response post-processing verdict. Article 12 audits read against this graph; no second logging system
Article 14 (Human Oversight)High-risk AI systems must allow effective human oversight, including the ability for a human to override or disregard the system’s outputThe banker’s accept/override decision per agent recommendation is captured into the per-banker episodic graph as a typed event; rate of acceptance and rate of override per skill per quarter are queryable; a skill that is being overridden more than 30% of the time is flagged for retraining

The structural insight is that the regulator reading the agent and the regulator reading the data lineage are reading the same graph through different SPARQL templates. The Article 10 template that Part 11b showed for the relationship-banker-agent-v3 model is the same template the EU AI Act conformance assessor will run when enforcement begins. The Article 12 logs land in a named graph the regulator can read directly. Article 14 oversight is not a separate workflow; it is a typed event in the per-banker episodic graph. The fragmentation that would have been the default (agent telemetry in one system, training-data lineage in another, oversight logs in a third) has been collapsed into the same substrate the operational front office and the governance back office already use.

KEY INSIGHT: when the regulator’s question about your agent is answerable as a SPARQL template against the same graph that answers the regulator’s question about your risk submission, you have crossed the threshold the four-or-five-store anti-pattern from Part 10 named, applied to AI compliance specifically. Most banks will hit high-risk enforcement (currently August 2, 2026, with a proposed deferral to December 2, 2027 under review) with a separate AI-compliance store stood up alongside their Data Governance store; Lakeside hits it with one substrate. The cost of the separation will compound through every subsequent AI regulation.

What Lakeside Got Wrong On The Way

The architecture above is the eighteen-month outcome. The path was not clean. Four mistakes are worth naming because each is the failure mode the next bank will reach for first.

Month 4: ontology bloat. The first six months of any KG program are when the in-house ontology grows fastest. Lakeside’s lksb: module drifted to over a hundred classes by the end of month four (the figures here are illustrative), well above the 100-class ceiling Part 11a’s discipline named. The drift was driven by good-faith proposals from domain experts who wanted their specific concept modeled exactly the way they thought about it, even when a FIBO class would have served. The retraction took two weeks: the ontology team reviewed every lksb: class against the criterion “does this subclass a FIBO class, or does it model a strictly internal concept that FIBO does not cover,” retired the classes that failed the test, and republished the ontology at version 2.5.1. The cost of the retraction was small because it happened at month 4; a peer bank that let drift run for two years had to spend a quarter consolidating over a thousand in-house classes. The lesson Lakeside encodes: review the in-house module ceiling monthly during the first year; treat any growth above the cap as a discipline incident, not a feature request.

Month 7: IRI minting drift. Lakeside’s IRI discipline (one mint authority for the whole bank, IRIs minted only after entity resolution completes) was caught quietly violated when the commercial onboarding team stood up its own minter for a fast-tracked client-onboarding workflow. The team’s justification was speed: routing every new commercial counterparty through the central minter added 200ms to onboarding, and the team needed to hit a SLA. For four months, commercial counterparties had two IRIs (the onboarding-team minter’s and the central minter’s) that were silently reconciled by the ER pipeline, mostly correctly. The drift was caught when an AML investigator queried for a counterparty’s beneficial ownership and got two distinct subgraphs that had to be hand-reconciled. The retraction took three weeks: the onboarding-team minter was retired, the central minter was optimized to meet the SLA, and the orphan IRIs from the four-month period were reconciled programmatically. The lesson encodes: the central minter is a single point of failure; treat it with the same investment as the central authentication system; never let a downstream team build a parallel one for SLA reasons.

Month 11: SHACL gate over-rejection. The SHACL gate from Part 11a’s stage 7 was tuned conservatively at first: any Track 2 extraction with non-standard date formats (the bank’s credit memos used at least seven date formats across the historical corpus), missing required attributes, or unexpected predicate cardinalities was quarantined. The intent was right; the calibration was wrong. By month 11, the quarantine queue had tens of thousands of unprocessed extractions (the figure is illustrative), mostly legitimate facts from credit memos that the gate had rejected for cosmetic reasons. The agent’s silver-tier signal was not just incomplete; it was systematically biased toward credit memos that happened to use the modal date format. The retraction took six weeks: the SHACL shapes were relaxed (date-format normalization moved upstream into a Track 2 preprocessing step; cardinality constraints were softened where the underlying domain allowed multiple values), the quarantine queue was reprocessed, and a quarantine-review workflow was added so that future quarantined extractions go to a human reviewer rather than accumulating silently. The lesson encodes: a SHACL gate that quarantines silently is an invisible ceiling on the agent’s silver-tier signal; the queue must be visible, monitored, and bounded.

Month 14: episodic memory invalidation gap. The episodic-memory layer was added in month 9 and ran without an invalidation flow until month 14. During those five months, the agent recalled prior client-meeting facts that had since been corrected in the operational graph; one specific incident involved a counterparty’s parent-company restructuring that had been corrected in semantic memory but not propagated to the per-banker episodic graph. The agent recalled the pre-restructuring parent in a meeting prep brief; the banker walked into the meeting and discussed the wrong corporate structure. The retraction took four weeks: a semantic-to-episodic invalidation flow was added (when a semantic fact changes, every episode that referenced the affected IRI is flagged stale; the agent’s retrieval planner refuses to recall stale episodes without a freshness check), and a backfill ran across the five months of accumulated episodes to flag any whose referenced IRIs had since changed. The lesson encodes: episodic memory without an invalidation flow is a liability; the bridge from semantic to episodic must be designed in from the start, not added as a fix-up.

The four mistakes are recoverable individually. The pattern across them is what matters. Each mistake is the failure mode that the Part 9 seven-failure-mode list named in the abstract; living through them is what turned the abstract failure mode into a discipline.

The Contract Discipline That Keeps The Agent Operable

The agent is a consumer of the operational and governance graphs in the Part 8 minimum-viable-contract sense. Lakeside publishes a six-element contract per agent release. The relationship-banker-agent-v3 contract names the ontology version IRI it depends on (FIBO 2026-Q1 plus lksb: 2.5.1), the named-graph IRIs it reads from (operational/2026-Q1, governance/2026-Q1, episodic/banker/{bankerId}, procedural/agent/relationship-banker-v3), the class and predicate IRIs it specifically uses (counterparty, exposure, beneficial ownership, AML investigation, advisor note, plus the seventy-some predicates the SPARQL templates reference), the SHACL shapes it requires upstream (the gate that produces the gold-tier triples), the migration window for any breaking change to those upstream artifacts (8 weeks dual-write minimum), and the notification channel (a #kg-platform-changes Slack channel plus a quarterly email digest to the AI engineering team).

When the FIBO 2026-Q2 release lands, the platform team runs the Part 8 six-stage migration playbook once. Impact analysis identifies the FIBO predicates the agent’s SPARQL templates use; design proposes the new agent contract version; dual-write runs the old and new ontology versions in parallel for the migration window; the agent team retrains relationship-banker-agent-v4 against the new ontology; the v4 release goes to production with v3 deprecated; v3 is removed two quarters after v4 lands. The agent’s training-data provenance chain (the Article 10 SPARQL template from Part 11b) traces every v4 training fact back through the same provenance contract that v3 used. Reproducibility at the agent layer is a consequence of the same versioning discipline that keeps Pillar 3 reports reproducible.

The other contract Lakeside publishes is the inverse: the operational graph and the governance graph are also consumers of the agent. The agent writes back into the per-banker episodic graph and the per-agent log graph, and those writes go through the same SHACL gate that any other write does. An agent that emits a malformed PROV-O activity for an Article 12 log entry has the write rejected; the AI engineering team is notified through the same #kg-platform-changes channel. The agent is not a privileged consumer; it is a consumer with the same contract discipline as every other consumer.

Cost Preview: What Each Layer Buys And What It Costs

Appendix B will cover the full cost model. The trilogy-spanning roll-up is the precondition. The four layers (foundation, operational, governance, agent) buy distinct things and cost distinct things. The Lakeside numbers below are illustrative ranges, not vendor quotes; Appendix B will decompose each line into infrastructure, license, headcount, and ramp components.

LayerAnnual cost range at Lakeside scaleAnnual benefit range at Lakeside scaleNet
Foundation (graph store, ontology, IRI minter, ER pipeline, SHACL gate)$1.8M to $2.4M (canonical RDF triple store plus property-graph traversal store licenses, infrastructure, central platform team of ~6)Enabling layer; benefit attributes to the layers aboveInvestment
Operational (customer 360, beneficial ownership, real-time transaction risk)$0.6M to $0.9M (incremental compute, per-business-unit ramp, integration work)$7M to $10M (banker time recovered from reconciliation; AML investigation throughput; transaction risk service unit cost reduction)Net positive in year one
Governance (BCBS 239, ECB RDARR, GDPR Article 30 ROPA, EU AI Act Article 10)$0.4M to $0.7M (governance-team headcount; quarterly FIBO release management; regulator-facing reporting build-out)$5M to $8M (regulatory remediation cost avoidance; reduced finding cycle time; cross-walk authoring instead of per-regulation reconciliation projects)Net positive in year two
Agent (relationship-banker agent v3, including memory frameworks, retrieval sidecar, response post-processing)$1.2M to $1.6M (the agent episodic-memory store or equivalent memory framework, agent runtime, AI engineering team of ~4, model inference)$4M to $6M (banker productivity recovered through agent-assisted prep; reduced reliance on senior-banker reach-around for routine questions; new revenue from improved client-meeting throughput)Net positive in year two
Trilogy total$4M to $5.6M annual run rate$16M to $24M annual benefit3-4x annual return at steady state

A diagram showing Lakeside's series-spanning cost-and-benefit roll-up as four horizontal layers stacked from foundation at the bottom to agent at the top. Each layer is a horizontal bar split into a left red-orange "cost" segment and a right green "benefit" segment, with proportional widths matching the dollar ranges in the table. Bottom layer in slate labeled "Foundation: graph store + ontology + IRI minter + ER + SHACL ($1.8M-$2.4M annual)" with the cost bar filling the entire width and a small annotation "enabling layer; benefit attributes to layers above." Second layer in deep teal labeled "Operational: customer 360 + beneficial ownership + transaction risk ($0.6M-$0.9M cost; $7M-$10M benefit)" with the cost bar small on the left and the benefit bar much larger on the right. Third layer in deep blue labeled "Governance: BCBS 239 + ECB RDARR + GDPR Art 30 + EU AI Act Art 10 ($0.4M-$0.7M cost; $5M-$8M benefit)" with similar small-cost large-benefit shape. Top layer in amber labeled "Agent: relationship-banker agent v3 + memory framework + retrieval sidecar ($1.2M-$1.6M cost; $4M-$6M benefit)" with similar shape. To the right of the four layers, a vertical bracket labeled "Trilogy total: $4M-$5.6M annual run rate; $16M-$24M annual benefit; 3-4x return at steady state." Below the layers, three callout annotations: (1) "Foundation cost is amortized across all three application layers; counted once" pointing at the bottom bar, (2) "Operational benefit is the largest single contributor; banker reconciliation time is the dominant cost recovered" pointing at the second bar, (3) "Agent layer benefit grows with adoption; year one is ramp; year three is steady state" pointing at the top bar. A small grey legend at the bottom reads "ranges are illustrative; Appendix B decomposes each line into infrastructure, license, headcount, and ramp components." Caption: "the trilogy is a single program with one foundation cost and three application benefits; the agent layer is the third, not a separate investment."

The interpretation that matters operationally is that the agent layer is not the largest benefit and not the largest cost. The operational layer is the dominant benefit (banker reconciliation time recovered is the single largest dollar line); the foundation layer is the dominant cost (the platform team and the graph stores). The agent layer is the multiplier: it cannot pay back without the foundation, the foundation does not pay back without the operational layer, and the governance layer is what keeps all three defensible against the regulator. The trilogy is a single program with one foundation cost and three application benefits, not three separate investments.

Diagnostic: Is Your Firm Ready For The Agent Layer?

The eight-question diagnostic for the agent layer (the third in the trilogy after Part 11a’s foundation diagnostic and Part 11b’s governance diagnostic) is the readiness check Lakeside ran before deploying relationship-banker-agent-v3 to the first cohort of bankers.

Diagnostic questionYes if…No means…
Does every triple your agent retrieves carry a trust tier?The seven-field provenance contract is populated on every fact, including Track 2 extractions, with lksb:trustTier setThe Apex Capital incident is the failure mode you have not yet had; fix this before any agent goes to bankers
Is your retrieval planner enforcing the tier policy in the SPARQL or Cypher query, not just in the system prompt?The query carries FILTER(?trustTier IN (...)) matched to the workflow; the planner refuses to emit a query without a policySingle-layer enforcement leaks; tier laundering at retrieval is the most common production incident
Is your prompt assembly wrapping lower-tier evidence in labeled blocks with explicit tier instructions?The prompt has <gold-evidence>, <silver-evidence>, and <bronze-evidence> blocks with system-instruction tier weightingTier hiding inside the prompt is the second-most-common production incident
Is your response post-processing verifying every cited fact’s tier against the source graph?A sidecar parses the response, looks up cited fact IRIs, and blocks responses whose cited tiers violate the policyThe response post-processing layer is what catches what the first two miss; without it, the agent eventually launders a tier
Is your episodic memory bridged to semantic memory invalidation?When a semantic fact changes, every episode referencing the affected IRI is flagged staleStale episodic recall is the second-most-likely incident (after tier laundering); the gap surfaces during a client meeting, not an audit
Is your procedural memory (skill subgraph) versioned per agent release with training-data provenance?Each skill has a version IRI, an owl:priorVersion chain, and a prov:wasDerivedFrom+ chain back to the training dataThe Article 10 audit will request this when high-risk enforcement begins (currently August 2, 2026, with a proposed deferral to December 2, 2027 under review); without it, the agent is not deployable in the EU
Are agent responses logged into an append-only named graph that the regulator can read?Every agent response is a PROV-O Activity in a per-period log graph; immutable; queryableArticle 12 record-keeping cannot be answered; the agent is non-conformant when high-risk enforcement begins (currently August 2, 2026, with a proposed deferral to December 2, 2027 under review)
Are banker overrides captured as typed events in the episodic graph and queryable per skill?Override rate per skill per quarter is a SPARQL query; high-override skills are flagged for retrainingArticle 14 human oversight is not measurable; skills regress silently

A firm that answers yes on six or more is ready to deploy a high-risk agent against a knowledge graph; a firm at all eight is in Lakeside posture for the EU AI Act enforcement deadline. The gap between “agent v1 in pilot” and “agent v3 in production with all eight” is roughly three quarters of focused work for a mid-size bank that already has the foundation and operational layers from Part 11a.

Cross-Trilogy Do Next: Foundation, Operational, Governance, Agent

The Do Next table that closes the trilogy spans all four layers. Each row is anchored to one layer; the priority is keyed to the order Lakeside followed.

PriorityActionLayerWhy it matters
Now (this quarter)Audit your in-house ontology module against the under-100-class ceiling. If it has drifted above, retract before the next ontology release lands.FoundationOntology bloat compounds; month 4 retraction cost two weeks; year-three retraction cost a peer bank a quarter
Now (this quarter)Confirm one IRI minting authority for the entire firm. If a downstream team has stood up a parallel minter (the month-7 Lakeside incident), retire it before the next quarter closes.FoundationThe central minter is the single point of identity discipline; parallel minters silently drift and surface during AML investigations
Now (this quarter)Make lksb:trustTier (or the equivalent) a first-class predicate on every fact in your operational graph, including Track 2 extractions. Backfill if necessary.OperationalThe agent layer cannot enforce the tier policies without it; the governance layer cannot answer Article 10 without it
Next (next two quarters)Author SPARQL templates for the four regulators in scope (BCBS 239 Principle 3, ECB RDARR, GDPR Article 30 ROPA, EU AI Act Article 10) against your governance graph. Run them quarterly as part of the standard control set.GovernanceCross-walking regulations is what a governance graph is for; the templates are the deliverable
Next (next two quarters)Stand up the three-layer trust-tier enforcement (retrieval planner, prompt assembly, response post-processing). Do not deploy a high-risk agent without all three.AgentSingle-layer enforcement leaks; the Apex incident replays at any bank that ships with one or two
Next (next two quarters)Map your agent’s memory pattern onto the CoALA four layers (working, episodic, semantic, procedural) and pin each layer to a named-graph partition with explicit IRIs.AgentThe mapping is the architecture; without it, agent memory accumulates as a separate liability
Soon (next year)Add the semantic-to-episodic invalidation flow before episodic memory has been running long enough to accumulate stale references.AgentThe month-14 Lakeside incident is the avoidable one if the flow is designed in from the start
Soon (next year)Stand up a named-graph version chain per ontology release. Pin regulatory submissions and agent training corpora by Version IRI; bind operational consumers and the agent runtime to the alias.FoundationA program that cannot reproduce a 2025 report or a 2025 agent decision against the 2025 ontology has the same archeology problem as Northwind, just delayed by two years
Soon (next year)Publish a six-element contract per agent release naming the ontology version, the named-graph IRIs, the class and predicate IRIs, the SHACL shapes, the migration window, and the notification channel.AgentWithout the contract, every quarterly FIBO release is a fire drill; with the contract, it is a scheduled migration
Eventually (when stable)Treat agent responses as PROV-O Activities in an append-only log graph the regulator can read directly. Treat banker overrides as typed events in the episodic graph.AgentArticle 12 record-keeping and Article 14 human oversight become queries against the same substrate, not separate compliance projects

What Comes Next: Appendix A, Then The Conclusion

Part 11c closes the Lakeside Trust Bank trilogy. The same graph that took three weeks to reconcile pre-KG now answers the relationship banker on Monday (operational), the BCBS 239 examiner on Tuesday (governance), and the agent-assisted client-meeting prep on Wednesday (agent). Three days, three consumers, one substrate, one identity discipline, one provenance contract.

Appendix A is the first place in the series where specific vendor names appear for tool selection rather than as illustrative examples. It covers the 2026 KG tooling landscape across the same layers this trilogy used generically: RDF triple stores, property-graph traversal stores, hybrid stores, SPARQL-to-SQL virtualization layers, entity-resolution engines, LLM-assisted extraction pipelines, and governance-metadata platforms, with the named products at each layer and the Lakeside picks with the rationale. Appendix B (cost modeling) and Appendix C (politics, sponsorship, the “just use a database” argument) close out the practical-program guidance. Part 12 wraps the series with what building a knowledge graph actually teaches you. The Lakeside trilogy is the worked example the rest of the series can refer back to.

Sources & References

  1. Sumers, Yao, Narasimhan, Griffiths: Cognitive Architectures for Language Agents (CoALA)(2024)
  2. Packer et al. (Letta/MemGPT): MemGPT: Towards LLMs as Operating Systems(2024)
  3. Letta: Skills, Long-Term Memory, and the Next Phase of Agent Operating Systems(2025)
  4. Zep / Graphiti: Build Real-Time Knowledge Graphs for AI Agents(2025)
  5. Microsoft Research: VeriTrail: Detecting hallucination and tracing provenance in multi-step AI workflows(2025)
  6. EU AI Act Article 10: Data and Data Governance(2024)
  7. EU AI Act Article 12: Record-Keeping(2024)
  8. EU AI Act Article 14: Human Oversight(2024)
  9. EU AI Act: Implementation Timeline (high-risk obligations originally August 2, 2026, with a proposed Digital Omnibus deferral to December 2, 2027)(2024)
  10. Gibson Dunn: EU AI Act Omnibus Agreement: Postponed High-Risk Deadlines and Other Key Changes(2026)
  11. Hogan Lovells: EU legislators agree to delay for high-risk AI rules(2026)
  12. FIBO Quarterly Release Notes (EDM Council, 2025 Q4)(2025)
  13. W3C PROV-O: The PROV Ontology(2013)
  14. Hogan et al.: Knowledge Graphs (ACM Computing Surveys 2021; Synthesis Lectures, Morgan and Claypool 2022)(2022)
  15. Edge et al. (Microsoft Research): From Local to Global: A Graph RAG Approach to Query-Focused Summarization(2024)

Stay in the loop

Get new articles on data governance, AI, and engineering delivered to your inbox.

No spam. Unsubscribe anytime.