Data Architecture & Engineering June 19, 2026 · 30 min read

Building a Production Knowledge Graph at Lakeside Trust Bank: The Governance Layer

Part 11b of the Knowledge Graph Practitioner's Guide. The same Lakeside Trust Bank graph from Part 11a takes a Tuesday-morning examiner question, then collapses BCBS 239 attribute-level lineage, ECB RDARR Data Lineage expectations, GDPR Article 30 ROPA, and EU AI Act Article 10 training-data provenance into four SPARQL templates against one graph. Covers the OpenLineage to PROV-O bridge through Lakeside's Spark and dbt pipelines, the 280 CDEs as typed nodes with 5,600 hasImplementation edges (Counterparty Credit Exposure as the worked example), the trust-tier-by-reporting-surface table specialized to Lakeside, and the named-graph version chain across a quarterly FIBO release.

By Vikas Pratap Singh
#knowledge-graph #reference-architecture #financial-services #data-governance #bcbs-239 #openlineage #prov-o #eu-ai-act #critical-data-elements #regulatory-lineage

Knowledge Graph Practitioner’s Guide: Overview | Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 | Part 7 | Part 8 | Part 9 | Part 10 | Part 11a | Part 11b | Part 11c | Appendix A | Appendix B | Appendix C | Part 12

Recap: Where Part 11a Left Off

Part 11a introduced Lakeside Trust Bank ($75B in assets, US-headquartered with an EU subsidiary, 1.2M retail customers, 22,000 commercial counterparties, ~280 CDEs at 1:20 implementing-field ratio) and decomposed the operational use case for a deliberate knowledge graph. The Monday-morning question (the Müller-family exposure reconciliation that took three weeks pre-KG and 180ms post-KG) forced the architecture: a modular ontology Lakeside imports (FIBO BE plus FIBO LOAN plus FIBO SEC plus FIBO FBC plus W3C Time plus PROV-O plus SKOS plus DCAT 3 plus SHACL, with a thin lksb: in-house module under 100 classes), an 8-stage construction pipeline that converges Track 1 (R2RML over the cloud data warehouse plus an open table-format lake plus the retail master-data MDM platform) and Track 2 (LLM extraction over the enterprise document store’s credit memos, KYC files, and advisor notes) at one identity, and a hybrid serving tier (the canonical RDF triple store on SPARQL plus a property-graph traversal-store view, see Appendix A for the specific tools). This piece picks up from there: the same graph absorbs the governance back office.

Tuesday Morning, Q1 2026: The Examiner Asked Five Questions

A Federal Reserve examiner sat down with Lakeside’s BCBS 239 program lead two months after the Müller-family incident and asked a five-part question. For Counterparty Credit Exposure on the Q1 2026 risk submission, attribute by attribute, the examiner wanted to see: every source field that contributes, every transformation that touches each contributing field, every owner accountable at every stage, every quality measurement that ran on every execution, and every change to the underlying ontology in the reporting period. The examiner had read the 2024 ECB Guide on effective risk data aggregation and risk reporting, and attribute-level Data Lineage was one of the seven priority areas the Guide had named. The five-part question was a structured BCBS 239 Principle 3 ask in the new ECB shape.

Pre-KG, the program lead would have responded the way Northwind Bancshares responded in Part 10: three weeks of reconciling four governance stores (catalog, lineage tool, glossary, policy register) that did not share identifiers, a partial answer, and the next quarter’s risk-management committee inheriting a known gap. Post-KG, the program lead opened a SPARQL template, parameterized it with the Counterparty Credit Exposure CDE IRI and the Q1 2026 reporting window, ran it once against the governance endpoint, and produced a structured response with attribute-level traversal, owner attribution, quality scores, and ontology version metadata in under 90 seconds. The examiner asked two follow-up questions in the same session. Both ran as variations on the same template.

The Tuesday-morning examiner question is the governance counterpart to the Monday-morning relationship-banker question from Part 11a. Same bank. Same graph. Different consumer, different reading pattern. The substrate did not change.

Specializing The Governance Ontology Shape To Lakeside

Part 10 locked the governance ontology shape: DCAT 3 plus PROV-O plus SKOS plus FIBO plus a thin in-house module, with six entity types (Dataset, Field, Job/Activity, CriticalDataElement, Policy, Owner) and a small set of typed relationships (partOf, wasGeneratedBy, implementsCDE, validatedAgainstShapes, applies, hasOwner). Lakeside imports that exact shape; the foundational layers from Part 11a already include DCAT 3, PROV-O, SKOS, and SHACL. The governance specialization is not a second ontology. It is a small set of governance-specific extensions to the same lksb: module.

Governance extensionLakeside class or propertyWhat it adds beyond Part 11a
Critical Data Elementlksb:CriticalDataElement rdfs:subClassOf cde:CriticalDataElementTyped node for each of the ~280 CDEs; carries owner, quality rule, regulatory driver
Field-to-CDE linkcde:hasImplementation (Field to CDE; inverse cde:implementsCDE)The 1:N edge that anchors attribute-level lineage at the business-concept layer
Policy nodelksb:Policy rdfs:subClassOf policy:PolicyEach regulatory citation (BCBS 239 Principle 3, GDPR Art 30, AI Act Art 10) is a node, not a PDF reference
Personal data flaggdpr:PersonalData SKOS Concept attached to Fields and CDEsThe ROPA query traverses this tag to bound personal-data scope
Training data flaglksb:TrainingDataset subclass of dcat:DatasetEvery dataset used to train or evaluate the relationship-banker agent carries this type
Quality measurementdqv:hasQualityMeasurement link from Field or Activity to a Measurement nodeAlready in Part 11a’s SHACL gate; promoted here to a first-class governance edge
Regulatory cross-walk linklksb:satisfiesPrinciple (Activity to Policy)The link a SPARQL template walks to answer “which BCBS 239 principles does this control satisfy”

The discipline from Part 11a holds. The lksb: module stays under 100 classes. Every new governance class either subclasses an existing FIBO, DCAT, PROV-O, or SKOS construct or models a concept that none of those vocabularies cover at sufficient resolution. The thin in-house module is what prevents the governance project from accidentally inventing a parallel ontology for governance purposes. The failure mode this avoids is the one a governance program slides into without that discipline: a governance ontology that grows to hundreds of bespoke classes nobody can explain to FIBO consumers. Lakeside’s governance extensions instead add roughly twenty classes to the eighty already in lksb:.

What this looks like in practice. Most governance-KG projects fail at the ontology stage by assuming the governance vocabulary is its own thing. It is not. Governance is a reading pattern over the same operational graph, with a few additional typed edges. If your governance ontology cannot be expressed as a thin extension of your operational ontology, the operational graph and the governance graph will accidentally diverge, and within twelve months you will be reconciling them the way Northwind reconciled its four stores.

OpenLineage To PROV-O At Lakeside Scale

The connective tissue between the operational pipeline and the governance graph is the OpenLineage to PROV-O bridge that Part 7 introduced and Part 10 made mechanical. At Lakeside, the bridge runs across roughly 4,600 Spark and dbt jobs that emit OpenLineage natively, plus around 200 Airflow-orchestrated ingest pipelines that emit through the OpenLineage Airflow integration. The events flow through the OpenLineage metadata store (see Appendix A for the specific tools) that Lakeside operates as the OpenLineage event store; a small ingestion service reads from that store, applies the eight-row mapping table from Part 10 specialized for Lakeside’s Job naming conventions and operator vocabulary, and asserts the result into a designated named graph in the governance KG.

OpenLineage conceptPROV-O conceptLakeside KG predicateLakeside-specific note
Runprov:ActivityOne node per run; IRI under https://lakeside.com/kg/run/{run-uuid}Run UUIDs come from the OpenLineage event; never re-minted
Jobprov:Planprov:wasAssociatedWith from Run to JobJob IRIs use the namespace + name pattern from the OL spec; one Job per dbt model or Spark application
Input datasetEntity used by Activityprov:usedDatasets are already DCAT 3 nodes in the governance graph; the Run links them by IRI
Output datasetEntity generated by Activityprov:wasGeneratedBySame Dataset IRI is used across Run, governance reporting, and the Part 9 agent retrieval surface
Producer (operator team)prov:Agentprov:wasAttributedToLakeside’s Agent IRIs are minted from the HRIS team registry; one Team Agent per dbt model owner
Column-level lineage facetprov:wasDerivedFrom at Field granularityField-to-Field edge with the producing Run as middle termThe facet’s transformation type carries through as lksb:transformationKind
Data quality facetdqv:hasQualityMeasurementMeasurement node attached to the Run and propagated to the OutputsSHACL validation results from Part 11a stage 7 land here as additional Measurements
Schema facetdcat:schemaDCAT schema node per Dataset versionDrives the schema-drift reconciliation that keeps the CDE mappings honest

A diagram showing the OpenLineage to PROV-O bridge in Lakeside's pipeline. Top band: three source boxes labeled "Apache Spark (3,200 jobs/day)," "dbt (1,400 models)," and "Airflow ingest (200 pipelines)" each with a small lineage icon and an arrow pointing down. The arrows converge into a central horizontal panel labeled "OpenLineage metadata store (event store)" in slate color. Below it, a single arrow points down to a deep-teal box labeled "OL-to-PROV-O ingestion service." Below that, an arrow points into a panel labeled "Governance Knowledge Graph (named graph: lksb:graph/lineage/{release-window})" with eight inset chips arranged in two rows showing the eight-row mapping: Run to prov:Activity, Job to prov:Plan, Input to prov:used, Output to prov:wasGeneratedBy, Producer to prov:Agent, Column lineage to prov:wasDerivedFrom, Quality facet to dqv:hasQualityMeasurement, Schema facet to dcat:schema. To the right of the central panel, a small annotation box reads "illustrative volumes: events per day on the order of hundreds of thousands of runs; ingest service throughput target: <30s end-to-end from emit to graph-readable." A red dotted callout on the left side reads "the operational pipeline already emits these events; the governance bridge is the consumption pattern, not new instrumentation." Bottom caption: "one event stream feeds operational lineage, governance reporting, and the Part 11c agent retrieval surface; no second pipeline."

The Lakeside-specific operational details are worth stating because they are where most governance programs lose the bridge. First, the named-graph naming convention is https://lakeside.com/graph/lineage/{system}/{release-window} per the Part 8 versioning rules; Q1 2026 lineage from the warehouse lands in https://lakeside.com/graph/lineage/cloud-warehouse/2026-Q1 and never gets overwritten. Second, the seven-field provenance contract from Part 7 plus the validatedAgainstShapes field from Part 8 are populated automatically: the Run carries where (system from the OL job namespace), what process (the Job IRI), who (the Agent IRI mapped from HRIS), when (the Run timestamps), trust level (set by source policy: gold for production warehouse jobs, silver for sandbox, bronze for ad hoc query exports), source hash (the Dataset content hash from the OL schema facet), and validatedAgainstShapes (the SHACL ShapeGraph IRI that approved the Run’s outputs). Third, when an OL event arrives without a column-level lineage facet (some legacy Spark jobs do not emit it), the ingestion service flags the Run as silver-tier rather than dropping it, so the governance graph carries the gap explicitly.

The bridge is what turns “we have lineage” (the OpenLineage metadata store has events) into “we can answer attribute-level lineage queries” (the governance graph has typed edges). Most banks have the first; Lakeside built the second.

CDEs As Typed Nodes: Counterparty Credit Exposure Across 22 Fields

Lakeside’s CDE inventory has 280 elements at the CDE meta-model shape (1:20 implementing-field ratio gives roughly 5,600 implementing fields). Pre-KG, the inventory lived in a Confluence space cross-referenced to a spreadsheet that pointed at column paths in the warehouse; the spreadsheet had drifted across two quarterly cycles before the Müller incident exposed it. Post-KG, every CDE is a typed node in the governance graph with the small fixed set of relationships from Part 10.

Counterparty Credit Exposure (CCE) is the worked example because it is the CDE the examiner asked about and because it has the most fan-out at Lakeside (22 implementing fields across 5 systems). The pattern.

EdgeFrom CCE node toWhat it carries at Lakeside
cde:typeReferenceFIBO conceptsfibo-loan:CreditExposure plus fibo-be:LegalEntity for the counterparty term
cde:hasOwnerTreasury Risk Reporting team Agent IRIThe single accountable team; HRIS-sourced; inherits to all 22 implementing fields
cde:appliesPolicyBCBS 239 Principle 3 plus ECB RDARR plus IFRS 9 disclosureThree Policy nodes; cross-walked once at the CDE level rather than per-field
cde:hasQualityRuleSHACL ShapeGraph plus Lakeside DQ rule”must be non-negative; must reconcile to GL within 0.5%; must include all positions resolved by the operational graph”
cde:dependsOnFive upstream CDEsCounterparty Identifier, Exposure Amount, Recovery Rate, Default Probability, Collateral Coverage Ratio
cde:hasImplementation22 Field nodes across 5 systemsCloud-warehouse fct_exposure (8 fields), table-format-lake position-stream (5 fields), trust accounting GL feed (4 fields), credit-memo-extracted Track 2 fields (3 fields), private-wealth holdings system (2 fields)

A diagram showing Counterparty Credit Exposure as a typed CDE node at the center of its surrounding subgraph. Center: a deep teal hexagon labeled "CDE: Counterparty Credit Exposure (lksb:cde-CCE-001)" with a FIBO badge underneath reading "typeReference: fibo-loan:CreditExposure + fibo-be:LegalEntity." Around the hexagon, six radiating edge groups. Top: one edge labeled "cde:hasOwner" leading to a green Owner chip "Treasury Risk Reporting Team (Agent)." Upper right: three edges labeled "cde:appliesPolicy" leading to three red-orange Policy chips "BCBS 239 Principle 3," "ECB RDARR Attribute-Level Lineage," "IFRS 9 Disclosure." Right: one edge labeled "cde:hasQualityRule" leading to an amber Rule chip "SHACL: must be non-negative; reconcile to GL within 0.5%; include all positions" with a small SHACL ShapeGraph badge. Lower right: five edges labeled "cde:dependsOn" leading to five smaller deep-teal CDE hexagons "Counterparty Identifier (CCE-002)," "Exposure Amount (CCE-003)," "Recovery Rate (CCE-004)," "Default Probability (CCE-005)," "Collateral Coverage Ratio (CCE-006)." Bottom: a fan-out of 22 thin slate-gray edges all labeled "cde:hasImplementation" leading to small Field nodes grouped into 5 subgroups by system: "Cloud-warehouse fct_exposure (8 fields: cce_amt, cpty_id, eff_dt, rcvry_rt, dflt_prob, ...)," "Table-format-lake position-stream (5 fields)," "Trust accounting GL feed (4 fields)," "Credit-memo Track 2 extraction (3 fields, silver tier)," "Private-wealth holdings system (2 fields)." Lower left: one edge labeled "prov:wasAttributedTo" tracing back through one Field's last-modifying Activity to its Agent. Annotation in the upper left reads "1 CDE → 22 Fields across 5 systems; 280 CDEs total at Lakeside; 1:20 ratio per the CDE meta-model; reconciles automatically against OpenLineage schema-facet events." Caption at the bottom: "the CDE node is the load-bearing surface for the regulator's attribute-level lineage question; one node, six edge types, the entire governance answer."

The reconciliation discipline that keeps the CDE-to-Field mapping from rotting is the same one Part 10 specified: every OpenLineage schema-facet event that detects a new column in a tracked Dataset triggers a stewardship task (“does this implement an existing CDE, or is it a new concept”); every column rename updates the cde:hasImplementation edge atomically; every dropped column expires the edge with a prov:invalidatedAtTime annotation rather than deleting it. The Track 2 credit-memo-extracted fields (the three silver-tier fields above) carry a tier annotation forward through every governance query: when the regulator’s question allows silver-tier inputs, those fields appear; when the question requires gold-only inputs (Pillar 3 risk submission), those fields are filtered out at query time.

The 22-field fan-out for CCE is on the larger end at Lakeside; the median CDE has 12-15 implementing fields. The largest CDE (Customer Identifier, which spans every customer-touching system) has 64 implementing fields; the smallest CDEs are configuration values with one implementing field each. The 5,600 total implementing-field count is what the catalog crawler sees; the CDE layer is what governance queries against.

Regulatory Cross-Walks: Four Templates Against One Graph

The Tuesday-morning examiner question is one shape of regulatory ask; it is not the only shape. Lakeside operates against four regulators concurrently, each with a different question shape, all running as templates against the same governance graph. The four regulators in scope.

RegulatorRegulationWhat it asks Lakeside
Federal Reserve plus OCCBCBS 239 Principle 3 (Accuracy and Integrity)Attribute-level lineage from risk reports back through every contributing transformation, with quality and ownership
ECB (via the EU subsidiary)ECB RDARR Guide (2024) priority area: attribute-level Data LineageSame as BCBS 239 Principle 3 in shape, with stricter expectations on column-level coverage and timeliness of remediation
EU subsidiary scopeGDPR Article 30 (Records of Processing Activities)Records of every processing activity touching personal data, with purpose, recipient, retention, and security measures
EU AI Act conformance assessorEU AI Act Article 10 (Data Governance for high-risk AI systems; application deferred from 2 August 2026 to 2 December 2027 under the May 2026 EU Digital Omnibus agreement, provisional pending formal adoption)Provenance, collection methodology, bias detection, and quality checks for all data used to train, validate, or test the relationship-banker agent

Each regulation reduces to a SPARQL template that traverses the same graph along different terminal paths. The four templates are short enough to show in shape if not in full detail.

# BCBS 239 Principle 3 / ECB RDARR: attribute-level lineage for a CDE
SELECT ?field ?activity ?agent ?qualityScore ?ts ?tier
WHERE {
  ?cde lksb:cdeId "CCE-001" .
  ?field cde:implementsCDE ?cde .
  ?field prov:wasGeneratedBy ?activity .
  ?activity prov:wasAttributedTo ?agent ;
            prov:endedAtTime ?ts ;
            lksb:trustTier ?tier .
  OPTIONAL { ?activity dqv:hasQualityMeasurement ?qualityScore . }
  FILTER (?ts >= "2026-01-01"^^xsd:dateTime &&
          ?ts < "2026-04-01"^^xsd:dateTime &&
          ?tier = "gold")
}
ORDER BY DESC(?ts)
# GDPR Article 30 ROPA: every Activity touching personal data in the EU subsidiary
SELECT ?activity ?purpose ?recipient ?retention ?dataCategory ?tier
WHERE {
  ?activity a prov:Activity ;
            lksb:jurisdiction lksb:EU-Subsidiary ;
            lksb:processingPurpose ?purpose ;
            prov:wasAttributedTo ?recipient ;
            lksb:trustTier ?tier .
  ?activity prov:used | prov:wasGeneratedBy ?dataset .
  ?dataset dcat:schema ?schema .
  ?schema rdfs:member ?field .
  ?field skos:related gdpr:PersonalData ;
         lksb:retentionPolicy ?retention ;
         lksb:dataCategory ?dataCategory .
}
# EU AI Act Article 10: training-data provenance for the relationship-banker agent
SELECT ?trainingDataset ?activity ?agent ?qualityMeasurement ?biasFinding
WHERE {
  ?model lksb:modelId "relationship-banker-agent-v3" .
  ?model prov:wasDerivedFrom+ ?trainingDataset .
  ?trainingDataset a lksb:TrainingDataset ;
                   prov:wasGeneratedBy ?activity ;
                   lksb:trustTier "gold" .
  ?activity prov:wasAttributedTo ?agent .
  OPTIONAL { ?trainingDataset dqv:hasQualityMeasurement ?qualityMeasurement . }
  OPTIONAL { ?trainingDataset lksb:biasFinding ?biasFinding . }
}
# BCBS 239 Principle 6: ad hoc cross-scenario report production
SELECT ?cde ?datasetMaterializing ?stressScenario ?lastRun
WHERE {
  ?cde a cde:CriticalDataElement ;
       cde:appliesPolicy lksb:bcbs239-principle-6 .
  ?datasetMaterializing prov:used ?fieldImplementing ;
                        lksb:stressScenario ?stressScenario .
  ?fieldImplementing cde:implementsCDE ?cde .
  ?datasetMaterializing prov:wasGeneratedBy ?run .
  ?run prov:endedAtTime ?lastRun .
}

The four templates share a common substrate: every query starts from a CDE, a Model, an Activity, or a Dataset that already exists in the operational graph from Part 11a, traverses through PROV-O edges that the OpenLineage bridge populates, filters by trust tier, and projects governance metadata that the seven-field provenance contract guarantees is present. New regulations become new templates. The 2024 Golpayegani et al. paper on KG-based mappings between the EU AI Act and international standards is the methodological grounding for this approach; the SPARQL templates above are the Lakeside-specific application.

A diagram showing Lakeside's regulatory cross-walk against one governance graph. Center: a horizontal panel labeled "Lakeside Governance Knowledge Graph" filled with a dense web of nodes and edges (CDE hexagons, Field circles, Activity rectangles, Dataset rounded rectangles, Agent triangles, Policy chips, Model squares for the relationship-banker agent), with a small grey legend at the bottom of the panel reading "shared provenance contract: where, what process, who, when, trust level, source hash, validatedAgainstShapes." Four arrows radiate outward to four query-template chips around the perimeter. Upper left chip in slate: "BCBS 239 Principle 3 + ECB RDARR" with a SPARQL fragment annotation ?field cde:implementsCDE ?cde . ?field prov:wasGeneratedBy ?activity and a badge "attribute-level lineage; gold-tier-only." Upper right chip in teal: "GDPR Article 30 ROPA (EU subsidiary scope)" with annotation ?activity prov:used ?dataset . ?field skos:related gdpr:PersonalData and a badge "all-tiers; tier-as-column." Lower right chip in amber: "EU AI Act Article 10 (relationship-banker agent training data)" with annotation ?model prov:wasDerivedFrom+ ?trainingDataset . ?trainingDataset lksb:trustTier 'gold' and a badge "high-risk AI; application date deferred to Dec 2027; gold-only training." Lower left chip in green: "BCBS 239 Principle 6 (cross-scenario ad hoc reports)" with annotation ?cde cde:appliesPolicy ... ?datasetMaterializing lksb:stressScenario and a badge "ad-hoc capacity test." Around the perimeter, a small dotted box on the lower edge labeled "Future regulation X (e.g. UK FCA, MAS, FFIEC update)" with annotation 'new SPARQL template; same graph; no new store.' Bottom caption: 'four regulators, four query templates, one graph, one provenance contract; the work is template authoring, not store reconciliation.'

KEY INSIGHT: when your governance program treats new regulations as new SPARQL templates against an existing graph rather than new compliance projects against new stores, you have crossed the threshold the four-or-five-store anti-pattern from Part 10 named. Lakeside has not eliminated regulatory work; it has eliminated the per-regulation reconciliation work. The per-regulation work that remains is template authoring, owner attestation, and explanatory narrative for the regulator, all of which scale with the number of regulations linearly rather than quadratically.

Trust Tiers Across Lakeside’s Reporting Surfaces

The four-tier trust pattern from Part 7 (gold, silver, bronze, quarantine) extends the governance-reporting policy that Part 10 specified. At Lakeside, the policy is operationalized as a tier-by-surface table that every reporting team carries and every SPARQL template enforces.

Reporting surface at LakesideAllowed tiersWhy this discipline at Lakeside
Pillar 3 risk submission to Federal Reserve / OCCGold onlyBCBS 239 Principle 3 expects verified, owner-attested, validated content; submitting silver content would be a finding
Y-9C and FFIEC quarterly call reportsGold onlySame posture; trust-tier laundering at this surface is a regulatory finding
EU subsidiary FINREP and COREP submissions to the ECBGold onlySame posture under ECB RDARR
Internal management reports (ALCO, CRO desk reports)Gold and silver, with tier-visible per rowSpeed of availability matters; tier visibility lets readers calibrate trust
Operational dashboards (relationship-banker, AML investigator)Gold and silver; bronze allowed only with a “preliminary” badgeBronze is signal where the consumer can see the tier; tier hiding is the failure mode
GDPR Article 30 ROPA (EU subsidiary)All tiers; tier becomes a columnArticle 30 expects the full picture, including tentative or in-flight processing activities
Internal exploratory analysis and data-science notebooksAll tiers, with tier explicit in returned columnsBronze is a signal worth surfacing to analysts; mislabeling it as gold is the failure
EU AI Act Article 10 training data for the relationship-banker agentGold only for production training; silver for evaluation; bronze quarantinedArticle 10 requires high-quality training data; tier discipline is the operational implementation
EU AI Act Article 12 model behavior logs (post-deployment)Append-only into a per-model named graph; no tier filteringArticle 12 wants the full audit trail without filtering; the immutability is the assurance

The mechanical enforcement is the same as the Part 9 trust-tier-aware retrieval pattern: the SPARQL template carries a FILTER (?tier IN (...)) clause appropriate for the surface, the report-generation service refuses to render rows that violate the policy, and the tier appears explicitly in the rendered output where the policy permits silver or bronze content. The Tuesday-morning BCBS 239 template above filters to gold only; the ROPA template returns all tiers with a tier column; the AI Act Article 10 template filters to gold for production training. Three different surfaces, one substrate, three different filter clauses.

Versioning Across A Quarterly FIBO Release

The governance graph cannot stand still. FIBO publishes quarterly releases (the 2026 Q1 Production release contains roughly 2,450 classes across the FIBO domains, having promoted from the 2025 Q4 release that the worked example below pins against). Every quarterly FIBO release brings additions, deprecations, and occasionally renames. The bank’s Pillar 3 submission for Q1 2026 must reproduce against the FIBO version that was current when the submission was made; the relationship-banker agent in production must run against the latest FIBO without breaking; the ROPA report due in Q2 must reflect any new personal-data classifications that the FIBO Q1 release introduced. The Part 8 named-graph versioning pattern is what makes all three coexist.

Lakeside’s discipline is mechanical. Every FIBO release lands as a new ontology version IRI: https://lakeside.com/onto/fibo-extended/2025-Q4, https://lakeside.com/onto/fibo-extended/2026-Q1. The governance graph carries a named-graph version chain per ontology release; consumers pin to a specific version, or they bind to a stable alias (https://lakeside.com/onto/fibo-extended/current) that the platform redirects. The migration playbook from Part 8 (six stages: impact analysis, design, dual-write, migration window, deprecation enforcement, removal) runs once per quarterly cycle. A typical quarterly cycle at Lakeside is six to eight weeks of dual-write while the candidate version stabilizes, two weeks of consumer migration, then deprecation enforcement and the alias flip.

A diagram showing the named-graph version chain for the Lakeside FIBO-extended ontology across a quarterly release. Vertical axis is time top-to-bottom. Five named graphs stacked vertically as rounded rectangles in deep blue: v2025-Q3 (top, "September 2025; FIBO 2025 Q3"), v2025-Q4 (October 2025; FIBO 2025 Q4 import; LOAN module additive change for syndicated participations; MINOR bump to lksb 2.5.0), v2026-Q1-candidate (early February 2026; FIBO 2026 Q1 candidate; new BE classes for trust ownership; under dual-write), v2026-Q1 (March 2026; FIBO 2026 Q1 promoted; alias 'current' flips here), v2026-Q1-patch (April 2026; PATCH for SKOS glossary corrections). Each named graph rectangle carries metadata fields visible inside it: "version IRI," "dcterms:created," "prov:wasGeneratedBy," "owl:priorVersion to previous version." Vertical arrows on the right side connect each version to the next via "owl:priorVersion." To the right of the chain, a column shows five consumer applications and their current pinning: "Pillar 3 risk submission Q1 2026 (pinned to v2025-Q4 for reproducibility)," "ECB RDARR submission Q1 2026 (pinned to v2025-Q4)," "Relationship-banker agent v3 (pinned to v2026-Q1, gold-only training)," "ROPA Q2 2026 report (pinned to alias 'current' = v2026-Q1)," "Operational queries (alias 'current')." A horizontal bracket on the left side spans v2025-Q4 to v2026-Q1 labeled "deprecation cycle: 6-8 weeks dual-write + 2 weeks migration window." A small green annotation on the lower right reads "regulatory submissions stay reproducible by IRI; agents and operational consumers track current; ROPA picks up new personal-data classifications when the alias flips." Bottom caption: "the alias 'current' flips on a known date; consumers pin by IRI when reproducibility outweighs freshness; quarterly FIBO releases never break a past report."

The pinning matrix is what matters operationally. The Pillar 3 risk submission for Q1 2026 must reproduce against v2025-Q4 because that was the ontology version current when the data was generated; the relationship-banker agent runs against v2026-Q1 because the agent retrains every quarter and the trainer pins to the latest; the ROPA tracks the alias because GDPR ROPA wants the current state, not a frozen snapshot. The platform serves all three from the same graph store; the version IRIs are immutable. When the regulator asks “show me the Q1 2026 Pillar 3 submission against the ontology that was current at the time,” the answer is the named graph at v2025-Q4. When the regulator asks “show me your current ROPA,” the answer is the alias-resolved current view. Same question shape, different version semantics, both serviceable.

What this looks like in practice. Most KG programs that stand still during quarterly FIBO releases either freeze the import (and accumulate technical debt that will explode at year three) or unsafely roll forward (and discover a year later that a report cannot reproduce). The discipline that keeps Lakeside out of both traps is the named-graph version chain plus the consumer pinning matrix, applied to the ontology and to every governance subgraph. The cost is roughly one engineer-quarter per year of release-management work; the alternative is a multi-quarter ontology-debt remediation project that will surface during a regulatory exam.

Failure Modes Specific To The Governance Layer At Lakeside

Even with the operational graph in place from Part 11a, the governance layer has its own ways to fail. Six failure modes follow directly from the architecture’s seams; Lakeside is designed to avoid each.

FailureWhat it would look like at LakesideWhat the architecture does to prevent it
Governance KG as a fifth storeA separate governance graph runs alongside the operational graph; the two share no IRIsThe governance graph is the same store as the operational graph; named-graph partitions inside one store, not a separate database
OpenLineage events ingested as raw events, not mapped to PROV-OThe OpenLineage metadata store has events; the governance KG cannot answer attribute-level lineage queriesThe OL-to-PROV-O ingestion service is required; raw OL events are not the substrate
CDE-to-Field mapping rotThe mapping spreadsheet drifts; six months later the regulator’s question hits stale linkscde:hasImplementation reconciles automatically against OpenLineage schema-facet events; mapping rot is structurally prevented
Tier laundering at the regulatory submission boundaryA silver-tier Track 2 fact appears in a Pillar 3 row because the tier filter was applied at retrieval but not at report generationTier filtering is at every layer (retrieval, prompt assembly, report generation, post-render audit); the Part 9 three-layer pattern extended to reports
Schema drift across a quarterly FIBO releaseA regulator asks for Q3 2025 lineage; the Q3 2025 ontology is gone; the report cannot reproduceNamed-graph version chain plus consumer pinning matrix; immutable version IRIs; rollback is an alias change
Policy-as-document, not policy-as-nodeThe policy register exports PDFs; the governance KG has Policy nodes but no applies edges; cross-walks cannot runEvery Policy node is connected via applies edges to CDEs, Fields, and Activities at ingestion time; the cross-walk discipline is enforced as part of the policy-register-to-KG sync

KEY INSIGHT: The fix in every case is to refuse to treat governance as a separate program. The same graph, the same identity discipline, the same provenance contract, the same SHACL gate, the same versioning pattern. When a governance question requires reaching outside this set, the substrate has not yet absorbed the governance layer; the next regulator question will reopen the gap.

Diagnostic: Is Your Governance Layer Ready For Lakeside Discipline?

The diagnostic for whether a firm is ready for the governance layer (not the operational layer; that diagnostic was in Part 11a) is the eight-question variant below. Lakeside answered yes on all eight before the BCBS 239 program lead opened the SPARQL template for the Tuesday-morning examiner.

Diagnostic questionYes if…No means…
Are CDEs nodes in the governance graph (not rows in a spreadsheet)?Every CDE has a typed node; cde:hasImplementation edges reconcile against OpenLineage schema eventsSpreadsheet rot will surface within six months; build the CDE layer in the graph first
Is the OpenLineage to PROV-O bridge running across every CDE-implementing pipeline?The OpenLineage metadata store plus the ingestion service plus the eight-row mapping plus named-graph-per-release-window are all in productionAttribute-level lineage queries cannot run; the regulator’s question already exceeds your coverage
Are Policies first-class nodes with applies edges to CDEs, Fields, and Activities?Every regulatory citation is a node connected to the artifacts it governsPolicy-as-document means cross-walks cannot run as queries; the per-regulation reconciliation work returns
Is the trust tier carried forward through every reporting surface, not just at retrieval?Pillar 3 templates, ROPA templates, AI Act templates each carry an explicit tier filter; report generation enforces itTier laundering is the second incident waiting to happen
Is the seven-field provenance contract populated automatically by the OL bridge?Where, what process, who, when, trust level, source hash, validatedAgainstShapes are all set on every RunManual provenance is not sustainable; the bridge or its equivalent is required
Is there a named-graph version chain per ontology release, with consumer pinning?Quarterly FIBO releases land as new version IRIs; consumers pin or bind to aliasSchema drift will break a past report during a regulatory exam
Are SPARQL templates the deliverable for new regulations, not new tools?The next regulation is template authoring + owner attestation, not vendor procurementThe substrate is not yet the substrate; the four-store pattern is reconstructing inside the KG program
Is the governance graph the same store as the operational graph, partitioned by named graph?Operational consumers and regulators read from one substrate; partitioning is logical, not physicalThe fifth-store anti-pattern; reconciliation will return through a different door

A firm that answers yes on six or more is ready to start the governance layer; a firm that answers yes on all eight is in Lakeside posture. The gap between “operational graph is built” and “governance layer is absorbed” is roughly two quarters of focused work for a mid-size bank. Lakeside is at quarter four counting from the program start; quarter two of the governance layer specifically.

Do Next: Governance Layer Discipline Tier List

PriorityActionWhy It Matters
Now (this quarter)Audit how your governance graph (or governance program if no graph yet) treats CDEs. If they are spreadsheet rows instead of typed nodes, that is the highest-leverage fix.The CDE layer is the load-bearing surface for attribute-level lineage; spreadsheet rot is the most reliable predictor of governance-program failure.
Now (this quarter)Confirm that your OpenLineage emission covers every pipeline that touches a CDE-implementing dataset. Backfill emission before extending governance investment in any other direction.The 2024 ECB RDARR Guide and the EU AI Act Article 10 both demand attribute-level Data Lineage; the bridge is what turns OL events into governance answers.
Next (next two quarters)Author SPARQL templates for BCBS 239 Principle 3, GDPR Article 30 ROPA, and EU AI Act Article 10 against your governance graph. Run them quarterly as part of the standard control set.Cross-walking regulations is what a governance graph is for; new regulations become new templates, not new tools.
Next (next two quarters)Extend the trust-tier policy from agent retrieval to every reporting surface. Make tier a first-class field on every reportable triple. Enforce it at report generation, not just at retrieval.Tier laundering at the regulatory boundary is a finding; tier-at-retrieval-only is the second incident waiting to happen.
Soon (next year)Stand up a named-graph version chain per ontology release. Pin regulatory submissions by Version IRI; bind operational consumers and the agent to the alias. Run a quarterly FIBO release through the migration playbook to validate the discipline.A governance program that cannot reproduce a 2025 report against the 2025 ontology has the same archeology problem as Northwind, just delayed by two years.
Soon (next year)If your governance graph and your operational graph are still separate stores, plan the merge. Same store, named-graph partitioning, shared IRIs.The fifth-store anti-pattern is the most common reason a governance KG program fails to absorb the regulatory layer it was supposed to absorb.
Eventually (when stable)Document and publish a minimum viable consumer contract for every consumer of the governance graph (Pillar 3 reporting team, ROPA team, agent team) per the Part 8 contract pattern.Contracts are the discipline that lets the governance graph evolve without breaking past reports; without them, schema evolution is a recurring incident.

What Comes Next: Part 11c

Part 11b covered how Lakeside’s governance back office runs against the same graph as Part 11a’s operational front office: the OL-to-PROV-O bridge across Spark and dbt, CDEs as typed nodes anchoring attribute-level lineage, four SPARQL templates that absorb four regulators, the trust-tier-by-surface table, and the named-graph version chain that survives quarterly FIBO releases.

Part 11c takes the same graph and points the relationship-banker agent at it. The CoALA four-layer memory model from Part 9 maps to named graphs (semantic memory plus episodic memory) plus the agent’s working context plus a skill subgraph for procedural memory. The trust-tier-aware retrieval pattern enforces three policies in production: portfolio decisions strict-tier-floor (gold-only), client-meeting prep tier-segregated (gold plus silver in separate context), advisor-facing summaries tier-explicit-citation (every claim carries a tier badge). Part 11c also closes the trilogy with the cross-cutting lessons (what Lakeside got wrong, the contract discipline that keeps the graph operable, the cost-modeling preview that points to Appendix B) and a Do Next table that spans the foundation, the operational layer, the governance layer, and the agent layer.

One graph, one identity discipline, one provenance contract, three use cases. The relationship-banker agent is the third.

Sources & References

  1. Basel Committee on Banking Supervision: BCBS 239 Principles for effective risk data aggregation and risk reporting(2013)
  2. ECB Guide on effective risk data aggregation and risk reporting (RDARR)(2024)
  3. EU AI Act Article 10: Data and Data Governance(2024)
  4. GDPR Article 30: Records of processing activities(2018)
  5. OpenLineage Specification(2024)
  6. OpenLineage Column-Level Lineage Dataset Facet(2024)
  7. Marquez Project: Reference Implementation of OpenLineage(2024)
  8. W3C PROV-O: The PROV Ontology(2013)
  9. W3C DCAT 3: Data Catalog Vocabulary(2024)
  10. W3C SKOS: Simple Knowledge Organization System(2009)
  11. FIBO Quarterly Release Notes (EDM Council)(2026)
  12. FIB-DM: Finance Domain ontology transformed into an Enterprise Data Model (FIBO 2026 Q1 class count)(2026)
  13. FinCEN Beneficial Ownership Information Reporting Rule(2024)
  14. Golpayegani et al.: An Open Knowledge Graph-Based Approach for Mapping Concepts and Requirements between the EU AI Act and International Standards(2024)
  15. Gibson Dunn: EU AI Act Omnibus Agreement - Postponed High-Risk Deadlines and Other Key Changes(2026)

Stay in the loop

Get new articles on data governance, AI, and engineering delivered to your inbox.

No spam. Unsubscribe anytime.