Building a Production Knowledge Graph at Lakeside Trust Bank: The Governance Layer
Part 11b of the Knowledge Graph Practitioner's Guide. The same Lakeside Trust Bank graph from Part 11a takes a Tuesday-morning examiner question, then collapses BCBS 239 attribute-level lineage, ECB RDARR Data Lineage expectations, GDPR Article 30 ROPA, and EU AI Act Article 10 training-data provenance into four SPARQL templates against one graph. Covers the OpenLineage to PROV-O bridge through Lakeside's Spark and dbt pipelines, the 280 CDEs as typed nodes with 5,600 hasImplementation edges (Counterparty Credit Exposure as the worked example), the trust-tier-by-reporting-surface table specialized to Lakeside, and the named-graph version chain across a quarterly FIBO release.
Knowledge Graph Practitioner’s Guide: Overview | Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 | Part 7 | Part 8 | Part 9 | Part 10 | Part 11a | Part 11b | Part 11c | Appendix A | Appendix B | Appendix C | Part 12
Recap: Where Part 11a Left Off
Part 11a introduced Lakeside Trust Bank ($75B in assets, US-headquartered with an EU subsidiary, 1.2M retail customers, 22,000 commercial counterparties, ~280 CDEs at 1:20 implementing-field ratio) and decomposed the operational use case for a deliberate knowledge graph. The Monday-morning question (the Müller-family exposure reconciliation that took three weeks pre-KG and 180ms post-KG) forced the architecture: a modular ontology Lakeside imports (FIBO BE plus FIBO LOAN plus FIBO SEC plus FIBO FBC plus W3C Time plus PROV-O plus SKOS plus DCAT 3 plus SHACL, with a thin lksb: in-house module under 100 classes), an 8-stage construction pipeline that converges Track 1 (R2RML over the cloud data warehouse plus an open table-format lake plus the retail master-data MDM platform) and Track 2 (LLM extraction over the enterprise document store’s credit memos, KYC files, and advisor notes) at one identity, and a hybrid serving tier (the canonical RDF triple store on SPARQL plus a property-graph traversal-store view, see Appendix A for the specific tools). This piece picks up from there: the same graph absorbs the governance back office.
Tuesday Morning, Q1 2026: The Examiner Asked Five Questions
A Federal Reserve examiner sat down with Lakeside’s BCBS 239 program lead two months after the Müller-family incident and asked a five-part question. For Counterparty Credit Exposure on the Q1 2026 risk submission, attribute by attribute, the examiner wanted to see: every source field that contributes, every transformation that touches each contributing field, every owner accountable at every stage, every quality measurement that ran on every execution, and every change to the underlying ontology in the reporting period. The examiner had read the 2024 ECB Guide on effective risk data aggregation and risk reporting, and attribute-level Data Lineage was one of the seven priority areas the Guide had named. The five-part question was a structured BCBS 239 Principle 3 ask in the new ECB shape.
Pre-KG, the program lead would have responded the way Northwind Bancshares responded in Part 10: three weeks of reconciling four governance stores (catalog, lineage tool, glossary, policy register) that did not share identifiers, a partial answer, and the next quarter’s risk-management committee inheriting a known gap. Post-KG, the program lead opened a SPARQL template, parameterized it with the Counterparty Credit Exposure CDE IRI and the Q1 2026 reporting window, ran it once against the governance endpoint, and produced a structured response with attribute-level traversal, owner attribution, quality scores, and ontology version metadata in under 90 seconds. The examiner asked two follow-up questions in the same session. Both ran as variations on the same template.
The Tuesday-morning examiner question is the governance counterpart to the Monday-morning relationship-banker question from Part 11a. Same bank. Same graph. Different consumer, different reading pattern. The substrate did not change.
Specializing The Governance Ontology Shape To Lakeside
Part 10 locked the governance ontology shape: DCAT 3 plus PROV-O plus SKOS plus FIBO plus a thin in-house module, with six entity types (Dataset, Field, Job/Activity, CriticalDataElement, Policy, Owner) and a small set of typed relationships (partOf, wasGeneratedBy, implementsCDE, validatedAgainstShapes, applies, hasOwner). Lakeside imports that exact shape; the foundational layers from Part 11a already include DCAT 3, PROV-O, SKOS, and SHACL. The governance specialization is not a second ontology. It is a small set of governance-specific extensions to the same lksb: module.
| Governance extension | Lakeside class or property | What it adds beyond Part 11a |
|---|---|---|
| Critical Data Element | lksb:CriticalDataElement rdfs:subClassOf cde:CriticalDataElement | Typed node for each of the ~280 CDEs; carries owner, quality rule, regulatory driver |
| Field-to-CDE link | cde:hasImplementation (Field to CDE; inverse cde:implementsCDE) | The 1:N edge that anchors attribute-level lineage at the business-concept layer |
| Policy node | lksb:Policy rdfs:subClassOf policy:Policy | Each regulatory citation (BCBS 239 Principle 3, GDPR Art 30, AI Act Art 10) is a node, not a PDF reference |
| Personal data flag | gdpr:PersonalData SKOS Concept attached to Fields and CDEs | The ROPA query traverses this tag to bound personal-data scope |
| Training data flag | lksb:TrainingDataset subclass of dcat:Dataset | Every dataset used to train or evaluate the relationship-banker agent carries this type |
| Quality measurement | dqv:hasQualityMeasurement link from Field or Activity to a Measurement node | Already in Part 11a’s SHACL gate; promoted here to a first-class governance edge |
| Regulatory cross-walk link | lksb:satisfiesPrinciple (Activity to Policy) | The link a SPARQL template walks to answer “which BCBS 239 principles does this control satisfy” |
The discipline from Part 11a holds. The lksb: module stays under 100 classes. Every new governance class either subclasses an existing FIBO, DCAT, PROV-O, or SKOS construct or models a concept that none of those vocabularies cover at sufficient resolution. The thin in-house module is what prevents the governance project from accidentally inventing a parallel ontology for governance purposes. The failure mode this avoids is the one a governance program slides into without that discipline: a governance ontology that grows to hundreds of bespoke classes nobody can explain to FIBO consumers. Lakeside’s governance extensions instead add roughly twenty classes to the eighty already in lksb:.
What this looks like in practice. Most governance-KG projects fail at the ontology stage by assuming the governance vocabulary is its own thing. It is not. Governance is a reading pattern over the same operational graph, with a few additional typed edges. If your governance ontology cannot be expressed as a thin extension of your operational ontology, the operational graph and the governance graph will accidentally diverge, and within twelve months you will be reconciling them the way Northwind reconciled its four stores.
OpenLineage To PROV-O At Lakeside Scale
The connective tissue between the operational pipeline and the governance graph is the OpenLineage to PROV-O bridge that Part 7 introduced and Part 10 made mechanical. At Lakeside, the bridge runs across roughly 4,600 Spark and dbt jobs that emit OpenLineage natively, plus around 200 Airflow-orchestrated ingest pipelines that emit through the OpenLineage Airflow integration. The events flow through the OpenLineage metadata store (see Appendix A for the specific tools) that Lakeside operates as the OpenLineage event store; a small ingestion service reads from that store, applies the eight-row mapping table from Part 10 specialized for Lakeside’s Job naming conventions and operator vocabulary, and asserts the result into a designated named graph in the governance KG.
| OpenLineage concept | PROV-O concept | Lakeside KG predicate | Lakeside-specific note |
|---|---|---|---|
| Run | prov:Activity | One node per run; IRI under https://lakeside.com/kg/run/{run-uuid} | Run UUIDs come from the OpenLineage event; never re-minted |
| Job | prov:Plan | prov:wasAssociatedWith from Run to Job | Job IRIs use the namespace + name pattern from the OL spec; one Job per dbt model or Spark application |
| Input dataset | Entity used by Activity | prov:used | Datasets are already DCAT 3 nodes in the governance graph; the Run links them by IRI |
| Output dataset | Entity generated by Activity | prov:wasGeneratedBy | Same Dataset IRI is used across Run, governance reporting, and the Part 9 agent retrieval surface |
| Producer (operator team) | prov:Agent | prov:wasAttributedTo | Lakeside’s Agent IRIs are minted from the HRIS team registry; one Team Agent per dbt model owner |
| Column-level lineage facet | prov:wasDerivedFrom at Field granularity | Field-to-Field edge with the producing Run as middle term | The facet’s transformation type carries through as lksb:transformationKind |
| Data quality facet | dqv:hasQualityMeasurement | Measurement node attached to the Run and propagated to the Outputs | SHACL validation results from Part 11a stage 7 land here as additional Measurements |
| Schema facet | dcat:schema | DCAT schema node per Dataset version | Drives the schema-drift reconciliation that keeps the CDE mappings honest |
The Lakeside-specific operational details are worth stating because they are where most governance programs lose the bridge. First, the named-graph naming convention is https://lakeside.com/graph/lineage/{system}/{release-window} per the Part 8 versioning rules; Q1 2026 lineage from the warehouse lands in https://lakeside.com/graph/lineage/cloud-warehouse/2026-Q1 and never gets overwritten. Second, the seven-field provenance contract from Part 7 plus the validatedAgainstShapes field from Part 8 are populated automatically: the Run carries where (system from the OL job namespace), what process (the Job IRI), who (the Agent IRI mapped from HRIS), when (the Run timestamps), trust level (set by source policy: gold for production warehouse jobs, silver for sandbox, bronze for ad hoc query exports), source hash (the Dataset content hash from the OL schema facet), and validatedAgainstShapes (the SHACL ShapeGraph IRI that approved the Run’s outputs). Third, when an OL event arrives without a column-level lineage facet (some legacy Spark jobs do not emit it), the ingestion service flags the Run as silver-tier rather than dropping it, so the governance graph carries the gap explicitly.
The bridge is what turns “we have lineage” (the OpenLineage metadata store has events) into “we can answer attribute-level lineage queries” (the governance graph has typed edges). Most banks have the first; Lakeside built the second.
CDEs As Typed Nodes: Counterparty Credit Exposure Across 22 Fields
Lakeside’s CDE inventory has 280 elements at the CDE meta-model shape (1:20 implementing-field ratio gives roughly 5,600 implementing fields). Pre-KG, the inventory lived in a Confluence space cross-referenced to a spreadsheet that pointed at column paths in the warehouse; the spreadsheet had drifted across two quarterly cycles before the Müller incident exposed it. Post-KG, every CDE is a typed node in the governance graph with the small fixed set of relationships from Part 10.
Counterparty Credit Exposure (CCE) is the worked example because it is the CDE the examiner asked about and because it has the most fan-out at Lakeside (22 implementing fields across 5 systems). The pattern.
| Edge | From CCE node to | What it carries at Lakeside |
|---|---|---|
cde:typeReference | FIBO concepts | fibo-loan:CreditExposure plus fibo-be:LegalEntity for the counterparty term |
cde:hasOwner | Treasury Risk Reporting team Agent IRI | The single accountable team; HRIS-sourced; inherits to all 22 implementing fields |
cde:appliesPolicy | BCBS 239 Principle 3 plus ECB RDARR plus IFRS 9 disclosure | Three Policy nodes; cross-walked once at the CDE level rather than per-field |
cde:hasQualityRule | SHACL ShapeGraph plus Lakeside DQ rule | ”must be non-negative; must reconcile to GL within 0.5%; must include all positions resolved by the operational graph” |
cde:dependsOn | Five upstream CDEs | Counterparty Identifier, Exposure Amount, Recovery Rate, Default Probability, Collateral Coverage Ratio |
cde:hasImplementation | 22 Field nodes across 5 systems | Cloud-warehouse fct_exposure (8 fields), table-format-lake position-stream (5 fields), trust accounting GL feed (4 fields), credit-memo-extracted Track 2 fields (3 fields), private-wealth holdings system (2 fields) |
The reconciliation discipline that keeps the CDE-to-Field mapping from rotting is the same one Part 10 specified: every OpenLineage schema-facet event that detects a new column in a tracked Dataset triggers a stewardship task (“does this implement an existing CDE, or is it a new concept”); every column rename updates the cde:hasImplementation edge atomically; every dropped column expires the edge with a prov:invalidatedAtTime annotation rather than deleting it. The Track 2 credit-memo-extracted fields (the three silver-tier fields above) carry a tier annotation forward through every governance query: when the regulator’s question allows silver-tier inputs, those fields appear; when the question requires gold-only inputs (Pillar 3 risk submission), those fields are filtered out at query time.
The 22-field fan-out for CCE is on the larger end at Lakeside; the median CDE has 12-15 implementing fields. The largest CDE (Customer Identifier, which spans every customer-touching system) has 64 implementing fields; the smallest CDEs are configuration values with one implementing field each. The 5,600 total implementing-field count is what the catalog crawler sees; the CDE layer is what governance queries against.
Regulatory Cross-Walks: Four Templates Against One Graph
The Tuesday-morning examiner question is one shape of regulatory ask; it is not the only shape. Lakeside operates against four regulators concurrently, each with a different question shape, all running as templates against the same governance graph. The four regulators in scope.
| Regulator | Regulation | What it asks Lakeside |
|---|---|---|
| Federal Reserve plus OCC | BCBS 239 Principle 3 (Accuracy and Integrity) | Attribute-level lineage from risk reports back through every contributing transformation, with quality and ownership |
| ECB (via the EU subsidiary) | ECB RDARR Guide (2024) priority area: attribute-level Data Lineage | Same as BCBS 239 Principle 3 in shape, with stricter expectations on column-level coverage and timeliness of remediation |
| EU subsidiary scope | GDPR Article 30 (Records of Processing Activities) | Records of every processing activity touching personal data, with purpose, recipient, retention, and security measures |
| EU AI Act conformance assessor | EU AI Act Article 10 (Data Governance for high-risk AI systems; application deferred from 2 August 2026 to 2 December 2027 under the May 2026 EU Digital Omnibus agreement, provisional pending formal adoption) | Provenance, collection methodology, bias detection, and quality checks for all data used to train, validate, or test the relationship-banker agent |
Each regulation reduces to a SPARQL template that traverses the same graph along different terminal paths. The four templates are short enough to show in shape if not in full detail.
# BCBS 239 Principle 3 / ECB RDARR: attribute-level lineage for a CDE
SELECT ?field ?activity ?agent ?qualityScore ?ts ?tier
WHERE {
?cde lksb:cdeId "CCE-001" .
?field cde:implementsCDE ?cde .
?field prov:wasGeneratedBy ?activity .
?activity prov:wasAttributedTo ?agent ;
prov:endedAtTime ?ts ;
lksb:trustTier ?tier .
OPTIONAL { ?activity dqv:hasQualityMeasurement ?qualityScore . }
FILTER (?ts >= "2026-01-01"^^xsd:dateTime &&
?ts < "2026-04-01"^^xsd:dateTime &&
?tier = "gold")
}
ORDER BY DESC(?ts)
# GDPR Article 30 ROPA: every Activity touching personal data in the EU subsidiary
SELECT ?activity ?purpose ?recipient ?retention ?dataCategory ?tier
WHERE {
?activity a prov:Activity ;
lksb:jurisdiction lksb:EU-Subsidiary ;
lksb:processingPurpose ?purpose ;
prov:wasAttributedTo ?recipient ;
lksb:trustTier ?tier .
?activity prov:used | prov:wasGeneratedBy ?dataset .
?dataset dcat:schema ?schema .
?schema rdfs:member ?field .
?field skos:related gdpr:PersonalData ;
lksb:retentionPolicy ?retention ;
lksb:dataCategory ?dataCategory .
}
# EU AI Act Article 10: training-data provenance for the relationship-banker agent
SELECT ?trainingDataset ?activity ?agent ?qualityMeasurement ?biasFinding
WHERE {
?model lksb:modelId "relationship-banker-agent-v3" .
?model prov:wasDerivedFrom+ ?trainingDataset .
?trainingDataset a lksb:TrainingDataset ;
prov:wasGeneratedBy ?activity ;
lksb:trustTier "gold" .
?activity prov:wasAttributedTo ?agent .
OPTIONAL { ?trainingDataset dqv:hasQualityMeasurement ?qualityMeasurement . }
OPTIONAL { ?trainingDataset lksb:biasFinding ?biasFinding . }
}
# BCBS 239 Principle 6: ad hoc cross-scenario report production
SELECT ?cde ?datasetMaterializing ?stressScenario ?lastRun
WHERE {
?cde a cde:CriticalDataElement ;
cde:appliesPolicy lksb:bcbs239-principle-6 .
?datasetMaterializing prov:used ?fieldImplementing ;
lksb:stressScenario ?stressScenario .
?fieldImplementing cde:implementsCDE ?cde .
?datasetMaterializing prov:wasGeneratedBy ?run .
?run prov:endedAtTime ?lastRun .
}
The four templates share a common substrate: every query starts from a CDE, a Model, an Activity, or a Dataset that already exists in the operational graph from Part 11a, traverses through PROV-O edges that the OpenLineage bridge populates, filters by trust tier, and projects governance metadata that the seven-field provenance contract guarantees is present. New regulations become new templates. The 2024 Golpayegani et al. paper on KG-based mappings between the EU AI Act and international standards is the methodological grounding for this approach; the SPARQL templates above are the Lakeside-specific application.
KEY INSIGHT: when your governance program treats new regulations as new SPARQL templates against an existing graph rather than new compliance projects against new stores, you have crossed the threshold the four-or-five-store anti-pattern from Part 10 named. Lakeside has not eliminated regulatory work; it has eliminated the per-regulation reconciliation work. The per-regulation work that remains is template authoring, owner attestation, and explanatory narrative for the regulator, all of which scale with the number of regulations linearly rather than quadratically.
Trust Tiers Across Lakeside’s Reporting Surfaces
The four-tier trust pattern from Part 7 (gold, silver, bronze, quarantine) extends the governance-reporting policy that Part 10 specified. At Lakeside, the policy is operationalized as a tier-by-surface table that every reporting team carries and every SPARQL template enforces.
| Reporting surface at Lakeside | Allowed tiers | Why this discipline at Lakeside |
|---|---|---|
| Pillar 3 risk submission to Federal Reserve / OCC | Gold only | BCBS 239 Principle 3 expects verified, owner-attested, validated content; submitting silver content would be a finding |
| Y-9C and FFIEC quarterly call reports | Gold only | Same posture; trust-tier laundering at this surface is a regulatory finding |
| EU subsidiary FINREP and COREP submissions to the ECB | Gold only | Same posture under ECB RDARR |
| Internal management reports (ALCO, CRO desk reports) | Gold and silver, with tier-visible per row | Speed of availability matters; tier visibility lets readers calibrate trust |
| Operational dashboards (relationship-banker, AML investigator) | Gold and silver; bronze allowed only with a “preliminary” badge | Bronze is signal where the consumer can see the tier; tier hiding is the failure mode |
| GDPR Article 30 ROPA (EU subsidiary) | All tiers; tier becomes a column | Article 30 expects the full picture, including tentative or in-flight processing activities |
| Internal exploratory analysis and data-science notebooks | All tiers, with tier explicit in returned columns | Bronze is a signal worth surfacing to analysts; mislabeling it as gold is the failure |
| EU AI Act Article 10 training data for the relationship-banker agent | Gold only for production training; silver for evaluation; bronze quarantined | Article 10 requires high-quality training data; tier discipline is the operational implementation |
| EU AI Act Article 12 model behavior logs (post-deployment) | Append-only into a per-model named graph; no tier filtering | Article 12 wants the full audit trail without filtering; the immutability is the assurance |
The mechanical enforcement is the same as the Part 9 trust-tier-aware retrieval pattern: the SPARQL template carries a FILTER (?tier IN (...)) clause appropriate for the surface, the report-generation service refuses to render rows that violate the policy, and the tier appears explicitly in the rendered output where the policy permits silver or bronze content. The Tuesday-morning BCBS 239 template above filters to gold only; the ROPA template returns all tiers with a tier column; the AI Act Article 10 template filters to gold for production training. Three different surfaces, one substrate, three different filter clauses.
Versioning Across A Quarterly FIBO Release
The governance graph cannot stand still. FIBO publishes quarterly releases (the 2026 Q1 Production release contains roughly 2,450 classes across the FIBO domains, having promoted from the 2025 Q4 release that the worked example below pins against). Every quarterly FIBO release brings additions, deprecations, and occasionally renames. The bank’s Pillar 3 submission for Q1 2026 must reproduce against the FIBO version that was current when the submission was made; the relationship-banker agent in production must run against the latest FIBO without breaking; the ROPA report due in Q2 must reflect any new personal-data classifications that the FIBO Q1 release introduced. The Part 8 named-graph versioning pattern is what makes all three coexist.
Lakeside’s discipline is mechanical. Every FIBO release lands as a new ontology version IRI: https://lakeside.com/onto/fibo-extended/2025-Q4, https://lakeside.com/onto/fibo-extended/2026-Q1. The governance graph carries a named-graph version chain per ontology release; consumers pin to a specific version, or they bind to a stable alias (https://lakeside.com/onto/fibo-extended/current) that the platform redirects. The migration playbook from Part 8 (six stages: impact analysis, design, dual-write, migration window, deprecation enforcement, removal) runs once per quarterly cycle. A typical quarterly cycle at Lakeside is six to eight weeks of dual-write while the candidate version stabilizes, two weeks of consumer migration, then deprecation enforcement and the alias flip.
The pinning matrix is what matters operationally. The Pillar 3 risk submission for Q1 2026 must reproduce against v2025-Q4 because that was the ontology version current when the data was generated; the relationship-banker agent runs against v2026-Q1 because the agent retrains every quarter and the trainer pins to the latest; the ROPA tracks the alias because GDPR ROPA wants the current state, not a frozen snapshot. The platform serves all three from the same graph store; the version IRIs are immutable. When the regulator asks “show me the Q1 2026 Pillar 3 submission against the ontology that was current at the time,” the answer is the named graph at v2025-Q4. When the regulator asks “show me your current ROPA,” the answer is the alias-resolved current view. Same question shape, different version semantics, both serviceable.
What this looks like in practice. Most KG programs that stand still during quarterly FIBO releases either freeze the import (and accumulate technical debt that will explode at year three) or unsafely roll forward (and discover a year later that a report cannot reproduce). The discipline that keeps Lakeside out of both traps is the named-graph version chain plus the consumer pinning matrix, applied to the ontology and to every governance subgraph. The cost is roughly one engineer-quarter per year of release-management work; the alternative is a multi-quarter ontology-debt remediation project that will surface during a regulatory exam.
Failure Modes Specific To The Governance Layer At Lakeside
Even with the operational graph in place from Part 11a, the governance layer has its own ways to fail. Six failure modes follow directly from the architecture’s seams; Lakeside is designed to avoid each.
| Failure | What it would look like at Lakeside | What the architecture does to prevent it |
|---|---|---|
| Governance KG as a fifth store | A separate governance graph runs alongside the operational graph; the two share no IRIs | The governance graph is the same store as the operational graph; named-graph partitions inside one store, not a separate database |
| OpenLineage events ingested as raw events, not mapped to PROV-O | The OpenLineage metadata store has events; the governance KG cannot answer attribute-level lineage queries | The OL-to-PROV-O ingestion service is required; raw OL events are not the substrate |
| CDE-to-Field mapping rot | The mapping spreadsheet drifts; six months later the regulator’s question hits stale links | cde:hasImplementation reconciles automatically against OpenLineage schema-facet events; mapping rot is structurally prevented |
| Tier laundering at the regulatory submission boundary | A silver-tier Track 2 fact appears in a Pillar 3 row because the tier filter was applied at retrieval but not at report generation | Tier filtering is at every layer (retrieval, prompt assembly, report generation, post-render audit); the Part 9 three-layer pattern extended to reports |
| Schema drift across a quarterly FIBO release | A regulator asks for Q3 2025 lineage; the Q3 2025 ontology is gone; the report cannot reproduce | Named-graph version chain plus consumer pinning matrix; immutable version IRIs; rollback is an alias change |
| Policy-as-document, not policy-as-node | The policy register exports PDFs; the governance KG has Policy nodes but no applies edges; cross-walks cannot run | Every Policy node is connected via applies edges to CDEs, Fields, and Activities at ingestion time; the cross-walk discipline is enforced as part of the policy-register-to-KG sync |
KEY INSIGHT: The fix in every case is to refuse to treat governance as a separate program. The same graph, the same identity discipline, the same provenance contract, the same SHACL gate, the same versioning pattern. When a governance question requires reaching outside this set, the substrate has not yet absorbed the governance layer; the next regulator question will reopen the gap.
Diagnostic: Is Your Governance Layer Ready For Lakeside Discipline?
The diagnostic for whether a firm is ready for the governance layer (not the operational layer; that diagnostic was in Part 11a) is the eight-question variant below. Lakeside answered yes on all eight before the BCBS 239 program lead opened the SPARQL template for the Tuesday-morning examiner.
| Diagnostic question | Yes if… | No means… |
|---|---|---|
| Are CDEs nodes in the governance graph (not rows in a spreadsheet)? | Every CDE has a typed node; cde:hasImplementation edges reconcile against OpenLineage schema events | Spreadsheet rot will surface within six months; build the CDE layer in the graph first |
| Is the OpenLineage to PROV-O bridge running across every CDE-implementing pipeline? | The OpenLineage metadata store plus the ingestion service plus the eight-row mapping plus named-graph-per-release-window are all in production | Attribute-level lineage queries cannot run; the regulator’s question already exceeds your coverage |
Are Policies first-class nodes with applies edges to CDEs, Fields, and Activities? | Every regulatory citation is a node connected to the artifacts it governs | Policy-as-document means cross-walks cannot run as queries; the per-regulation reconciliation work returns |
| Is the trust tier carried forward through every reporting surface, not just at retrieval? | Pillar 3 templates, ROPA templates, AI Act templates each carry an explicit tier filter; report generation enforces it | Tier laundering is the second incident waiting to happen |
| Is the seven-field provenance contract populated automatically by the OL bridge? | Where, what process, who, when, trust level, source hash, validatedAgainstShapes are all set on every Run | Manual provenance is not sustainable; the bridge or its equivalent is required |
| Is there a named-graph version chain per ontology release, with consumer pinning? | Quarterly FIBO releases land as new version IRIs; consumers pin or bind to alias | Schema drift will break a past report during a regulatory exam |
| Are SPARQL templates the deliverable for new regulations, not new tools? | The next regulation is template authoring + owner attestation, not vendor procurement | The substrate is not yet the substrate; the four-store pattern is reconstructing inside the KG program |
| Is the governance graph the same store as the operational graph, partitioned by named graph? | Operational consumers and regulators read from one substrate; partitioning is logical, not physical | The fifth-store anti-pattern; reconciliation will return through a different door |
A firm that answers yes on six or more is ready to start the governance layer; a firm that answers yes on all eight is in Lakeside posture. The gap between “operational graph is built” and “governance layer is absorbed” is roughly two quarters of focused work for a mid-size bank. Lakeside is at quarter four counting from the program start; quarter two of the governance layer specifically.
Do Next: Governance Layer Discipline Tier List
| Priority | Action | Why It Matters |
|---|---|---|
| Now (this quarter) | Audit how your governance graph (or governance program if no graph yet) treats CDEs. If they are spreadsheet rows instead of typed nodes, that is the highest-leverage fix. | The CDE layer is the load-bearing surface for attribute-level lineage; spreadsheet rot is the most reliable predictor of governance-program failure. |
| Now (this quarter) | Confirm that your OpenLineage emission covers every pipeline that touches a CDE-implementing dataset. Backfill emission before extending governance investment in any other direction. | The 2024 ECB RDARR Guide and the EU AI Act Article 10 both demand attribute-level Data Lineage; the bridge is what turns OL events into governance answers. |
| Next (next two quarters) | Author SPARQL templates for BCBS 239 Principle 3, GDPR Article 30 ROPA, and EU AI Act Article 10 against your governance graph. Run them quarterly as part of the standard control set. | Cross-walking regulations is what a governance graph is for; new regulations become new templates, not new tools. |
| Next (next two quarters) | Extend the trust-tier policy from agent retrieval to every reporting surface. Make tier a first-class field on every reportable triple. Enforce it at report generation, not just at retrieval. | Tier laundering at the regulatory boundary is a finding; tier-at-retrieval-only is the second incident waiting to happen. |
| Soon (next year) | Stand up a named-graph version chain per ontology release. Pin regulatory submissions by Version IRI; bind operational consumers and the agent to the alias. Run a quarterly FIBO release through the migration playbook to validate the discipline. | A governance program that cannot reproduce a 2025 report against the 2025 ontology has the same archeology problem as Northwind, just delayed by two years. |
| Soon (next year) | If your governance graph and your operational graph are still separate stores, plan the merge. Same store, named-graph partitioning, shared IRIs. | The fifth-store anti-pattern is the most common reason a governance KG program fails to absorb the regulatory layer it was supposed to absorb. |
| Eventually (when stable) | Document and publish a minimum viable consumer contract for every consumer of the governance graph (Pillar 3 reporting team, ROPA team, agent team) per the Part 8 contract pattern. | Contracts are the discipline that lets the governance graph evolve without breaking past reports; without them, schema evolution is a recurring incident. |
What Comes Next: Part 11c
Part 11b covered how Lakeside’s governance back office runs against the same graph as Part 11a’s operational front office: the OL-to-PROV-O bridge across Spark and dbt, CDEs as typed nodes anchoring attribute-level lineage, four SPARQL templates that absorb four regulators, the trust-tier-by-surface table, and the named-graph version chain that survives quarterly FIBO releases.
Part 11c takes the same graph and points the relationship-banker agent at it. The CoALA four-layer memory model from Part 9 maps to named graphs (semantic memory plus episodic memory) plus the agent’s working context plus a skill subgraph for procedural memory. The trust-tier-aware retrieval pattern enforces three policies in production: portfolio decisions strict-tier-floor (gold-only), client-meeting prep tier-segregated (gold plus silver in separate context), advisor-facing summaries tier-explicit-citation (every claim carries a tier badge). Part 11c also closes the trilogy with the cross-cutting lessons (what Lakeside got wrong, the contract discipline that keeps the graph operable, the cost-modeling preview that points to Appendix B) and a Do Next table that spans the foundation, the operational layer, the governance layer, and the agent layer.
One graph, one identity discipline, one provenance contract, three use cases. The relationship-banker agent is the third.
Sources & References
- Basel Committee on Banking Supervision: BCBS 239 Principles for effective risk data aggregation and risk reporting(2013)
- ECB Guide on effective risk data aggregation and risk reporting (RDARR)(2024)
- EU AI Act Article 10: Data and Data Governance(2024)
- GDPR Article 30: Records of processing activities(2018)
- OpenLineage Specification(2024)
- OpenLineage Column-Level Lineage Dataset Facet(2024)
- Marquez Project: Reference Implementation of OpenLineage(2024)
- W3C PROV-O: The PROV Ontology(2013)
- W3C DCAT 3: Data Catalog Vocabulary(2024)
- W3C SKOS: Simple Knowledge Organization System(2009)
- FIBO Quarterly Release Notes (EDM Council)(2026)
- FIB-DM: Finance Domain ontology transformed into an Enterprise Data Model (FIBO 2026 Q1 class count)(2026)
- FinCEN Beneficial Ownership Information Reporting Rule(2024)
- Golpayegani et al.: An Open Knowledge Graph-Based Approach for Mapping Concepts and Requirements between the EU AI Act and International Standards(2024)
- Gibson Dunn: EU AI Act Omnibus Agreement - Postponed High-Risk Deadlines and Other Key Changes(2026)
Stay in the loop
Get new articles on data governance, AI, and engineering delivered to your inbox.
No spam. Unsubscribe anytime.