An Enterprise Data Governance Glossary Operators Can Use

A practical enterprise data governance glossary that turns business intelligence, stewardship, metadata, security, privacy, quality, and lifecycle terms into usable review language.

By Jovani Pink February 20, 2026 18 min — Systems & Complexity Notes

Outcome focus: Created a shared vocabulary and term-entry contract that helps governance, data engineering, analytics, security, and business teams align definitions before certifying data products.

The first enterprise glossary failed because it defined words and left decisions untouched.

The definitions were reasonable. Business Intelligence meant turning raw data into information for decisions. Data Steward meant the business role responsible for metadata and data quality. Metadata meant data about data. Data retention meant how long data was kept. Nobody argued much with the words.

Then the customer dashboard shipped.

Sales used one definition of active customer. Finance used another. Support wanted canceled customers included while service obligations were still open. Security wanted a sensitive field masked. Analytics wanted drill-through access to transaction-level detail. Legal asked whether the retention schedule allowed the historical table to exist in the first place.

The glossary had entries. It did not have operating force.

That is the standard I use now: a governance glossary is useful only if it changes how a data product is reviewed, approved, accessed, monitored, and retired. A definition should name the thing, but it should also tell the team who owns it, where it appears, how it is measured, and what decision it affects.

This post is a practical glossary for enterprise data governance. It is not a legal taxonomy, and it is not a replacement for frameworks like EDM Council DCAM, NIST CSF 2.0, NIST Privacy Framework, or NIST SP 800-53. It is the operating language I want in the room when business, analytics, platform, security, privacy, and governance teams have to make the same data product mean the same thing.

How to Use the Glossary#

A glossary entry should answer more than "what does this term mean?"

It should answer:

  • Who owns the term?
  • Which data products use it?
  • Which source is authoritative?
  • Which quality rules prove it is fit for use?
  • Which privacy or security controls apply?
  • Which downstream decisions break if the meaning changes?
A glossary term becomes useful when it connects definition, rule, product, control, decision, and evidence.

The tradeoff is precision over speed. It is faster to let every team define terms locally. It is also how enterprises end up with five revenue numbers, three customer counts, and a privacy review that discovers sensitive data after the dashboard is already trusted.

The better path is slower at the definition boundary and faster everywhere downstream.

Business and Analytics Terms#

These terms describe how data becomes decision support.

TermWorking definitionReview question
Business Intelligence (BI)Tools, processes, and practices that turn raw or modeled data into information for decision-making.Which decision or operating rhythm does this BI asset support?
DashboardA monitoring surface that shows current or frequently refreshed status.Is the dashboard for live monitoring, or is it being used as a static report?
ScorecardA performance snapshot against targets, goals, or thresholds.Are targets approved, current, and owned?
ReportA structured analytical output, often with more detail, context, and interpretation than a dashboard.Is the report exploratory, official, regulatory, or operational?
Drill downNavigation from a summary level into lower levels within the same hierarchy, such as year to quarter to month.Does the hierarchy match the governed dimensional model?
Drill throughNavigation from one analytical view into a related detail page or dataset filtered to a selected context.Is the detail view governed at the same classification and access level?
MetricA quantifiable measure of activity, quality, risk, cost, or outcome.Is the calculation documented and reproducible?
Key Performance Indicator (KPI)A metric tied to a strategic or operational target.Who owns the target, tolerance, and interpretation?
OLAPOnline Analytical Processing: multidimensional analysis across measures, dimensions, hierarchies, and aggregations.Are dimensions, grains, and aggregations consistent across tools?
Data miningStatistical or computational analysis to discover patterns, associations, clusters, anomalies, or predictive signals.Is the discovered pattern approved for action or only exploration?
Semantic layerA governed business-facing model of metrics, dimensions, relationships, and calculations.Is the semantic layer the source for official BI definitions?
Self-service analyticsA model where business users explore governed data with approved tools and guardrails.Which datasets are certified for self-service use?

Microsoft's Power BI drillthrough documentation is a useful reminder that navigation features carry governance implications. Drill-through often moves a user from summary to detail. That can change privacy risk, row-level access needs, and interpretation.

Governance Roles and Structures#

These terms describe who can define, approve, implement, and escalate.

TermWorking definitionReview question
Data Governance (DG)The operating system for managing, improving, protecting, and using data as an enterprise asset.Which decision rights, artifacts, and cadences make the governance real?
Governance frameworkThe roles, policies, processes, standards, controls, and measures that define how governance runs.Is the framework used in release gates or only documented?
Governance maturity modelA staged way to assess how repeatable, measured, and embedded governance practices are.What evidence moves a domain from one level to the next?
Data OwnerSenior business accountable for domain data outcomes, risk, source-of-truth decisions, and policy approval.Can this person approve tradeoffs when functions disagree?
Data StewardBusiness-facing authority for meaning, quality rules, classifications, usage guidance, and issue triage.Has the steward translated definitions into testable rules?
Data CustodianTechnical role that implements controls, pipelines, access, metadata capture, lineage, and operational reliability.Are approved governance rules automated and observable?
Business SMEFunctional expert who validates process reality, edge cases, and fitness for use.Which process or report does the SME represent?
Data Steward Working GroupDomain or cross-domain forum where stewards coordinate definitions, rules, issues, and changes.Does it resolve issues or only discuss them?
Executive SponsorSenior leader who provides funding, advocacy, escalation, and strategic priority for the governance program.What decision can the sponsor unblock?
Governance co-chairLeader responsible for running governance forums, managing agendas, tracking decisions, and linking working groups to executives.Are decisions captured with owners and due dates?
EDGCEnterprise Data Governance Committee: escalation body for cross-domain policy, exceptions, and unresolved risk.Which issues qualify for escalation?

I wrote a companion role model in Data Governance Roles Need Decision Rights. The short version here is simple: owners decide, stewards define, custodians implement, SMEs validate, and the committee escalates enterprise risk.

Governance Concepts#

These terms describe the rules of the operating model.

TermWorking definitionReview question
Data governance standardA mandatory rule or practice adopted by governed domains.How is compliance measured and enforced?
PolicyA formal statement of required behavior, risk posture, or control expectation.Who approved it, and what happens when it is violated?
StandardA specific required implementation pattern or minimum bar.Is it testable?
ProcedureStep-by-step process for executing a policy or standard.Who follows it, and how is evidence captured?
ControlA safeguard or process that reduces risk or enforces policy.Is the control preventive, detective, or corrective?
ExceptionApproved deviation from policy, standard, or threshold.Who accepted the risk, for how long, and with what compensating control?
WaiverTemporary permission to proceed despite unmet criteria.What expiry date and remediation plan exist?
Data domainBusiness area with related concepts, processes, data products, and ownership.Are domain boundaries clear enough to assign accountability?
Data productGoverned dataset, view, API, feature set, or analytical asset designed for consumption.What contract proves it is fit for use?
Source of truthThe authoritative source used to resolve conflicts for a defined data element or domain.Is authority scoped by purpose and time?
Golden recordConsolidated best representation of an entity, often produced through matching, survivorship, and merge rules.Which survivorship rules choose winning values?
Data contractExplicit agreement for schema, semantics, quality, freshness, ownership, and change behavior.Does breaking the contract block release?
CertificationGovernance approval that a data asset meets defined quality, metadata, access, lineage, and ownership standards.What evidence supports the certified label?

The failure mode is treating "source of truth" as a universal title. It is rarely universal. A billing system may be authoritative for invoice status. A CRM may be authoritative for account owner. A support platform may be authoritative for service obligations. A governed domain needs scoped authority, not slogans.

Data Management Concepts#

These terms describe how data is created, moved, described, and made reusable.

TermWorking definitionReview question
ETLExtract, Transform, Load: data is transformed before loading into the target.Where are transformation rules versioned and tested?
ELTExtract, Load, Transform: data is loaded first, then transformed inside the target platform.Which layers are raw, curated, and certified?
MDMMaster Data Management: rules, processes, and systems that maintain authoritative shared entities such as customer, product, supplier, or employee.Which domain entity needs a golden record, and why?
Reference dataShared code sets, classifications, hierarchies, and lookup values used across systems.Who approves changes to shared codes and hierarchies?
MetadataData about data: meaning, structure, ownership, lineage, classification, quality, use, and context.Is the metadata complete enough to support trust and impact analysis?
Business metadataBusiness definitions, policies, owners, usage constraints, classifications, and context.Can a business user understand and use it?
Technical metadataSchemas, data types, jobs, tables, columns, partitions, code, lineage, and operational properties.Can an engineer trace and operate it?
Operational metadataRuntime information such as freshness, failures, volume, latency, cost, and usage.Can the team see whether the asset is healthy?
Metadata managementProcesses and tools for maintaining metadata quality, access, lineage, and discoverability.Who keeps metadata current after release?
Data dictionaryTechnical inventory of fields, attributes, formats, constraints, and definitions.Is it synchronized with actual schemas?
Data catalogSearchable inventory of data assets, metadata, ownership, classifications, lineage, and usage signals.Can consumers find certified assets and understand restrictions?
Business glossaryApproved vocabulary of business terms, definitions, owners, and relationships.Are glossary terms linked to physical data assets?
Data lineageRecord of where data originates, how it moves, how it transforms, and where it is consumed.Can the team trace impact before changing a field?
Data profilingAnalysis of data structure, values, distributions, patterns, nulls, duplicates, and anomalies.Did profiling produce rules or only observations?

Microsoft Purview's data governance glossary and lineage documentation are useful examples of how catalog terms, classifications, assets, and lineage become a connected operating surface.

The most common mistake is separating glossary and catalog work. A glossary without asset links is vocabulary. A catalog without business terms is inventory. Governance needs both.

Security, Privacy, and Compliance Terms#

These terms describe how data is protected and constrained.

TermWorking definitionReview question
Information securityPolicies, controls, and practices that protect confidentiality, integrity, and availability.Which control objective applies to this asset?
PrivacyRules and practices for responsible collection, use, sharing, retention, and disposal of personal or sensitive data.What purpose, lawful basis, notice, and minimization constraints apply?
Data securityTechnical and administrative protection for data assets, including access control, encryption, masking, monitoring, and incident response.What protects the data at rest, in transit, and in use?
Data classificationCategorization of data by sensitivity, confidentiality, regulatory obligation, or handling requirement.Is classification applied at dataset and field level?
Sensitive dataData requiring additional protection because disclosure, misuse, or alteration creates harm or legal risk.Which fields require masking, approval, or special handling?
PIIPersonally identifiable information: data that identifies or can reasonably identify a person.Is this field direct, indirect, derived, or linkable?
PHIProtected health information under HIPAA context.Is the organization a covered entity, business associate, or outside HIPAA scope?
Data Loss Prevention (DLP)Processes and technologies that detect, classify, monitor, and prevent unauthorized exposure or exfiltration.Are DLP findings routed to owners and remediated?
Data maskingObscuring sensitive values while preserving some operational utility.Is masking irreversible, format-preserving, dynamic, or only display-level?
TokenizationReplacing sensitive values with tokens managed by a protected mapping service.Who can re-identify, and under what approval?
AnonymizationTransformation intended to prevent identification of individuals.Has re-identification risk been assessed for the actual context?
De-identificationRemoval or alteration of identifiers to reduce privacy risk, often under a specific regulatory or analytical framework.Which method and evidence prove the data is de-identified enough for the use?
PseudonymizationReplacing identifiers while retaining a way to relink with additional information.Is the key separated and controlled?
RBACRole-Based Access Control: granting permissions through assigned roles.Do roles map to business need and least privilege?
Audit loggingRecords of access, modification, administrative action, and policy events.Are logs complete enough to reconstruct what happened?
Audit trailEnd-to-end evidence chain showing who did what, when, and under which approval.Can audit evidence survive staff turnover and tool migration?
Data ethicsPrinciples for fair, accountable, transparent, and responsible data collection and use.Could the use be legal but still unacceptable?
Regulatory complianceMeeting applicable legal, contractual, and industry obligations.Which regulation or control actually applies?
Data residencyRequirement that data be stored in a defined geography.Which storage, backup, and replication locations are in scope?
Data sovereigntyLegal or jurisdictional control over data based on where it is located, processed, or accessed.Which laws govern processing and access?

For security and privacy terms, current link-outs matter. NIST CSF 2.0 provides a general cyber risk management frame. NIST SP 800-53 organizes detailed control families such as access control, audit and accountability, incident response, PII processing and transparency, and system integrity. NIST's RBAC project gives historical and technical context for role-based access control.

For de-identification, Google Sensitive Data Protection documents transformations such as masking, and HHS explains HIPAA de-identification through Expert Determination and Safe Harbor. For GDPR-oriented teams, the European Commission's GDPR principles are a cleaner anchor than secondhand summaries.

Lifecycle Management Terms#

These terms describe how data quality and data lifespan are governed.

TermWorking definitionReview question
Data qualityFitness of data for intended use across dimensions such as accuracy, completeness, consistency, timeliness, validity, uniqueness, and reliability.Which quality dimensions are critical for this decision?
AccuracyData correctly represents the real-world object, event, or measurement.What source or process verifies correctness?
CompletenessRequired values or records are present.Which missing values block use?
ConsistencyValues do not conflict across systems, records, or time.Which conflicts need survivorship rules?
TimelinessData is current enough for its use.What freshness SLA matters?
ValidityValues conform to allowed formats, ranges, codes, and rules.Which checks are automated?
UniquenessEntities or records are not duplicated beyond accepted rules.What match and merge logic applies?
ReliabilityData can be depended on repeatedly under expected conditions.What monitoring proves stability?
Data retentionPolicy defining how long data is stored for business, legal, risk, or operational needs.Which clock starts retention?
ArchivalMoving inactive data to long-term storage while preserving access, integrity, and policy compliance.Who can retrieve archived data and why?
Disposal or dispositionSecure deletion, destruction, anonymization, or transfer at the end of the lifecycle.What evidence proves disposal happened?
Legal holdSuspension of normal disposal because of litigation, investigation, or regulatory need.Which datasets and backups are included?
Data minimizationCollecting, storing, and processing only what is necessary for the defined purpose.Which fields can be removed without harming the purpose?
Purpose limitationUsing data only for specified, legitimate, and compatible purposes.Is the secondary use approved?
Storage limitationKeeping identifiable personal data no longer than necessary for the purpose, subject to legitimate exceptions.Which retention rule prevents indefinite storage?

Lifecycle terms are where privacy, cost, and analytics collide. Analysts often want history. Legal may need retention. Privacy may require minimization and deletion. Platform teams want storage cost under control. The glossary should not pretend those tensions disappear. It should name the decision owner and the evidence required.

Term Entry Contract#

The glossary becomes operational when each governed term has a minimum contract.

governance-glossary-entry.yaml
term: "active customer"
domain: "customer"
status: "approved"
owner: "customer data owner"
steward: "customer data steward"
definition: "A customer with an active billable contract during the reporting period."
business_context: "Used for executive reporting, retention analysis, and revenue operations."
authoritative_source: "crm.account_contract_status"
related_terms:
  - "billable customer"
  - "service obligation"
  - "churned customer"
quality_rules:
  - name: "contract status is populated"
    threshold: ">= 99.5 percent"
  - name: "reporting period has one status per account"
    threshold: "100 percent for certified reporting"
security_and_privacy:
  classification: "confidential"
  pii_fields:
    - "account_contact_email"
  access_model: "RBAC with steward approval for detail-level drill-through"
lineage:
  source: "crm"
  certified_asset: "customer_profile_certified"
  consumers:
    - "executive_customer_scorecard"
    - "retention_model_features"
change_control:
  breaking_change_requires:
    - "steward review"
    - "owner approval"
    - "consumer migration note"
retention:
  rule: "retain certified monthly snapshots for 7 years unless legal hold applies"
evidence:
  - "definition approval record"
  - "quality dashboard"
  - "lineage registration"
  - "access review log"

This looks heavier than a plain definition because plain definitions are not enough for enterprise use. A term that drives reporting, access, AI features, compliance, or customer action needs ownership, source, rules, controls, lineage, and change behavior.

What I Would Add Before Calling It Complete#

The starter glossary above covers the core vocabulary from business analytics, roles, governance, data management, security, privacy, compliance, and lifecycle management. Before using it as an enterprise standard, I would add a few organization-specific columns:

ColumnWhy it matters
DomainPrevents global terms from hiding local ownership
OwnerNames final accountability
StewardNames the person who maintains meaning and rules
System or asset linksConnects words to real tables, dashboards, APIs, and models
ClassificationConnects meaning to access and handling
Quality rulesTurns definitions into tests
ConsumersShows blast radius when a term changes
Change historyPreserves decisions and reversals
StatusSeparates proposed, approved, deprecated, and retired terms

The status field is underrated. Proposed terms should not be treated like approved terms. Deprecated terms should not vanish quietly. Retired terms should leave a pointer to their replacement or the reason they stopped being valid.

The Tradeoff#

Glossaries slow people down at the start.

Someone has to decide whether "customer" means buyer, account, household, patient, subscriber, merchant, tenant, or legal entity. Someone has to separate dashboard convenience from source-of-truth authority. Someone has to say that a field cannot be used for a new purpose until privacy and access have been reviewed.

That friction is the cost of enterprise meaning.

The alternative is local speed and global confusion. Every team moves quickly until a metric dispute reaches leadership, a model trains on an unstable feature, a regulatory request takes weeks, or a data quality incident reveals that nobody knows which definition was official.

I would rather pay the definition cost once and attach it to artifacts than keep paying the trust tax forever.

The Review Checklist#

Use this checklist when adding or approving a glossary term.

CheckPass condition
MeaningDefinition is specific enough to distinguish from adjacent terms
OwnerAccountable owner and steward are named
SourceAuthoritative source is scoped and linked
AssetsPhysical datasets, reports, models, or APIs are connected
RulesQuality rules and thresholds exist for governed use
ControlsClassification, access, masking, and retention are documented
LineageUpstream source and downstream consumers are known
ChangeBreaking-change approval and notification path exists
EvidenceApproval record, tests, and catalog metadata are available
StatusTerm lifecycle state is clear

That is the difference between a glossary and a governance asset.

A glossary is not successful because it has many entries. It is successful when the next dashboard, data product, AI feature, access request, and incident review all use the same language without reopening the same argument.

Back to all writing
On this page
  1. How to Use the Glossary
  2. Business and Analytics Terms
  3. Governance Roles and Structures
  4. Governance Concepts
  5. Data Management Concepts
  6. Security, Privacy, and Compliance Terms
  7. Lifecycle Management Terms
  8. Term Entry Contract
  9. What I Would Add Before Calling It Complete
  10. The Tradeoff
  11. The Review Checklist