The Hidden Risk of Institutional Knowledge

Wooden door with a paper exit sign stapled to it hangs open, showing a set of wooden stairs in the background.
Photo by Erik Mclean / Unsplash

At many universities, critical data doesn’t live in systems. It lives in people.

The analyst who understands how enrolment is actually calculated. The staff member who knows which fields to ignore. The one person who can explain why two reports don’t match.

Most of the time, this works. But eventually, that person leaves.

When Knowledge Walks Out the Door

When the only person who understands a dataset leaves the institution, the impact is rarely immediate, but it is almost always significant.

Questions that used to be answered in minutes now take days. Reports that once felt reliable become uncertain. Small discrepancies start to appear, and no one is quite sure why.

Over time, the institution begins to lose confidence in its own data.

I once worked with a university who had happily automated one of their government enrolment reporting tasks. It ran annually without fail and prepared the data for submission. And someone submitted it on time. 

One day, an analyst discovered that the institution was reporting that well over half its students were part-time. Given they were primarily serving full-time students, this was a problem. 

So as any good institutional research analyst would, the analyst tried to find the root cause. 

Ultimately, they learned: 

  1. The way full-time was encoded in the database had changed
  2. The person who had built the automation had retired, and the person replacing them was unaware of the change in data encoding. Further, because the process was fully automated, they were unaware they should test outputs manually

This is not a technical failure.

It is a governance failure.

Why This Happens

Institutional knowledge builds up naturally over time.

Analysts develop an understanding of data quirks, historical decisions, and informal rules that are never formally captured. They learn which tables to trust, which joins to avoid, and which edge cases matter.

Much of this knowledge never makes it into documentation. Even when it does, it is often incomplete or quickly becomes outdated.

In many cases, the only place the full logic exists is in someone’s head.

Where the Risk Shows Up

The risks associated with institutional knowledge are often subtle at first, but they compound over time:

  • Reports become inconsistent as different people interpret the data differently
  • Metrics drift because definitions are not applied consistently
  • New team members struggle to understand how data is structured
  • Analysts spend time rediscovering logic that was previously known
  • Decision-making slows as confidence in the data decreases

In more serious cases, this can lead to incorrect reporting to leadership or external stakeholders.

The Connection to Shadow Systems and Reports

In previous posts, I discussed how shadow systems emerge and how reports can become systems of record.

Institutional knowledge is often what holds those environments together.

Humans are being used as a substitute for application programming interfaces (or model context protocol or agent to agent communications). 

A spreadsheet may only make sense because someone knows how it was built. A dashboard may only be trusted because someone understands the logic behind it. When that knowledge is lost, the underlying systems become fragile.

This is one of the reasons shadow systems can be so difficult to manage — they often rely heavily on implicit knowledge.

Making Knowledge Visible

The goal is not to eliminate institutional knowledge, but to make it visible, shared, and transparent.

In practice, this means:

  • Making metric definitions explicit and accessible
  • Documenting key assumptions and transformations (and resourcing the knowledge owners with adequate time and tools to do so)
  • Identifying systems of record for important data
  • Using data lineage to map how data flows and evolves

These practices help move knowledge from individuals into the institution.

A More Resilient Data Environment

When knowledge is shared and visible, institutions become more resilient.

New team members can onboard more quickly. Analysts can spend less time rediscovering logic. Decisions can be made with greater confidence.

Perhaps most importantly, the institution is no longer dependent on any single individual to understand how its data works.

Governance as Continuity

Data governance is often framed as a compliance or documentation exercise.

In my view, data governance should be living, breathing documentation that updates live, identifies gaps, and can even recommend fixes. 
It should ensure that knowledge persists beyond individuals, that systems remain understandable over time, and that institutions can continue to trust their data even as people change.

Because in the end, the risk is not just that someone leaves. It’s the institutional knowledge walking out the door that can’t be easily replaced.

What institutional processes or datasets at your institution rely heavily on “the one person who knows how it works”?

In future posts, we’ll continue exploring how institutions can move from static documentation toward living governance - including the role of lineage, shared standards, and AI-supported governance approaches.