Why calling an `Apple` an apple isn’t enough – why we need SNOMED CT to break down #DataSilos
This article is part of our `Data Standards 101` series, where we discuss common themes we come across as a team of informaticians and terminologists. In today’s article we deal with the ever so common theme of:
If we all agree to call an `Apple` an apple, then why are you telling use standards — surely you are selling a `solution` that solves no `real problem`… aka `snake oil`
I refer to this as the `finite conceptualisation` or `Humpty Dumpty problem`, which in the computational logic world is called `closed world` reasoning. A classic example in clinical medicine (but feel free to swap to another domain) is → My network/specialists (aka buddies) and I all agree that there is only one possible `meaning` for `fundus`, so when we write `abnormality in fundus`, everyone knows what it means…
For those of you who know what `fundus` means, well done! But for those who don’t — `fundus` could mean:
- Fundus of the eye — a part of your eye that your optician/ophthalmologist will look at
- Fundus of the stomach — a part of the stomach which has a specific context, if you want to know where an ulcer or growth is located
- Fundus of the uterus — a part of a woman’s reproductive organs, if you are interested in knowing where the placenta is attached or some growth is located
So what exactly did the `fundus` in the above statement refer to? We simply don’t know, because we have no context of what medical speciality the `specialists` belonged to. Yes, I can already hear some saying — surely, you are exaggerating Jay cos we read what is written and we already know what this fundus is.
The Humpty Dumpty Problem — `Apples` in Disneyland
You are right and that is exactly why I call it the `Humpty Dumpty` problem, who famously corrects Alice, when she misunderstands him by saying:
`Words mean exactly what I say they mean` — Humpty Dumpty
So let us understand how Humpty solves this problem:
- He has a gigantic head and knows the meaning of every word (and can arguably read other’s thoughts and correct them when they misunderstand the meaning)
- He knows everyone around and can shout loud enough to broadcast his words (and meaning) to everyone around
- …
I wish I were an `egg head` like Humpty Dumpty and could do all that, but sadly I am not. Neither do I or the rest of us live in Wonderland, where we can ignore the real world which has multiple contexts, specialities, languages, etc. So let us take a simple example:
- Mickey Mouse and Donald Duck agree that there is a thing called an `apple`.
- They agree that an apple is a red, edible fruit with a few seeds and a fleshy body.
- Now in Disneyland all Applesare red and they are always edible and grow on trees.
- Great, problem solved!
Meanwhile, in the Marvel Universe, we can blame the Avengers for being an unruly bunch cos they can’t agree what an `Apple` means. Here is what Marvel Universe complicated Appleslook like:
Reconciling `Apples` from Disneyland and Marvel Universe
Most of us would recognise that red and green apples are legit and yes we can recognise them as types of apples and would reconcile them as:
Apple |__ Red Apple |__ Green Apple
Yes, the botanists might want to complicate this and say
Apple |_ Edible Apple |_ Red Apple |_ Green Apple |_ Non edible Apple |_ Crab Apple |__ Hedge Apple
For convenience, we are avoiding those `golden green apples`… So continuing with the story, what on earth is Iron Man referring to? Well…
For those of us who use an Apple Mac, we know that `Apple` is not a fruit but a company that makes computers that metaphorically chew on words and numbers!
So how do we now reconcile our `world` view in this case? Where would this new `Apple` go? Now in order to sufficiently disambiguate (err simplify the world), we will have to complicate it a bit first by introducing some other `things` called
- Fruit
- Company
Luckily, we can all agree about these words and can use them to formalise our world a bit more. I am deliberately avoiding calling Apple `Malus` deliberately, so please lets not go Linnean nomenclature yet… 🙈
Note, that to avoid casual readers searching for `apple` and seeing two results with the same name Apple (the fruit) and Apple (the company), we already tweaked the latter’s name to `Apple Inc`. Moving on quickly, before you tell me off for being very American and using `Inc` for a business, let us focus on the topic of this post.
Solving the `same name, different things` problem
Mickey Mouse and Donald Duck can continue to refer to `apple` happily and in Disneyland and never care about the rest of us. Everyone in Disneyland can do this and live happy ever after!
But for the rest of us, we need to know about different types of `apples` and more importantly have an unique way of identifying them. In order to uniquely identify in the above list, we used `Apple` for the fruit and `Apple Inc` for the company. However, we share our world with computers that prefer `numbers/identifiers` instead of `names`. So to make things easier for a computer to record/retrieve the hundreds of fruits and companies in our list, we allow computers to give things in our list numerical (or alphanumeric) identifiers.
This in a way forms the basis of why SNOMED CT has two very important notions for every `conceptual thing` in clinical medicine.
- Concept Id — the unique identifier for the `conceptual thing`
- Fully Specified Name — the unique name for the `conceptual thing`
As you can imagine, clinical medicine is huge and has hundreds of thousands of these conceptual things (called concepts). So every single of those concepts has these concept ids and fully specified names.
Why do we need to solve this problem?
If we could be Mickey Mouse, Donald Duck or Humpty Dumpty and blissfully live in our versions of `Lala land` then you never have to worry about these unique names and identifiers. Over the past century, clinical medicine became more complicated due to ever growing understanding of diseases, treatments, etc. This has had an unintended but insidious effect — where now healthcare is fragmented in delivery and practice. So while it is nice to think of a `patient as the center of care`, often information about a patient is fragmented across multiple systems in hospitals, primary care, community care, etc. So let us accept that there are legitimate reasons why my `information` needs to exist across different systems. Historically, many systems and thought processes have evolved around the notion of everyone using a system knowing exactly what an `Apple` means. This hasn’t been much of an issue so far, since most typical healthcare delivery was by patients visiting the same GP (family doctor) or the local hospital. However, that is not the case now, patients move around, demands on healthcare means we get referred to the best hospital available, we use virtual consultations (post COVID). With this change in healthcare delivery landscape and clinical medicine is continuously exploding thanks to the genomics revolution, do we really think that the old `my Apple means exactly what I say I mean` still holds when patient data is shared across different systems?
Let us also accept that an Apple or a Sodium test or a CABG in one system might not mean exactly the same thing in another system. Let us not fool ourselves into thinking that my system or Apple is the only way everyone (or every other system) will understand things.
Where does SNOMED CT fit in?
Yes, there is value in us all talking and agreeing what things mean. Yes, specialists in a given discipline still get together and agree on what something actually means. So what changes? What changes is something very simple — we look for consensus beyond our system borders, we think outside the narrow confines of `my system` or `Disneyland`. Then we can think of what this means for the patient and for my colleagues beyond my immediate context. If all that sounds like hard work, here comes the good news!
This is exactly what SNOMED CT gives us out of the box. It is a consolidation of the broad consensus of what `conceptual things` in clinical practice must be called and how they relate to each other. It comes with ~300,000 concepts split across most of the commonly used `types` of things in clinical practice. So is there a reason why you aren’t thinking about using it? Over 40 countries in the world already agree that SNOMED CT is hugely valuable and have made it free for use for all their citizens and recognise it as the standard way for recording healthcare information.
So next time you are about to sit down and knock up your favourite list of `apples` or surgical procedures, diagnoses, interventions, lab tests, stop and think if you are about to reinvent the wheel…
SNOMED CT is more than just a dictionary
If all the above made it sound like SNOMED CT is just a dictionary of `conceptual things` in clinical practice, then that is an oversimplification. That is like saying Google is a dictionary of web site links (urls)! There is a lot of knowledge and thinking that went into how SNOMED CT is both created and maintained on an on-going basis. Attempting to cover all that in this article would make it overly long. So here is a checklist for those attempting to create a list of `apples` in their shed for when you might want to use SNOMED CT
- Does your list of apples/procedures/diagnoses/tests contain things that will be used across multiple systems e.g primary care, secondary care, community care, public health
- Does your list of apples/procedures/diagnoses/tests need to support multiple use cases from the same data — aka support business intelligence, analytics for downstream systems.e.g. Do you want to know how many patients with `chest cough` and `flu like symptoms` went on to become `SARS Cov2 positive` in 2 weeks?
- Is your list of apples/procedures/diagnoses/tests going to change in the near future?
- Does this change need to be synchronised across multiple systems (including downstream ones)?
- Does your list of apples/procedures/diagnoses/tests have things in them, whose meaning could change over time or borders. e.g. Are you now recording `diabetes mellitus` and think you got `Type I Diabetes mellitus` or `Type II Diabetes Mellitus` covered? How in the future `Mono-genetic Diabetes Mellitus` is a thing, how would your `apples list` cope?
- Last but not least, does your list have the same `same thing that has multiple names`? E.g. Primary Adrenal Insufficiency is also know as `Addison’s Disease`, so do you care about finding the same thing using either of those terms? Again, this is an important issue that SNOMED CT solves.
I would be remiss if I didn’t paraphrase an inspiring mentor Alan Rector, who famously once said → the world needs to move on from hand-crafted, cottage cheese industry of carefully crafted list of words to something more robust and computable. That computable thing while seemingly complex and foreign to some is what forms the logical basis of SNOMED CT. This should form the basis of a truly interoperable healthcare system, where what happens to a patient and why doesn’t depend on talking to Humpty Dumpty or understanding `apples`…
If you disagree with anything I have said, then please get in touch with me or leave a comment. If you have come across those who are compiling `apple lists`, then please share this article with them. If you need help getting away from `apple lists` and moving to SNOMED CT, but don’t know where to start, then drop me a line! This article is also available on our Medium site, if you wanted to check out other articles in the series!
For those of you who want to see how the original `fundus` example plays out in SNOMED CT, here is a tiny table that shows some `fundii` in clinical medicine with a bonus added in…
Other Resources
- Termlex SNOMED CT Implementation and Content Management services: https://termlex.com/snomed-ct/
- Termlex SNOMED CT Authoring, Maintenance and Migration services: https://termlex.com/snomed-ct-content-authoring-migration-services/