What’s in the SNOMED CT Galaxy? Exploring Content from different editions
We all know and hear that SNOMED CT is quite comprehensive and has a lot of content. The current International Edition (July 2022) release of SNOMED CT has 354,259 concepts in it!! 😱 Different member countries (e.g. UK, USA, Sweden, Australia, Netherlands, etc) all also add their own national content to their extensions! Some like the UK and Australia add content in the tens and hundreds (?) of thousands to their national extensions.
Are you curious to broadly see what sort of content (categories) exist in SNOMED CT?
Do you want to know what sort of content (categories) different countries have been adding to their national extensions?
Now you can!! Thanks to some great analysis by Jeremy Rogers at NHS Digital, we can finally peek under the bonnet/hood 😅
Here is a pretty picture 🎉 and what follows below are key take away messages
Content Distribution across Extensions
When visualised in this graphical and macro sense, a few things stand out. Perhaps these have always been obvious, but its nice to review them. So here are some of the stand out features
Member Extensions – Small, Medium & Large
One of the main things that stood out for us, once we aimed to visualise extensions side by side is the different in scale – ranging from hundreds to hundreds of thousands! So while this might not be a scientific way to describe these things, we had to define the following aribitrary categories for extensions based on size (total concept count):
Here is a screenshot, showing the various extensions grouped using the above criteria
We’ll broadly classify as ones with more than 100,000 concepts! We use this cut off because thats is approximately a third of the size of the current SNOMED CT International release (354,259 concepts). The big extensions are
- 🇦🇺 Australian Extension: 150,300
- 🇬🇧 United Kingdom: 411,859
- 🇮🇳 Indian Extension: 114,332 – published by CDACINDIA
- 🇸🇬 Singapore Extension published by MOH Holdings Pte Ltd (Singapore) perhaps belongs here, but we have no official releases available
- We suspect if National Library of Medicine (NLM) ever decided to merge RxNorm into the US Extension, then it might get into this category
These are extensions between 10,000 – 100,000 concepts. Here are the extensions that fall into this category:
- Argentinan extension: 40,842
- Spanish : 20,264
- Dutch: 13,344 published by Nictiz
- Norwegian : 16,611 published by Directorate of e-health
These are extensions of less than 10,000 concepts. The vast majority of member country extensions seem to fall into this category. The surprise entry here was the US extension, but that we suspect is because they do not include their drug and diagnostic content in this extension.
- The category where the most content exists in SNOMED CT as a whole (excluding drugs) is Clinical findings. This makes sense, since most of medicine is about recording findings, disease, disorders etc.
- The bulk of the content added to the national extensions is related to medicinal products (what are called drug dictionaries)
- The UK edition (maintained by our former colleagues at NHS Digital) takes the top prize for being the largest extension (note Singapore extension excluded due to the unavailability of an official release)
Here is a quick screenshot of the content distribution for a few selected extensions.
If you are curious about the exact numbers you can visualise them directly using Termnexus
Hierarchies extended by many extensions
- Surprising stable/core category where see multiple countries adding significant content are show here
- We think this points to the fact that some of the related hierarchies are in need of extending/revising (See recent Expo presentation by Sarah Harry for lab/organisms related extensions)
- Lesser known concept type that seems to be used more than expected by extensions is `Record artifact`. This isn’t the official statement, but a record artefact is the sort of concept that helps you combine/represent things at the interface of a terminology and information model.
- Believe it or not, the UK Edition has a lot of Simple refsets – 1152 refsets to be precise 😂
Drugs and Medicinal Products!
- Drug content deserves a special mention, given the extensive range of content that different extensions have added. Special mentions go to:
- Clinical drug content added by 🇦🇷Argentina, 🇮🇳 India and 🇳🇴 Norway
- Australia has ~140,000 concepts in different parts of the drug hierarchy
What is Termnexus
Termnexus is a terminology content management platform – offering terminology server, authoring and mapping functionality. It of course supports popular terminology standards including SNOMED CT.
Here is how you can visualise all these SNOMED CT editions in Termnexus.
- Go to https://uat.termnexus.com
- Click on the big purple button in the `Want to browse SNOMED CT section`
- In the `Dashboard` page, you can select the different `Editions size` buttons to toggle between different edition size categories described above
- If you want to explore what is inside any extension, just click on its corresponding bar graph, the pie chart that corresponds to it automatically updates!
- Check it out before you come to the next section, we could use some more input from you after exploring it…
Here is an animated screenshot of the steps!
What else of note?
Let us note that some of the `categories` that are displayed aren’t the 19 top level SNOMED CT hierarchies but correspond to the `semantic tags`. This actually gives us a richer representation of what is in there – so I think Jeremy Rogers did the right things there. Do you agree with this or would you want to see them collapsed to the top 19 hierarchies?
We thought that perhaps there is a `latent` category called `Mini` (concept count < 5000) not to suggest that they are less important, but just a way of visualising them better. You can still click on the `small graphs` of these mini extensions to see that they contain quite a good spread of content. See the Belgium extension below for example:
Did you notice other things while browsing the hierarchies or content displayed in the graphs above or directly in Termnexus? If so, please share your observations, findings and more from the galaxy of SNOMED CT editions!! Lastly, let us also appreciate all the enormous work that goes into creating and maintaining SNOMED CT – across all the amazing national release centres in different countries! 👏