Why the SNOMED CT and LOINC `concept model` matters to patients and healthcare

Earlier this year we revisited work our team did looking at the gaps in the Unified Test List (broadly the lab medicine subset of SNOMED CT for use in the UK). As part of we revisited one of the most common philosophical choices we make as terminologists:

Do we just provide/publish a `standard code` with just an identifier and a standardised name (fully specified name)… Or do we ensure it is modelled sufficiently and accurately.

See often our users and broadly requester say `please just urgently give us an code/identifier`. They of course rightly want to fix an immediate problem in their system and interoperate using that identifier. Why then, would terminologists agonise over the wording and the modeling of things (aka concepts) in a terminology?

Background

If you are reading this article, I assume you already know about `concepts`, `descriptions` and `relationships` as the fundamental notions in a well developed terminology/ontology. If not, please refer to this `Introduction to SNOMED CT — Part I` series which should take only 5 minutes to understand. SNOMED CT, LOINC and most well developed terminologies follow the same principles, even if the words they use are slightly different.

Concept — is the `thing` or `entity` that we are trying to refer to

The principal protagonists of this article are:

Lexical Representation — the set of `descriptions` that sufficient describe (for humans) what the concept is about
Concept Model — the set of `relationships` that sufficiently define (or logically describe) the concept is often referred to as the model

Both `descriptions` and `relationships` matter — but for different purposes. However, much like the `in your face protagonist with great looks` the Lexical Representation is highly visible and seen by most as what the concept is about. The `Concept Model` are the more `behind the scenes, somewhat aloof and oft-misunderstood` protagonist (e.g. Mr Darcy from Pride and Prejudice… 😂) who quietly does important stuff that takes a bit of understanding. I am not here to showcase one over the other — but rather to say why the concept model that is oft not considered matters. Just for context:

There are `1,506,901` active descriptions in the October 2022 UK SNOMED CT edition. In contrast there are `1,275,368` active relationships in it. So broadly for every 5 descriptions there are 4 relationships.

This should show the effort that goes into creating the `concept model` and not just the `lexical representation`. Why would anyone put so much into creating them if they didn’t matter….

So why does the Concept Model matter?

If you have read the first part of our `Intro to SNOMED CT series`, then you might as well refer to the second part, which describes the usefulness of `relationships` in both interoperability and analytics. This article builds on it with some concrete examples of the consequences of modelling that might matter to us all. Here are some examples concepts from the UTL:

Current modelling of some concepts from the UTL

Seems straightforward does it not (assuming you know the UTL model), if you read the Lexical Representation (Fully Specified Name) and try to follow what the Concept Model is. In this case, the `Concept Model` is the combination of `component`, `property` and `specimen`. So in every case we are describing something about:

Antimony presence in some specimen (blood, urine, etc)
and what we measured about it (e.g. substance concentration).

So by reading the `Substance concentration of antimony in 24 hour urine (observable entity)` you can guess it will be about measuring `42449005 Antimony` in `276833005 24 hour urine specimen`. Now you understand why most often users just say `give me an identifier and a standardised term` please. So why then would we care about the concept model?

Now think about data analysis or even all the clever stuff that people AI that people refer to. If we want to identify all tests that are done on `Arsenic` in `urine`, we want to be able to pull out both:

Substance concentration of antimony in 24 hour urine (observable entity)
Substance concentration ratio of antimony to creatinine in urine (observable entity)

I know the common way of finding these would be the lexical matching on the word `urine` — but there are times when this won’t work (medicine is complex and I won’t digress). However, notionally we also know that `24 hour urine specimen` is a type of `urine specimen`. In fact, SNOMED CT provides this information out of the box for us via you guessed it — the concept model! So here is what the concept model says about `urine specimen` in SNOMED CT.

As an aside, there are 42 types of urine specimens listed in SNOMED CT — and you thought this would be straightforward. Actually, let’s spare some thought for the amazing amount of work that goes into creating standards like SNOMED CT and LOINC by the incredible teams behind them (SNOMED International, Regenstrief Institute, Inc., NHS England). It is their painstaking work that over the years have helped create this rich concept model in these standards. Thanks to their work, I can now trivially write a query like below to pull out all Arsenic related tests that have some type of urine specimen.

<< Observable entity: component=<<42449005|Antimony| AND specimen=<<122575003|Urine specimen|

Please don’t worry if you don’t understand that query as it was deliberately written in pseudo ECL. Just before we move on, also think if we had just used the `lexical match only` approach how many matches we’d have found from all over SNOMED CT. For the record a quick search reveals 5,416 matches in all of SNOMED CT for the word `urine`. So the concept model allows us to whittle down this list to exactly what we want — from 5000+ to just 42. Of course we could constrain this even further using the power of the concept model, but let’s move on.

With great power comes…

If you have followed the examples till now, please now scroll back up to the table to look at the modelling for `Substance concentration ratio of antimony to creatinine in urine (observable entity)`. Now, I will draw your attention to something the eagle eyed readers would have already spotted. This test is about the `Substance concentration ratio of antimony to creatinine`. So we aren’t measuring the substance concentration like the other tests. We are comparing the `Substance concentration of arsenic` to another substance called creatinine. So the simple mathematical representation of this would be:

Just referring to the maths that we all learnt in school, we would know that a ratio is always a number, nothing more than a number. Metrologically a number on its own is a dimensionless entity, unlike something like say a `Substance concentration` that can have dimensions (A level physics — dimensional representation) of moles/mass per volume/weight etc. So now let’s revisit this concept’s current modelling

Current modelling of Substance concentration ratio of Antimony to Creatinine in urine in the UTL

Since this is really a ratio, it would make sense to represent this using the `ratio` property concept or one of its types in SNOMED CT. First lets look at what the updated concept model might look like:

Proposed modelling of Substance concentration ratio of Antimony to Creatinine in urine in the UTL

Note that the updated concept model uses `118557008 Substance concentration ratio` instead of the previous `118556004 Substance concentration`. In fact, SNOMED CT concept model has a hierarchy of ratios too as shown below:

Concept Model in LOINC

In fact, while we have discussed the concept model from SNOMED CT, LOINC also has similar notions. When it comes to modelling tests, LOINC does a great job of defining every test. For example, here is concept model for the Antimony to Creatinine ratio (52941–2 Antimony/Creatinine [Molar ratio] in Urine) in LOINC:

Current modelling of Substance concentration ratio of Antimony to Creatinine in urine in the LOINC

And here are the various ratios (from the LOINC Parts) that are published in LOINC . Of course, LOINC hasn’t not traditionally organised its content using the same hierarchical relationships that SNOMED CT has, so you don’t get that cool hierarchy/tree view. But make no mistake, the information is still there if you look in the right place.

As an aside, note that LOINC models some other similar concepts using the top level concept `ratio` instead of using any of the sub types. For example, here is 56652–1 Arsenic/Creatinine [Ratio] in Urine) in LOINC:

However, when I proposed the updated modelling I used `substance concentration ratio` which is a sub type of ratio. Interestingly, from a quick search there are 22 ratio subtype concepts in LOINC vs 19 ratio subtype concepts in SNOMED CT. There is another side question – do we want to distinguish `substance concentration ratio` from say `mass concentration ratio` or are they all just the top level ratio…

Reconciling such differences between the LOINC and SNOMED CT concept models and their use would be key to the LOINC SNOMED CT Extension work.

Implications of Concept Model

In the previous sections we covered the usefulness of the concept model in aggregating data and interoperability. However, particularly in the lab space there are other implications for the concept model (correct or incorrect). For example, when I receive a test result which is a `substance concentration`, I can perform some automated validation on it to verify that the `units of measurement` that accompany this result are what I expect (e.g. moles/L, mmol/mL, etc). I would want to do this, to ensure that the sending or receiving system did not inadvertently truncate or change something about the result, creating a safety issue. In fact, there is an entire standard called UCUM that deals with the machine representation of units and how to convert them from say moles per litre to something like millimoles per litre. In fact, when the maths is sound like it is in UCUM, you can write converters (like that exist in the HL7 world) to use UCUM to convert units safely as part of interoperability.

We would all like to be able to use such validations to ensure our lab results are being shared correctly, without any safety issues being introduced when data crosses systems.

On the flip side, if the concept modelling is inaccurate then it would be possible to infer the wrong sort of thing. For example, the unfortunate representation of `11641000237102 | Substance concentration ratio of antimony to creatinine in urine (observable entity)|` using `118556004 Substance concentration` could lead to a validation engine expecting that some units like moles/L are meant to accompany this result and flag it as invalid or worse as something it is not. Of course, in the real world a lot of testing would be done to ensure the validity of such rules but the real issue might not be picked up if you looked at the `name` of the test only and not its concept model… We spotted a range of similar modelling patterns for other concepts in the HTML version of the UTL, which we suspect is under review by NHS England

15011000237107 Substance ratio of 3-methoxytyramine to creatinine in urine (observable entity)
12421000237102 Arbitrary concentration ratio of aspartate transaminotransferase/alanine aminotransferase in serum (observable entity)
118556004 Substance concentration ratio of barium to creatinine in urine (observable entity)

It is quite possible that a clear and principled rationale was used for modelling these concepts in this particular pattern. However, since we were unable to find the corresponding principle in UTL Editorial Principles, we are unable to infer anything more than make the observation about the possible unintended consequences of the concept model.

I will conclude by saying that the work of a terminologist is quite difficult. So inspecting just a concept and inferring all the thinking that went into it is not always possible. I would like to credit the amazing work that my colleagues Sarah Harry and Sudha Kodati did on reviewing the UTL content a year or two ago. Likewise, just looking at the name (lexical representation) of a concept does not give enough credit to the terminologist’s work who would have spent as much time on the concept model. Hopefully this article showcases some of the reasons why the concept model in SNOMED CT and LOINC is quite useful and offers great power for both interoperability and analytics.

Want to discuss lab data standards, results sharing, pathology data harmonisation? We’re always open to chat

We at Termlex believe that all clinicians and patients would like to have the ability to access and compare diagnostic results from anywhere irrespective of their origin. If this is something you believe in and would like to chat, come talk to us…

Jay Kola
February 24, 2023
0 Comments

Categories: LOINC, Ontology, SNOMED CT

Tags: Concept Model, Semantic model