Value sets proposal

The Problem

Currently in ADL value set constraints are defined in two ways. External value sets are defined as a reference in the archetype terminology section, identified using an ac-code. In the terminology section a binding can be added, which is a URI to an external ref-set or value set, e.g. sitting in some CTS2 terminology service in which a copy of SNOMED CT is to be found.

For many required value sets, there is no externally available value set / ref-set definition. While there might be one day, archetype authors commonly need to be able to create an 'inline' value set. This is done by defining a constraint that looks like the following:

ELEMENT[id9] occurrences matches {0..1} matches { -- Position
value matches {
DV_CODED_TEXT[id1059] matches {
defining_code matches {
[local::
at1001, -- Standing
at1002, -- Sitting
at1003, -- Reclining
at1004, -- Lying
at1015; -- Lying with tilt to left
at1002] -- Assumed
}
}
}
}

In the terminology section we have the following:

term_definitions = <

....

 ["at1001"] = <
text = <"Standing">
description = <"Standing at the time of blood pressure measurement.">
>
["at1002"] = <
text = <"Sitting">
description = <"Sitting (for example on bed or chair) at the time of blood pressure measurement.">
>
["at1003"] = <
text = <"Reclining">
description = <"Reclining at the time of blood pressure measurement.">
>
["at1004"] = <
text = <"Lying">
description = <"Lying flat at the time of blood pressure measurement.">
>
["at1015"] = <
text = <"Lying with tilt to left">
description = <"Lying flat with some lateral tilt, usually angled towards the left side. Commonly required in the last trimester of pregnancy to relieve aortocaval compression.">

In this example the openEHR types ELEMENT and DV_CODED_TEXT are shown, but with a different reference model, e.g. an HL7-based one, you would see Act and CD respectively. The [local::at1001, at1002,...] is the bit that matters - it is an instance of the AOM class C_TERMINOLOGY code, in which the code list is set to be one of at1001, at1002, etc.These codes are all defined in the terminology section of the archetype, and they might individually have bindings to SNOMED CT or other external codes.

The question here is: where should the value set really be defined? As described here, it's inline in the archetype constraint definition, right in the middle of all the quantity, structural and other constraints. But since a value set is a terminological entity, should it really be defined in the archetype terminology section, and just referred to from the constraint definition part?

Implemented Solution

An earlier proposal is shown in the following section. What I have actually implemented is shown here, and is more like Daniel Karlsson's suggestion on the technical list. In it, all inline constraints are treated as either:

  • a single term, in which case a single at-code is used
  • a value set, in which case a single ac-code is used

Where value sets are found (i.e. > 1 code), they are converted to value sets in the terminology.

If the codes or value sets found in ADL 1.4 or 1.5 archetypes are for external terminologies (including 'openehr'), they are converted to at-codes and/or ac-codes as above, and bindings synthesised in the archetype.

Below is how a value set looks in ADL 1.4, and in any 'old' ADL 1.5 archetypes:

  ELEMENT[id31] occurrences matches {0..1} matches { -- Body exposure
value matches {
DV_CODED_TEXT matches {
defining_code matches {
[local::
at32, -- Naked
at33, -- Reduced clothing/bedding
at34, -- Appropriate clothing/bedding
at35; -- Increased clothing/bedding
at34] -- assumed value
}
}
}
}

When this is reprocessed according to the new approach, we get the following in the body of the archetype:

  ELEMENT[id31] occurrences matches {0..1} matches { -- Body exposure
value matches {
DV_CODED_TEXT[id62] matches {
defining_code matches {[ac1; at34]} -- Body exposure
}
}
}

And this in the terminology:

  term_definitions = <
    ["en"] = <
      ["at32"] = <
        text = <"Naked">
        description = <"No clothing, bedding or covering.">
      >
      ["at33"] = <
        text = <"Reduced clothing/bedding">
        description = <"The person is covered by a lesser amount of clothing or bedding than deemed appropriate for the environmental circumstances.">
      >
      ["at34"] = <
        text = <"Appropriate clothing/bedding">
        description = <"The person is covered by an amount of clothing or bedding deemed appropriate for the environmental circumstances.">
      >
      ["at35"] = <
        text = <"Increased clothing/bedding">
        description = <"The person is covered by an increased amount of clothing or bedding than deemed appropriate for the environmental circumstances.">
      >
      ["ac1"] = <
        text = <"Body exposure">
        description = <"The thermal situation of the person who is having the temperature taken.">
      >
    >
  >
 
  value_sets = <
    ["ac1"] = <
      id= <"ac1">
      members = <"at32", "at33", "at34", "at35">
    >
  >

Some existing archetypes have inline value-sets consisting of external codes, e.g. openehr codes, or SNOMED codes. In the new approach, all of this would move to the value-set representation based on internal at-codes, with each external code also having a binding added. The following shows the effect of re-processing an archetype containing an externally coded inline value set. First the original archetype (1.5 format, new id-codes):

archetype (adl_version=1.5)
	openEHR-EHR-OBSERVATION.external_code_list.v1

language
	original_language = <[ISO_639-1::en]>

description
	original_author = <
		["name"] = <"Sam Heard">
		["organisation"] = <"Ocean Informatics">
		["email"] = <"sam.heard@oceaninformatics.com">
		["date"] = <"22/03/2006">
	>
	details = <
		["en"] = <
			language = <[ISO_639-1::en]>
			purpose = <"To test rewriting of external code list">
			copyright = <"© openEHR Foundation">
		>
	>
	lifecycle_state = <"AuthorDraft">
	other_details = <
		["regression"] = <"PASS">
	>

definition
	OBSERVATION[id1] matches {	
		protocol matches {
			ITEM_TREE[id2] matches {	
				items cardinality matches {0..*; unordered} matches {
					ELEMENT[id3] occurrences matches {0..1} matches {	
						value matches {
							DV_CODED_TEXT matches {
								defining_code matches {
									[openehr::
									249,
									251,
									252,
									253,
									523,
									666;
									251]
								}
							}
						}
					}
				}
			}
		}
	}

terminology
	term_definitions = <
		["en"] = <
			["id1"] = <
				text = <"Test Obs">
				description = <"Test Obs">
			>
			["id3"] = <
				text = <"document state">
				description = <"document state">
			>
		>
	>

Now the reprocessed version:

archetype (adl_version=1.5.1)
	openEHR-EHR-OBSERVATION.external_code_list.v1.0.0

language
	original_language = <[ISO_639-1::en]>

description
	original_author = <
		["name"] = <"Sam Heard">
		["organisation"] = <"Ocean Informatics">
		["email"] = <"sam.heard@oceaninformatics.com">
		["date"] = <"22/03/2006">
	>
	details = <
		["en"] = <
			language = <[ISO_639-1::en]>
			purpose = <"To test rewriting of external code list">
			copyright = <"© openEHR Foundation">
		>
	>
	lifecycle_state = <"initial">
	other_details = <
		["regression"] = <"PASS">
	>

definition
	OBSERVATION[id1] matches {	-- Test Obs
		protocol matches {
			ITEM_TREE[id2] matches {	-- history
				items matches {
					ELEMENT[id3] occurrences matches {0..1} matches {	-- document state
						value matches {
							DV_CODED_TEXT[id4] matches {
								defining_code matches {[ac1; at2]}		-- document state
							}
						}
					}
				}
			}
		}
	}

terminology
	term_definitions = <
		["en"] = <
			["id1"] = <
				text = <"Test Obs">
				description = <"Test Obs">
			>
			["id3"] = <
				text = <"document state">
				description = <"document state">
			>
			["ac1"] = <
				text = <"document state">
				description = <"document state">
			>
			["at1"] = <
				text = <"(added by post-parse processor)">
				description = <"(added by post-parse processor)">
			>
			["at2"] = <
				text = <"(added by post-parse processor)">
				description = <"(added by post-parse processor)">
			>
			["at3"] = <
				text = <"(added by post-parse processor)">
				description = <"(added by post-parse processor)">
			>
			["at4"] = <
				text = <"(added by post-parse processor)">
				description = <"(added by post-parse processor)">
			>
			["at5"] = <
				text = <"(added by post-parse processor)">
				description = <"(added by post-parse processor)">
			>
			["at6"] = <
				text = <"(added by post-parse processor)">
				description = <"(added by post-parse processor)">
			>
		>
	>
	term_bindings = <
		["openehr"] = <
			["at1"] = <http://openehr.info/id/249>
			["at2"] = <http://openehr.info/id/251>
			["at3"] = <http://openehr.info/id/252>
			["at4"] = <http://openehr.info/id/253>
			["at5"] = <http://openehr.info/id/523>
			["at6"] = <http://openehr.info/id/666>
		>
	>
	value_sets = <
		["ac1"] = <
			id= <"ac1">
			members = <"at1", "at2", "at3", "at4", "at5", "at6">
		>
	>

Original Proposed Solution (not implemented)

Imagine that the above constraint looked like the following:

ELEMENT[id9] occurrences matches {0..1} matches { -- Position
value matches {
DV_CODED_TEXT[id1059] matches {
defining_code matches {[local::vs1001]}
}
}
}

where 'vs1001' is a code referring to a value set defined in the terminology. Now imagine that in the terminology you have the following:

term_definitions = <

....

 ["vs1001"] = <
text = <"Blood pressure measuring position">
description = <"Position of patient at time of measuring blood pressure.">
>
 ["at1001"] = <
text = <"Standing">
description = <"Standing at the time of blood pressure measurement.">
>
["at1002"] = <
text = <"Sitting">
description = <"Sitting (for example on bed or chair) at the time of blood pressure measurement.">
>
["at1003"] = <
text = <"Reclining">
description = <"Reclining at the time of blood pressure measurement.">
>
["at1004"] = <
text = <"Lying">
description = <"Lying flat at the time of blood pressure measurement.">
>
["at1015"] = <
text = <"Lying with tilt to left">
description = <"Lying flat with some lateral tilt, ....">

 

relationships = <
["rel1"] = <origin=<"at1001"> rel=<"instance-of"> target=<"vs1001">>
  ["rel2"] = <origin=<"at1002"> rel=<"instance-of"> target=<"vs1001">>
  ["rel3"] = <origin=<"at1003"> rel=<"instance-of"> target=<"vs1001">>
  ["rel4"] = <origin=<"at1004"> rel=<"instance-of"> target=<"vs1001">>
  ["rel5"] = <origin=<"at1015"> rel=<"instance-of"> target=<"vs1001">>
>

 

The 'relationships' section doesn't exist in current ADL 1.5, but we can see that it is simply a set of triples (the exact representation is just made up for now), linking the body position value set members to the value set, identified by "vs1001". And the archetype terminology section is starting to look like a small SNOMED CT structured terminology.

Why might we do this? Well, several reasons come to mind:

  • Now the value set is explicitly represented in the archetype terminology section, which is where terminology people would probably like to see it. 
    • This would enable various interesting kinds of processing and reasoning on archetype terminologies.
    • We are forced to provide a term corresponding to the value set itself (vs1001 above), which itself could be bound to external terminology codes.
    • Now we have an explicit grouping of a set of at-codes into a value set, whereas before in the archetype terminology we just had a long table of term_definitions, potentially containing at-codes from numerous value sets defined in the archetype terminology - all mixed together.
  • The value set is referred to by reference from the main definition. If in the future, the value set were to become available in e.g. SNOMED CT, the definition part of the archetype could stay the same; all that would be needed would be a binding to "vs0001" to the URI for SNOMED's blood pressure measuring position ref set.
  • Any given value set could be represented both by an local-to-archetype value set, as in the above example, and (at some point in time) a binding to an external ref set. However, in some circumstances the external ref-set might not be available, and the locally defined one would then work as a fall-back. This could be made explicit when generating the operational template in a given deployment context.
  • Value sets represented in archetypes like this could be exported into a SNOMED-like format pretty easily, and offered to IHTSDO as ref sets for inclusion in SNOMED, or in a specific extension. Archetype authors are often the world experts on a particular topic, and may in many cases develop the best possible value sets for many specific purposes.
  • Other kinds of relationships could be represented. The idea of an archetype isn't to replace external terminology, but it could be useful to augment it by asserting local value sets as above, also other relationships like synonyms and so on.