Input to terminology constraint expression

Input from the UK LRA design experience

The following is extracted from the document, 'The LRA Technical Model Developer's Design Manual Part 1: Model Design' (Version 0.6.1), describing the LRA's use of SNOMED CT as a reference terminology.  It is posted here as a potential starting point for discussing CIMI's evolving terminology binding approach.  Other users are encouraged to comment or suggest changes to the below, based on what they feel might be appropriate for CIMI.

1      Terminology Expression Constraint Design

A key aspect of the design of LRA Technical Models is the specification of constraints to restrict the semantics and if necessary form of the SNOMED CT instance expressions allowed for the terminology bound model attributes.  The following sections summarise the key features and dependencies of terminology expression constraint design. 

1.1    Use of Post-Coordination

The main aspects of SNOMED CT post-coordination used to support the encoding of care record information with LRA Technical Models are:

  • Context wrapper
    • The SNOMED CT context model explicitly represents the situation/context that applies to individual model elements.
  • Clinical focus concept
    • A clinical finding, observable, event, or procedure.
  • Refinements
    • Values applied to SNOMED CT concept model attributes that add detail and specificity to the focus concept.
    • Refinement includes adding specificity to an existing defining relationship and applying values to other attributes sanctioned by the concept model.

Refinement also includes 'indirect' use of attributes such as the application of laterality to a finding or procedure. In this case, the laterality logically applies to the finding or procedure site.

The result from all this is a set of constraints which can be expressed in a form which can then be engineered to ensure LRA design and implementation conformant to what has been chosen. The complete set of constraints is defined in the appendix to this document.

1.2    Expression Constraints

The LRA currently supports the specification and application of the semantic and literal expression constraint types defined by the terminology binding mechanism developed for use within the LRA. [5] These constraint types restrict the semantics and form respectively of the instance expressions allowable for terminology bound model attributes. 

All attributes capable of being terminology bound of all model elements inherit default expression constraints from the underlying Reference Model.  These constraints are listed for the appropriate classes described section 4.1.  Any constraint subsequently defined to restrict further the permissible semantics and / or form in which they are expressed must conform to the default constraint(s). If no other constraint is specified for a model attribute then the default constraint will apply.  Whilst ensuring that any instance expression applied to the attribute is conformant with the Reference Model, the default constraint is unlikely to be sufficiently restrictive to satisfy any real world requirement. 

This document provides guidance on specifying constraints for a range of care record requirements.  All constraints are expressed using the Extended SNOMED CT Compositional Grammar (see section 7.1) although their underlying serialisation uses an XML grammar. 

1.2.1     Expression Constraint Types

1.2.1.1    Semantic Expression Constraints

Semantic expression constraints restrict what it is permissible for an instance expression to state.  A semantic expression constraint asserts that only terminology expressions that have a meaning that falls within a domain specified by the constraint can be used as values for the model attribute to which the constraint applies.  A semantic expression constraint is therefore concerned only with ensuring that the meaning conveyed instance conforms to the set of permissible meaning defined by the constraint and whether that meaning is conveyed as a pre-coordinated or post-coordinated expression.

1.2.1.2    Literal Expression Constraints

Literal expression constraints restrict the literal form of an instance expression.  A literal expression constraint specifies the permitted or required post-coordination of any instance expression used as a value for the model attribute to which the constraint applies.  Literal expression constraints and semantic expression constraints are interrelated in that

  • a semantic expression constraint may indirectly constrain the literal expression (e.g. if the prohibiting a specific semantic facet prohibits that aspect of post-coordination); and 
  • A literal expression constraint imposes some constraints on semantics (e.g. if post-coordination is not permitted, then meanings for which no pre-coordinated concept exists in SNOMED CT cannot be represented).

Literal expression constraints can be used to limit variation in forms of expression and thus simplify implementation.  They can also be used to require explicit post-coordination where issues with the consistency and completeness of SNOMED CT content are expected to interfere with a mission critical processing requirement.

1.2.2     Open Versus Closed Constraints

The following convention is used in this document which goes beyond the current SNOMED CT standard. The constraints on an expression or a refinement within an expression constraint may be 'open' or 'closed'. An 'open' constraint implies that an attribute that is permitted by the SNOMED CT Concept Model but not mentioned in the constraint is permitted to have the range of value specified by the Concept Model. A 'closed' constraint implies that only those attributes explicitly mentioned may be present.

1.2.3     Constraint Inheritance

A requirement to test multiple inherited constraints is likely to increase the burden of validation.  Expression constraints can be represented in one of three different normal forms.

  • Short-form - includes only refinements to inherited constraints.
    • This is the most efficient in terms of authoring and management of constraints.
  • Intermediate-form – includes and refines inherited constraints but does not restate rules that are part of the concept model.
    • This is efficient for both processing and distribution because its only dependencies are on a single common set of concept model constraints.
  • Long-form - includes all applicable rules, including those inherited from the concept model.
    • This is potentially more efficient for processing since it does not have any dependencies.

The Terminology Binding Technical Specification [5] supports short, intermediate and long forms, using a single XML schema which includes an attribute specifying the form. The constraints used in this document use the intermediate normal form except where otherwise stated.

1.3    Instance Expressions

This guidance also includes many examples of instance expressions, using the SNOMED CT Compositional Grammar without extensions. 

1.3.1     Close To User Form

It is often possible to express any given SNOMED CT instance expression in a number of equivalent forms.  The close-to-user form is a more natural form for clinicians to use and is more readily understood by human readers and is used for many of the instance examples in this document.  Although semantic expression constraints are intended to constrain the meaning rather than the form of the expression, if too restrictive they may unintentionally prevent the use of some close to user forms.  For example, the organising concept 4908009 | history of (situation) | and its subtypes has a temporal context value of in 410513005 | in the past |.  A semantic constraint requiring that instances in which the time of the event was unknown have a temporal context value of 410587003 | in the past | (or a subtype thereof) would prohibit the use of any concepts from this hierarchy.  This is taken into account in the constraints specified in section 4.2.3.   

1.3.2     Default Context

When a SNOMED CT expression appears in a record without any explicitly stated context, then the concept is considered to have a "soft-default" context.

For a clinical finding this means that the finding has actually occurred, is occurring to the subject of record and it is occurring currently or at a stated past time.

The soft default context for a procedure means that the procedure was completed, was performed on the subject of record (the patient), and was done in the present time or at a stated past time.

Regardless of any forms used by instance expressions, semantic expression constraints always make explicit any permitted contexts. 

For further information the reader should refer to Section 2.5.2 of SNOMED CT Abstract Logical Model and Representational Forms and to SNOMED CT Transforming Expressions to Normal Forms.

1.4    Extended SNOMED CT Compositional Grammar

The grammar used in this guidance is an extension beyond the current standard and is unlikely to be formally accepted for SNOMED CT for several years. The annex to this guidance document defines the extended grammar and should be used as an authoritative reference for the use of the grammar in LRA.

1.5    Reliance on Unpublished Work

The terminology guidance in this document makes use of the work of the IHTSDO Machine Readable Concept Model Project Group. This work is not yet generally published or approved and therefore no specific further reference to this area can be made in this guidance document. Although not part of the Standard SNOMED CT release, a machine and human readable representation of the International SNOMED CT concept model can be retrieved from the IHTSDO collaborative space following the link https://TheCAP.seework.com/P25441247 . Please note this is a secure site - a user account should be requested first from support@ihtsdo.org, including "Machine and Human Readable Concept Model" in the requested project areas (see http://www.ihtsdo.org/about-ihtsdo/collaborative-space/ for details).

1      Appendices

1.1    Extended Compositional Grammar  

This annex specifies extensions to the SNOMED CT Compositional Grammar which have been used in HL7 TermInfo and in early NHS work to represent constraints on expressions. This material, previously published in the NHS CFH document on 'Terminology Binding Requirements and Principles' (version 1.0 May 2008) is repeated here for reference.

The source form for the constraints and examples in this document is the richer and more expressive XML representation recommended in the Terminology Binding Technical Specification. However, within the text of the document they are rendered into the Extended SNOMED CT Compositional Grammar (ESCG) described in this appendix. The rendering is derived by applying and XSLT transform the source representation. The rendered result is easier to read than more verbose XML but, in the case of complex constraints, does result in some lost of specificity. For these reason, where practical, hyperlinks have been included to reference the XML source files. These hyperlinks may not be navigable in some published forms of the document but are available from the XHTML (.html) and 'HTML Help' (.chm) versions of the document.

The following tables provide an overview of the SNOMED CT Compositional Grammar which is documented in 'SNOMED CT Guide to Abstract Models and Representational Forms' and various enhancements supported by the 'Extended Compositional Grammar' (ESCG).

To enable simple representation of constrained value-sets of concepts and expressions based on post-coordinated refinement.

To support clear documentation of relatively simple constraints, an informal extension has been made to the compositional grammar.

An informal extension has been made to the compositional grammar to represent constraints. The extensions include:

Symbols < and << to represent constraints that include subtypes of a specified concept.

The ^ symbol to represent inclusion of members of a Refset (subset).

The ! symbol to exclusion (i.e. the logical complement of the specified constraint).

Logical AND (intersection) and OR (union) operations between permitted value-sets.

 This human-readable rendering of constraints arose from changes to the grammar proposed in the 'Guide to Use of SNOMED CT in HL7 Version 3' and the NHS CFH report on 'Design of Adverse Reaction Archetypes and Templates for the Vaccination Summary Record '; the pilot project on use of EN13606 and SNOMED CT.

To further improve clarity a colour scheme has been applied to distinguish concept identifiers (grey), terms (blue) and significant operators (red) and term delimiters (green).

Table 100. Summary of SNOMED CT Compositional Grammar
 Symbol
Notes
Examples

digits

ConceptId

A sequence of digits in an expression represents a SNOMED CT concept identifier. The two exceptions to this are:
1) Where digits occur between a pair of pipe symbols, in which case the digits are part of the display name (see | text | row in this table).
2) Where a string of digits is immediately preceded by a caret symbol ^ in which case in the extended constraint grammar this represents a subset (or Refset ) identifier (see next table)
The simplest expression is a concept identifier on its own. For example: 87628006 | bacterial infectious disease |

| text |

Display name delimiter

A pair of pipe ("|") symbols are used to delimit an optional display name for the immediately preceding concept identifier. For example: 87628006 | bacterial infectious disease |
The display name may be the term string of any of the descriptions associated with the concept in a current version of SNOMED CT. For example any the following are a sample of valid representations of the same concept:
87628006 | bacterial infectious disease |
87628006 | bacterial infectious disease |
87628006 | enfermedad infecciosa bacteriana |
87628006 | maladie infectieuse bactérienne |
Note: In constraint expressions where a subset (or Refset ) identifier is used the | text | is the name of the subset (or Refset ).

space
tab
linefeed
return

Whitespace characters

Whitespace characters are ignored and can thus be used to format the appearance of an expression where this aids clarity. The only exception to this rule is that spaces are not ignored within a display name.
Note: Spaces before or after the last non whitespace character of a display name are ignored. The text between the pair of pipe characters is trimmed of any surrounding whitespace but spaces within the enclosed text are treated as part of the display name.

:

Refinement prefix

A colon (: ) precedes a refinement of meaning of the concept to the left of the colon. A refinement consists of one or more attributes and/or attributes groups and these are illustrated by examples in subsequent rows of this table.

=

Attribute value prefix

Each of the attributes that make up a refinement consists of an attribute name and an attribute value. The attribute name precedes the value and is separated from it by an equals sign ( = ).
The attribute name is represented by a concept identifier and the attribute value. The attribute value may be represented by a concept identifier as in the following example or by a nested expression (see example later in this table).
The following example specifies a bacterial infectious disease caused by streptococcus pneumoniae.
87628006 | bacterial infectious disease |:
246075003 | causative agent | = 9861002 | Streptococcus pneumoniae |

,

Attribute separator

A refinement may include more than one attribute. In this case, a comma (, ) is used to separate attributes from one another.
The following example specifies a bacterial infectious disease affecting the lung and caused by streptococcus pneumoniae.
87628006 | bacterial infectious disease |:
246075003 | causative agent | = 9861002 | Streptococcus pneumoniae |
, 363698007 | finding site | = 45653009 | structure of upper lobe of lung |

( exp )

Nested expression

The value of an attribute may be represented by a nested expression rather than a single concept identifier. In this case, the nested expression is enclosed in parentheses.
The following example specifies a bacterial infectious disease affecting the left upper lobe of the lung and caused by streptococcus pneumoniae. The nested expression localises and lateralises the site of the disease.
87628006 | bacterial infectious disease |:
246075003 | causative agent | = 9861002 | Streptococcus pneumoniae |
, 363698007 | finding site | =( 45653009 | structure of upper lobe of lung |:
272741003 | laterality | = 7771000 | left | )

{ grp }

Attribute group

In some cases, different sets of attributes apply to different facets of the same concept. For example, some common fractures involve two adjacent bones and the nature of the fracture of each bone may differ. Similarly, some procedures involve removal of one structure and repair of another and different refinements of these actions may be required.
In SNOMED CT concepts that have multiple facets are defined with each facet represented by a separate relationship group. When these concepts are refined, it may be necessary to specify which group is being refined. In these cases, curly braces are used to group together sets of attributes that act together.
The following example represents a fracture of the shaft of the tibia and fibula. The tibia has a spiral fracture while the nature of the fracture of the fibula is incomplete.
271577005 | fracture of shaft of tibia and fibula |:
{ 116676008 | associated morphology | = 30543000 | fracture, incomplete |
, 363698007 | finding site | = 113224005 | bone structure of shaft of fibula | }
,{ 116676008 | associated morphology | = 73737008 | fracture, spiral |
, 363698007 | finding site | = 52687003 | bone structure of shaft of tibia | }

+

Combination

87628006 | bacterial infectious disease | + 50043002 | disorder of respiratory system |
This means a disorder that is both a bacterial disease and disorder of the respiratory systems. For example "bacterial pneumonia".
It does not mean two separate disorders that are for some reasons are being linked. For example, this use of the plus sign would not be the appropriate way to represent that someone has a non-bacterial respiratory disorder (e.g. allergic asthma) and also has a bacterial disease (e.g. impetigo).


Table 101. Compositional Grammar extension - Constraint symbols
 Symbol
Notes
Examples


This concept
(No symbol prefix)

71388002 | procedure |
The concept "procedure" SHALL be used. Note: By default, unless the surrounding context states otherwise, this implies this precise concept (i.e. not one of its subtypes). However, the context within a sentence or parsable expression may imply a less specific requirement. For example, if the concept is followed by any options for addition of refinements these implicitly permit refinement of the concept.

^

The identifier (and optional text) that follows refers to a subset (or Refset ) and any member of that set it permitted.

^ 8181000000134 | Encounter disposition |
A concept that is a member of the 'Encounter disposition' subset SHALL be used.
Note: This symbol cannot be combined with the "<<" or "<" symbols. This could conflict with the membership definition for the set. Where subtypes of members of a set are intended to be included this must be a property of the set (or of its members) and the set must be specified using and 'intension definition'.

<<

This concept or any subtype permitted

<< 71388002 | procedure |
Either the concept "procedure" or one of its subtypes SHALL be used. Note: this differs from the "<=" symbol used to indicate the same constraint in other HL7 specifications. The reason for the difference is to limit the use of "=" as the operator that joins an attribute name and an attribute value in the unextended compositional grammar

<

Any subtype of this concept (but not the concept itself)

71388002 | procedure |: 363704007 | procedure site | =( 29836001 | hip region structure |: 272741003 | laterality | = < 182353008 | side | )
The procedure site SHALL be the value "hip region structure" and SHALL include the attribute "laterality" The value of "laterality" SHALL be a subtype of "side" but SHALL NOT be "side" itself.

~

Optional attribute (only applicable as a prefix to AttributeName)

71388002 | procedure |: << 363704007 | procedure site | =( << 29836001 | hip region structure |:~ 272741003 | laterality | = < 182353008 | side | )
The attribute "procedure site" or one of its subtypes (e.g. "procedure site - direct") SHALL be applied and its value SHALL be "hip region structure" or one of its subtypes. The attribute "laterality" MAY BE applied and if present its value SHALL be a subtype of "side" but SHALL NOT be "side" itself.

!

This concept is prohibited and SHALL NOT be used.

71388002 | procedure |: 363704007 | procedure site | =( 29836001 | hip region structure |: ! 272741003 | laterality | )
The procedure site SHALL be the value "hip region structure" and SHALL NOT include the attribute "laterality".
Note: This example conflicts with the SNOMED CT compositional grammar as no value is supplied for the laterality attribute. Since the laterality attribute is not permitted, it makes no sense to provide a value. Alternatively, a dummy value could be provided but it has been omitted here and in the examples in this document as it would decrease rather than enhance clarity.

! <

This concept and all its subtypes are prohibited and SHALL NOT be used.

71388002 | procedure |: 363704007 | procedure site | =( 29836001 | hip region structure |: 272741003 | laterality | = ! < 66459002 | unilateral | )
The procedure site SHALL be the value "hip region structure" and MAY include the attribute "laterality" The value of "laterality" SHALL NOT be "unilateral" or a subtype of "unilateral".

! ^

The identifier (and optional text) that follows refers to a subset (or Refset ) and NO member of that set it permitted.

! ^ 10301000003139 | Clinical Exclusions |
A concept that is a member of the 'Clinical exclusions' subset SHALL NOT be used.



Table 102. Compositional Grammar Extension – Constrainable elements
Element
Notes and examples

ConceptId

A constraint symbol MAY directly precede a ConceptId. In this case, it requires, allows, or prohibits use of the referenced concept (and/or subtypes of that concept) in that logical position in the expression.
Unless otherwise stated, the comparison between an instance expression and a constraint assumes both are transformed to normal forms before testing.
A constraint symbol MAY directly precede a ConceptId. In this case, it requires, allows, or prohibits use of the referenced concept (and/or subtypes of that concept) in that logical position in the expression.
Unless otherwise stated, the comparison between an instance expression and a constraint assumes both are transformed to normal forms before testing.
For example, the following constraint:
71388002 | procedure |:
260686004 | method | = << 129304002 | excision - action |
Permits expressions such as | cholecystectomy | or 80146002 | appendectomy | because the concepts | cholecystectomy | and 80146002 | appendectomy | are defined in SNOMED CT release data as subtypes of 71388002 | procedure | with 260686004 | 260686004 | method | | = 129304002 | excision - action | ].

Attribute Name

A constraint symbol MAY directly precede the ConceptId that specifies the name of an attribute. In this case it requires, allows or prohibits use of that attribute (or a subtype of that attribute). Unless the use of the attribute is prohibited, the value of that attribute MAY be separately constrained.
The following example asserts that the attribute "procedure site" or one of its subtypes (e.g. "procedure site - direct") SHALL be applied and its value SHALL be "hip region structure" or one of its subtypes.
A constraint symbol MAY directly precede the ConceptId that specifies the name of an attribute. In this case it requires, allows or prohibits use of that attribute (or a subtype of that attribute). Unless the use of the attribute is prohibited, the value of that attribute MAY be separately constrained.
The following example asserts that the attribute "procedure site" or one of its subtypes (e.g. "procedure site - direct") SHALL be applied and its value SHALL be "hip region structure" or one of its subtypes.
71388002 | procedure |: << 363704007 | procedure site | = << 29836001 | hip region structure |

Nested Expression

A constraint symbol may directly precede an expression enclosed in parentheses. In this case, it requires, allows or prohibits inclusion of the parenthesised expression (and/or subtypes of that expression) in that logical position in the expression.
A constraint symbol may directly precede an expression enclosed in parentheses. In this case, it requires, allows or prohibits inclusion of the parenthesised expression (and/or subtypes of that expression) in that logical position in the expression.
Note: From a human-readable perspective it is clearer to specify individual constraints on the elements within the nested expression, rather than to apply a constraint to the nested expression as a whole. However, the nested form avoids repetition of common elements and, in the XML representation provides a more flexible, efficient approach.

Attribute Group

A constraint symbol MAY directly precede an attribute group. In this case, it requires, allows or prohibits inclusion of the specified group (and/or subtypes of that group) in that logical position in the expression.
The following example asserts that the group shown or a subtype of that group must be present. Thus this will include any abdominal excision.
A constraint symbol MAY directly precede an attribute group. In this case, it requires, allows or prohibits inclusion of the specified group (and/or subtypes of that group) in that logical position in the expression.
The following example asserts that the group shown or a subtype of that group must be present. Thus this will include any abdominal excision.
71388002 | procedure |: << 260686004 | method | = 129304002 | excision - action |, 405813007 | procedure site - Direct | = 113345001 | abdominal structure |

Other

The constraints cannot be used elsewhere in the expression. In particular a constraint cannot be applied to a refinement as whole or to a display name. Therefore, the constraint symbols cannot immediately follow the concept identifier, nor can they precede the pipe ("|") or colon (":") symbols.

Table 103. Grammar Extension - Logical constrain combinations
 Symbol
Notes
Examples

OR

Where two or more values are permitted, the set of conditions and the individual expressions SHALL both be enclosed in standard curved brackets () and the word "OR" SHALL be placed between the expression.

71388002 | procedure |:
363704007 | procedure site | = ( 29836001 | hip region structure |:
~ 272741003 | laterality | =
( 7771000 | left | )
OR ( 24028007 | right | ) )
The procedure site SHALL be the value "hip region structure" and MAY include the attribute "laterality" The value of "laterality" SHALL be either "left" or "right".

AND

Where two or more conditions are both required to apply, the individual expression SHALL be enclosed in standard curved brackets and the word "AND" shall be placed between the expressions. ((exp1) AND (exp2))

71388002 | procedure |:
363704007 | procedure site | = ( 29836001 | hip region structure |:
~ 272741003 | laterality | =
( < 182353008 | side | )
AND ( ! << 51440002 | right and left | ) )
The procedure site SHALL be the value "hip region structure" and MAY include the attribute "laterality" The value of "laterality" SHALL be a subtype of "side" AND SHALL NOT be "bilateral" or a subtype of "bilateral".