We're updating the issue view to help you get more done.Learn more

Correct the regex published for the ARCHETYPE_ID type

There is currently a problem with the published regex for the ARCHETYPE_ID class in the openEHR support information model. It does not deal correctly with case, version ids, or special characters. In addition, the explanatroy text in section 4.2.2. does not correspond to the current reality of openEHR identifiers.





Thomas Beale

Raised By

Eric Browne
John Arnett
Peter Gummer

Change Description

The proposed solution is as below, and has the following features: - the characters #, ( and ) are not allowed - no part (i.e. any section between '-' or '.' delimiters) can be only 1 character long - no name part can start with a digit - allows upper case in the alphabetic parts of the id - limiting the version part of the id to numbers (i.e. no letters) - not allowing a leading '0' in the version identifier The grammar published in the Support IM becomes: # -------- production rules -------- archetype_id: qualified_rm_entity '.' domain_concept '.' version_id qualified_rm_entity: rm_originator '-' rm_name '-' rm_entity rm_originator: V_ALPHANUMERIC_NAME rm_name: V_ALPHANUMERIC_NAME rm_entity: V_ALPHANUMERIC_NAME domain_concept: concept_name { '-' specialisation }* concept_name: V_ALPHANUMERIC_NAME specialisation: V_ALPHANUMERIC_NAME version_id: 'v' V_NONZERO_DIGIT [ V_NUMBER ] # -------- lexical patterns -------- V_ALPHANUMERIC_NAME: [a-zA-Z][a-zA-Z0-9_]+ V_NONZERO_DIGIT: [1-9] V_NUMBER: [0-9]+ The PERL regular expression equivalent of the above is as follows: [a-zA-Z]\w+(-[a-zA-Z]\w+){2}\.[a-zA-Z]\w+(-[a-zA-Z]\w+)*\.v[1-9]\d* The classic regular expression equivalent of this is generated from the above with the following substitutions: \w -> [a-zA-Z0-9_] \d -> [0-9] The explanatory text is improved as follows: - move explanatory material next to grammar up to a dedicated subsection under section 4.2.2 Composite Identifiers; - improve the text under section 4.2.2 generally - add a warning that the non-conforming .v1draft form of the identifier may still need to be supported.

Impact Analysis

The impact should be minimal, since the grammar and regex changes above conform to all known archetypes, other than deprecated ones with '.v1draft' at the end. A warning statement in the spec has been made about these identifiers, and a separate guideline will be published on openEHR regarding fixing this situation.


Thomas Beale
Eric Browne
Peter Gummer


Fix versions

Due date