Uploaded image for project: 'Specification'
  1. SPEC-260

Correct the regex published for the ARCHETYPE_ID type

    Details

    • Type: Change Request
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Release 1.0.2
    • Component/s: openehr.rm.support
    • Labels:
      None
    • Change Description:
      Hide
      The proposed solution is as below, and has the following features:
      - the characters #, ( and ) are not allowed
      - no part (i.e. any section between '-' or '.' delimiters) can be only 1 character long
      - no name part can start with a digit
      - allows upper case in the alphabetic parts of the id
      - limiting the version part of the id to numbers (i.e. no letters)
      - not allowing a leading '0' in the version identifier


      The grammar published in the Support IM becomes:

      # -------- production rules --------
      archetype_id: qualified_rm_entity '.' domain_concept '.' version_id
      qualified_rm_entity: rm_originator '-' rm_name '-' rm_entity
      rm_originator: V_ALPHANUMERIC_NAME
      rm_name: V_ALPHANUMERIC_NAME
      rm_entity: V_ALPHANUMERIC_NAME
      domain_concept: concept_name { '-' specialisation }*
      concept_name: V_ALPHANUMERIC_NAME
      specialisation: V_ALPHANUMERIC_NAME
      version_id: 'v' V_NONZERO_DIGIT [ V_NUMBER ]
      # -------- lexical patterns --------
      V_ALPHANUMERIC_NAME: [a-zA-Z][a-zA-Z0-9_]+
      V_NONZERO_DIGIT: [1-9]
      V_NUMBER: [0-9]+

      The PERL regular expression equivalent of the above is as follows:
      [a-zA-Z]\w+(-[a-zA-Z]\w+){2}\.[a-zA-Z]\w+(-[a-zA-Z]\w+)*\.v[1-9]\d*

      The classic regular expression equivalent of this is generated from the above with the following substitutions:
      \w -> [a-zA-Z0-9_]
      \d -> [0-9]


      The explanatory text is improved as follows:
      - move explanatory material next to grammar up to a dedicated subsection under section 4.2.2 Composite Identifiers;
      - improve the text under section 4.2.2 generally
      - add a warning that the non-conforming .v1draft form of the identifier may still need to be supported.
      Show
      The proposed solution is as below, and has the following features: - the characters #, ( and ) are not allowed - no part (i.e. any section between '-' or '.' delimiters) can be only 1 character long - no name part can start with a digit - allows upper case in the alphabetic parts of the id - limiting the version part of the id to numbers (i.e. no letters) - not allowing a leading '0' in the version identifier The grammar published in the Support IM becomes: # -------- production rules -------- archetype_id: qualified_rm_entity '.' domain_concept '.' version_id qualified_rm_entity: rm_originator '-' rm_name '-' rm_entity rm_originator: V_ALPHANUMERIC_NAME rm_name: V_ALPHANUMERIC_NAME rm_entity: V_ALPHANUMERIC_NAME domain_concept: concept_name { '-' specialisation }* concept_name: V_ALPHANUMERIC_NAME specialisation: V_ALPHANUMERIC_NAME version_id: 'v' V_NONZERO_DIGIT [ V_NUMBER ] # -------- lexical patterns -------- V_ALPHANUMERIC_NAME: [a-zA-Z][a-zA-Z0-9_]+ V_NONZERO_DIGIT: [1-9] V_NUMBER: [0-9]+ The PERL regular expression equivalent of the above is as follows: [a-zA-Z]\w+(-[a-zA-Z]\w+){2}\.[a-zA-Z]\w+(-[a-zA-Z]\w+)*\.v[1-9]\d* The classic regular expression equivalent of this is generated from the above with the following substitutions: \w -> [a-zA-Z0-9_] \d -> [0-9] The explanatory text is improved as follows: - move explanatory material next to grammar up to a dedicated subsection under section 4.2.2 Composite Identifiers; - improve the text under section 4.2.2 generally - add a warning that the non-conforming .v1draft form of the identifier may still need to be supported.
    • Impact Analysis:
      Hide
      The impact should be minimal, since the grammar and regex changes above conform to all known archetypes, other than deprecated ones with '.v1draft' at the end. A warning statement in the spec has been made about these identifiers, and a separate guideline will be published on openEHR regarding fixing this situation.
      Show
      The impact should be minimal, since the grammar and regex changes above conform to all known archetypes, other than deprecated ones with '.v1draft' at the end. A warning statement in the spec has been made about these identifiers, and a separate guideline will be published on openEHR regarding fixing this situation.

      Description

      There is currently a problem with the published regex for the ARCHETYPE_ID class in the openEHR support information model. It does not deal correctly with case, version ids, or special characters. In addition, the explanatroy text in section 4.2.2. does not correspond to the current reality of openEHR identifiers.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              thomas.beale Thomas Beale
              Raised By:
              Eric Browne, John Arnett, Peter Gummer
              Analyst:
              Eric Browne, Peter Gummer, Thomas Beale
            • Votes:
              1 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved: