Simplified Serial Formats - Data Types

Following the SEC virtual meeting 25-27 Mar 2020, we agreed to produce a simpler specification for the ‘simplified data template’ format. As part of that we need stringified data types.

Starting from the EtherCIS and Marand Web Template formats, and also taking into account the ‘standard’ serialisations used in ODIN (and thus in ADL), the following is a table proposing a global specification for openEHR for this need.

The data types below are grouped into Foundation, Base and RM (i.e. DV types).

The columns are as follows:

  • Preferred compact format: compact form derived from ODIN, ADL, EtherCIS and Marand, with JSON wrapping where needed;

  • Marand Web Template: an allowed flavour, given the number of users of EhrScape;

  • Standard JSON: always allowed.

Column content convention: each column should contain a Pattern or JSON-Type (in black), an example (in teal, or as a JSON formatted block) and optionally also comment(s) (in red).

A few questions need to be addressed with these formats:

  • how will a given JSON template text indicate which flavour it is in?

  • can one template contain more than one format, e.g. Compact or MWT mixed with standard JSON?

An assumption made in the table is that some of the formats are not self-describing - a data instance needs to be read with a schema available. This is what would allow “5” to be read as either an Integer or a DvCount, depending on the schema at that location. If we want the data to be self-describing we would need a few more syntax differences.

 

Name

Preferred compact format

Marand Web Template
(allowed alternative)

Standard openEHR JSON
(canonical, not simplified)
(also legal)

Name

Preferred compact format

Marand Web Template
(allowed alternative)

Standard openEHR JSON
(canonical, not simplified)
(also legal)

Primitive types

Boolean

true | false

true

true | false

true | false (Boolean, see schema info)

true

Integer

Integer

12

 

Integer (see schema info)

12

Real

Real (Number in JSON)

12.0
39.4

 

Number (see schema info)

12.0
39.4

Character

String

“c”

 

String (see schema info)

“c”

String

String

“a string”

 

String (see schema info)

“a string”

Uri

“scheme:xyz” (String following RFC 3986)

“http://openEHR.org/home”

 

“scheme:xyz” (String following RFC 3986, see schema info)

“http://openEHR.org/home”

Terminology_code

(=CODE_PHRASE)

“[terminology_id::code_string]” (String)

“[icd10AM::F60.1]”
”[snomed_ct(3.1)::3415004]”

{ "|code": "238", "|terminology": "openehr" }
{ "terminology_id": "icd10AM", "code_string": "F60.1" }

Terminology_term
(=DV_CODED_TEXT ?)

“[terminology_id::code_string|value|]” (String)

”[icd10AM::F60.1|Schizoid personality disorder|]”

”[snomed_ct(3.1)::3415004|cyanosis|]”

{ "|code": "238", "|value": "other care", "|terminology": "openehr" }

 

Iso8601_date

“YYYY-MM-DD”

“01-04-2020”

 

“YYYY-MM-DD”

“01-04-2020”

Iso8601_date_time

“YYYY-MM-DDTHH:MM:SS[.SSS][+HH:MM]”

“01-04-2020T13:45:00”

"2017-09-14T15:44:57.722+03:00"

 

“YYYY-MM-DDTHH:MM:SS.SSS+HH:MM”

“2016-06-23T13:42:16.117+02:00“

Iso8601_time

“HH:MM:SS[.SSS][+HH:MM]”

“13:45:00”

"15:44:57.722+03:00"

 

“HH:MM:SS”

“13:45:00”

S.I.: I think timezone and msec are also supported

Iso8601_duration

“P\dW | P[\dY][\dM][\dD][T[\dH][\dM][\d[.\d]S]]”

“P2Y4M10D”

“P1DT3H”

”PT2h5m0s”

 

“P\dW | P[\dY][\dM][\dD][T[\dH][\dM][\d[.\d]S]]”

“P2Y4M10D”

“P1DT3H”

”PT2h5m0s”

Container of Primitive

List<String>

Array (of strings)

["cyan", "magenta", "yellow", "black"]

 

Array (of strings) (see schema info)

["cyan", "magenta", "yellow", "black"]

List<Integer>

Array (of integers)

[1, 1, 2, 3, 5]

 

Array (of integers) (see schema info)

[1, 1, 2, 3, 5]

List<Terminology_code>

Array of ODIN coded term (see syntax above)

[“[icd10AM::F60.1]”, “[icd10AM::F64.2]”]

 

Hash<K, V>

(standard JSON) Object

 

(standard JSON) Object (see schema info)

Interval of Ordered Primitive

Interval<Integer>

ODIN interval syntax

“|0..5|”

”|>=0|”

 

Interval<Real>

ODIN interval syntax

“|0.0..1000.0|”

”|0.0..<1000.0|”

”|5.0 ±0.5|”

”|5.0 +/-0.5|”

 

Interval<Iso8601_date>

ODIN interval syntax

”|>=1939-02-01|”

”|1927-05-17..2008-01-18|”

 

Interval<Iso8601_time>

ODIN interval syntax

”|08:02..09:10|”

 

 

 

 

 

RM DV types

CODE_PHRASE

“[terminology_id::code_string]” (minimal)
“[terminology_id(ver_id)::code_string|value|]” (maximal)
(specify with regex or BNF/Antlr)

”[snomed_ct::313267000]”

”[snomed_ct::313267000|Stroke|]”

S.I.: there is no support for ver_id & value

DV_TEXT

CHECK - this will only work with schema
“Stroke”

 

DV_CODED_TEXT

(Expressed as DV_CODED_TEXT attributes)

“[defining_code.terminology_id::defining_code.code_string|value|]”
”[snomed_ct::313267000|Stroke|]”

 

DV_ORDINAL

“1|[snomed_ct::313267000|Stroke|]”

 

DV_QUANTITY

”value,unit” (with number of decimals in value indicating precision)
”78.500,kg”

 

DV_PROPORTION

”numerator, denominator, proportion_kind” where latter is RATIO, UNITARY, PERCENT, FRACTION, INTEGER_FRACTION and related to type attribute of DV_PROPORTION

”25.3, 100, PERCENT”

 

DV_EHR_URI

“scheme:xyz”

“http://openEHR.org/home”

 

DV_COUNT

CHECK - bare integer will only work with schema.

“5”

 

DV_DATE

CHECK - will only work with schema

“2016-06-23“

 

DV_DATE_TIME

CHECK - will only work with schema

“2016-06-23T13:42:16.117+02:00“

 

DV_TIME

CHECK - will only work with schema

“13:42:16“

 

DV_DURATION

CHECK - will only work with schema

“PT2h5m0s”

 

DV_MULTIMEDIA

Std JSON structure with compressed items internally

Same as preferred format but with MWT bars

 

DV_PARSABLE

CHECK
{“value”, “formalism”}

T.B.: Std JSON might be better

S.I.: switching: {“formalism“: “value“}

{"Hello world!", "text/plain"}

 

Standard JSON with MWT bars added

 

DV_IDENTIFIER

Standard JSON

Standard JSON with MWT bars added

Standard JSON

DV_SCALE

“1.0|[term::code|symbol|]”

 

DV_BOOLEAN

true | false

 

Interval of RM DV types

DV_INTERVAL
<DV_QUANTITY>

 

 

 

 

 

 

 

RM Common types

PARTICIPATION

Propose:

 

PARTY_IDENTIFIED

 

 

 

 

 

 

 

Higher Level Structures

Ian example from EhrScape doc.

2.2.4. Node structure for ANY (CHOICE) elements

When an element in a template is defined so that it allows multiple data types, multiple nodes are created also in the web template. Node ids for ANY elements are generated by using the type name after the initial DV_ with a _value suffix, for example for a choice element allowing DV_CODED_TEXT and DV_QUANTITY, the following nodes will be generated:

  • coded_text_value

  • quantity_value

Complete node structure will be as follows:
A sample node with choice:

The following table lists RM type names and resulting web template ids:

rm type name

web template node id

rm type name

web template node id

DV_CODED_TEXT

coded_text_value

DV_TEXT

text_value

DV_MULTIMEDIA

multimedia_value

DV_PARSABLE

parsable_value

DV_STATE

state_value

DV_BOOLEAN

boolean_value

DV_IDENTIFIER

identifier_value

DV_URI

uri_value

DV_EHR_URI

ehr_uri_value

DV_DURATION

duration_value

DV_QUANTITY

quantity_value

DV_COUNT

count_value

DV_PROPORTION

proportion_value

DV_DATE_TIME

date_time_value

DV_DATE

date_value

DV_TIME

time_value

DV_ORDINAL

ordinal_value