Current State of Data Tagging Overview

Responding to customers' needs, vendors implemented functionality for decorating data in EHR with additional “tags” to encode non-structured non-clinical (technical) context, that needs not be versioned. This is not yet part of the openEHR standard and has been introduced by vendors with barely any coordination, so implementations are incompatible. We hope to be able to unify our approaches at least in their “least common denominator” and standardise this aspect of CDRs. Vendors might still add their own bells and whistles to the standardised bare minimum, but the basic idea would be uniformed.

The first step is to gather information on existing implementations, which is the purpose of this page.

Better (ex Marand)

Our CDR can work with tags on arbitrary AQL-path-addressable element from a COMPOSITION downward. Technically tags have two fields to address the target element (full COMPOSITION UID and an AQL path) and two content fields (tag name and an optional tag value, both strings).

Tagging can be done in two ways.

  • Tags (with optional values) can be part of the contributed composition in our proprietary “flat JSON” format under the ctx/tags field. The canonical format has no place for tags, obviously.

  • Via a proprietary REST API; see /tagging under
    Electronic Health Record APIs on https://www.ehrscape.com/api-explorer.html. Adding a tag this way does not create a new version of composition, and non-current version can be tagged as well.

Tags can be used in multiple ways:

  • Tags on a composition can be retrieved via that same API.

  • Objects, tagged with specific tags (where either only tag name can be constrained, or the value as well if required), can be retrieved via that same API. Note that CDR can be instructed to search only among the current versions of the compositions, or to return only the most recent version of any object that has the requested tag(s) (which may or may not be the current i.e. generally most recent version), or to return all tagged versions, or to return only the current version of any object that ever had a request tag, even if it does not have it in the current version.

  • The AQL has been extended with a function tags(c) that returns a list of compositions' tags (including values and AQL paths, if they have them) like this:

    <te:tag_with_value> <te:aqlPath>/</te:aqlPath> <te:tag>formName</te:tag> <te:value>Diagnosis of diabetes - LDL order</te:value> </te:tag_with_value>

    (te is our proprietary namespace http://schemas.marand.com/thinkehr/v1)
    This function can, at the moment, only be used in the projection (SELECT) part of the AQL.

  • The AQL has been extended with a keyword TAGGED BY that can be used in WHERE clauses; at the moment only for entire compositions: … WHERE c TAGGED BY 'key::value'. The double colon and value can be omitted to filter only by tag name (key). We propose this approach not to be standardised and instead to use the function tags in this context as well to avoid modifying the grammar; we believe DIPS does filtering this way.

Code24

What we have:

  • a tag is a pair of key-value associated with the latest version of a composition (or party, or folder); tag values are mandatory

  • tags are not stored in the version’s data (so not in the composition) but instead are associated with the versioned_object (of that composition)

  • tags are used as indexes, their values are auto-extract from pre-configured paths, every time a version is committed; in theory we could also deal explicit values (instead of extracted) but this functionality is not used

  • tags are not folders

  • we also have labels, which in comparison with tags are just values associated with versioned_object; these labels have to be explicitly set (they are not auto-extracted as tags above)

  • tags and values are returned or could be set as metadata in REST api of specific resource

  • while querying with our query engine we use tags and labels in a similar way as Better above, although it is written differently, equivalent to … WHERE tags.key = ‘value' as well as … WHERE tags.key LIKE 'val%’ . This can be seen as a reserved operand prefix.

What we are open for:

  • stopping with ‘labels’ in favor of ‘tags’.

  • allowing tags to have optional value.

  • support usage of tags in query as functions as @Bjørn Næss suggested: select c/name/value, tag(c, 'EpisodeOfCareId') from composition c where tag(c, 'EpisodeOfCareId') = 'test' limit 5

  • if via function tag() is not an valid option (curious about @Seref Arikan feedback) , then TAGGED BY will also do, but preferably as … WHERE c TAGGED BY key = 'value'

DIPS

Tags were developed to support integration with proprietary systems which don’t do versioning of data. Some metadata might change without committing a new version. The tags are used to update metadata on Compositions from the client.

The client may update the tags using the REST API:

POST /api/v1/{uid}/tags - Update tags on a COMPOSITION or FOLDER. Existing tags will be replaced.

There is also a REST API to get all tags in the CDR:

GET /api/v1/tag/keys - Get all tag keys currently defined in the CDR.

For query we have one function tag(c, <tag key>) which returns the value of the <tag key> of the composition. The function might be used both in SELECT and WHERE like this:

select tag(c, ‘EpisodeOfCareId’) as Episode from Composition c where tag(c, ‘ReferralId’) = ‘<some referral id>’