For some time there have been discussions about how best to support anonymous population querying in openEHR EHR AQL, given that demographic entities are intentionally separate. By anonymous, we mean queries that are not directly identifiable e.g on Year of Birth, sex ,gender, broad location alive/dead etc.
e.g. How do we do “Give me a list of EHRIds for Females over 65 who have elevated blood pressures”?
Of course, most CDR implementation environments will have a way of ‘making this work’ but there is no ‘standard’ way of defining this in current AQL
The barriers:
1. Even if the Demographics RM was supported in AQL, it is not widely used, and in many cases, other demographics services are used. So we need something that is agnostic of the exact demographic service used.
2. There is also need to support minimal demographics support within the EHR itself. Although ‘discouraged’ we know there are examples of use of the ehr_status/other_details structure , or a separate composition holding simple subject demographic details, usually anonymised
The borad solution proposed is to define an AQL function(s) or to represent common population query variables such as Age/Year of birth, sex , gender etc. It would be up to the CDR implementation (probably needing quite local adapttion) to resolve the queried varaible to an actual datapoint
So.. in the context of an AQL like
SELECT c FROM EHR e CONTAINS COMPOSITION c WHERE YearOfBirth() > '1958'
YearofBirth() might resolve to ….
An openEHR Demographics data path
An openEHR EHR service path
a FHIR path
another api endpoint
depending on the local demographics service provision
Challenges
1. Exactly how do we describe/define the population query variables
Do we try to align with existing demographics models e.g openEHR or FHIR
We are definitely not trying to suggest or enforce any particular actual demographics representation
Options
Make use of an existing EHR/EHR_STATUS path and a pseduo archetype to act as the object definition
SELECT e FROM EHR e WHERE EHR e/ehr_status/subject/@entities/[openEHR-EHR-CLUSTER.pop_demographics.v1]/items[at0001|Birth Sex|] = 'at0005|Male|'
In this case, although the CLUSTER.pop_demographics.v1 exists, it is only definitional and is not actually carried in the CDR
Use an indepedent object such as SUBJECT(). (Sebastian Iancu )
2aSELECT SUBJECT('dateofBirth') as dob, FROM EHR e CONTAINS COMPOSITION c WHERE SUBJECT('dateofBirth') < '1958' and SUBJECT('gender') matches {'at0005|Male|'}
or a variation,
2b
SELECT SUBJECT().dateOfBirth as dob, FROM EHR e CONTAINS COMPOSITION c WHERE SUBJECT().dateofBirth < 1958 and SUBJECT().gender matches {'at0005|Male|'}
or with clean link between EHR and SUBJECT:
2c
SELECT SUBJECT(e).dateOfBirth as dob, FROM EHR e CONTAINS COMPOSITION c WHERE SUBJECT(e).dateOfBirth < '1958-01-01' and SUBJECT(e).gender matches {'at0005|Male|'}
Making use of some of the ideas around a broader subject proxy service
SELECT s.proxy('age') as age, a/items[at0001]/value as analyteResult FROM EHR e (CONTAINS EHR_STATUS CONTAINS PARTY_PROXY s) AND (CONTAINS COMPOSITION c CONTAINS CLUSTER a[openEHR-EHR-CLUSTER.laboratory_test_analyte.v1]) WHERE s.proxy('age') < 25 and s.proxy('gender') matches {'at0005', 'Male', 'M'}
See more at https://discourse.openehr.org/t/demographic-virtual-values-workgroup/5359/18
Other thoughts
Some of the ideas in SQL on FHIR seem pertinent
https://www.health-samurai.io/articles/sql-on-fhir-an-inside-look
and AGE() might be better implemented as a simple AQL function, along with NOW() as a convenience
WHERE AGE(SUBJECT('dateofBirth'), NOW()) < 58
2. Kickoff Virtual Demographics 16.07.24
Participants
Ian McNicoll | Bostjan Lah | Matija Polajnar |
Diego Bosca | Severin Kohler |
All of the below solutions are still WIP
We discussed different AQL solutions. It was agreed that the following version is probably the best, but it still needs further discussion. The syntax introduces the @ symbol to indicate the use of a virtual demographics archetype. Dots were used, as they are easier to write.
SELECT e@subject.dateOfBirth as dob, FROM EHR e CONTAINS COMPOSITION c WHERE e@subject.dateOfBirth < '1958-01-01' and e@subject.gender matches {'LA2-8'}
The model behind this query is provided as an archetype for documentation and modeling purposes, using the FHIR Patient as an orientation. The scope of the virtual demographics is for population queries; therefore, there is no need for a non-pseudonymized/anonymized person representation. Codings used should also derive from FHIR values used in Patient.
The model can be found at https://github.com/freshehrteam/openehr-sec.git
Besides the archetype description, the idea of introducing an Implementation Guide (IG) as a usage guideline was welcomed.
3. Virtual Demographics 28.08.24
Participants
Ian McNicoll | Bostjan Lah | Matija Polajnar |
Severin Kohler |
We discussed Diego Boscas' and Chunla Ma's comments. Introducing age comes with a lot of challenges. We currently cannot anticipate all the problems and requirements, such as what to use as an output format (e.g., Julian date, ISO date). With ISO, there is the issue of converting days to months, for example. Therefore, we agreed not to support age for now and instead use a date difference. Chunla Ma's proposal would require supporting a generic PROXY_PARTY logic, which has implications beyond demographics, like querying a PROXY_PARTY inside a composition. Consequently, we did not adopt this solution.
We discussed how much of the new virtual demographics query logic needs to be consistent with the current AQL and to what extent we can introduce new grammar and syntax. We agreed that, for now, we want to maintain two versions of the grammar: Version One, which aligns with current AQL, and Version Two, which is a novel approach to a modern, intuitive AQL syntax for virtual demographics. Version Two is separated into two proposals. For Version Two, we want to make use of FHIR coding, as it provides a simpler approach to querying terminologies than having term mappings and defining codes. However, this is still open for discussion.
Archetype:
From: https://github.com/freshehrteam/openehr-sec
Version 1
SELECT e as ehr, FROM EHR e CONTAINS COMPOSITION c WHERE e@subject/dateOfBirth/value < '1958-01-01' and e@subject/gender/defining_code/code_string matches {'LA2-8'} and e@subject/dateOfBirth/value > current_timestamp('-P12Y')
Version 2a
SELECT e as ehr, FROM EHR e WHERE e@subject.dateOfBirth.value < '1958-01-01' and e@subject.gender matches {'LA2-8'} and -- matches is interchangable with = e@subject.dateOfBirth.value > current_timestamp('-P12Y')
Version 2b
SELECT e as ehr, FROM EHR e WHERE e@subject.dateOfBirth < '1958-01-01' and e@subject.gender.coding.code matches 'LA2-8' and -- matches is interchangable with = e@subject.dateOfBirth > current_timestamp('-P12Y')
4. Virtual Demographics 05.09.24
Ian McNicoll | Bostjan Lah | Matija Polajnar |
Severin Kohler | Diego Bosca | Sebastian Iancu |
We discussed whether all functions should be allowed on the virtual demographics. We agreed that, for simplicity, we will only allow LIMIT
, OFFSET
, and ORDER BY
. Functions like COUNT
, MEAN
, MAX
, MIN
, and AVERAGE
will not be supported. We also agreed that the virtual demographics parameters should be based on FHIRsearch parameters.
Sebastian made a comment that we need to emphasize that the virtual demographics extension to AQL is agnostic of the format in which demographic data is maintained by vendors. It is oriented towards FHIRsearch parameters but does not require the underlying data to be in FHIR.
TODOs:
Diego agreed to define some sample queries.
Severin agreed to extend ANTLR AQL grammar to support the first draft of virtual demographics.
Open for discussion:
Define the
e@subject
return.Should we create a REST API endpoint to connect to demographic data for the AQL engine?