AQL Evolution Notes
Working group:
- Seref Arikan (Ocean)
- Diego Bosca (Veratech)
- Ian McNicoll (Inidus)
- Bjorn Naess (DIPS)
- Chunlan Ma (Ocean AQL expert)
- Boštjan Lah (Marand)
- Luis Marco (NSE/PLRI)
Work plan;
- solve the permutations problem
- add LINK-resolution/following capability to AQL
- consider conformance test additions
- ....
Functions
Assume BI etc is done in another form, if so, define some ETL semantics?
- Question - should we do write out some specification for more advanced query syntax for the use-cases related to secondary use of date (business anlytics, aggregation and statistical functions)?
Better support for terminology
allow Snomed expressions?
Ian: We have made a start on some low-level openEHRText/Coded_text <> FHIR equivalents - see https://github.com/inidus/openehr-care-connect-adaptor/blob/master/docs/mapping_guidance/openEHR-AdverseReactionRisk-to-FHIR-AllergyIntolerance-STU3-mappings.adoc
Bjorn: Expand in AQL - like suggested here?
https://openehrspecs.slack.com/archives/C2CD84RUL/p1504168030000398?thread_ts=1474025119.000021&cid=C2CD84RUL
Permutations problem
Fix semantics problems - Pablo, Seref
- Seref blog - discussion
- AQL discussions by Seref: https://github.com/bjornna/aql-discussion
- Permutation discussion by Bjørn Næss : https://github.com/bjornna/openehr-conformance/tree/master/aql
Fundamental problem: what's the underlying formalism?
See also Seref's tree diagrams.
Use cases
Ian ex:
Bjorn:
- https://github.com/bjornna/openehr-conformance/blob/master/aql/case1.1-permutation_bp/index.adoc
- https://github.com/bjornna/openehr-conformance/blob/master/aql/case1.1-permutation_gcs/index.adoc
- we want to get a single row with one column containing Container<String>
Diego: see thesis p100-104;
Bostjan: use a 'compress' modifier to indicate to 'reduce' the rows
General shape of Solution
Add one standard function - 'extract', or 'filter' (use regex); or 'construct'
Bostjan suggestion: extract(parent node, 'aql path1', 'aql path2',...) which would give a collection of those columns
Check sparql, Xquery etc for precedents.
SPARQL
There are several functions that have some relationship with the intended use of query patterns:
a) FILTER: is the closest function to what we are looking for. It allows to define filters usig strings (regex), numeric values (arithmetic operators)
https://www.w3.org/TR/rdf-sparql-query/#scopeFilters
b) CONSTRUCT allows to build a new graph that may follow some pattern from a graph returned in a SPARQL query. https://www.w3.org/TR/rdf-sparql-query/#construct
c) GROUP https://www.w3.org/TR/rdf-sparql-query/#GroupPatterns
In Eligibility Criteria (EC) SPARQL seems to be the most used language according to https://www.w3.org/wiki/images/archive/3/32/20130224153627!Convergence-Meeting-2013-Summary.pdf
Gremlin
If the EXTRACT function aims to allow concatenation a nice example of how to orientate it is the Gremlin-Tinkerpop database: http://tinkerpop.apache.org/
ECLECTIC
Coming from the medical informatics domain, it is a language designed for Eligibility Criteria (EC) in clinical studies. It is closer to human language than AQL abstracting the query implementer from technical details of the query language. Temporal semantics are user-friendly (e.g. first diagnosis, last result etc.) but there are no complex pattern matching functionalities. https://www.thieme-connect.com/DOI/DOI?10.3414/ME13-02-0027
SQWRL
Defines basic aggregation functions to be applied over an OWL ontology. It is less flexible that SPARQL FILTER but relies on the logics underlying the ontology https://github.com/protegeproject/swrlapi/wiki/SQWRLCore#Aggregation
Others: OCL-based GELLO, ...
Query by patterns
Can be done with ADL2 structures;
See Diego's work.
Link-resolution/following in AQL
One approach is to add a resolve() function to resolve links to its target in the result.
Other approach: another keyword like CONTAINS-RESOLVE? (Heath: bad idea... need to take account of type in LINK)
Problem of recursion, i.e. graph of LINKs
One suggestion (from Ian?) was to have three different versions of operators
- CONTAINS (as today?)
- RESOLVE or a function resolve(...)
- CONTAINS_OR_RESOLVE - a combination of both
See
Update/Delete
Demographic Querying
Tags and Querying
DIPS ex:
- Query by TAG: select c/name/value, tag(c, 'EpisodeOfCareId') from composition c where tag(c, 'EpisodeOfCareId') = 'test' limit 5
How to represent tags?
Marand ex:
- Query by TAG: SELECT tags(c) FROM COMPOSITION c WHERE c TAGGED BY 'key::value'
Using Folders
Basic requirement - able to add post-hoc containers / bundles collections
Semantics of FOLDER CONTAINS
Folders +/- containment concept
- Examples of queries with FOLDER: https://github.com/bjornna/openehr-conformance/tree/master/aql/case2-folders
Eriks interpretation of DIPS current jumping from FOLDER→COMPOSITION:
- Let the CONTAINS operatior first recursively taverse down the hierarchy of FOLDERS (like // in xpath) according to query parameters. Then when reaching a matching folder, let CONTAINS temporarily mean to follow query-matching .items links in that specific folder to any matching COMPOSTIONs in that folder. Then when continuing traversal inside the composition(s) let CONTAINS again only mean real parent-child containment within the COMPOSITION(s), thus not following links from the COMPOSITION(s)
Post-hoc reference lists, Folders - enables 'views'
See Bjorn notes & papers.