OpenAPI Strategy
This page is to document a strategy to create / generate OpenAPI expression of openEHR models and API semantics, suited to efficient application development.
Original Discussion
See SEC call notes 30-05-2022.
@Seref Arikan (Personal) - can we find a more developer-oriented set of models? (GraphQL) - reduce inheritance, generics etc. @Heath Frankel (Unlicensed) - many devs only care about OpenAPI or similar level of spec @Thomas Beale - consider repurposing e.g. SIM-B spec @Pieter Bos - NB: not all code generating tools support all of OpenAPI @Heath Frankel (Unlicensed) - FHIR doesn’t ‘do’ OpenAPI directly - devs do it themselves @Erik Sundvall - template-specific API v canonical-based @Seref Arikan (Personal) What if we had OpenAPI repr. of RM, and archetyping on this? What would we lose? @Heath Frankel (Unlicensed) What would we gain? Even FHIR too hard for many devs |
Proposals
Initial proposal: initiating a workgroup to explore the state of OpenAPI specification, tooling and feasibility of using OpenAPI as a computable specification for openEHR’s REST API.
Expected outcomes
(please add yours below)
@Seref Arikan : Providing an OpenAPI specification for the REST API of openEHR so that developers can feed this API to code generators and end up with code in their programming language of preference that can serialise/deserialise/validate JSON based on the OpenAPI document provided by openEHR. The generated code should support both clients of REST API (building actual RM instances and calling REST endpoints) and server side implementations (generating stubs/interfaces which should be implemented to deliver the functionality defined in the contract, i.e. OpenAPI document). As a minimum, Java, C# and Typescript should be supported as code generation targets. The serialised JSON should be as aligned as possible with the current JSON representations, which are in turn mostly parallel to XSDs published by openEHR.
@Sebastian Iancu : I mostly agree with statements from @Seref Arikan, but as additions, I think we should:
have openEHR REST Api specs formally expressed in a popular, consumer friendly and machine processable format, hence we need to migrate from api-blueprint to OpenAPI
the resulted openEHR OpenAPI specs should be easily used to (order matters):
generate HTML docs
validate data (API payload)
generate client code or server stubs
(in the future) generate mock-servers (for generic testing)
address not only schema for the json payload, but also xml
Status Updates/Progress
A workgroup consisting of @Sebastian Iancu , @Pablo Pazos , @Seref Arikan (Personal) and @Pieter Bos started exploring OpenAPI. The initial challenge has been representing the inheritance as it is used by RM. OpenAPI supports different ways of expressing what we’d commonly refer as inheritance in OO languages, and existing code generators produce code in different styles based on the approach used in the OpenAPI schema.
To narrow the behaviour in this fundamental case, we adopted a simple approach, a stripped down Composition type defined in OpenAPI, with its name
property having type DvText
and DvCodedText
inheriting from DvText
. @Sebastian Iancu produced different variations of expressing inheritance for this simple case with help from @Pieter Bos (which is much appreciated) as can be seen here.
Looking at the code generated from these different approaches, we found that code generators cannot handle some valid OpenAPI models. At least for Java and C# targets, the problem is that the _type
discriminator field ends up having an enum type generated for it, but then the generated classes attempt to assign a string value to this field, hence the generated code won’t build. Aside from this, the code generated by the tools uses a quite cumbersome way of representing inheritance, namely a wrapper type that takes the actual field value in its constructor, and developers are supposed to do a type check (via a generated method) to use the actual name
field: is the value a DvText
or a DvCodedText
? Given the set of models Sebastian produced, we may be able to relax the OpenAPI schema a bit and then the code generator can generate code that’ll actually build. This would give us JSON serialisation/deserialisation/validation, though the convenience of the resulting code for developers remains to be seen.
Tools used:
yaml schema validator from: swagger, redoc, prism/stoplight, all as cli-tools (docker).
code generated with swagger and openapi-generator https://openapi-generator.tech/
mock-servers from redoc, stoplight, and danielgtaylor/apisprout
generate code in many languages: android, csharp, csharp-netcore, dart, eiffel, go, groovy, java, javascript, javascript-closure-angular, kotlin, objc, perl, php, python, ruby, spring, swift5, typescript-fetch typescript-node
validators: NetworkNT, Openapi4j, AJV, schemaSafe
Tools can be used (via docker) from https://github.com/openEHR/sanbox-openapi/tree/master/dev . Tetst files are located in https://github.com/openEHR/sanbox-openapi/tree/master/docs . Generated code is in https://github.com/openEHR/sanbox-openapi/tree/master/gen .
Beside the DvText/DvCodedText examples, we also converted all of https://specifications.openehr.org/releases/ITS-REST/Release-1.0.2 in OpenAPI v3.0.3 yaml format and rendered as HTML. The goal is to Release 1.0.3 open openEHR REST API specs.
The other line of work was done by @Pablo Pazos to look at the rest of the RM from an OpenAPI modelling perspective. (Pablo, feel free to fill this part in with a proper definition of the work you’ve done, I failed to do that )
Strategy
(please add yours)
@Seref Arikan : My view is that we may be able to find a working combination of OpenAPI schema and code generators for Java/C#/Typescript, even though we may have to modify code generator tools for the specific way we represent inheritance in OpenAPI speak. Then, in time, code generators may catch up and our patch may not be needed. Nothing stops us from making a PR to these open source generators by the way. Frankly, even this is a non-trivial commitment. We need to identify an almost working configuration of OpenAPI schema and generators, then close the gaps where necessary, and maintain it for some languages, probably mainstream ones. This is the lowest cost option forward that I can see for the expected outcomes I listed above. We dont' know yet if we’ll have issues with the OpenAPI-code generation setting for modelling other aspects of the RM. I think at a high level, the question is: can we express what is expressed by the XSDs using OpenAPI, and can we have code generation for these. I think it’s also possible that we may conclude that OpenAPI is not there yet and park it. So my suggested strategy is to relax OpenAPI schema a bit, going back from what Sebastian has done, and if we can handle the broken code generation issue with that, see what can be done beyond our simple Composition
experiment. Then have another discussion at that point and either drop it or move further.
@Sebastian Iancu : Apparently it is hard to find codegen tools that can work fine with schemas developed for validation. Changing codegen templates to match our usecase, and fix bugs that we find, is an option to increase effectiveness of those codegen tools, while we focus our schemas mainly on validation. But this strategy might be expensive from work-hours perspective. My thoughts is that we need to have two flavours of schema: for validation and for codegen. Improving codegen templates is still important to do, but it wont be a bloker.
Steps to take:
make some tools/helpers so that based on a single schema source we could generate two different yaml schemas (validation and codegen)
focus mainly on schema validation, generic perspective (not necessary a schema to be used with a specific validation tool)
schema codegen comes on 2nd place, but we should do whatever is necessary to maximize quality of generated code
improve (if possible) 3-5 codegen templates (like @Seref Arikan suggested, Java, C#, etc)
provide generated code in min 3-5 languages, as implementation example, for demonstration purposes, not aiming to be used in production or real software development (so not public reference libraries)