Design and implementation
At the time this software was developed, there already existed several
open-source solutions which could be adopted by a patient database owner
in order to connect to the MME network as an independent node
(https://github.com/ga4gh/mme-apis/wiki/Implementations). After an
analysis of the existing implementations, we came to the conclusion that
none of them really addressed our needs, as these implementations were
too tightly coupled to data structures and routines of a specific host
research center, or software demos not intended for production settings
(https://github.com/MatchmakerExchange/reference-server). The
first technical reason that prompted us to launch PatientMatcher was the
need to develop an application written in Python
(https://www.python.org/). This is the language of choice for most
of the projects developed at our facility and for this reason the
project will have better chances of being maintained over time. Another
obvious advantage is that developing the solution in a very popular
programming language, will likely increase the chances that
PatientMatcher or some of its modules will be used by other research
centers or diagnostic laboratories willing to connect to MME as distinct
nodes. The second technical challenge that led us to develop a custom
solution, was the necessity of storing data in a documented-oriented
database such as MongoDB (https://www.mongodb.com/), where patient
data documents are very similar to data objects used in Scout
(https://github.com/Clinical-Genomics/scout), the application used
by our clinical laboratories for handling results from NGS analyses.
Additionally, MongoDB saves documents in JSON, the same format used by
MME nodes for exchanging patient data via HTTP requests. Technical
considerations aside, our primary reason to develop the software from
scratch was the opportunity to introduce a highly customizable patient
similarity scoring algorithm, to help data contributors to fine-tune the
parameters of interest to be used in the patient similarity computation.
PatientMatcher consists of a Python (3.6+) backend connected to a web
app built in Flask 2.0+
(https://flask.palletsprojects.com/en/2.0.x/). The application
data is stored in a MongoDB database.
The program backend contains the command to update database resources:
HPO and disease term ontologies, respectively downloaded from the OBO
Foundry (https://github.com/OBOFoundry) and the Jenkins automation
server from the Monarch Institute
(https://ci.monarchinitiative.org/). These resources are the core
of the software’s phenotype similarity score algorithm. The command line
is additionally used to add or remove MME clients (connected nodes
allowed to run queries on PatientMatcher by exhibiting a security token
that is unique for each node) and MME nodes (external nodes queried by
PatientMatcher using a token assigned in turn by these servers).
PatientMatcher is basically a Representation State Transfer (REST) API
tool that allows to programmatically submit data, download results and
perform exhaustive comparison against the internal database data set or
submit queries to external nodes. The application is written to
implement the Matchmaker Exchange API Âspecifications. (J. Buske et al.,
2015) The available server endpoints are illustrated in Table 1.