Design and implementation

At the time this software was developed, there already existed several open-source solutions which could be adopted by a patient database owner in order to connect to the MME network as an independent node (https://github.com/ga4gh/mme-apis/wiki/Implementations). After an analysis of the existing implementations, we came to the conclusion that none of them really addressed our needs, as these implementations were too tightly coupled to data structures and routines of a specific host research center, or software demos not intended for production settings (https://github.com/MatchmakerExchange/reference-server). The first technical reason that prompted us to launch PatientMatcher was the need to develop an application written in Python (https://www.python.org/). This is the language of choice for most of the projects developed at our facility and for this reason the project will have better chances of being maintained over time. Another obvious advantage is that developing the solution in a very popular programming language, will likely increase the chances that PatientMatcher or some of its modules will be used by other research centers or diagnostic laboratories willing to connect to MME as distinct nodes. The second technical challenge that led us to develop a custom solution, was the necessity of storing data in a documented-oriented database such as MongoDB (https://www.mongodb.com/), where patient data documents are very similar to data objects used in Scout (https://github.com/Clinical-Genomics/scout), the application used by our clinical laboratories for handling results from NGS analyses. Additionally, MongoDB saves documents in JSON, the same format used by MME nodes for exchanging patient data via HTTP requests. Technical considerations aside, our primary reason to develop the software from scratch was the opportunity to introduce a highly customizable patient similarity scoring algorithm, to help data contributors to fine-tune the parameters of interest to be used in the patient similarity computation. PatientMatcher consists of a Python (3.6+) backend connected to a web app built in Flask 2.0+ (https://flask.palletsprojects.com/en/2.0.x/). The application data is stored in a MongoDB database.
The program backend contains the command to update database resources: HPO and disease term ontologies, respectively downloaded from the OBO Foundry (https://github.com/OBOFoundry) and the Jenkins automation server from the Monarch Institute (https://ci.monarchinitiative.org/). These resources are the core of the software’s phenotype similarity score algorithm. The command line is additionally used to add or remove MME clients (connected nodes allowed to run queries on PatientMatcher by exhibiting a security token that is unique for each node) and MME nodes (external nodes queried by PatientMatcher using a token assigned in turn by these servers).
PatientMatcher is basically a Representation State Transfer (REST) API tool that allows to programmatically submit data, download results and perform exhaustive comparison against the internal database data set or submit queries to external nodes. The application is written to implement the Matchmaker Exchange API ­specifications. (J. Buske et al., 2015) The available server endpoints are illustrated in Table 1.