To unite genotype and phenotype data, we have developed an extract, transform, load (ETL) pipeline called DIPper (DataIngest Pipeline). We extract data in multiple formats from numerous sources. For example, in the case of the mouse database, MGI, and the fly database, Flybase, we extract data in the form of dumps of relational databases. In other cases, such as the zebrafish database, ZFIN, we ingest a flat file (.csv), and in the case of the genomic variation database, Clinvar, we ingest XML format. But harmonizing different file formats is trivial compared to harmonizing the way in which the underlying data itself is modeled; modeling varies widely between (and even within) sources. Because the component pieces are not designed to be used together, the task is not unlike building a single structure from a combination of Legos, Lincoln Logs, and Tinker Toys. These pieces do not fit together without the “semantic glue” of shared data modelling and ontologies; this is precisely what Monarch provides.