AMI-diagram: Mining Facts from Images


There are at least 10 million diagrams published in the scientific literature each year and many of them represent factual information. AMI-Diagram is a flexible tool which can mine facts from diagrams and convert the graphics primitives into XML. The targets include X-Y plots, barcharts, chemical structure diagrams and phylogenetic trees. AMI can ingest born-digital diagrams either as latent vectors (converted from Postscript), pixel diagrams (PNGs and JPEgs) or scanned documents. For high-quality/resolution diagrams the process is automatic; commandline parameters can be used for noisy or complex diagrams. AMI is part of the ContentMine framework ( for automatically extracting science from the published literature.