Performing analytics on your data (Bert, Johannes and Marcus)
If many small calculations (especially machine learning calculations on line format structures) are to be performed on a large set of molecules, it can be very inefficient to have a task monitor and a docker container for each individual calculation. For this reason, the user is able to submit a batch of calculations as a single task. In this task, there is only one task monitor and one docker container used, and the docker container is given the freedom to choose how to run the calculations (one at a time, one per core, etc.). Additionally, each calculation is checked to see if it has been performed before. If so, it will skip that calculation and use the output already present in the database. An example of this strategy can be seen in the example below, where ChemML results for a list of SMILES are compared in a plot.