A Rigourous Evaluation Pipeline