FEP performances in thermodynamic reversibility. (A) Convergence of the error-weighted average for \(\Delta\Delta G_R\) at different simulation lengths in the two force fields, CHARMM36 (FEPc, in red) and Amber99sb*ILDN (FEPa, in cyan). (B) Percentage of the dataset with a \(\Delta\Delta G_R\) below 0.5 (red bars) and 1 kcal/mol (blue bars) at different simulation lengths for the Amber99sb*ILDN force field.