Impact of simulation length and number of FEP windows to \(\Delta G\) accuracy. (A) \(\Delta\Delta G_R\) errors corresponding to \(\Delta\Delta G_R\) calculations in Fig. 6 of the main text: in red the standard FEPc simulation setup (10 ns for 14 \(\lambda\)-windows, in red), in cyan the setup for a longer simulation length (20 ns for 14 \(\lambda\)-windows), and in blue for shorter simulation length but larger number of FEP windows (5 ns for 28 \(\lambda\)-windows). (B) Overall FEPc performance substituting the 15 calculations labeled in Fig. S9A with the ones obtained with last setup (5 ns for 28 \(\lambda\)-windows, labeled as mFEPc), subdividing the dataset in the different categories analyzed in Fig. 5 of the main text. (C) Impact of simulation length on \(\Delta\Delta G_R\) calculations in the above 15 mutations for the 28 \(\lambda\)-windows FEPc setup.