Understanding the Sources of Error in MBAR through Asymptotic Analysis
Multiple sampling strategies commonly used in molecular dynamics, such as umbrella sampling and alchemical free energy methods, involve sampling from multiple thermodynamic states. Commonly, the data are then recombined to construct estimates of free energies and ensemble averages using the Multistate Bennett Acceptance Ratio (MBAR) formalism. However, the error of the MBAR estimator is not well-understood: previous error analysis of MBAR assumed independent samples and did not permit attributing contributions to the total error to individual thermodynamic states. In this work, we derive a novel central limit theorem for MBAR estimates. This central limit theorem yields an error estimator which can be decomposed into contributions from the individual Markov chains used to sample the states. We demonstrate the error estimator for an umbrella sampling calculation of the alanine dipeptide in two dimensions and an alchemical calculation of the hydration free energy of methane. In both cases, the states' individual contributions to the error provide insight into the sources of error of the simulations. Our numerical results demonstrate that the time required for the Markov chain to decorrelate in individual thermodynamic states contributes considerably to the total MBAR error. Moreover, they indicate that it may be possible to use the contributions to tune the sampling and improve the accuracy of MBAR calculations.
READ FULL TEXT