The mbar module: MBAR

The mbar module contains the MBAR class, which implements the multistate Bennett acceptance ratio (MBAR) method [1].

A module implementing the multistate Bennett acceptance ratio (MBAR) method for the analysis of equilibrium samples from multiple arbitrary thermodynamic states in computing equilibrium expectations, free energy differences, potentials of mean force, and entropy and enthalpy contributions.

Please reference the following if you use this code in your research:

[1] Shirts MR and Chodera JD. Statistically optimal analysis of samples from multiple equilibrium states. J. Chem. Phys. 129:124105, 2008. http://dx.doi.org/10.1063/1.2978177

This module contains implementations of

  • MBAR - multistate Bennett acceptance ratio estimator
class pymbar.mbar.MBAR(u_kn, N_k, maximum_iterations=10000, relative_tolerance=1e-07, verbose=False, initial_f_k=None, solver_protocol=None, initialize='zeros', x_kindices=None, **kwargs)

Multistate Bennett acceptance ratio method (MBAR) for the analysis of multiple equilibrium samples.

Notes

Note that this method assumes the data are uncorrelated.

Correlated data must be subsampled to extract uncorrelated (effectively independent) samples.

References

[1] Shirts MR and Chodera JD. Statistically optimal analysis of samples from multiple equilibrium states. J. Chem. Phys. 129:124105, 2008 http://dx.doi.org/10.1063/1.2978177

Initialize multistate Bennett acceptance ratio (MBAR) on a set of simulation data.

Upon initialization, the dimensionless free energies for all states are computed. This may take anywhere from seconds to minutes, depending upon the quantity of data. After initialization, the computed free energies may be obtained by a call to getFreeEnergyDifferences(), or expectation at any state of interest can be computed by calls to computeExpectations().

Parameters:

u_kn : np.ndarray, float, shape=(K, N_max)

u_kn[k,n] is the reduced potential energy of uncorrelated configuration n evaluated at state k.

u_kln : np.ndarray, float, shape (K, L, N_max)

If the simulation is in form u_kln[k,l,n] it is converted to u_kn format

u_kn = [ u_1(x_1) u_1(x_2) u_1(x_3) . . . u_1(x_n)
         u_2(x_1) u_2(x_2) u_2(x_3) . . . u_2(x_n)
                                    . . .
         u_k(x_1) u_k(x_2) u_k(x_3) . . . u_k(x_n)]

N_k : np.ndarray, int, shape=(K)

N_k[k] is the number of uncorrelated snapshots sampled from state k. Some may be zero, indicating that there are no samples from that state.

We assume that the states are ordered such that the first N_k are from the first state, the 2nd N_k the second state, and so forth. This only becomes important for BAR – MBAR does not care which samples are from which state. We should eventually allow this assumption to be overwritten by parameters passed from above, once u_kln is phased out.

maximum_iterations : int, optional

Set to limit the maximum number of iterations performed (default 1000)

relative_tolerance : float, optional

Set to determine the relative tolerance convergence criteria (default 1.0e-6)

verbosity : bool, optional

Set to True if verbose debug output is desired (default False)

initial_f_k : np.ndarray, float, shape=(K), optional

Set to the initial dimensionless free energies to use as a guess (default None, which sets all f_k = 0)

solver_protocol : list(dict) or None, optional, default=None

List of dictionaries to define a sequence of solver algorithms and options used to estimate the dimensionless free energies. See pymbar.mbar_solvers.solve_mbar() for details. If None, use the developers best guess at an appropriate algorithm.

The default will try to solve with an adaptive solver algorithm which alternates between self-consistent iteration and Newton-Raphson, where the method with the smallest gradient is chosen to improve numerical stability.

initialize : ‘zeros’ or ‘BAR’, optional, Default: ‘zeros’

If equal to ‘BAR’, use BAR between the pairwise state to initialize the free energies. Eventually, should specify a path; for now, it just does it zipping up the states.

The ‘BAR’ option works best when the states are ordered such that adjacent states maximize the overlap between states. Its up to the user to arrange the states in such an order, or at least close to such an order. If you are uncertain what the order of states should be, or if it does not make sense to think of states as adjacent, then choose ‘zeroes’.

(default: ‘zeros’, unless specific values are passed in.)

x_kindices

Which state is each x from? Usually doesn’t matter, but does for BAR. We assume the samples are in K order (the first N_k[0] samples are from the 0th state, the next N_k[1] samples from the 1st state, and so forth.

Notes

The reduced potential energy u_kn[k,n] = u_k(x_{ln}), where the reduced potential energy u_l(x) is defined (as in the text) by: u_k(x) = beta_k [ U_k(x) + p_k V(x) + mu_k' n(x) ] where

beta_k = 1/(kB T_k) is the inverse temperature of condition k, where kB is Boltzmann’s constant

U_k(x) is the potential energy function for state k

p_k is the pressure at state k (if an isobaric ensemble is specified)

V(x) is the volume of configuration x

mu_k is the M-vector of chemical potentials for the various species, if a (semi)grand ensemble is specified, and ' denotes transpose

n(x) is the M-vector of numbers of the various molecular species for configuration x, corresponding to the chemical potential components of mu_m.

x_n indicates that the samples are from k different simulations of the n states. These simulations need only be a subset of the k states.

The configurations x_ln must be uncorrelated. This can be ensured by subsampling a correlated timeseries with a period larger than the statistical inefficiency, which can be estimated from the potential energy timeseries {u_k(x_ln)}_{n=1}^{N_k} using the provided utility pymbar.timeseries.statisticalInefficiency(). See the help for this function for more information.

Examples

>>> from pymbar import testsystems
>>> (x_n, u_kn, N_k, s_n) = testsystems.HarmonicOscillatorsTestCase().sample(mode='u_kn')
>>> mbar = MBAR(u_kn, N_k)
W_nk

Retrieve the weight matrix W_nk from the MBAR algorithm.

Necessary because they are stored internally as log weights.

Returns:

weights : np.ndarray, float, shape=(N, K)

NxK matrix of weights in the MBAR covariance and averaging formulas

computeCovarianceOfSums(d_ij, K, a)

We wish to calculate the variance of a weighted sum of free energy differences. for example var(\sum a_i df_i).

We explicitly lay out the calculations for four variables (where each variable is a logarithm of a partition function), then generalize.

The uncertainty in the sum of two weighted differences is

var(a1(f_i1 - f_j1) + a2(f_i2 - f_j2)) =
    a1^2 var(f_i1 - f_j1) +
    a2^2 var(f_i2 - f_j2) +
    2 a1 a2 cov(f_i1 - f_j1, f_i2 - f_j2)
cov(f_i1 - f_j1, f_i2 - f_j2) =
    cov(f_i1,f_i2) -
    cov(f_i1,f_j2) -
    cov(f_j1,f_i2) +
    cov(f_j1,f_j2)

call:

f_i1 = a
f_j1 = b
f_i2 = c
f_j2 = d
a1^2 var(a-b) + a2^2 var(c-d) + 2a1a2 cov(a-b,c-d)

we want 2cov(a-b,c-d) = 2cov(a,c)-2cov(a,d)-2cov(b,c)+2cov(b,d), since var(x-y) = var(x) + var(y) - 2cov(x,y), then, 2cov(x,y) = -var(x-y) + var(x) + var(y). So, we get

2cov(a,c) = -var(a-c) + var(a) + var(c)
-2cov(a,d) = +var(a-d) - var(a) - var(d)
-2cov(b,c) = +var(b-c) - var(b) - var(c)
2cov(b,d) = -var(b-d) + var(b) + var(d)

adding up, finally :

2cov(a-b,c-d) = 2cov(a,c)-2cov(a,d)-2cov(b,c)+2cov(b,d) =
    - var(a-c) + var(a-d) + var(b-c) - var(b-d)

a1^2 var(a-b)+a2^2 var(c-d)+2a1a2cov(a-b,c-d) =
    a1^2 var(a-b)+a2^2 var(c-d)+a1a2 [-var(a-c)+var(a-d)+var(b-c)-var(b-d)]

var(a1(f_i1 - f_j1) + a2(f_i2 - f_j2)) =
    a1^2 var(f_i1 - f_j1) + a2^2 var(f_i2 - f_j2) + 2a1 a2 cov(f_i1 - f_j1, f_i2 - f_j2)
= a1^2 var(f_i1 - f_j1) + a2^2 var(f_i2 - f_j2) + a1 a2 [-var(f_i1 - f_i2) + var(f_i1 - f_j2) + var(f_j1-f_i2) - var(f_j1 - f_j2)]

assume two arrays of free energy differences, and and array of constant vectors a. we want the variance var(\sum_k a_k (f_i,k - f_j,k)) Each set is separated from the other by an offset K same process applies with the sum, with the single var terms and the pair terms

Parameters:

d_ij : a matrix of standard deviations of the quantities f_i - f_j

K : The number of states in each ‘chunk’, has to be constant

outputs : KxK variance matrix for the sums or differences \sum a_i df_i

computeEffectiveSampleNumber(verbose=False)

Compute the effective sample number of each state; essentially, an estimate of how many samples are contributing to the average at given state. See pymbar/examples for a demonstration.

It also counts the efficiency of the sampling, which is simply the ratio of the effective number of samples at a given state to the total number of samples collected. This is printed in verbose output, but is not returned for now.

Parameters:

verbose : print out information about the effective number of samples

Returns:

N_eff : np.ndarray, float, shape=(K)

estimated number of samples contributing to estimates at each state i. An estimate to how many samples collected just at state i would result in similar statistical efficiency as the MBAR simulation. Valid for both sampled states (in which the weight will be greater than N_k[i], and unsampled states.

Notes

Using Kish (1965) formula (Kish, Leslie (1965). Survey Sampling. New York: Wiley)

As the weights become more concentrated in fewer observations, the effective sample size shrinks. from http://healthcare-economist.com/2013/08/22/effective-sample-size/

effective number of samples contributing to averages carried out at state i
    =  (\sum_{n=1}^N w_in)^2 / \sum_{n=1}^N w_in^2
    =  (\sum_{n=1}^N w_in^2)^-1

the effective sample number is most useful to diagnose when there are only a few samples contributing to the averages.

Examples

>>> from pymbar import testsystems
>>> [x_kn, u_kln, N_k, s_n] = testsystems.HarmonicOscillatorsTestCase().sample()
>>> mbar = MBAR(u_kln, N_k)
>>> N_eff = mbar.computeEffectiveSampleNumber()
computeEntropyAndEnthalpy(u_kn=None, uncertainty_method=None, verbose=False, warning_cutoff=1e-10)

Decompose free energy differences into enthalpy and entropy differences.

Compute the decomposition of the free energy difference between states 1 and N into reduced free energy differences, reduced potential (enthalpy) differences, and reduced entropy (S/k) differences.

Parameters:

u_kn : float, NxK array

The energies of the state that are being used.

uncertainty_method : string , optional

Choice of method used to compute asymptotic covariance method, or None to use default See help for computeAsymptoticCovarianceMatrix() for more information on various methods. (default: None)

warning_cutoff : float, optional

Warn if squared-uncertainty is negative and larger in magnitude than this number (default: 1.0e-10)

Returns:

result_vals : dictionary

Keys in the result_vals dictionary:

‘Delta_f’ : np.ndarray, float, shape=(K, K)

results[‘Delta_f’] is the dimensionless free energy difference f_j - f_i

‘dDelta_f’ : np.ndarray, float, shape=(K, K)

uncertainty in results[‘Delta_f’]

‘Delta_u’ : np.ndarray, float, shape=(K, K)

results[‘Delta_u’] is the reduced potential energy difference u_j - u_i

‘dDelta_u’ : np.ndarray, float, shape=(K, K)

uncertainty in results[‘Delta_u’]

‘Delta_s’ : np.ndarray, float, shape=(K, K)

results[‘Delta_s’] is the reduced entropy difference S/k between states i and j (s_j - s_i)

‘dDelta_s’ : np.ndarray, float, shape=(K, K)

uncertainty in results[‘Delta_s’]

Examples

>>> from pymbar import testsystems
>>> (x_n, u_kn, N_k, s_n) = testsystems.HarmonicOscillatorsTestCase().sample(mode='u_kn')
>>> mbar = MBAR(u_kn, N_k)
>>> results = mbar.computeEntropyAndEnthalpy()
computeExpectations(A_n, u_kn=None, output='averages', state_dependent=False, compute_uncertainty=True, uncertainty_method=None, warning_cutoff=1e-10, return_theta=False)

Compute the expectation of an observable of a phase space function.

Compute the expectation of an observable of a single phase space function A(x) at all states where potentials are generated.

Parameters:

A_n : np.ndarray, float

A_n (N_max np float64 array) - A_n[n] = A(x_n)

u_kn : np.ndarray

u_kn (energies of state of interest length N) default is self.u_kn

output : string, optional

‘averages’ outputs expectations of observables and ‘differences’ outputs a matrix of differences in the observables.

compute_uncertainty : bool, optional

If False, the uncertainties will not be computed (default: True)

uncertainty_method : string, optional

Choice of method used to compute asymptotic covariance method, or None to use default See help for _computeAsymptoticCovarianceMatrix() for more information on various methods. (default: None)

warning_cutoff : float, optional

Warn if squared-uncertainty is negative and larger in magnitude than this number (default: 1.0e-10)

state_dependent: bool, whether the expectations are state-dependent.

Returns:

result_vals : dictionary

Possible keys in the result_vals dictionary:

‘mu’ : np.ndarray, float

if output is ‘averages’ A_i (K np float64 array) - A_i[i] is the estimate for the expectation of A(x) for state i. if output is ‘differences’

‘sigma’ : np.ndarray, float

dA_i (K np float64 array) - dA_i[i] is uncertainty estimate (one standard deviation) for A_i[i] or dA_ij (K np float64 array) - dA_ij[i,j] is uncertainty estimate (one standard deviation) for the difference in A beteen i and j or None, if compute_uncertainty is False.

‘Theta’ ((KxK np float64 array): Covariance matrix of log weights

References

See Section IV of [1].

Examples

>>> from pymbar import testsystems
>>> (x_n, u_kn, N_k, s_n) = testsystems.HarmonicOscillatorsTestCase().sample(mode='u_kn')
>>> mbar = MBAR(u_kn, N_k)
>>> A_n = x_n
>>> results = mbar.computeExpectations(A_n)
>>> A_n = u_kn[0,:]
>>> results = mbar.computeExpectations(A_n, output='differences')
computeExpectationsInner(A_n, u_ln, state_map, uncertainty_method=None, warning_cutoff=1e-10, return_theta=False)

Compute the expectations of multiple observables of phase space functions in multiple states.

Compute the expectations of multiple observables of phase space functions [A_0(x),A_1(x),…,A_i(x)] along with the covariances of their estimates at multiple states.

intended as an internal function to keep all the optimized and robust expectation code in one place, but will leave it open to allow for later modifications

It calculates all input observables at all states which are specified by the list of states in the state list.

Parameters:

A_n : np.ndarray, float, shape=(I, N)

A_in[i,n] = A_i(x_n), the value of phase observable i for configuration n

u_ln : np.ndarray, float, shape=(L, N)

u_ln[l,n] is the reduced potential of configuration n at state l if u_ln = None, we use self.u_kn

state_map : np.ndarray, int, shape (2,NS) or shape(1,NS)

If state_map has only one dimension where NS is the total number of states we want to simulate things a. The list will be of the form [[0,1,2],[0,1,1]]. This particular example indicates we want to output the properties of three observables total: the first property A[0] at the 0th state, the 2nd property A[1] at the 1th state, and the 2nd property A[1] at the 2nd state. This allows us to tailor our output to a large number of different situations.

uncertainty_method : string, optional

Choice of method used to compute asymptotic covariance method, or None to use default See help for computeAsymptoticCovarianceMatrix() for more information on various methods. (default: None)

warning_cutoff : float, optional

Warn if squared-uncertainty is negative and larger in magnitude than this number (default: 1.0e-10)

return_theta : bool, optional

Whether or not to return the theta matrix. Can be useful for complicated differences of observables.

Returns:

result_vals : dictionary

Possible keys in the result_vals dictionary:

‘observables’: np.ndarray, float, shape = (S)

results_vals[‘observables’][i] is the estimate for the expectation of A_state_map[i](x) at the state specified by u_n[state_map[i],:]

‘Theta’ : np.ndarray, float, shape = (K+len(state_list), K+len(state_list)) the covariance matrix of log weights.

‘Amin’ : np.ndarray, float, shape = (S), needed for reconstructing the covariance one level up.

‘f’ : np.ndarray, float, shape = (K+len(state_list)), ‘free energies’ of the new states (i.e. ln (<A>-Amin+1)) as the log form is easier to work with.

Notes

Situations this will be used for :

  • Multiple observables, single state (called though computeMultipleExpectations)

  • Single observable, multiple states (called through computeExpectations)

    This has two cases: observables that don’t change with state, and observables that do change with state. For example, the set of energies at state k consist in energy function of state 1 evaluated at state 1, energies of state 2 evaluated at state 2, and so forth.

  • Computing only free energies at new states.

  • Would require additional work to work with potentials of mean force, because we need to ignore the functions that are zero when integrating.

Examples

>>> from pymbar import testsystems
>>> (x_n, u_kn, N_k, s_n) = testsystems.HarmonicOscillatorsTestCase().sample(mode='u_kn')
>>> mbar = MBAR(u_kn, N_k)
>>> A_n = np.array([x_n,x_n**2,x_n**3])
>>> u_n = u_kn[:2,:]
>>> state_map = np.array([[0,0],[1,0],[2,0],[2,1]],int)
>>> results = mbar.computeExpectationsInner(A_n, u_n, state_map)
computeMultipleExpectations(A_in, u_n, compute_uncertainty=True, compute_covariance=False, uncertainty_method=None, warning_cutoff=1e-10, return_theta=False)

Compute the expectations of multiple observables of phase space functions.

Compute the expectations of multiple observables of phase space functions [A_0(x),A_1(x),…,A_i(x)] at a single state, along with the error in the estimates and the uncertainty in the estimates. The state is specified by the choice of u_n, which is the energy of the n samples evaluated at a the chosen state.

Returns:

result_vals : dictionary

Possible keys in the result_vals dictionary:

‘mu’ : np.ndarray, float, shape=(I)

result_vals[‘mu’] is the estimate for the expectation of A_i(x) at the state specified by u_kn

‘sigma’ : np.ndarray, float, shape = (I)

result_vals[‘sigma’] is the uncertainty in the expectation of A_state_map[i](x) at the state specified by u_n[state_map[i],:] or None if compute_uncertainty is False

‘covariances’ : np.ndarray, float, shape=(I, I)

result_vals[‘covariances’] is the COVARIANCE in the estimates of A_i[i] and A_i[j]: we can’t actually take a square root or None if compute_covariance is False

‘Theta’: np.ndarray, float, shape=(I, I), covariances of the log weights, useful for some additional calculations.

Examples

>>> from pymbar import testsystems
>>> (x_n, u_kn, N_k, s_n) = testsystems.HarmonicOscillatorsTestCase().sample(mode='u_kn')
>>> mbar = MBAR(u_kn, N_k)
>>> A_in = np.array([x_n,x_n**2,x_n**3])
>>> u_n = u_kn[0,:]
>>> results = mbar.computeMultipleExpectations(A_in, u_kn)
computeOverlap()

Compute estimate of overlap matrix between the states.

Returns:

result_vals : dictonary

Possible keys in the result_vals dictionary:

‘scalar’ : np.ndarray, float, shape=(K, K)

One minus the largest nontrival eigenvalue (largest is 1 or -1)

‘eigenvalues’ : np.ndarray, float, shape=(K)

The sorted (descending) eigenvalues of the overlap matrix.

‘O’ : np.ndarray, float, shape=(K, K)

Estimated state overlap matrix: O[i,j] is an estimate of the probability of observing a sample from state i in state j

Notes

W.T * W pprox \int (p_i p_j /\sum_k N_k p_k)^2 \sum_k N_k p_k dq^N
    = \int (p_i p_j /\sum_k N_k p_k) dq^N

Multiplying elementwise by N_i, the elements of row i give the probability for a sample from state i being observed in state j.

Examples

>>> from pymbar import testsystems
>>> (x_kn, u_kn, N_k, s_n) = testsystems.HarmonicOscillatorsTestCase().sample(mode='u_kn')
>>> mbar = MBAR(u_kn, N_k)
>>> results = mbar.computeOverlap()
computePMF(u_n, bin_n, nbins, uncertainties='from-lowest', pmf_reference=None)

Compute the free energy of occupying a number of bins.

This implementation computes the expectation of an indicator-function observable for each bin.

Parameters:

u_n : np.ndarray, float, shape=(N)

u_n[n] is the reduced potential energy of snapshot n of state k for which the PMF is to be computed.

bin_n : np.ndarray, float, shape=(N)

bin_n[n] is the bin index of snapshot n of state k. bin_n can assume a value in range(0,nbins)

nbins : int

The number of bins

uncertainties : string, optional

Method for reporting uncertainties (default: ‘from-lowest’)

  • ‘from-lowest’ - the uncertainties in the free energy difference with lowest point on PMF are reported
  • ‘from-specified’ - same as from lowest, but from a user specified point
  • ‘from-normalization’ - the normalization sum_i p_i = 1 is used to determine uncertainties spread out through the PMF
  • ‘all-differences’ - the nbins x nbins matrix df_ij of uncertainties in free energy differences is returned instead of df_i

pmf_reference : int, optional

the reference state that is zeroed when uncertainty = ‘from-specified’

Returns:

result_vals : dictionary

Possible keys in the result_vals dictionary:

‘f_i’ : np.ndarray, float, shape=(K)

result_vals[‘f_i’][i] is the dimensionless free energy of state i, relative to the state of lowest free energy

‘df_i’ : np.ndarray, float, shape=(K)

result_vals[‘df_i’][i] is the uncertainty in the difference of f_i with respect to the state of lowest free energy

Notes

  • All bins must have some samples in them from at least one of the states – this will not work if bin_n.sum(0) == 0. Empty bins should be removed before calling computePMF().
  • This method works by computing the free energy of localizing the system to each bin for the given potential by aggregating the log weights for the given potential.
  • To estimate uncertainties, the NxK weight matrix W_nk is augmented to be Nx(K+nbins) in order to accomodate the normalized weights of states where
  • the potential is given by u_kn within each bin and infinite potential outside the bin. The uncertainties with respect to the bin of lowest free energy are then computed in the standard way.

Examples

>>> # Generate some test data
>>> from pymbar import testsystems
>>> (x_n, u_kn, N_k, s_n) = testsystems.HarmonicOscillatorsTestCase().sample(mode='u_kn')
>>> # Initialize MBAR on data.
>>> mbar = MBAR(u_kn, N_k)
>>> # Select the potential we want to compute the PMF for (here, condition 0).
>>> u_n = u_kn[0, :]
>>> # Sort into nbins equally-populated bins
>>> nbins = 10 # number of equally-populated bins to use
>>> import numpy as np
>>> N_tot = N_k.sum()
>>> x_n_sorted = np.sort(x_n) # unroll to n-indices
>>> bins = np.append(x_n_sorted[0::int(N_tot/nbins)], x_n_sorted.max()+0.1)
>>> bin_widths = bins[1:] - bins[0:-1]
>>> bin_n = np.zeros(x_n.shape, np.int64)
>>> bin_n = np.digitize(x_n, bins) - 1
>>> # Compute PMF for these unequally-sized bins.
>>> results = mbar.computePMF(u_n, bin_n, nbins)
>>> # If we want to correct for unequally-spaced bins to get a PMF on uniform measure
>>> f_i_corrected = results['f_i'] - np.log(bin_widths)
computePerturbedFreeEnergies(u_ln, compute_uncertainty=True, uncertainty_method=None, warning_cutoff=1e-10)

Compute the free energies for a new set of states.

Here, we desire the free energy differences among a set of new states, as well as the uncertainty estimates in these differences.

Parameters:

u_ln : np.ndarray, float, shape=(L, Nmax)

u_ln[l,n] is the reduced potential energy of uncorrelated configuration n evaluated at new state k. Can be completely indepednent of the original number of states.

compute_uncertainty : bool, optional, default=True

If False, the uncertainties will not be computed (default: True)

uncertainty_method : string, optional

Choice of method used to compute asymptotic covariance method, or None to use default See help for computeAsymptoticCovarianceMatrix() for more information on various methods. (default: None)

warning_cutoff : float, optional

Warn if squared-uncertainty is negative and larger in magnitude than this number (default: 1.0e-10)

Returns:

result_vals : dictionary

Possible keys in the result_vals dictionary:

‘Delta_f’ : np.ndarray, float, shape=(L, L)

result_vals[‘Delta_f’] = f_j - f_i, the dimensionless free energy difference between new states i and j

‘dDelta_f’ : np.ndarray, float, shape=(L, L)

result_vals[‘dDelta_f’] is the estimated statistical uncertainty in result_vals[‘Delta_f’] or not included if compute_uncertainty is False

Examples

>>> from pymbar import testsystems
>>> (x_n, u_kn, N_k, s_n) = testsystems.HarmonicOscillatorsTestCase().sample(mode='u_kn')
>>> mbar = MBAR(u_kn, N_k)
>>> results = mbar.computePerturbedFreeEnergies(u_kn)
getFreeEnergyDifferences(compute_uncertainty=True, uncertainty_method=None, warning_cutoff=1e-10, return_theta=False)

Get the dimensionless free energy differences and uncertainties among all thermodynamic states.

Parameters:

compute_uncertainty : bool, optional

If False, the uncertainties will not be computed (default: True)

uncertainty_method : string, optional

Choice of method used to compute asymptotic covariance method, or None to use default. See help for computeAsymptoticCovarianceMatrix() for more information on various methods. (default: svd)

warning_cutoff : float, optional

Warn if squared-uncertainty is negative and larger in magnitude than this number (default: 1.0e-10)

return_theta : bool, optional

Whether or not to return the theta matrix. Can be useful for complicated differences.

Returns:

result_vals : dictionary

Possible keys in the result_vals dictionary:

‘Delta_f’ : np.ndarray, float, shape=(K, K)

Deltaf_ij[i,j] is the estimated free energy difference

‘dDelta_f’ : np.ndarray, float, shape=(K, K)

If compute_uncertainty==True, dDeltaf_ij[i,j] is the estimated statistical uncertainty (one standard deviation) in Deltaf_ij[i,j]. Otherwise not included.

‘Theta’ : np.ndarray, float, shape=(K, K)

The theta_matrix if return_theta==True, otherwise not included.

Notes

Computation of the covariance matrix may take some time for large K.

The reported statistical uncertainty should, in the asymptotic limit, reflect one standard deviation for the normal distribution of the estimate. The true free energy difference should fall within the interval [-df, +df] centered on the estimate 68% of the time, and within the interval [-2 df, +2 df] centered on the estimate 95% of the time. This will break down in cases where the number of samples is not large enough to reach the asymptotic normal limit.

See Section III of Reference [1].

Examples

>>> from pymbar import testsystems
>>> (x_n, u_kn, N_k, s_n) = testsystems.HarmonicOscillatorsTestCase().sample(mode='u_kn')
>>> mbar = MBAR(u_kn, N_k)
>>> results = mbar.getFreeEnergyDifferences()
getWeights()

Retrieve the weight matrix W_nk from the MBAR algorithm.

Necessary because they are stored internally as log weights.

Returns:

weights : np.ndarray, float, shape=(N, K)

NxK matrix of weights in the MBAR covariance and averaging formulas