Utilities : pymbar.utils

These functions are some miscellaneous functions used by other parts of the pymbar library.

exception pymbar.utils.BoundsError

Could not determine bounds on free energy

exception pymbar.utils.ConvergenceError

Convergence could not be achieved.

exception pymbar.utils.DataError

Data is inconsistent.

exception pymbar.utils.ParameterError

An error in the input parameters has been detected.

exception pymbar.utils.TypeCastPerformanceWarning
pymbar.utils.check_w_normalized(W, N_k, tolerance=0.0001)

Check the weight matrix W is properly normalized. The sum over N should be 1, and the sum over k by N_k should aslo be 1

  • W (np.ndarray, shape=(N, K), dtype='float') – The normalized weight matrix for snapshots and states. W[n, k] is the weight of snapshot n in state k.

  • N_k (np.ndarray, shape=(K), dtype='int') – N_k[k] is the number of samples from state k.

  • tolerance (float, optional, default=1.0e-4) – Tolerance for checking equality of sums


None – Returns a None object if test passes

Return type:



ParameterError – Appropriate message if W is not normalized within tolerance.

pymbar.utils.ensure_type(val, dtype, ndim, name, length=None, can_be_none=False, shape=None, warn_on_cast=True, add_newaxis_on_deficient_ndim=False)

Typecheck the size, shape and dtype of a numpy array, with optional casting.

  • val ({np.ndaraay, None}) – The array to check

  • dtype ({nd.dtype, str}) – The dtype you’d like the array to have

  • ndim (int) – The number of dimensions you’d like the array to have

  • name (str) – name of the array. This is used when throwing exceptions, so that we can describe to the user which array is messed up.

  • length (int, optional) – How long should the array be?

  • can_be_none (bool) – Is val == None acceptable?

  • shape (tuple, optional) – What should be shape of the array be? If the provided tuple has Nones in it, those will be semantically interpreted as matching any length in that dimension. So, for example, using the shape spec (None, None, 3) will ensure that the last dimension is of length three without constraining the first two dimensions

  • warn_on_cast (bool, default=True) – Raise a warning when the dtypes don’t match and a cast is done.

  • add_newaxis_on_deficient_ndim (bool, default=True) – Add a new axis to the beginining of the array if the number of dimensions is deficient by one compared to your specification. For instance, if you’re trying to get out an array of ndim == 3, but the user provides an array of shape == (10, 10), a new axis will be created with length 1 in front, so that the return value is of shape (1, 10, 10).


The returned value will always be C-contiguous.


typechecked_val – If val=None and can_be_none=True, then this will return None. Otherwise, it will return val (or a copy of val). If the dtype wasn’t right, it’ll be casted to the right shape. If the array was not C-contiguous, it’ll be copied as well.

Return type:

np.ndarray, None

pymbar.utils.kln_to_kn(kln, N_k=None, cleanup=False)

Convert KxKxN_max array to KxN max array

  • u_kln (np.ndarray, float, shape=(KxLxN_max)) –

  • N_k (np.array, optional) – the N_k matrix from the previous formatting form

  • cleanup (bool, optional) – optional command to clean up, since u_kln can get very large



Return type:

np.ndarray, float, shape=(LxN)

pymbar.utils.kn_to_n(kn, N_k=None, cleanup=False)

Convert KxN_max array to N array

  • u_kn (np.ndarray, float, shape=(KxN_max)) –

  • N_k (np.array, optional) – the N_k matrix from the previous formatting form

  • cleanup (bool, optional) – optional command to clean up, since u_kln can get very large



Return type:

np.ndarray, float, shape=(N)

pymbar.utils.logsumexp(a, axis=None, b=None, use_numexpr=True)

Compute the log of the sum of exponentials of input elements.

  • a (array_like) – Input array.

  • axis (None or int, optional, default=None) – Axis or axes over which the sum is taken. By default axis is None, and all elements are summed.

  • b (array-like, optional) – Scaling factor for exp(a) must be of the same shape as a or broadcastable to a.

  • use_numexpr (bool, optional, default=True) – If True, use the numexpr library to speed up the calculation, which can give a 2-4X speedup when working with large arrays.


res – The result, log(sum(exp(a))) calculated in a numerically more stable way. If b is given then log(sum(b*exp(a))) is returned.

Return type:


See also

numpy.logaddexp, numpy.logaddexp2, scipy.special.logsumexp


This is based on scipy.special.logsumexp but with optional numexpr support for improved performance.