molecular_simulations.simulate.free_energy module¶
Free energy calculations using Empirical Valence Bond (EVB) methods.
- class molecular_simulations.simulate.free_energy.PMFResult(bin_centers, pmf, pmf_uncertainty, free_energies, free_energy_uncertainty)[source]¶
Bases:
objectResults from PMF calculation using MBAR.
- bin_centers¶
Reaction coordinate values at bin centers (nm).
- pmf¶
Free energy values relative to minimum (kJ/mol).
- pmf_uncertainty¶
Standard error of PMF estimates (kJ/mol).
- free_energies¶
Raw free energies for each window from MBAR (kJ/mol).
- free_energy_uncertainty¶
Uncertainty in window free energies (kJ/mol).
- __init__(bin_centers, pmf, pmf_uncertainty, free_energies, free_energy_uncertainty)¶
- class molecular_simulations.simulate.free_energy.ConvergenceResult(window_idx, mean_rc, sem, n_blocks, block_means, is_converged)[source]¶
Bases:
objectResults from convergence analysis.
- window_idx¶
Window index.
- mean_rc¶
Mean reaction coordinate value.
- sem¶
Standard error of the mean.
- n_blocks¶
Number of blocks used in analysis.
- block_means¶
Mean RC for each block.
- is_converged¶
Whether the window appears converged (SEM < threshold).
- __init__(window_idx, mean_rc, sem, n_blocks, block_means, is_converged)¶
- class molecular_simulations.simulate.free_energy.OverlapResult(overlap_matrix, min_overlap, problem_pairs)[source]¶
Bases:
objectResults from window overlap analysis.
- overlap_matrix¶
Pairwise overlap between adjacent windows.
- min_overlap¶
Minimum overlap value (should be > 0.03 for MBAR).
- problem_pairs¶
List of window pairs with insufficient overlap.
- __init__(overlap_matrix, min_overlap, problem_pairs)¶
- class molecular_simulations.simulate.free_energy.EquilibrationResult(window_idx, t0, g, n_effective, fraction_discarded)[source]¶
Bases:
objectResults from equilibration detection.
- window_idx¶
Window index.
- t0¶
Index of first equilibrated frame.
- g¶
Statistical inefficiency.
- n_effective¶
Effective number of uncorrelated samples.
- fraction_discarded¶
Fraction of trajectory discarded as equilibration.
- __init__(window_idx, t0, g, n_effective, fraction_discarded)¶
- class molecular_simulations.simulate.free_energy.EVBAnalysisResult(pmf, convergence, overlap, equilibration, rc_data, temperature, k_umbrella)[source]¶
Bases:
objectComprehensive results from EVB free energy analysis.
- pmf¶
PMF calculation results.
- convergence¶
Per-window convergence analysis.
- overlap¶
Window overlap analysis.
- equilibration¶
Per-window equilibration detection.
- rc_data¶
Raw reaction coordinate data per window (after equilibration).
- temperature¶
Temperature used for analysis (K).
- k_umbrella¶
Umbrella force constant used (kJ/mol/nm²).
- convergence: list[ConvergenceResult]¶
- overlap: OverlapResult¶
- equilibration: list[EquilibrationResult]¶
- __init__(pmf, convergence, overlap, equilibration, rc_data, temperature, k_umbrella)¶
- class molecular_simulations.simulate.free_energy.EVBAnalyzer(log_path, log_prefix, k_umbrella, rc0_values, output_path=None)[source]¶
Bases:
objectStandalone analyzer for existing EVB umbrella sampling data.
Use this class when you have already run EVB simulations (possibly across multiple HPC jobs) and want to analyze the results without re-instantiating the full EVB class with Parsl configuration.
This class can: - Load RC data from log files - Detect equilibration - Check convergence - Analyze window overlap - Compute PMF using MBAR or WHAM
Example
>>> # Analyze existing EVB run >>> analyzer = EVBAnalyzer( ... log_path=Path("/scratch/evb_run/logs"), ... log_prefix="reactant", ... k_umbrella=160000.0, # Must match what was used in simulation ... rc0_values=np.linspace(-0.2, 0.2, 50), # Window centers ... ) >>> result = analyzer.run_full_analysis(temperature=300.0) >>> print(f"Barrier: {result.pmf.pmf.max():.2f} kJ/mol")
>>> # Or load from a metadata file saved during simulation >>> analyzer = EVBAnalyzer.from_metadata("/scratch/evb_run/evb_metadata.toml") >>> result = analyzer.run_full_analysis()
Initialize the EVB analyzer.
- Parameters:
log_path (
Path) – Directory containing RC log files (e.g., reactant_0.log, …).log_prefix (
str) – Prefix for log file names (e.g., ‘reactant’ for reactant_0.log).k_umbrella (
float) – Umbrella force constant in kJ/mol/nm². Must match simulation.rc0_values (
ndarray|list[int|float]) – Array of target RC values for each window. Must match simulation.output_path (
Path|None) – Directory for saving results. Defaults to log_path.
- __init__(log_path, log_prefix, k_umbrella, rc0_values, output_path=None)[source]¶
Initialize the EVB analyzer.
- Parameters:
log_path (
Path) – Directory containing RC log files (e.g., reactant_0.log, …).log_prefix (
str) – Prefix for log file names (e.g., ‘reactant’ for reactant_0.log).k_umbrella (
float) – Umbrella force constant in kJ/mol/nm². Must match simulation.rc0_values (
ndarray|list[int|float]) – Array of target RC values for each window. Must match simulation.output_path (
Path|None) – Directory for saving results. Defaults to log_path.
- classmethod from_metadata(metadata_path)[source]¶
Create analyzer from a metadata TOML file.
The metadata file should contain the parameters used during simulation. This is useful for reproducibility and avoiding parameter mismatches.
- Parameters:
metadata_path (
Path) – Path to TOML metadata file.- Return type:
- Returns:
Configured EVBAnalyzer instance.
- Example metadata.toml:
[evb] log_path = “/scratch/evb_run/logs” log_prefix = “reactant” k_umbrella = 160000.0 rc0_values = [-0.2, -0.19, …, 0.2]
- classmethod from_evb_instance(evb)[source]¶
Create analyzer from an existing EVB instance.
Useful when you want to decouple analysis from the simulation object.
- Parameters:
evb (
EVB) – Existing EVB instance.- Return type:
- Returns:
EVBAnalyzer with matching parameters.
- load_rc_data()[source]¶
Load reaction coordinate data from all window log files.
- Return type:
- Returns:
List of RC arrays, one per window.
- Raises:
FileNotFoundError – If any log file is missing.
- get_available_windows()[source]¶
Detect which window log files are available.
Useful for checking progress of incomplete runs.
- check_convergence(rc_data, block_size=None, sem_threshold=0.01)[source]¶
Check convergence of each window using block averaging.
- Return type:
- analyze_overlap(rc_data, n_bins=50, min_overlap_threshold=0.03)[source]¶
Analyze overlap between adjacent umbrella windows.
- Return type:
- compute_pmf(rc_data, temperature=300.0, n_bins=50)[source]¶
Compute PMF using MBAR (preferred) or WHAM fallback.
- Return type:
- run_full_analysis(temperature=300.0, n_bins=50, block_size=None, sem_threshold=0.01, overlap_threshold=0.03, discard_equilibration=True)[source]¶
Run comprehensive free energy analysis.
This is the main entry point for analyzing existing EVB data.
- Parameters:
temperature (
float) – Temperature in Kelvin.n_bins (
int) – Number of bins for PMF.block_size (
int|None) – Block size for convergence analysis.sem_threshold (
float) – SEM threshold for convergence.overlap_threshold (
float) – Minimum required window overlap.discard_equilibration (
bool) – Whether to discard equilibration frames.
- Return type:
- Returns:
EVBAnalysisResult with complete analysis.
- class molecular_simulations.simulate.free_energy.EVB(topology, coordinates, donor_atom, acceptor_atom, reactive_atom, parsl_config, log_path, log_prefix='reactant', rc_write_freq=5, steps=500000, dt=0.002, k=160000.0, k_path=100.0, D_e=392.46, alpha=13.275, r0=0.109, platform='CUDA', n_windows=50, reaction_coordinate=None, restraint_sel=None)[source]¶
Bases:
objectEVB orchestrator. Sets up full EVB run for a set of reactants or products, and distributes calculations using Parsl.
Initialize the EVB orchestrator.
- Parameters:
topology (
Path) – Path to the system topology file (prmtop).coordinates (
Path) – Path to the system coordinate file (inpcrd).umbrella_atoms – List of three atom indices [i, j, k] for umbrella sampling where the reaction coordinate is dist(i,k) - dist(j,k).
morse_atoms – List of two atom indices [i, j] for the Morse bond.
reaction_coordinate (
list[float] |None) – List of [min, max, increment] defining the reaction coordinate windows.parsl_config (
Config) – Parsl configuration for distributed execution.log_path (
Path) – Directory path for writing reaction coordinate logs.log_prefix (
str) – Prefix for log file names. Defaults to ‘reactant’.rc_write_freq (
int) – Steps between reaction coordinate writes. Defaults to 5.steps (
int) – Number of simulation steps per window. Defaults to 500000.dt (
float) – Integration timestep in picoseconds. Defaults to 0.002.k (
float) – Umbrella force constant in kJ/mol/nm^2. Defaults to 160000.0.k_path (
float) – Path restraint force constant in kJ/mol. Defaults to 100.0.D_e (
float) – Morse well depth in kJ/mol. Defaults to 392.46.alpha (
float) – Morse width parameter in nm^-1. Defaults to 13.275.r0 (
float) – Equilibrium bond distance in nm. Defaults to 0.1.platform (
str) – OpenMM platform name. Defaults to ‘CUDA’.restraint_sel (
str|None) – Optional MDAnalysis selection string for backbone restraints. Defaults to None.
- __init__(topology, coordinates, donor_atom, acceptor_atom, reactive_atom, parsl_config, log_path, log_prefix='reactant', rc_write_freq=5, steps=500000, dt=0.002, k=160000.0, k_path=100.0, D_e=392.46, alpha=13.275, r0=0.109, platform='CUDA', n_windows=50, reaction_coordinate=None, restraint_sel=None)[source]¶
Initialize the EVB orchestrator.
- Parameters:
topology (
Path) – Path to the system topology file (prmtop).coordinates (
Path) – Path to the system coordinate file (inpcrd).umbrella_atoms – List of three atom indices [i, j, k] for umbrella sampling where the reaction coordinate is dist(i,k) - dist(j,k).
morse_atoms – List of two atom indices [i, j] for the Morse bond.
reaction_coordinate (
list[float] |None) – List of [min, max, increment] defining the reaction coordinate windows.parsl_config (
Config) – Parsl configuration for distributed execution.log_path (
Path) – Directory path for writing reaction coordinate logs.log_prefix (
str) – Prefix for log file names. Defaults to ‘reactant’.rc_write_freq (
int) – Steps between reaction coordinate writes. Defaults to 5.steps (
int) – Number of simulation steps per window. Defaults to 500000.dt (
float) – Integration timestep in picoseconds. Defaults to 0.002.k (
float) – Umbrella force constant in kJ/mol/nm^2. Defaults to 160000.0.k_path (
float) – Path restraint force constant in kJ/mol. Defaults to 100.0.D_e (
float) – Morse well depth in kJ/mol. Defaults to 392.46.alpha (
float) – Morse width parameter in nm^-1. Defaults to 13.275.r0 (
float) – Equilibrium bond distance in nm. Defaults to 0.1.platform (
str) – OpenMM platform name. Defaults to ‘CUDA’.restraint_sel (
str|None) – Optional MDAnalysis selection string for backbone restraints. Defaults to None.
- initialize()[source]¶
Initialize Parsl for runs.
If Parsl is already running (e.g., from a parent pipeline), reuses the existing DataFlowKernel. Otherwise, loads a new one from parsl_config.
- Return type:
- shutdown()[source]¶
Clean up Parsl after runs.
Only cleans up if this instance loaded Parsl itself. If Parsl was borrowed from a parent pipeline, leaves it running.
- Return type:
- run_evb(parsl_func=<parsl.app.python.PythonApp object>)[source]¶
Collect futures for each EVB window and distribute.
- Return type:
- process_evb_run()[source]¶
Reads in RCReporter logs from an EVB run and collects RC data.
This method collects the reaction coordinate values from each umbrella window. For actual free energy calculation, use compute_pmf() or run_full_analysis() after this method.
- Returns:
window, RC, rc0 (target RC for window).
- Return type:
DataFrame- Raises:
ValueError – If no EVB windows are found.
FileNotFoundError – If RC log files are missing.
- load_rc_data()[source]¶
Load reaction coordinate data from all windows.
- Return type:
- Returns:
List of RC arrays, one per window.
- Raises:
ValueError – If no windows found.
FileNotFoundError – If log files missing.
- detect_equilibration(rc_data, method='statistical_inefficiency')[source]¶
Detect equilibration time for each window.
Uses statistical inefficiency to identify when each trajectory has equilibrated. Frames before t0 should be discarded for analysis.
- Parameters:
- Return type:
- Returns:
List of EquilibrationResult for each window.
- check_convergence(rc_data, block_size=None, sem_threshold=0.01)[source]¶
Check convergence of each window using block averaging.
Block averaging divides the trajectory into blocks and computes the standard error of the mean across blocks. Well-converged simulations should have low SEM values.
- Parameters:
- Return type:
- Returns:
List of ConvergenceResult for each window.
- analyze_overlap(rc_data, n_bins=50, min_overlap_threshold=0.03)[source]¶
Analyze overlap between adjacent umbrella windows.
Good overlap between windows is critical for accurate free energy estimation with WHAM/MBAR. Adjacent windows should have >3% overlap for reliable results.
- Parameters:
- Return type:
- Returns:
OverlapResult with overlap matrix and problem identification.
- compute_pmf(rc_data, temperature=300.0, n_bins=50)[source]¶
Compute the Potential of Mean Force using MBAR.
Uses the pymbar library to compute free energies from umbrella sampling data. Falls back to WHAM-like histogram reweighting if pymbar is not available.
- Parameters:
- Return type:
- Returns:
PMFResult with free energy profile and uncertainties.
- Raises:
ValueError – If rc_data is empty or windows have no samples.
- run_full_analysis(temperature=300.0, n_bins=50, block_size=None, sem_threshold=0.01, overlap_threshold=0.03, discard_equilibration=True)[source]¶
Run comprehensive free energy analysis on EVB data.
This method performs: 1. Equilibration detection and removal 2. Convergence checking via block averaging 3. Window overlap analysis 4. PMF calculation using MBAR (or WHAM fallback)
- Parameters:
temperature (
float) – Temperature in Kelvin. Defaults to 300.0.n_bins (
int) – Number of bins for PMF. Defaults to 50.block_size (
int|None) – Block size for convergence analysis.sem_threshold (
float) – SEM threshold for convergence (nm).overlap_threshold (
float) – Minimum required window overlap.discard_equilibration (
bool) – Whether to discard equilibration frames.
- Return type:
- Returns:
EVBAnalysisResult with complete analysis results.
Example
>>> evb = EVB(topology, coordinates, ...) >>> evb.run_evb() # Run umbrella sampling >>> result = evb.run_full_analysis(temperature=300.0) >>> print(f"Barrier height: {result.pmf.pmf.max():.2f} kJ/mol")
- save_analysis_results(result, output_dir=None)[source]¶
Save analysis results to files.
Saves: - PMF data as CSV - Full results as parquet - Summary statistics as text
- Parameters:
result (
EVBAnalysisResult) – EVBAnalysisResult from run_full_analysis().output_dir (
Path|None) – Output directory. Defaults to log_path.
- Return type:
- save_metadata(output_path=None)[source]¶
Save run metadata for later analysis without re-instantiation.
This is crucial for HPC workflows where simulations may be interrupted or run across multiple jobs. The metadata file contains all parameters needed to create an EVBAnalyzer for post-hoc analysis.
- Parameters:
output_path (
Path|None) – Path to save metadata TOML. Defaults to log_path/evb_metadata.toml.- Return type:
- Returns:
Path to saved metadata file.
Example
>>> evb = EVB(...) >>> evb.save_metadata() # Save before running >>> evb.run_evb() # May get interrupted by walltime >>> >>> # Later, in a new job: >>> analyzer = EVBAnalyzer.from_metadata("path/to/evb_metadata.toml") >>> result = analyzer.run_full_analysis()
- get_analyzer()[source]¶
Create an EVBAnalyzer from this EVB instance.
Useful when you want to run analysis separately from simulation, or when you want to decouple the analysis code from Parsl.
- Return type:
- Returns:
EVBAnalyzer configured with this instance’s parameters.
Example
>>> evb = EVB(...) >>> evb.run_evb() >>> analyzer = evb.get_analyzer() >>> result = analyzer.run_full_analysis()
- property umbrella: dict[str, Any]¶
Sets up Umbrella force settings for force calculation.
Because the windows are decided at run time we leave rc0 as None for now. The dict includes both k (umbrella force constant) and k_path (path restraint force constant) since both are passed to the EVBCalculation.
- Returns:
Umbrella and path restraint parameters.
- Return type:
- property morse_bond: dict[str, Any]¶
Sets up Morse bond settings for potential creation.
D_e is the potential well depth (bond dissociation energy) and can be computed using QM or obtained from ML predictions such as ALFABET (https://bde.ml.nrel.gov). Must be in kJ/mol for OpenMM.
alpha is the potential well width computed from the harmonic force constant via the Taylor expansion of the second derivative:
alpha = sqrt(k_bond / (2 * D_e))
- Unit conversions from AMBER frcmod (kcal/mol/A^2) to OpenMM (kJ/mol/nm^2):
k_openmm = k_amber * 4.184 * 100
- Example calculation for C-H bond:
D_e = 93.8 kcal/mol = 392.46 kJ/mol k_bond = 330.6 kcal/(mol*A^2) = 138323 kJ/(mol*nm^2) alpha = sqrt(138323 / (2 * 392.46)) = 13.275 nm^-1
r0 is the equilibrium bond distance in nm.
- Returns:
Morse bond parameters with keys atom_i, atom_j, D_e, alpha, r0.
- Return type:
- class molecular_simulations.simulate.free_energy.EVBCalculation(topology, coord_file, out_path, rc_file, umbrella, morse_bond, rc_freq=5, steps=500000, dt=0.002, platform='CUDA', restraint_sel=None)[source]¶
Bases:
objectRuns a single EVB window.
Initialize a single EVB window calculation.
- Parameters:
topology (
Path) – Path to the system topology file (prmtop).coord_file (
Path) – Path to the system coordinate file (inpcrd).out_path (
Path) – Directory path for simulation output files.rc_file (
Path) – Path to the reaction coordinate log file.umbrella (
dict) – Dictionary containing umbrella sampling parameters including atom_i, atom_j, atom_k, k, k_path, and rc0.morse_bond (
dict) – Dictionary containing Morse bond parameters including atom_i, atom_j, D_e, alpha, and r0.rc_freq (
int) – Steps between reaction coordinate writes. Defaults to 5.steps (
int) – Number of simulation steps. Defaults to 500000.dt (
float) – Integration timestep in picoseconds. Defaults to 0.002.platform (
str) – OpenMM platform name. Defaults to ‘CUDA’.restraint_sel (
str|None) – Optional MDAnalysis selection string for backbone restraints. Defaults to None.
- __init__(topology, coord_file, out_path, rc_file, umbrella, morse_bond, rc_freq=5, steps=500000, dt=0.002, platform='CUDA', restraint_sel=None)[source]¶
Initialize a single EVB window calculation.
- Parameters:
topology (
Path) – Path to the system topology file (prmtop).coord_file (
Path) – Path to the system coordinate file (inpcrd).out_path (
Path) – Directory path for simulation output files.rc_file (
Path) – Path to the reaction coordinate log file.umbrella (
dict) – Dictionary containing umbrella sampling parameters including atom_i, atom_j, atom_k, k, k_path, and rc0.morse_bond (
dict) – Dictionary containing Morse bond parameters including atom_i, atom_j, D_e, alpha, and r0.rc_freq (
int) – Steps between reaction coordinate writes. Defaults to 5.steps (
int) – Number of simulation steps. Defaults to 500000.dt (
float) – Integration timestep in picoseconds. Defaults to 0.002.platform (
str) – OpenMM platform name. Defaults to ‘CUDA’.restraint_sel (
str|None) – Optional MDAnalysis selection string for backbone restraints. Defaults to None.
- prepare()[source]¶
Generates simulation object containing all custom forces to compute free energy. Leverages standard Simulator as backend, adding in Morse potential and Umbrella forces.
- static umbrella_force(atom_i, atom_j, atom_k, k, rc0, **kwargs)[source]¶
Difference of distances umbrella force. Think pulling an oxygen off
- Parameters:
atom_i (
int) – Index of first atom participating (from reactant).atom_j (
int) – Index of second atom participating (from product).atom_k (
int) – Index of shared atom participating in both reactant and product.k (
float) – Harmonic spring constant.rc0 (
float) – Target equilibrium distance for current window.
- Returns:
Force that drives sampling in each umbrella window.
- Return type:
CustomCompoundBondForce
- static path_restraint(atom_i, atom_j, atom_k, k_path, **kwargs)[source]¶
Enforce collinearity of moving atom with respect to the initial and final positions. By avoiding a custom angle force we avoid instability related to the asymptote at 180 degrees, which is what we are attempting to enforce. The cosine of the dot product of the vectors from i -> k and i -> j allows the penalty to scale quadratically with deviation, thus keeping the mobile atom snapped along the progress coordinate vector.
Note: k_path has units of kJ/mol (not kJ/mol/nm^2 like the umbrella k) since (1 - costheta)^2 is dimensionless. Typical values are 50-200 kJ/mol.
- static morse_bond_force(atom_i, atom_j, D_e, alpha, r0)[source]¶
Generates a custom Morse potential between two atom indices.
- The Morse potential has the form:
V(r) = D_e * (1 - exp(-alpha * (r - r0)))^2
- The alpha parameter can be computed from harmonic force constant k_bond:
alpha = sqrt(k_bond / (2 * D_e))
- All parameters must be in OpenMM’s native unit system:
Distances in nanometers (nm)
Energies in kJ/mol
Force constants in kJ/(mol*nm^2)
Alpha in nm^-1
- Parameters:
- Returns:
Force corresponding to a Morse potential.
- Return type:
CustomBondForce
- static remove_harmonic_bond(system, atom_i, atom_j)[source]¶
Remove the bond/constraint between two atoms.
This is necessary when replacing a harmonic bond with a Morse potential to avoid double-counting the bonded interaction. The method handles both: 1. Harmonic bonds (sets force constant to zero) 2. SHAKE/RATTLE constraints (removes the constraint entirely)