molecular_simulations.analysis.constant_pH_analysis module¶
Improved constant pH analysis with UWHAM reweighting.
This implementation adds multistate analysis capabilities to the basic curve fitting approach. Uses log-space arithmetic for numerical stability.
- class molecular_simulations.analysis.constant_pH_analysis.UWHAMSolver(tol=1e-07, maxiter=10000)[source]¶
Bases:
objectUnbinned Weighted Histogram Analysis Method (UWHAM) solver.
NOTE: This class is NOT currently used because UWHAM/MBAR is designed for umbrella sampling and replica exchange, not independent constant pH simulations. For standard constant pH MD where each pH is an independent equilibrium simulation, simple curve fitting is the correct approach.
This class is retained for potential use with replica exchange constant pH (REX-cpH) simulations where samples ARE correlated across pH values.
Uses log-space arithmetic throughout for numerical stability with large systems (100+ titratable residues).
- solve(verbose=False)[source]¶
Solve UWHAM self-consistent equations iteratively.
Uses the MBAR equation: f_k = -log(Σ_n exp(-u_k(x_n)) / Σ_l N_l exp(f_l - u_l(x_n)))
where the sum over n includes ALL samples from ALL states.
- Returns:
f – Free energy offsets for each pH simulation
- Return type:
np.ndarray
- compute_log_weights(target_pH)[source]¶
Compute log weights for reweighting to target pH.
Uses MBAR formula: w_n ∝ exp(-u_target(x_n)) / Σ_l N_l exp(f_l - u_l(x_n))
- class molecular_simulations.analysis.constant_pH_analysis.TitrationCurve(log_file, make_plots=True, out=PosixPath('.'), method='uwham')[source]¶
Bases:
objectAnalyze constant pH simulations with multiple fitting methods.
Available methods: - curvefit: Simple least squares fit of Hill equation to per-pH averages - weighted: Weighted least squares (weight by 1/variance) - bootstrap: Curve fitting with bootstrap confidence intervals
Note: For independent constant pH simulations (not replica exchange), simple curve fitting is the statistically correct approach. UWHAM/MBAR is only appropriate for replica exchange constant pH where samples are correlated across pH values.
- compute_titrations_curvefit()[source]¶
Compute pKa and Hill coefficient using scipy curve_fit.
This is the simple approach that treats each pH independently.
- Return type:
DataFrame
- compute_titrations_weighted(verbose=False)[source]¶
Compute pKa and Hill coefficient using weighted least squares.
Weights each pH point by 1/variance, giving more influence to points with more samples and intermediate protonation fractions.
This is more statistically rigorous than unweighted curve fitting when sample sizes vary across pH values.
- Return type:
DataFrame
- compute_titrations_bootstrap(n_bootstrap=1000, verbose=False)[source]¶
Compute pKa and Hill coefficient with bootstrap confidence intervals.
Resamples the data at each pH to estimate uncertainty in fitted parameters. This gives robust error estimates even when the Hill equation doesn’t perfectly fit the data.
- compute_titrations(verbose=False, n_bootstrap=1000)[source]¶
Compute titrations using selected method.
- Return type:
- diagnose_residue(resid, verbose=True)[source]¶
Diagnose why a residue might have failed pKa determination.
- static hill_equation(pH, pKa, n)[source]¶
Hill equation for acid-base equilibrium.
Returns fraction protonated as function of pH.
- class molecular_simulations.analysis.constant_pH_analysis.TitrationAnalyzer(log_files, output_dir=None)[source]¶
Bases:
objectHigh-level analyzer for constant pH simulations.
Provides a streamlined API that runs both curve fitting and UWHAM analysis, generates comparisons, and creates publication-quality plots.
Example usage¶
>>> analyzer = TitrationAnalyzer(["cpH.log"]) >>> analyzer.run() >>> analyzer.summary() >>> analyzer.plot_residue("145") >>> analyzer.plot_all(output_dir="plots/") >>> analyzer.save_results("results/")
Initialize the analyzer.
- param log_files:
Path(s) to constant pH log file(s)
- type log_files:
- param output_dir:
Directory for output files. If None, uses current directory.
- type output_dir:
- run(methods=['curvefit', 'weighted'], verbose=True, n_bootstrap=1000)[source]¶
Run the analysis with specified methods.
- Parameters:
methods (
list[str]) – Methods to run: ‘curvefit’, ‘weighted’, ‘bootstrap’ - curvefit: Simple least squares fit of Hill equation - weighted: Weighted least squares (by 1/variance) - bootstrap: Curve fit with bootstrap confidence intervalsverbose (
bool) – Print progress informationn_bootstrap (
int) – Number of bootstrap iterations (only used if ‘bootstrap’ in methods)
- Returns:
self
- Return type:
- plot_residue(resid, ax=None, show_curvefit=True, show_weighted=True, show_data=True, figsize=(8, 6), save=None)[source]¶
Plot titration curve for a single residue.
- Parameters:
resid (
str) – Residue ID to plotax (
Axes|None) – Axes to plot on. If None, creates new figure.show_curvefit (
bool) – Show curve fitting resultshow_weighted (
bool) – Show weighted fit resultshow_data (
bool) – Show raw data pointsfigsize (
tuple[float,float]) – Figure size if creating new figure
- Return type:
- plot_all(output_dir=None, format='png', show_curvefit=True, show_weighted=True, residues=None, verbose=True)[source]¶
Generate plots for all (or selected) residues.
- Parameters:
output_dir (
str|Path|None) – Directory for plots. Uses self.output_dir / ‘plots’ if None.format (
str) – Image format (‘png’, ‘pdf’, ‘svg’)show_curvefit (
bool) – Include curve fitting resultsshow_weighted (
bool) – Include weighted fit resultsresidues (
list[str] |None) – Specific residues to plot. If None, plots all.verbose (
bool) – Print progress
- Return type:
- plot_summary(figsize=(12, 5), save=None)[source]¶
Generate summary plot comparing methods.
Creates a 2-panel figure: - Left: pKa comparison scatter plot - Right: Distribution of pKa differences
- Return type:
- recommend_protonation(target_pH, confidence_threshold=0.7, use_bootstrap=False, verbose=True)[source]¶
Recommend protonation states for a target pH.
Uses the fitted titration curves to predict which residues are protonated vs deprotonated at the specified pH, with confidence estimates based on distance from pKa.
- Parameters:
target_pH (
float) – pH value to make predictions for (e.g., 3.0, 7.4)confidence_threshold (
float) – Probability threshold for “confident” predictions (default 0.7) Residues with P(protonated) between (1-threshold) and threshold are marked as “uncertain”use_bootstrap (
bool) – If True and bootstrap results available, use bootstrap CI for uncertainty estimationverbose (
bool) – Print summary of recommendations
- Returns:
resid: residue ID
resname: canonical residue name (ASP, GLU, HIS, LYS, CYS)
pKa: fitted pKa value
prob_protonated: probability of being protonated at target pH
recommendation: ‘protonated’, ‘deprotonated’, or ‘uncertain’
confidence: ‘high’, ‘medium’, or ‘low’
state_name: recommended state name (e.g., ‘ASH’ or ‘ASP’)
- Return type:
DataFrame
- get_protonation_string(target_pH, confidence_threshold=0.7)[source]¶
Get a simple string of recommended protonation states.
Useful for setting up simulations.
- export_protonation_states(target_pH, output_file=None, format='csv', confidence_threshold=0.7)[source]¶
Export protonation state recommendations to file.
- Parameters:
- Return type:
DataFrame
- molecular_simulations.analysis.constant_pH_analysis.analyze_cph(log_files, output_dir=None, methods=['curvefit', 'weighted'], plot=True, verbose=True)[source]¶
Convenience function to run complete constant pH analysis.
- Parameters:
- Return type:
Example
>>> results = analyze_cph("cpH.log", output_dir="analysis/") >>> results.summary() >>> results.plot_residue("145")