molecular_simulations.build.build_ligand module

Ligand parameterization and complex building module.

This module provides classes for parameterizing small molecule ligands using GAFF2 force fields and building protein-ligand complex systems for molecular dynamics simulations.

Classes:

LigandError: Custom exception for ligand parameterization failures. LigandBuilder: Parameterize ligands with GAFF2 force field. PLINDERBuilder: Build complexes from PLINDER database entries. ComplexBuilder: Build general protein-ligand complex systems.

exception molecular_simulations.build.build_ligand.LigandError(message='This system contains ligands which we cannot model!')[source]

Bases: Exception

Custom exception for ligand parameterization errors.

Raised when antechamber or SQM fails to parameterize a ligand, or when other ligand-related issues occur during system building.

Parameters:

message – Error message describing the failure. Defaults to a generic message about ligand modeling failure.

Example

>>> raise LigandError('Antechamber failed for molecule XYZ')

Initialize the LigandError.

__init__(message='This system contains ligands which we cannot model!')[source]

Initialize the LigandError.

class molecular_simulations.build.build_ligand.LigandBuilder(path, lig, lig_number=0, file_prefix='')[source]

Bases: object

Parameterize a ligand molecule with GAFF2 force field.

Generates all relevant force field files (.frcmod, .lib, .mol2) for running tleap with small molecule ligands.

Parameters:
  • path (str | Path) – Directory path for input/output files.

  • lig (str | Path) – Ligand filename (SDF or PDB format).

  • lig_number (int) – Numeric identifier for the ligand residue name. Creates residue names like ‘LG0’, ‘LG1’, etc. Defaults to 0.

  • file_prefix (str) – Optional prefix for output files. Defaults to ‘’.

path

Path object for working directory.

lig

Path to input ligand file.

ln

Ligand number for residue naming.

out_lig

Path for output ligand files (without extension).

Example

>>> builder = LigandBuilder(path='./build', lig='ligand.sdf', lig_number=0)
>>> builder.parameterize_ligand()

Initialize the LigandBuilder.

__init__(path, lig, lig_number=0, file_prefix='')[source]

Initialize the LigandBuilder.

parameterize_ligand()[source]

Generate GAFF2 parameters for the ligand.

Ensures consistent treatment of all ligand files by: 1. Adding hydrogens using RDKit 2. Converting to mol2 format 3. Running antechamber for GAFF2 atom typing and AM1-BCC charges 4. Running parmchk2 to generate missing parameters 5. Creating a tleap library file

Raises:

LigandError – If antechamber fails to parameterize the ligand.

Return type:

None

process_sdf()[source]

Process an SDF file and add hydrogens.

Uses RDKit to add hydrogens based on atom hybridization from the input SDF file. Note that incorrect hybridization in the input will result in incorrect hydrogen placement.

Return type:

None

process_pdb()[source]

Process a PDB file and add hydrogens.

Reads a small molecule PDB, adds hydrogens using RDKit, and writes the result to an SDF file.

Return type:

None

convert_to_mol2()[source]

Convert SDF to mol2 format using OpenBabel.

Return type:

None

move_antechamber_outputs()[source]

Clean up antechamber output files.

Removes unnecessary outputs and renames sqm.out for later verification that antechamber completed successfully.

Return type:

None

check_sqm()[source]

Verify that SQM calculations completed successfully.

Checks the sqm.out file for completion message. If absent, indicates parameter generation failed.

Raises:

LigandError – If SQM calculations did not complete.

Return type:

None

write_leap(inp)[source]

Write a tleap input file.

Parameters:

inp (str) – The tleap input file contents as a string.

Return type:

tuple[str, str]

Returns:

Tuple of (input_file_path, log_file_path).

class molecular_simulations.build.build_ligand.PLINDERBuilder(path, system_id, out, **kwargs)[source]

Bases: ImplicitSolvent

Build protein-ligand complexes from PLINDER database entries.

Extends ImplicitSolvent to handle PLINDER-format input directories containing receptor PDB, ligand SDF files, and sequence information.

Parameters:
  • path (str | Path) – Base path to PLINDER system directory.

  • system_id (str) – PLINDER system identifier.

  • out (str | Path) – Output directory path.

  • **kwargs – Additional arguments passed to ImplicitSolvent.

system_id

PLINDER system identifier.

build_dir

Directory for intermediate build files.

ions

List of ions extracted from ligand files.

ligs

List of parameterized ligand names.

Example

>>> builder = PLINDERBuilder(
...     path='./plinder_data', system_id='1abc_A_B_ligand', out='./output'
... )
>>> builder.build()

Initialize the PLINDERBuilder.

__init__(path, system_id, out, **kwargs)[source]

Initialize the PLINDERBuilder.

build()[source]

Build the protein-ligand complex system.

Migrates files, parameterizes ligands, and assembles the final system with topology and coordinate files.

Raises:

LigandError – If no ligands are found or parameterization fails.

Return type:

None

ligand_handler(ligs)[source]

Parameterize all ligands in the system.

Parameters:

ligs (list[str]) – List of ligand file paths to parameterize.

Return type:

list[str]

Returns:

List of parameterized ligand names (without extensions).

migrate_files()[source]

Prepare input files for system building.

Copies sequence files, fixes and moves the receptor PDB, and processes ligand files. Handles ions separately.

Return type:

list[str]

Returns:

List of ligand filenames to parameterize.

place_ions()[source]

Add ion records to the PDB file.

Appends ATOM records for extracted ions to the receptor PDB. Handles various ion naming conventions for AMBER compatibility.

Note

This method handles complex PDB formatting requirements. Proceed with caution if modifications are needed.

Return type:

None

assemble_system()[source]

Assemble the protein-ligand complex with tleap.

Creates topology and coordinate files for the complex, including all parameterized ligands and ions.

Return type:

None

prep_protein()[source]

Prepare the receptor protein structure.

Runs PDBFixer to repair missing residues, then pdb4amber to remove hydrogens and waters for clean tleap input.

Return type:

None

triage_pdb(broken_pdb, repaired_pdb)[source]

Repair a PDB structure using PDBFixer.

Runs PDBFixer to add missing loops and atoms. Uses FASTA sequence from PLINDER to properly model missing residues, including non-canonical residues like phosphorylations.

Parameters:
  • broken_pdb (str | Path) – Path to input PDB requiring repairs.

  • repaired_pdb (str | Path) – Path for output repaired PDB.

Return type:

None

inject_fasta(chain_map)[source]

Inject FASTA sequence data for PDBFixer.

Checks FASTA against actual sequence and modifies to correctly handle non-canonical residues (e.g., SER -> SEP for phosphoserine).

Parameters:

chain_map (dict[str, list[Any]]) – Dictionary mapping chain IDs to lists of residues.

Return type:

list[Sequence]

Returns:

List of PDBFixer Sequence objects for all chains.

Raises:

LigandError – If unknown residues are found in the FASTA.

check_ptms(sequence, chain_residues)[source]

Check for post-translational modifications in the sequence.

Compares the full sequence from FASTA against the partial sequence from the structural model and updates non-canonical residue names appropriately.

Parameters:
  • sequence (list[str]) – List of three-letter residue codes from FASTA.

  • chain_residues (list[Any]) – List of residue objects from the structure.

Return type:

list[str]

Returns:

Updated sequence list with PTM residue names.

Raises:

LigandError – If sequence length mismatches occur.

check_ligand(ligand)[source]

Check if a ligand file contains ions or valid small molecules.

Identifies species that are ions based on element type and formal charge. True ligands are returned for parameterization while ions are stored separately.

Parameters:

ligand (str | Path) – Path to ligand SDF file.

Return type:

bool

Returns:

True if ligand should be parameterized, False if it’s an ion.

property cation_list: list[str]

List of common cation element symbols (lowercase).

property anion_list: list[str]

List of common anion element symbols (lowercase).

class molecular_simulations.build.build_ligand.ComplexBuilder(path, pdb, lig, padding=10.0, lig_param_prefix=None, **kwargs)[source]

Bases: ExplicitSolvent

Build protein-ligand complexes with explicit solvent.

Extends ExplicitSolvent to handle ligand parameterization and complex assembly. Supports both automatic parameterization via antechamber and pre-computed parameter files.

Parameters:
  • path (str) – Directory path for output files.

  • pdb (str) – Path to protein PDB file.

  • lig (str | list[str]) – Path to ligand file(s). Can be a single path or list of paths.

  • padding (float) – Box padding in Angstroms. Defaults to 10.0.

  • lig_param_prefix (str | None) – Optional path prefix to pre-computed ligand parameters (.frcmod, .lib, .mol2). If None, parameters are generated automatically. Defaults to None.

  • **kwargs – Additional attributes (e.g., ‘ion’ for ion PDB path).

lig

Path(s) to ligand file(s).

build_dir

Directory for intermediate build files.

lig_param_prefix

Prefix for pre-computed parameter files.

Example

>>> builder = ComplexBuilder(
...     path='./build', pdb='protein.pdb', lig='ligand.sdf', padding=12.0
... )
>>> builder.build()

Initialize the ComplexBuilder.

__init__(path, pdb, lig, padding=10.0, lig_param_prefix=None, **kwargs)[source]

Initialize the ComplexBuilder.

build()[source]

Build the solvated protein-ligand complex.

Parameterizes ligands (if needed), prepares the protein, and assembles the solvated system.

Return type:

None

process_ligand(lig, prefix=None)[source]

Process and parameterize a single ligand.

Parameters:
  • lig (str | Path) – Path to ligand file.

  • prefix (int | None) – Optional numeric prefix for multi-ligand systems.

Return type:

Path

Returns:

Path to parameterized ligand (without extension).

add_ion_to_pdb()[source]

Add ion coordinates to the protein PDB file.

Reads ion coordinates from a separate file and appends them to the protein PDB before the END record.

Return type:

None

assemble_system(dim, num_ions)[source]

Assemble the solvated protein-ligand complex.

Loads ligand parameters, combines with protein, solvates, and ionizes the system.

Parameters:
  • dim (float) – Box dimension in Angstroms.

  • num_ions (int) – Number of Na+/Cl- pairs for 150mM concentration.

Return type:

None