Skip to content

protactm.sch.protac module

protactm.sch.protac.ProtacAtoms

Bases: LigandAtoms

A list of atoms representing a PROTAC.

analyze_conf_essemble(length_of='terminals', from_confs=200, cpus=0, calc_energy=False)

Predict the linker length distribution and optionally energies, according to generated conformations of the PROTAC. Length is defined as the distance between two terminal / anchor / extension / center of mass atoms of two ligands.

Parameters:

Name Type Description Default
of

Which points to measure distance between - 'terminals': Terminal atoms of each ligand - 'anchors': Anchor atoms connecting ligands to linker - 'extensions': Extension atoms of each ligand - 'ligand_coms': Centers of mass of each ligand

required
from_confs int

Number of conformations to generate

200
cpus int

Number of CPU threads to use (0 for auto)

0
calc_energy bool

Whether to calculate MMFF94 energies (slower)

False

Returns:

Type Description

If calc_energy=False (default): List of distances for all conformations

If calc_energy=True: Tuple of (distances, energies): distances: List of distances for all conformations energies: List of MMFF94 energies (kcal/mol) for all conformations

Example
# Get just distance distribution
protac = ProtacAtoms.build_smiles(...)  # Your PROTAC molecule
distances = protac.analyze_distance_dist(
    of='terminals',  # or 'anchors', 'extensions', 'ligand_coms'
    from_confs=1000  # number of conformations to generate
)

# Get both distance and energy distributions
distances, energies = protac.analyze_distance_dist(
    of='terminals',
    from_confs=1000,
    calc_energy=True
)

# Basic statistics
import numpy as np
mean_dist = np.mean(distances)
std_dist = np.std(distances)
print(f"Mean distance: {mean_dist:.2f} ± {std_dist:.2f} Å")

if calc_energy:
    mean_energy = np.mean(energies)
    std_energy = np.std(energies)
    print(f"Mean energy: {mean_energy:.2f} ± {std_energy:.2f} kcal/mol")

centrality cached property

Get the mapping of how close each atom is to the center. The greater the centrality, the closer to the center.

Returns:

Type Description

{atom index: centrality}

confsearch_attach_ligand(method='confsearch', **kw)

Run confsearch on the seperated linker part and attach the ligands with original confs.

NOTE: This method cannot fully search the torsion around the anchor atoms between the linker and the ligand, and cannot handle chirality around the anchor atoms.

NOTE: This approach is inefficient and is not used in the proposed ternary complex modeling process.

confsearch_lig_constrained(host='localhost:1', max_steps=None, ligands=[], energy_window=500, force=1.0, parallel_strategy='confgen', as_files=False, **kw)

Run confsearch on the whole PROTAC with torsions in both ligands constrained. This method allows small changes in torsion in the ligand. However, the efficiency of generation may be reduced.

NOTE: This approach is inefficient and is not used in the proposed ternary complex modeling process.

Parameters:

Name Type Description Default
host str

Host specified by "hostname:concurrency"

'localhost:1'
max_steps

Max steps for conformational search. Default value is 10000 * sqrt(n_torsions)

None
ligands List[LigandAtoms]

Ligands to be constrained. The conformational search process will keep torsions in the MCS part constrained. If not specified, the ligands in the PROTAC will be used.

[]
energy_window float

Affects the number of outputs, 100 - 1000 is appropriate for avoiding significant structural conflicts. The lower the number of generated structures, the lower the structural energy and the lower number of results and the less structural conflict. When you lower this value, you may need to raise max_steps.

500
force float

Affecting the strength of ligand restriction. Between 0 and 1.

1.0
parallel_strategy Literal['confgen', 'frozen_torsion']

How to take advantage of multi-core for conformation generation. Only available when concurrency>1. Available options: confgen: First use confgen to generate several initial conformations. Then run multiple confsearch jobs with both ligands constrained, and with different random seed. Finally combine all conformations generated. frozen_torsion (not recommendeded): Combine the conformations generated by multiple confsearch jobs with both ligands constrained, and with different random seed. For each subjob, a different additional torsion in linker is constrained.

'confgen'
as_files

Memory overflow may occur when a large number of conformations are generated. When True, the task results will not be read, but a file list will be returned.

False

confsearch_only_linker_macromodel(host='localhost:1', max_steps=None, energy_window=500, scan_ring_torsions=False, as_files=False, **kw)

Run confsearch only on torsions in the linker. This allows the ligands on both sides to maintain the original conformation. If you need to adjust the ligand conformation in the input structure, use the use_lig_pose() method.

NOTE: This approach is inefficient and is not used in the proposed ternary complex modeling process.

Parameters:

Name Type Description Default
host str

Host specified by "hostname:concurrency"

'localhost:1'
max_steps int

Max steps for conformational search. Default value is 10000 * n_linker_torsions

None
energy_window float

Affects the number of outputs, 100 - 5000 is appropriate for avoiding significant structural conflicts. The lower the number of generated structures, the lower the structural energy and the lower number of results and the less structural conflict. When you lower this value, you may need to raise max_steps.

500
scan_ring_torsions

Whether to re-search the torsions in unsaturated ring.

False
as_files

Memory overflow may occur when a large number of conformations are generated. When True, the task results will not be read, but a file list will be returned.

False

confsearch_only_linker_numpy(host='localhost:1', max_steps=10000, mc_n_torsions=3)

Run confsearch only on torsions in the linker. This method is a numpy-based Monte Carlo approach that is well suited for the creation of very large conformation collections for further fusion.

NOTE: This approach is inefficient and is not used in the proposed ternary complex modeling process.

guess_linker()

Guess the linker part in this PROTAC. A built-in empirical fragment-based criterion is used to determine the linker.

guess_linker_from(st, by='full')

Guess the linker part in this PROTAC, by virtue of information from protein-ligand complexes.

Parameters:

Name Type Description Default
st Union[List[Structure], BinaryPose]

Either: 1. A list of 1-2 protein-ligand complexes, consisting ligands corresponds to a substructure in the PROTAC. 2. A BinaryPose object, consisting ligands corresponds to substructures in the PROTAC.

required
by Literal['sasa', 'full']

sasa: The portion of the ligand with the larger SASA is recognized as the linker. full: The entire ligand will be recognized as a ligand in PROTAC, other parts are recognized as linkers.

'full'

guess_partners()

Guess the partner chains of this PROTAC, sorted from the larger partner to the smaller partner.

Returns:

Name Type Description
Example Tuple[List[str], Atoms]

For a ternary complex system: Brd4/PROTAC/VHL/ElongC/ElongB, guess_partners will returns: ( ([''], ), (['', '', ''], ), )

Note Tuple[List[str], Atoms]

PROTAC atoms are not included in partner atoms.

ligands: Tuple[ProtacLigand, ProtacLigand] property

Get two ligands (warhead and E3 ligand) by spliting the PROTAC.

ligands_sorted: Tuple[ProtacLigand, ProtacLigand] property

Get two ligands (warhead and E3 ligand) by spliting the PROTAC, sorted as the same order of partners.

linker: ProtacLinker property writable

Guess the linker part of the PROTAC as a ProtacLinker object.

partner_ligand(partner)

Get the ligands (warhead or E3 ligand) contacting with the specified partner chain.

partners cached property

Guess the partner chains of the PROTAC, see: ProtacAtoms.guess_partners().

side_atoms cached property

Get the atoms on each side without knowing the linker part.

terminal_atoms cached property

Get the atom pair that maximize the shortest bond path length without knowing the linker part.

use_lig_pose(ligands, method='snap')

Transforms the pose of the ligands in the protac molecule to conform to the input ligand conformations.

NOTE: This approach is inefficient and is not used in the proposed ternary complex modeling process.

Parameters:

Name Type Description Default
ligands List[LigandAtoms]

LigandAtoms of a small molecule inhibitor. It will be used as a reference to mimic and adjust the conformation of the ligands in the PROTAC.

required
method Literal['snap', 'snap_one_side', 'fast3d', 'torsion']

snap: Recommended method. Will produce a precise alignment. fast3d (slow!): Generate a series of conformations using fast3d, then analyze shape similarity. snap_one_side (slow!): Run snap on one ligand, and fast3d on other ligands. torsion: Adjust all bond angles to those in the reference ligands. Unable to adjust ring conformations.

'snap'

Returns: A tuple of (conf, rmsd): conf: A Structure consisting the best alignment. rmsd: Root mean square of ligand superimpose RMSDs.

protactm.sch.protac.ProtacLigand

Bases: ProtacPart

partner cached property

Get the representitive directly contacting chain of this part.

terminal cached property

Get the farthest atom from anchor.

with_anchor()

Get atoms of the ligand with the anchor atom added.

protactm.sch.protac.ProtacLinker

Bases: ProtacPart

with_anchors()

Get atoms of the linker.

protactm.sch.protac.ProtacPart

Bases: Atoms

A list of atoms representing part of a PROTAC (warheads, linkers and E3 ligands).

For linkers, anchor atoms are included by default. For ligands, anchor atoms should be added using ProtacPart.with_anchors().

protac cached property

Get the parent PROTAC object.

reorder_to_smarts()

Standardize the order of index in this atom list according to SMARTS.