protactm.sch.protac module¶
protactm.sch.protac.ProtacAtoms
¶
Bases: LigandAtoms
A list of atoms representing a PROTAC.
analyze_conf_essemble(length_of='terminals', from_confs=200, cpus=0, calc_energy=False)
¶
Predict the linker length distribution and optionally energies, according to generated conformations of the PROTAC. Length is defined as the distance between two terminal / anchor / extension / center of mass atoms of two ligands.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
of |
Which points to measure distance between - 'terminals': Terminal atoms of each ligand - 'anchors': Anchor atoms connecting ligands to linker - 'extensions': Extension atoms of each ligand - 'ligand_coms': Centers of mass of each ligand |
required | |
from_confs |
int
|
Number of conformations to generate |
200
|
cpus |
int
|
Number of CPU threads to use (0 for auto) |
0
|
calc_energy |
bool
|
Whether to calculate MMFF94 energies (slower) |
False
|
Returns:
| Type | Description |
|---|---|
|
If calc_energy=False (default): List of distances for all conformations |
|
|
If calc_energy=True: Tuple of (distances, energies): distances: List of distances for all conformations energies: List of MMFF94 energies (kcal/mol) for all conformations |
Example
# Get just distance distribution
protac = ProtacAtoms.build_smiles(...) # Your PROTAC molecule
distances = protac.analyze_distance_dist(
of='terminals', # or 'anchors', 'extensions', 'ligand_coms'
from_confs=1000 # number of conformations to generate
)
# Get both distance and energy distributions
distances, energies = protac.analyze_distance_dist(
of='terminals',
from_confs=1000,
calc_energy=True
)
# Basic statistics
import numpy as np
mean_dist = np.mean(distances)
std_dist = np.std(distances)
print(f"Mean distance: {mean_dist:.2f} ± {std_dist:.2f} Å")
if calc_energy:
mean_energy = np.mean(energies)
std_energy = np.std(energies)
print(f"Mean energy: {mean_energy:.2f} ± {std_energy:.2f} kcal/mol")
centrality
cached
property
¶
Get the mapping of how close each atom is to the center. The greater the centrality, the closer to the center.
Returns:
| Type | Description |
|---|---|
|
{atom index: centrality} |
confsearch_attach_ligand(method='confsearch', **kw)
¶
Run confsearch on the seperated linker part and attach the ligands with original confs.
NOTE: This method cannot fully search the torsion around the anchor atoms between the linker and the ligand, and cannot handle chirality around the anchor atoms.
NOTE: This approach is inefficient and is not used in the proposed ternary complex modeling process.
confsearch_lig_constrained(host='localhost:1', max_steps=None, ligands=[], energy_window=500, force=1.0, parallel_strategy='confgen', as_files=False, **kw)
¶
Run confsearch on the whole PROTAC with torsions in both ligands constrained. This method allows small changes in torsion in the ligand. However, the efficiency of generation may be reduced.
NOTE: This approach is inefficient and is not used in the proposed ternary complex modeling process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
host |
str
|
Host specified by "hostname:concurrency" |
'localhost:1'
|
max_steps |
Max steps for conformational search. Default value is 10000 * sqrt(n_torsions) |
None
|
|
ligands |
List[LigandAtoms]
|
Ligands to be constrained. The conformational search process will keep torsions in the MCS part constrained. If not specified, the ligands in the PROTAC will be used. |
[]
|
energy_window |
float
|
Affects the number of outputs, 100 - 1000 is appropriate for avoiding significant
structural conflicts. The lower the number of generated structures, the lower the
structural energy and the lower number of results and the less structural conflict.
When you lower this value, you may need to raise |
500
|
force |
float
|
Affecting the strength of ligand restriction. Between 0 and 1. |
1.0
|
parallel_strategy |
Literal['confgen', 'frozen_torsion']
|
How to take advantage of multi-core for conformation generation. Only available when concurrency>1. Available options: confgen: First use confgen to generate several initial conformations. Then run multiple confsearch jobs with both ligands constrained, and with different random seed. Finally combine all conformations generated. frozen_torsion (not recommendeded): Combine the conformations generated by multiple confsearch jobs with both ligands constrained, and with different random seed. For each subjob, a different additional torsion in linker is constrained. |
'confgen'
|
as_files |
Memory overflow may occur when a large number of conformations are generated. When True, the task results will not be read, but a file list will be returned. |
False
|
confsearch_only_linker_macromodel(host='localhost:1', max_steps=None, energy_window=500, scan_ring_torsions=False, as_files=False, **kw)
¶
Run confsearch only on torsions in the linker. This allows the ligands on both sides to maintain the original conformation. If you need to adjust the ligand conformation in the input structure, use the use_lig_pose() method.
NOTE: This approach is inefficient and is not used in the proposed ternary complex modeling process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
host |
str
|
Host specified by "hostname:concurrency" |
'localhost:1'
|
max_steps |
int
|
Max steps for conformational search. Default value is 10000 * n_linker_torsions |
None
|
energy_window |
float
|
Affects the number of outputs, 100 - 5000 is appropriate for avoiding significant
structural conflicts. The lower the number of generated structures, the lower the
structural energy and the lower number of results and the less structural conflict.
When you lower this value, you may need to raise |
500
|
scan_ring_torsions |
Whether to re-search the torsions in unsaturated ring. |
False
|
|
as_files |
Memory overflow may occur when a large number of conformations are generated. When True, the task results will not be read, but a file list will be returned. |
False
|
confsearch_only_linker_numpy(host='localhost:1', max_steps=10000, mc_n_torsions=3)
¶
Run confsearch only on torsions in the linker. This method is a numpy-based Monte Carlo approach that is well suited for the creation of very large conformation collections for further fusion.
NOTE: This approach is inefficient and is not used in the proposed ternary complex modeling process.
guess_linker()
¶
Guess the linker part in this PROTAC. A built-in empirical fragment-based criterion is used to determine the linker.
guess_linker_from(st, by='full')
¶
Guess the linker part in this PROTAC, by virtue of information from protein-ligand complexes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
st |
Union[List[Structure], BinaryPose]
|
Either: 1. A list of 1-2 protein-ligand complexes, consisting ligands corresponds to a substructure in the PROTAC. 2. A BinaryPose object, consisting ligands corresponds to substructures in the PROTAC. |
required |
by |
Literal['sasa', 'full']
|
sasa: The portion of the ligand with the larger SASA is recognized as the linker. full: The entire ligand will be recognized as a ligand in PROTAC, other parts are recognized as linkers. |
'full'
|
guess_partners()
¶
Guess the partner chains of this PROTAC, sorted from the larger partner to the smaller partner.
Returns:
| Name | Type | Description |
|---|---|---|
Example |
Tuple[List[str], Atoms]
|
For a ternary complex system: Brd4/PROTAC/VHL/ElongC/ElongB,
|
Note |
Tuple[List[str], Atoms]
|
PROTAC atoms are not included in partner atoms. |
ligands: Tuple[ProtacLigand, ProtacLigand]
property
¶
Get two ligands (warhead and E3 ligand) by spliting the PROTAC.
ligands_sorted: Tuple[ProtacLigand, ProtacLigand]
property
¶
Get two ligands (warhead and E3 ligand) by spliting the PROTAC, sorted as the same order of partners.
linker: ProtacLinker
property
writable
¶
Guess the linker part of the PROTAC as a ProtacLinker object.
partner_ligand(partner)
¶
Get the ligands (warhead or E3 ligand) contacting with the specified partner chain.
partners
cached
property
¶
Guess the partner chains of the PROTAC, see: ProtacAtoms.guess_partners().
side_atoms
cached
property
¶
Get the atoms on each side without knowing the linker part.
terminal_atoms
cached
property
¶
Get the atom pair that maximize the shortest bond path length without knowing the linker part.
use_lig_pose(ligands, method='snap')
¶
Transforms the pose of the ligands in the protac molecule to conform to the input ligand conformations.
NOTE: This approach is inefficient and is not used in the proposed ternary complex modeling process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ligands |
List[LigandAtoms]
|
LigandAtoms of a small molecule inhibitor. It will be used as a reference to mimic and adjust the conformation of the ligands in the PROTAC. |
required |
method |
Literal['snap', 'snap_one_side', 'fast3d', 'torsion']
|
snap: Recommended method. Will produce a precise alignment. fast3d (slow!): Generate a series of conformations using fast3d, then analyze shape similarity. snap_one_side (slow!): Run snap on one ligand, and fast3d on other ligands. torsion: Adjust all bond angles to those in the reference ligands. Unable to adjust ring conformations. |
'snap'
|
Returns: A tuple of (conf, rmsd): conf: A Structure consisting the best alignment. rmsd: Root mean square of ligand superimpose RMSDs.
protactm.sch.protac.ProtacLigand
¶
Bases: ProtacPart
protactm.sch.protac.ProtacLinker
¶
Bases: ProtacPart
with_anchors()
¶
Get atoms of the linker.