deeporigin.drug_discovery.LigandSet
¶
A class representing a set of Ligand objects.
Attributes:
Name | Type | Description |
---|---|---|
ligands |
list[Ligand]
|
A list of Ligand instances contained in the set. |
network |
dict
|
A dictionary containing the network of ligands estimated using Konnektor. |
Attributes¶
Functions¶
admet_properties
¶
admet_properties(use_cache: bool = True)
Predict ADMET properties for all ligands in the set. This calls the admet_properties() method on each Ligand in the set. Returns a list of the results for each ligand. Shows a progress bar using tqdm.
compute_constraints
¶
compute_constraints(
*, reference: Ligand, mcs_mol=None
) -> list[list[dict]]
Align a set of ligands to a reference ligand
embed
¶
embed()
Minimize all ligands in the set using their 3D optimization routines. This calls the embed() method on each Ligand in the set.
filter_top_poses
¶
filter_top_poses(
*, by_pose_score: bool = False
) -> LigandSet
Filter ligands to keep only the best pose for each unique molecule.
Groups ligands by their 'initial_smiles' property and retains only the one with: - Minimum binding energy (default), or - Maximum pose score (when by_pose_score=True)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
by_pose_score
|
bool
|
If True, select by maximum pose score. If False (default), select by minimum binding energy. |
False
|
Returns:
Name | Type | Description |
---|---|---|
LigandSet |
LigandSet
|
A new LigandSet containing only the best pose for each unique molecule. |
Raises:
Type | Description |
---|---|
DeepOriginException
|
If required properties are missing from ligands. |
from_csv
classmethod
¶
from_csv(
file_path: str, smiles_column: str = "smiles"
) -> LigandSet
Create a LigandSet instance from a CSV file containing SMILES strings and additional properties.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path
|
str
|
The path to the CSV file. |
required |
smiles_column
|
str
|
The name of the column containing SMILES strings. Defaults to "smiles". |
'smiles'
|
Returns:
Name | Type | Description |
---|---|---|
LigandSet |
LigandSet
|
A LigandSet instance containing Ligand objects created from the CSV file. |
Raises:
Type | Description |
---|---|
FileNotFoundError
|
If the file does not exist. |
DeepOriginException
|
If the CSV does not contain the specified smiles column or if SMILES strings are invalid. |
from_dir
classmethod
¶
from_dir(directory: str) -> LigandSet
Create a LigandSet instance from a directory containing SDF files.
from_rdkit_mols
classmethod
¶
from_rdkit_mols(mols: list[Mol])
Create a LigandSet from a list of RDKit molecules.
from_sdf
classmethod
¶
from_sdf(
file_path: str,
*,
sanitize: bool = True,
remove_hydrogens: bool = False
) -> LigandSet
Create a LigandSet instance from an SDF file containing one or more molecules.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path
|
str
|
The path to the SDF file. |
required |
sanitize
|
bool
|
Whether to sanitize molecules. Defaults to True. |
True
|
remove_hydrogens
|
bool
|
Whether to remove hydrogens. Defaults to False. |
False
|
Returns:
Name | Type | Description |
---|---|---|
LigandSet |
LigandSet
|
A LigandSet instance containing Ligand objects created from the SDF file. |
Raises:
Type | Description |
---|---|
FileNotFoundError
|
If the file does not exist. |
DeepOriginException
|
If the file cannot be parsed correctly. |
from_sdf_files
classmethod
¶
from_sdf_files(
file_paths: list[str],
*,
sanitize: bool = True,
remove_hydrogens: bool = False
) -> LigandSet
Create a LigandSet instance from multiple SDF files by concatenating them together.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_paths
|
list[str]
|
A list of paths to SDF files. |
required |
sanitize
|
bool
|
Whether to sanitize molecules. Defaults to True. |
True
|
remove_hydrogens
|
bool
|
Whether to remove hydrogens. Defaults to False. |
False
|
Returns:
Name | Type | Description |
---|---|---|
LigandSet |
LigandSet
|
A LigandSet instance containing Ligand objects from all SDF files. |
Raises:
Type | Description |
---|---|
FileNotFoundError
|
If any of the files do not exist. |
DeepOriginException
|
If any of the files cannot be parsed correctly. |
from_smiles
classmethod
¶
from_smiles(smiles: list[str] | set[str]) -> LigandSet
Create a LigandSet from a list of SMILES strings.
map_network
¶
map_network(
*,
use_cache: bool = True,
operation: Literal[
"mapping", "network", "full"
] = "network",
network_type: Literal["star", "mst", "cyclic"] = "mst"
)
Map a network of ligands from an SDF file using the DeepOrigin API.
mcs
¶
mcs() -> str
Generates the Most Common Substructure (MCS) for ligands in a LigandSet
Returns:
Type | Description |
---|---|
str
|
smartsString (str) : SMARTS string representing the MCS |
protonate
¶
protonate(
*, ph: number = 7.4, filter_percentage: number = 1.0
)
Protonate the ligandSet. Only the most abundant species is retained for each ligand.
random_sample
¶
random_sample(n: int) -> LigandSet
Return a new LigandSet containing n randomly selected ligands.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n
|
int
|
Number of ligands to randomly sample |
required |
Returns:
Name | Type | Description |
---|---|---|
LigandSet |
LigandSet
|
A new LigandSet with n randomly selected ligands |
Raises:
Type | Description |
---|---|
ValueError
|
If n is greater than the total number of ligands |
show_grid
¶
show_grid(
mols_per_row: int = 3,
sub_img_size: tuple[int, int] = (300, 300),
)
show all ligands in the LigandSet in a grid
to_sdf
¶
to_sdf(output_path: Optional[str] = None) -> str
Write all ligands in the set to a single SDF file, preserving all properties from each Ligand's mol field.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_path
|
str
|
The path to the output SDF file. |
None
|
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
The path to the written SDF file. |