deeporigin.drug_discovery.Ligand¶
Bases: Entity
A class representing a ligand molecule in drug discovery workflows.
The Ligand class provides functionality to create, manipulate, and analyze small molecules (ligands) in computational drug discovery. It supports various input formats and provides methods for property prediction, visualization, and file operations.
After running :class:~deeporigin.drug_discovery.molprops.Molprops, predicted ADMET values
are available on dedicated attributes (log_s, log_d, log_p, herg, cyp,
ames, has_pains, pains_fragments) as well as in :attr:properties.
The RDKit molecule must be passed as the keyword-only argument mol (typically via
:meth:from_smiles, :meth:from_rdkit_mol, or similar factory methods).
Attributes¶
available_for_docking
class-attribute
instance-attribute
¶
available_for_docking: bool = field(
init=False, default=True
)
canonical_smiles
property
¶
canonical_smiles: str
Canonical (RDKit) SMILES for this ligand.
Notes: - Canonicalization is RDKit-specific. - Returns implicit-H SMILES by default (explicit Hs removed). - Preserves stereochemistry if present.
contains_boron
property
¶
contains_boron: bool
Check if the ligand contains boron atoms.
Currently, ligands with boron atoms are not supported for docking.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if the ligand contains boron atoms, False otherwise. |
formal_charge
property
¶
formal_charge: int
Compute the formal charge of the ligand molecule.
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The sum of formal charges of all atoms in the molecule. |
hbond_acceptor_count
property
¶
hbond_acceptor_count: int
Compute the number of hydrogen bond acceptors in the ligand molecule.
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The number of hydrogen bond acceptors. |
hbond_donor_count
property
¶
hbond_donor_count: int
Compute the number of hydrogen bond donors in the ligand molecule.
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The number of hydrogen bond donors. |
local_path
class-attribute
instance-attribute
¶
local_path: str | None = field(default=None, kw_only=True)
molecular_weight
property
¶
molecular_weight: float
Compute the exact molecular weight of the ligand molecule.
Returns:
| Name | Type | Description |
|---|---|---|
float |
float
|
The exact molecular weight in atomic mass units. |
project_id
class-attribute
instance-attribute
¶
project_id: str | None = field(default=None, kw_only=True)
remote_path
class-attribute
instance-attribute
¶
remote_path: str | None = field(default=None, kw_only=True)
rotatable_bond_count
property
¶
rotatable_bond_count: int
Compute the number of rotatable bonds in the ligand molecule.
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The number of rotatable bonds. |
tpsa
property
¶
tpsa: float
Compute the Topological Polar Surface Area (TPSA) of the ligand molecule.
Returns:
| Name | Type | Description |
|---|---|---|
float |
float
|
The TPSA value in square Angstroms. |
Functions¶
add_hydrogens
¶
add_hydrogens(add_coordinates: bool = True)
Add hydrogens to the molecule.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
add_coordinates
|
bool
|
Whether to generate coordinates for added hydrogens |
True
|
download
¶
download(
*,
lazy: bool = True,
client: DeepOriginClient | None = None
) -> str
Download the entity file from remote storage.
No-ops if local_path is already set.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client
|
DeepOriginClient | None
|
DeepOriginClient instance. If None, uses DeepOriginClient(). |
None
|
Returns:
| Type | Description |
|---|---|
str
|
The local file path. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If neither local_path nor remote_path is available. |
embed
¶
embed(add_hydrogens: bool = True, seed: int = -1)
Generate 3D coordinates for the molecule.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
add_hydrogens
|
bool
|
Whether to add hydrogens |
True
|
seed
|
int
|
Random seed for coordinate generation |
-1
|
ensure_remote_path
¶
ensure_remote_path(
*, client: DeepOriginClient, label: str
) -> None
Ensure :attr:remote_path is set after a lazy :meth:sync may have no-oped.
If the entity already has a platform id but remote_path was never
populated (e.g. rehydrated metadata only), sync(lazy=True) returns
early. This performs a full sync when needed, then raises if the path is
still missing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client
|
DeepOriginClient
|
Authenticated client for sync/upload. |
required |
label
|
str
|
Human-readable name for error messages (e.g. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
from_base64
classmethod
¶
from_base64(
base64_string: str, name: str = "", **kwargs: Any
) -> Self
Create a Ligand instance from a base64 encoded SDF string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base64_string
|
str
|
Base64 encoded SDF content |
required |
name
|
str
|
Name of the ligand. Defaults to "". |
''
|
**kwargs
|
Any
|
Additional arguments to pass to the constructor |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
Ligand |
Self
|
A new Ligand instance |
Raises:
| Type | Description |
|---|---|
DeepOriginException
|
If the base64 string cannot be decoded or parsed |
from_file
classmethod
¶
from_file(
file_path: str | Path,
*,
sanitize: bool = True,
remove_hydrogens: bool = False
) -> Self
Create a Ligand from a file after verifying it is an SDF (extension and content).
This delegates to :meth:from_sdf after validation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
str | Path
|
Path to the SDF file. |
required |
sanitize
|
bool
|
Whether to sanitize molecules. Defaults to True. |
True
|
remove_hydrogens
|
bool
|
Whether to remove hydrogens. Defaults to False. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
Ligand |
Self
|
The Ligand instance created from the SDF file. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the file does not exist. |
DeepOriginException
|
If the path is not an SDF file or loading fails. |
from_id
classmethod
¶
from_id(
id: str,
*,
client: Optional[DeepOriginClient] = None,
download: bool = True
) -> Self
Create a Ligand instance from a Deep Origin Data Platform ID.
Fetches the ligand record from the platform. If the record has an associated mol file it is downloaded and used to construct the ligand; otherwise the SMILES string is used.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
id
|
str
|
The Deep Origin Data Platform ID of the ligand. |
required |
client
|
Optional[DeepOriginClient]
|
Optional DeepOriginClient instance. If not provided, uses the default client. |
None
|
download
|
bool
|
If False, skip mol file download and hydrate from SMILES with
|
True
|
Returns:
| Name | Type | Description |
|---|---|---|
Ligand |
Self
|
A new Ligand instance. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the ligand data contains neither a mol file nor a SMILES string. |
from_identifier
classmethod
¶
from_identifier(identifier: str) -> Self
Create a Ligand instance from a compound name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
identifier
|
str
|
The identifier to resolve to a SMILES string. |
required |
Raises:
| Type | Description |
|---|---|
DeepOriginException
|
If no compound is found for the given name |
AssertionError
|
If neither smiles nor name is provided |
from_rdkit_mol
classmethod
¶
from_rdkit_mol(
mol: Mol, name: Optional[str] = None, **kwargs: Any
) -> Self
Create a Ligand instance from an RDKit Mol object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mol
|
Mol
|
RDKit molecule object to convert to a Ligand |
required |
name
|
str
|
Name of the ligand. Defaults to "". |
None
|
**kwargs
|
Any
|
Additional arguments to pass to the constructor |
{}
|
from_remote_file
classmethod
¶
from_remote_file(
remote_path: str,
*,
client: DeepOriginClient | None = None,
lazy: bool = True,
sanitize: bool = True,
remove_hydrogens: bool = False
) -> Self
Create a Ligand from an SDF file stored on the platform.
Downloads the file via :meth:deeporigin.platform.files.FilesClient.download,
then loads it with :meth:from_sdf. The SDF must contain exactly one molecule.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
remote_path
|
str
|
Platform file path (e.g. org storage path) to the SDF file. |
required |
client
|
DeepOriginClient | None
|
DeepOrigin client used for download. If |
None
|
lazy
|
bool
|
Passed to |
True
|
sanitize
|
bool
|
Whether to sanitize molecules when reading the SDF (see
:meth: |
True
|
remove_hydrogens
|
bool
|
Whether to strip hydrogens when reading the SDF (see
:meth: |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
Ligand |
Self
|
A ligand with :attr: |
Self
|
set to |
|
Self
|
set to the downloaded file path. |
from_sdf
classmethod
¶
from_sdf(
file_path: str | Path,
*,
sanitize: bool = True,
remove_hydrogens: bool = False
) -> Self
Create a single Ligand instance from an SDF file containing exactly one molecule.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
str
|
The path to the SDF file. |
required |
sanitize
|
bool
|
Whether to sanitize molecules. Defaults to True. |
True
|
remove_hydrogens
|
bool
|
Whether to remove hydrogens. Defaults to False. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
Ligand |
Self
|
The Ligand instance created from the SDF file. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the file does not exist. |
DeepOriginException
|
If the file cannot be parsed correctly or contains more than one molecule. |
from_smiles
classmethod
¶
from_smiles(
smiles: str, name: str = "", **kwargs: Any
) -> Self
Create a Ligand instance from a SMILES string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
smiles
|
str
|
SMILES string representing the ligand |
required |
name
|
str
|
Name of the ligand. Defaults to "". |
''
|
**kwargs
|
Any
|
Additional arguments to pass to the constructor |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
Ligand |
Self
|
A new Ligand instance |
get_center
¶
get_center() -> list[number]
Get the center of the ligand based on its coordinates.
Returns: - list: The center coordinates of the ligand. - None: If coordinates are not available.
get_conformer
¶
get_conformer(conformer_id: int = 0)
Get a specific conformer of the molecule.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conformer_id
|
int
|
Conformer index |
0
|
get_conformer_id
¶
get_conformer_id() -> int
Get the ID of the current conformer.
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
Conformer ID |
get_coordinates
¶
get_coordinates(i: int = 0)
Get the coordinates of atoms in a specific conformer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
i
|
int
|
Conformer index |
0
|
get_property
¶
get_property(prop_name: str)
Get the value of a property for the ligand molecule.
Parameters: - prop_name (str): Name of the property to retrieve.
Returns: - The value of the property if it exists, otherwise None.
get_species
¶
get_species() -> list[str]
Get the atomic symbols of all atoms in the molecule.
Returns:
| Name | Type | Description |
|---|---|---|
list |
list[str]
|
List of atomic symbols |
has_3d_structure
¶
has_3d_structure() -> bool
Check if the ligand has 3D coordinates (not just 2D).
This method checks if the molecule has conformers with 3D coordinates. Ligands created from SMILES typically have 2D coordinates (z=0 for all atoms), while ligands with actual 3D structure have non-zero z coordinates.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if the ligand has 3D coordinates (non-zero z values), False if it only has 2D coordinates or no conformers. |
has_hydrogens
¶
has_hydrogens() -> bool
Check if the molecule contains hydrogen atoms.
This method determines if hydrogens are present by comparing the canonical SMILES string of the molecule with and without explicit hydrogens added. If the SMILES strings differ, the molecule contains hydrogens.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if the molecule contains hydrogen atoms, False otherwise |
has_unsupported_atoms
¶
has_unsupported_atoms() -> bool
Whether :attr:mol contains any atom type not supported for docking workflows.
mol_from_block
classmethod
¶
mol_from_block(
block_type: str,
block: str,
sanitize: bool = True,
remove_hs: bool = False,
) -> Mol
Create a molecule from a block of text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
block_type
|
str
|
Type of the input block |
required |
block
|
str
|
Text block containing molecular data |
required |
sanitize
|
bool
|
Whether to sanitize the molecule |
True
|
remove_hs
|
bool
|
Whether to remove hydrogens |
False
|
Returns:
| Type | Description |
|---|---|
Mol
|
Chem.Mol: RDKit molecule object |
mol_from_file
classmethod
¶
mol_from_file(
*,
file_type: FILE_FORMATS,
file_path: str,
sanitize: bool = True,
remove_hs: bool = False
) -> Mol
Create a molecule from a file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_type
|
str
|
Type of the input file (must be in FILE_FORMATS) |
required |
file_path
|
str
|
Path to the input file |
required |
sanitize
|
bool
|
Whether to sanitize the molecule |
True
|
remove_hs
|
bool
|
Whether to remove hydrogens |
False
|
Returns:
| Type | Description |
|---|---|
Mol
|
Chem.Mol: RDKit molecule object |
Raises:
| Type | Description |
|---|---|
DeepOriginException
|
If the file format is invalid or parsing fails |
NotImplementedError
|
If the file type is not supported |
prepare
¶
prepare(*, remove_hydrogens: bool = False) -> Self
Prepare the ligand for downstream workflows.
The routine performs the following using RDKit and internal utilities: - Salt removal - Kekulization - Fragment validation (rejects multiple non-identical fragments) - Wildcard atom validation (rejects '*' atoms) - Validation of atom types against supported symbols
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
remove_hydrogens
|
bool
|
Whether to remove hydrogens from the SMILES representation. Defaults to False (preserve hydrogens). |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
Ligand |
Self
|
The prepared ligand (self), for chaining. |
Raises:
| Type | Description |
|---|---|
DeepOriginException
|
If preparation fails, unsupported atom types are present, or multiple non-identical fragments are detected. |
process_mol
¶
process_mol() -> None
Clean the ligand molecule by removing hydrogens and sanitizing the structure.
Raises:
| Type | Description |
|---|---|
DeepOriginException
|
If salt removal or kekulization fails |
protonate
¶
protonate(
*,
ph: number = 7.4,
filter_percentage: number = 1.0,
client: Optional[DeepOriginClient] = None,
use_cache: bool = True,
quote: bool = False
) -> FunctionResult
Protonate the ligand at a given pH using the DeepOrigin API.
Returns a FunctionResult whose .ligands attribute contains
the protonated ligand. When quote=True, .ligands is empty
and .estimate gives the cost in dollars.
The ligand is mutated in place: self.mol, self.smiles, and
self.protonated_at_ph are updated with the most abundant
protonation state.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ph
|
number
|
pH value at which to protonate the ligand. Defaults to 7.4 (physiological pH). |
7.4
|
filter_percentage
|
number
|
Percentage threshold for filtering protonation states. Only species with abundance above this threshold are considered. Defaults to 1.0. |
1.0
|
client
|
Optional[DeepOriginClient]
|
DeepOrigin client instance. If None, uses
|
None
|
use_cache
|
bool
|
Whether to use cached protonation results. |
True
|
quote
|
bool
|
If True, request a cost estimate without executing. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
FunctionResult |
FunctionResult
|
A FunctionResult with a |
register
¶
register(
*,
client: Optional[DeepOriginClient] = None,
remote_path: Optional[str] = None
) -> None
Register the ligand as a new record in the data platform.
Uploads the ligand file to remote storage (if available) and creates a new ligand record, regardless of whether one already exists for this canonical SMILES.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client
|
Optional[DeepOriginClient]
|
DeepOriginClient instance. If None, uses DeepOriginClient(). |
None
|
remote_path
|
Optional[str]
|
Custom remote path to upload to. Overrides the default hash-based path. |
None
|
Returns:
| Type | Description |
|---|---|
None
|
None. As a side effect, uploads the ligand and sets |
None
|
to the newly created record's ID. |
Note
If the ligand was created from a SMILES string without an SDF file, only the SMILES will be used (no file upload will occur).
resolved_project_id
¶
resolved_project_id(
client: DeepOriginClient | None = None,
) -> str | None
Data platform project id for API calls.
Returns :attr:project_id when set; otherwise client.project_id when
client is given; otherwise None. Does not read the filesystem.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client
|
DeepOriginClient | None
|
Optional platform client (e.g. the one passed to
:meth: |
None
|
Returns:
| Type | Description |
|---|---|
str | None
|
Project id string, or None if neither the entity nor the client |
str | None
|
provides one. |
set_conformer_id
¶
set_conformer_id(i=0)
Set the ID of the current conformer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
i
|
int
|
New conformer ID |
0
|
set_property
¶
set_property(prop_name: str, prop_value)
Set a property for the ligand molecule.
Parameters: - prop_name (str): Name of the property. - prop_value: Value of the property.
show
¶
show() -> str | None
Visualize the current state of the ligand molecule.
Returns: - str: HTML representation of the visualization.
Raises: - Exception: If visualization fails.
sync
¶
sync(
*,
lazy: bool = False,
client: Optional[DeepOriginClient] = None,
remote_path: Optional[str] = None
) -> None
Sync the ligand to the data platform.
Uploads the ligand file and links to an existing record if one with
the same canonical SMILES already exists (setting id and
remote_path from the record's mol_file when present), otherwise
creates a new record via :meth:register.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lazy
|
bool
|
If True, skip syncing when the ligand already has an ID. Defaults to False. |
False
|
client
|
Optional[DeepOriginClient]
|
DeepOriginClient instance. If None, uses DeepOriginClient(). |
None
|
remote_path
|
Optional[str]
|
Custom remote path to upload to. Overrides the default hash-based path. |
None
|
Note
If the ligand was created from a SMILES string without an SDF file, only the SMILES will be used for syncing (no file upload will occur).
to_base64
¶
to_base64() -> str
Convert the ligand to base64 encoded SDF format.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Base64 encoded string of the SDF file content |
to_file
¶
to_file(file_path: Optional[str | Path] = None) -> str
Dump state to a file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
Optional[str | Path]
|
Path where the file will be written. If None, uses default path. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Path to the written file. |
to_hash
¶
to_hash() -> str
Convert the ligand to SHA256 hash of the SDF file content.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
SHA256 hash string of the SDF file content |
to_molblock
¶
to_molblock() -> str
Generate a MOL block representation of the molecule.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
MOL block string |
to_sdf
¶
to_sdf(output_path: Optional[str] = None) -> str
Write the ligand to an SDF file.
This is a local operation: it serializes the current :attr:mol. If the ligand
has :attr:remote_path but no local file yet, raise; rehydrate with
:meth:download first.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_path
|
Optional[str]
|
Path for the SDF file, or default under |
None
|
unsupported_atom_symbols
¶
unsupported_atom_symbols() -> list[str]
Sorted unique atom symbols in :attr:mol not in SUPPORTED_ATOM_SYMBOLS.
update_coordinates
¶
update_coordinates(coordinates: ndarray)
update coordinates of the ligand structure
upload
¶
upload(
*,
client: DeepOriginClient | None = None,
remote_path: str | None = None
) -> None
Upload the entity to the remote server.
Serializes via :meth:to_file with :attr:remote_path temporarily
cleared so subclasses that guard exports when only remote metadata is
present (e.g. :meth:Ligand.to_sdf) still write from in-memory state
on repeat uploads.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client
|
DeepOriginClient | None
|
DeepOriginClient instance. If None, uses DeepOriginClient(). |
None
|
remote_path
|
str | None
|
Custom remote path to upload to. When provided, sets
:attr: |
None
|
write_to_file
¶
write_to_file(
output_path: Optional[str] = None,
output_format: Literal["mol", "sdf", "pdb"] = "sdf",
) -> str | Path
Writes the ligand molecule to a file, including all properties.
Parameters: - output_path (str): Path where the ligand will be written. - output_format (Literal[".mol", ".sdf", ".pdb", "mol", "sdf", "pdb"]): Format to write the ligand in.
Raises:
| Type | Description |
|---|---|
-DeepOriginException
|
If the file extension is unsupported. |
-Exception
|
If writing to the file fails. |
ADMET (molprops) attributes¶
After you run Molprops on the ligand, scalar and structured predictions from the Deep Origin molprops tools are stored on the ligand as well as in properties and RDKit properties:
log_s,log_d,log_p— map to API keyslogS,logD,logPherg,cyp,ames— nested structures keyed ashERG,cyp,amesin the API responsehas_pains,pains_fragments— PAINS screening results
Until molprops has been run successfully (via Molprops), these fields remain None.
Preparation¶
Use Ligand.prepare() to perform common preparation steps before docking:
- salt removal, kekulization
- fragment validation (rejects multiple non-identical fragments)
- validation of atom symbols against supported types
Example:
from deeporigin.drug_discovery.structures import Ligand
lig = Ligand.from_smiles("CCO", name="Ethanol")
lig.prepare() # Preserves hydrogens by default
lig.prepare(remove_hydrogens=True) # Remove hydrogens from SMILES