Skip to content

deeporigin.drug_discovery.Ligand

Bases: Entity

A class representing a ligand molecule in drug discovery workflows.

The Ligand class provides functionality to create, manipulate, and analyze small molecules (ligands) in computational drug discovery. It supports various input formats and provides methods for property prediction, visualization, and file operations.

After running :class:~deeporigin.drug_discovery.molprops.Molprops, predicted ADMET values are available on dedicated attributes (log_s, log_d, log_p, herg, cyp, ames, has_pains, pains_fragments) as well as in :attr:properties.

The RDKit molecule must be passed as the keyword-only argument mol (typically via :meth:from_smiles, :meth:from_rdkit_mol, or similar factory methods).

Attributes

ames class-attribute instance-attribute

ames: dict[str, Any] | None = None

atom_types property

atom_types

available_for_docking class-attribute instance-attribute

available_for_docking: bool = field(
    init=False, default=True
)

block_content class-attribute instance-attribute

block_content: str | None = None

block_type class-attribute instance-attribute

block_type: str | None = None

canonical_smiles property

canonical_smiles: str

Canonical (RDKit) SMILES for this ligand.

Notes: - Canonicalization is RDKit-specific. - Returns implicit-H SMILES by default (explicit Hs removed). - Preserves stereochemistry if present.

contains_boron property

contains_boron: bool

Check if the ligand contains boron atoms.

Currently, ligands with boron atoms are not supported for docking.

Returns:

Name Type Description
bool bool

True if the ligand contains boron atoms, False otherwise.

coordinates property

coordinates

cyp class-attribute instance-attribute

cyp: dict[str, Any] | None = None

formal_charge property

formal_charge: int

Compute the formal charge of the ligand molecule.

Returns:

Name Type Description
int int

The sum of formal charges of all atoms in the molecule.

has_pains class-attribute instance-attribute

has_pains: bool | None = None

hbond_acceptor_count property

hbond_acceptor_count: int

Compute the number of hydrogen bond acceptors in the ligand molecule.

Returns:

Name Type Description
int int

The number of hydrogen bond acceptors.

hbond_donor_count property

hbond_donor_count: int

Compute the number of hydrogen bond donors in the ligand molecule.

Returns:

Name Type Description
int int

The number of hydrogen bond donors.

herg class-attribute instance-attribute

herg: dict[str, Any] | None = None

id class-attribute instance-attribute

id: str | None = field(default=None, kw_only=True)

identifier class-attribute instance-attribute

identifier: str | None = None

local_path class-attribute instance-attribute

local_path: str | None = field(default=None, kw_only=True)

log_d class-attribute instance-attribute

log_d: float | None = None

log_p class-attribute instance-attribute

log_p: float | None = None

log_s class-attribute instance-attribute

log_s: float | None = None

mol class-attribute instance-attribute

mol: Mol = field(kw_only=True)

molecular_weight property

molecular_weight: float

Compute the exact molecular weight of the ligand molecule.

Returns:

Name Type Description
float float

The exact molecular weight in atomic mass units.

name class-attribute instance-attribute

name: str | None = None

pains_fragments class-attribute instance-attribute

pains_fragments: list[Any] | None = None

prepared class-attribute instance-attribute

prepared: bool = field(init=False, default=False)

project_id class-attribute instance-attribute

project_id: str | None = field(default=None, kw_only=True)

properties class-attribute instance-attribute

properties: dict = field(default_factory=dict)

protonated_at_ph class-attribute instance-attribute

protonated_at_ph: float | None = None

remote_path class-attribute instance-attribute

remote_path: str | None = field(default=None, kw_only=True)

rotatable_bond_count property

rotatable_bond_count: int

Compute the number of rotatable bonds in the ligand molecule.

Returns:

Name Type Description
int int

The number of rotatable bonds.

seed class-attribute instance-attribute

seed: int | None = None

smiles class-attribute instance-attribute

smiles: str | None = None

tpsa property

tpsa: float

Compute the Topological Polar Surface Area (TPSA) of the ligand molecule.

Returns:

Name Type Description
float float

The TPSA value in square Angstroms.

xref_ins_code class-attribute instance-attribute

xref_ins_code: str | None = None

xref_protein class-attribute instance-attribute

xref_protein: str | None = None

xref_protein_chain_id class-attribute instance-attribute

xref_protein_chain_id: str | None = None

xref_residue_id class-attribute instance-attribute

xref_residue_id: str | None = None

Functions

add_hydrogens

add_hydrogens(add_coordinates: bool = True)

Add hydrogens to the molecule.

Parameters:

Name Type Description Default
add_coordinates bool

Whether to generate coordinates for added hydrogens

True

download

download(
    *,
    lazy: bool = True,
    client: DeepOriginClient | None = None
) -> str

Download the entity file from remote storage.

No-ops if local_path is already set.

Parameters:

Name Type Description Default
client DeepOriginClient | None

DeepOriginClient instance. If None, uses DeepOriginClient().

None

Returns:

Type Description
str

The local file path.

Raises:

Type Description
ValueError

If neither local_path nor remote_path is available.

draw

draw()

Draw the contained rdkit molecule using rdkit methods

embed

embed(add_hydrogens: bool = True, seed: int = -1)

Generate 3D coordinates for the molecule.

Parameters:

Name Type Description Default
add_hydrogens bool

Whether to add hydrogens

True
seed int

Random seed for coordinate generation

-1

ensure_remote_path

ensure_remote_path(
    *, client: DeepOriginClient, label: str
) -> None

Ensure :attr:remote_path is set after a lazy :meth:sync may have no-oped.

If the entity already has a platform id but remote_path was never populated (e.g. rehydrated metadata only), sync(lazy=True) returns early. This performs a full sync when needed, then raises if the path is still missing.

Parameters:

Name Type Description Default
client DeepOriginClient

Authenticated client for sync/upload.

required
label str

Human-readable name for error messages (e.g. "Protein").

required

Raises:

Type Description
ValueError

If remote_path cannot be determined after sync.

from_base64 classmethod

from_base64(
    base64_string: str, name: str = "", **kwargs: Any
) -> Self

Create a Ligand instance from a base64 encoded SDF string.

Parameters:

Name Type Description Default
base64_string str

Base64 encoded SDF content

required
name str

Name of the ligand. Defaults to "".

''
**kwargs Any

Additional arguments to pass to the constructor

{}

Returns:

Name Type Description
Ligand Self

A new Ligand instance

Raises:

Type Description
DeepOriginException

If the base64 string cannot be decoded or parsed

from_file classmethod

from_file(
    file_path: str | Path,
    *,
    sanitize: bool = True,
    remove_hydrogens: bool = False
) -> Self

Create a Ligand from a file after verifying it is an SDF (extension and content).

This delegates to :meth:from_sdf after validation.

Parameters:

Name Type Description Default
file_path str | Path

Path to the SDF file.

required
sanitize bool

Whether to sanitize molecules. Defaults to True.

True
remove_hydrogens bool

Whether to remove hydrogens. Defaults to False.

False

Returns:

Name Type Description
Ligand Self

The Ligand instance created from the SDF file.

Raises:

Type Description
FileNotFoundError

If the file does not exist.

DeepOriginException

If the path is not an SDF file or loading fails.

from_id classmethod

from_id(
    id: str,
    *,
    client: Optional[DeepOriginClient] = None,
    download: bool = True
) -> Self

Create a Ligand instance from a Deep Origin Data Platform ID.

Fetches the ligand record from the platform. If the record has an associated mol file it is downloaded and used to construct the ligand; otherwise the SMILES string is used.

Parameters:

Name Type Description Default
id str

The Deep Origin Data Platform ID of the ligand.

required
client Optional[DeepOriginClient]

Optional DeepOriginClient instance. If not provided, uses the default client.

None
download bool

If False, skip mol file download and hydrate from SMILES with remote_path set to the platform mol file path.

True

Returns:

Name Type Description
Ligand Self

A new Ligand instance.

Raises:

Type Description
ValueError

If the ligand data contains neither a mol file nor a SMILES string.

from_identifier classmethod

from_identifier(identifier: str) -> Self

Create a Ligand instance from a compound name.

Parameters:

Name Type Description Default
identifier str

The identifier to resolve to a SMILES string.

required

Raises:

Type Description
DeepOriginException

If no compound is found for the given name

AssertionError

If neither smiles nor name is provided

from_rdkit_mol classmethod

from_rdkit_mol(
    mol: Mol, name: Optional[str] = None, **kwargs: Any
) -> Self

Create a Ligand instance from an RDKit Mol object.

Parameters:

Name Type Description Default
mol Mol

RDKit molecule object to convert to a Ligand

required
name str

Name of the ligand. Defaults to "".

None
**kwargs Any

Additional arguments to pass to the constructor

{}

from_remote_file classmethod

from_remote_file(
    remote_path: str,
    *,
    client: DeepOriginClient | None = None,
    lazy: bool = True,
    sanitize: bool = True,
    remove_hydrogens: bool = False
) -> Self

Create a Ligand from an SDF file stored on the platform.

Downloads the file via :meth:deeporigin.platform.files.FilesClient.download, then loads it with :meth:from_sdf. The SDF must contain exactly one molecule.

Parameters:

Name Type Description Default
remote_path str

Platform file path (e.g. org storage path) to the SDF file.

required
client DeepOriginClient | None

DeepOrigin client used for download. If None, uses DeepOriginClient().

None
lazy bool

Passed to files.download; if True, skip download when the file already exists locally at the default cache location.

True
sanitize bool

Whether to sanitize molecules when reading the SDF (see :meth:from_sdf).

True
remove_hydrogens bool

Whether to strip hydrogens when reading the SDF (see :meth:from_sdf).

False

Returns:

Name Type Description
Ligand Self

A ligand with :attr:~deeporigin.drug_discovery.structures.entity.Entity.remote_path

Self

set to remote_path and :attr:~deeporigin.drug_discovery.structures.entity.Entity.local_path

Self

set to the downloaded file path.

from_sdf classmethod

from_sdf(
    file_path: str | Path,
    *,
    sanitize: bool = True,
    remove_hydrogens: bool = False
) -> Self

Create a single Ligand instance from an SDF file containing exactly one molecule.

Parameters:

Name Type Description Default
file_path str

The path to the SDF file.

required
sanitize bool

Whether to sanitize molecules. Defaults to True.

True
remove_hydrogens bool

Whether to remove hydrogens. Defaults to False.

False

Returns:

Name Type Description
Ligand Self

The Ligand instance created from the SDF file.

Raises:

Type Description
FileNotFoundError

If the file does not exist.

DeepOriginException

If the file cannot be parsed correctly or contains more than one molecule.

from_smiles classmethod

from_smiles(
    smiles: str, name: str = "", **kwargs: Any
) -> Self

Create a Ligand instance from a SMILES string.

Parameters:

Name Type Description Default
smiles str

SMILES string representing the ligand

required
name str

Name of the ligand. Defaults to "".

''
**kwargs Any

Additional arguments to pass to the constructor

{}

Returns:

Name Type Description
Ligand Self

A new Ligand instance

get_center

get_center() -> list[number]

Get the center of the ligand based on its coordinates.

Returns: - list: The center coordinates of the ligand. - None: If coordinates are not available.

get_conformer

get_conformer(conformer_id: int = 0)

Get a specific conformer of the molecule.

Parameters:

Name Type Description Default
conformer_id int

Conformer index

0

get_conformer_id

get_conformer_id() -> int

Get the ID of the current conformer.

Returns:

Name Type Description
int int

Conformer ID

get_coordinates

get_coordinates(i: int = 0)

Get the coordinates of atoms in a specific conformer.

Parameters:

Name Type Description Default
i int

Conformer index

0

get_formula

get_formula() -> str

Get the chemical formula of the molecule.

get_heavy_atom_count

get_heavy_atom_count() -> int

Get the number of heavy atoms in the molecule.

get_property

get_property(prop_name: str)

Get the value of a property for the ligand molecule.

Parameters: - prop_name (str): Name of the property to retrieve.

Returns: - The value of the property if it exists, otherwise None.

get_species

get_species() -> list[str]

Get the atomic symbols of all atoms in the molecule.

Returns:

Name Type Description
list list[str]

List of atomic symbols

has_3d_structure

has_3d_structure() -> bool

Check if the ligand has 3D coordinates (not just 2D).

This method checks if the molecule has conformers with 3D coordinates. Ligands created from SMILES typically have 2D coordinates (z=0 for all atoms), while ligands with actual 3D structure have non-zero z coordinates.

Returns:

Name Type Description
bool bool

True if the ligand has 3D coordinates (non-zero z values), False if it only has 2D coordinates or no conformers.

has_hydrogens

has_hydrogens() -> bool

Check if the molecule contains hydrogen atoms.

This method determines if hydrogens are present by comparing the canonical SMILES string of the molecule with and without explicit hydrogens added. If the SMILES strings differ, the molecule contains hydrogens.

Returns:

Name Type Description
bool bool

True if the molecule contains hydrogen atoms, False otherwise

has_unsupported_atoms

has_unsupported_atoms() -> bool

Whether :attr:mol contains any atom type not supported for docking workflows.

is_charged

is_charged() -> bool

Check if the molecule is charged.

mol_from_block classmethod

mol_from_block(
    block_type: str,
    block: str,
    sanitize: bool = True,
    remove_hs: bool = False,
) -> Mol

Create a molecule from a block of text.

Parameters:

Name Type Description Default
block_type str

Type of the input block

required
block str

Text block containing molecular data

required
sanitize bool

Whether to sanitize the molecule

True
remove_hs bool

Whether to remove hydrogens

False

Returns:

Type Description
Mol

Chem.Mol: RDKit molecule object

mol_from_file classmethod

mol_from_file(
    *,
    file_type: FILE_FORMATS,
    file_path: str,
    sanitize: bool = True,
    remove_hs: bool = False
) -> Mol

Create a molecule from a file.

Parameters:

Name Type Description Default
file_type str

Type of the input file (must be in FILE_FORMATS)

required
file_path str

Path to the input file

required
sanitize bool

Whether to sanitize the molecule

True
remove_hs bool

Whether to remove hydrogens

False

Returns:

Type Description
Mol

Chem.Mol: RDKit molecule object

Raises:

Type Description
DeepOriginException

If the file format is invalid or parsing fails

NotImplementedError

If the file type is not supported

prepare

prepare(*, remove_hydrogens: bool = False) -> Self

Prepare the ligand for downstream workflows.

The routine performs the following using RDKit and internal utilities: - Salt removal - Kekulization - Fragment validation (rejects multiple non-identical fragments) - Wildcard atom validation (rejects '*' atoms) - Validation of atom types against supported symbols

Parameters:

Name Type Description Default
remove_hydrogens bool

Whether to remove hydrogens from the SMILES representation. Defaults to False (preserve hydrogens).

False

Returns:

Name Type Description
Ligand Self

The prepared ligand (self), for chaining.

Raises:

Type Description
DeepOriginException

If preparation fails, unsupported atom types are present, or multiple non-identical fragments are detected.

process_mol

process_mol() -> None

Clean the ligand molecule by removing hydrogens and sanitizing the structure.

Raises:

Type Description
DeepOriginException

If salt removal or kekulization fails

protonate

protonate(
    *,
    ph: number = 7.4,
    filter_percentage: number = 1.0,
    client: Optional[DeepOriginClient] = None,
    use_cache: bool = True,
    quote: bool = False
) -> FunctionResult

Protonate the ligand at a given pH using the DeepOrigin API.

Returns a FunctionResult whose .ligands attribute contains the protonated ligand. When quote=True, .ligands is empty and .estimate gives the cost in dollars.

The ligand is mutated in place: self.mol, self.smiles, and self.protonated_at_ph are updated with the most abundant protonation state.

Parameters:

Name Type Description Default
ph number

pH value at which to protonate the ligand. Defaults to 7.4 (physiological pH).

7.4
filter_percentage number

Percentage threshold for filtering protonation states. Only species with abundance above this threshold are considered. Defaults to 1.0.

1.0
client Optional[DeepOriginClient]

DeepOrigin client instance. If None, uses DeepOriginClient().

None
use_cache bool

Whether to use cached protonation results.

True
quote bool

If True, request a cost estimate without executing.

False

Returns:

Name Type Description
FunctionResult FunctionResult

A FunctionResult with a .ligands attribute (list of Ligand).

register

register(
    *,
    client: Optional[DeepOriginClient] = None,
    remote_path: Optional[str] = None
) -> None

Register the ligand as a new record in the data platform.

Uploads the ligand file to remote storage (if available) and creates a new ligand record, regardless of whether one already exists for this canonical SMILES.

Parameters:

Name Type Description Default
client Optional[DeepOriginClient]

DeepOriginClient instance. If None, uses DeepOriginClient().

None
remote_path Optional[str]

Custom remote path to upload to. Overrides the default hash-based path.

None

Returns:

Type Description
None

None. As a side effect, uploads the ligand and sets self.id

None

to the newly created record's ID.

Note

If the ligand was created from a SMILES string without an SDF file, only the SMILES will be used (no file upload will occur).

resolved_project_id

resolved_project_id(
    client: DeepOriginClient | None = None,
) -> str | None

Data platform project id for API calls.

Returns :attr:project_id when set; otherwise client.project_id when client is given; otherwise None. Does not read the filesystem.

Parameters:

Name Type Description Default
client DeepOriginClient | None

Optional platform client (e.g. the one passed to :meth:sync).

None

Returns:

Type Description
str | None

Project id string, or None if neither the entity nor the client

str | None

provides one.

set_conformer_id

set_conformer_id(i=0)

Set the ID of the current conformer.

Parameters:

Name Type Description Default
i int

New conformer ID

0

set_property

set_property(prop_name: str, prop_value)

Set a property for the ligand molecule.

Parameters: - prop_name (str): Name of the property. - prop_value: Value of the property.

show

show() -> str | None

Visualize the current state of the ligand molecule.

Returns: - str: HTML representation of the visualization.

Raises: - Exception: If visualization fails.

sync

sync(
    *,
    lazy: bool = False,
    client: Optional[DeepOriginClient] = None,
    remote_path: Optional[str] = None
) -> None

Sync the ligand to the data platform.

Uploads the ligand file and links to an existing record if one with the same canonical SMILES already exists (setting id and remote_path from the record's mol_file when present), otherwise creates a new record via :meth:register.

Parameters:

Name Type Description Default
lazy bool

If True, skip syncing when the ligand already has an ID. Defaults to False.

False
client Optional[DeepOriginClient]

DeepOriginClient instance. If None, uses DeepOriginClient().

None
remote_path Optional[str]

Custom remote path to upload to. Overrides the default hash-based path.

None
Note

If the ligand was created from a SMILES string without an SDF file, only the SMILES will be used for syncing (no file upload will occur).

to_base64

to_base64() -> str

Convert the ligand to base64 encoded SDF format.

Returns:

Name Type Description
str str

Base64 encoded string of the SDF file content

to_file

to_file(file_path: Optional[str | Path] = None) -> str

Dump state to a file.

Parameters:

Name Type Description Default
file_path Optional[str | Path]

Path where the file will be written. If None, uses default path.

None

Returns:

Name Type Description
str str

Path to the written file.

to_hash

to_hash() -> str

Convert the ligand to SHA256 hash of the SDF file content.

Returns:

Name Type Description
str str

SHA256 hash string of the SDF file content

to_mol

to_mol(output_path: Optional[str] = None) -> str | Path

Write the ligand to a MOL file.

to_molblock

to_molblock() -> str

Generate a MOL block representation of the molecule.

Returns:

Name Type Description
str str

MOL block string

to_pdb

to_pdb(output_path: Optional[str] = None) -> str | Path

Write the ligand to a PDB file.

to_sdf

to_sdf(output_path: Optional[str] = None) -> str

Write the ligand to an SDF file.

This is a local operation: it serializes the current :attr:mol. If the ligand has :attr:remote_path but no local file yet, raise; rehydrate with :meth:download first.

Parameters:

Name Type Description Default
output_path Optional[str]

Path for the SDF file, or default under LIGANDS_DIR.

None

unsupported_atom_symbols

unsupported_atom_symbols() -> list[str]

Sorted unique atom symbols in :attr:mol not in SUPPORTED_ATOM_SYMBOLS.

update_coordinates

update_coordinates(coordinates: ndarray)

update coordinates of the ligand structure

upload

upload(
    *,
    client: DeepOriginClient | None = None,
    remote_path: str | None = None
) -> None

Upload the entity to the remote server.

Serializes via :meth:to_file with :attr:remote_path temporarily cleared so subclasses that guard exports when only remote metadata is present (e.g. :meth:Ligand.to_sdf) still write from in-memory state on repeat uploads.

Parameters:

Name Type Description Default
client DeepOriginClient | None

DeepOriginClient instance. If None, uses DeepOriginClient().

None
remote_path str | None

Custom remote path to upload to. When provided, sets :attr:remote_path before uploading. If :attr:remote_path is still unset, it is set to the default hash-based path.

None

write_to_file

write_to_file(
    output_path: Optional[str] = None,
    output_format: Literal["mol", "sdf", "pdb"] = "sdf",
) -> str | Path

Writes the ligand molecule to a file, including all properties.

Parameters: - output_path (str): Path where the ligand will be written. - output_format (Literal[".mol", ".sdf", ".pdb", "mol", "sdf", "pdb"]): Format to write the ligand in.

Raises:

Type Description
-DeepOriginException

If the file extension is unsupported.

-Exception

If writing to the file fails.

ADMET (molprops) attributes

After you run Molprops on the ligand, scalar and structured predictions from the Deep Origin molprops tools are stored on the ligand as well as in properties and RDKit properties:

  • log_s, log_d, log_p — map to API keys logS, logD, logP
  • herg, cyp, ames — nested structures keyed as hERG, cyp, ames in the API response
  • has_pains, pains_fragments — PAINS screening results

Until molprops has been run successfully (via Molprops), these fields remain None.

Preparation

Use Ligand.prepare() to perform common preparation steps before docking:

  • salt removal, kekulization
  • fragment validation (rejects multiple non-identical fragments)
  • validation of atom symbols against supported types

Example:

from deeporigin.drug_discovery.structures import Ligand

lig = Ligand.from_smiles("CCO", name="Ethanol")
lig.prepare()  # Preserves hydrogens by default
lig.prepare(remove_hydrogens=True)  # Remove hydrogens from SMILES