In [ ]:
%load_ext autoreload
%autoreload 2
%load_ext jupyter_black

Docking Workflow¶

This notebook demonstrates how to perform molecular docking using Deep Origin's drug discovery platform. You'll learn how to:

  1. Load and prepare proteins - Load a protein structure and prepare it for docking
  2. Find binding pockets - Identify potential binding sites on the protein
  3. Dock ligands - Perform docking calculations for single or multiple ligands
  4. Monitor jobs - Track the progress of docking calculations
  5. Analyze results - Visualize and filter docking poses

Let's get started!

Setup¶

First, we'll import the necessary Deep Origin drug discovery modules.

In [ ]:
from deeporigin.drug_discovery import (
    Complex,
    DATA_DIR,
    Protein,
    LigandSet,
)
import deeporigin

deeporigin.__version__

Load Protein Structure¶

Here we load a protein structure from a PDB file. The Complex object represents a protein-ligand complex and will be used throughout the docking workflow.

In [ ]:
protein = Protein.from_file(DATA_DIR / "brd" / "brd.pdb")
sim = Complex(protein=protein)
sim

Load Ligands¶

Load a set of ligands from a CSV file containing SMILES strings. The LigandSet object allows you to work with multiple ligands at once. You can visualize them in a grid to see what molecules you're working with.

In [ ]:
ligands = LigandSet.from_csv(DATA_DIR / "ligands" / "smiles_to_dock.csv")
ligands
In [ ]:
ligands.show_grid()

Assign Ligands to Complex¶

Associate the ligands with the protein complex. This prepares the system for docking calculations.

In [ ]:
sim.ligands = ligands
sim

Visualize the Protein¶

Display the protein structure in 3D. This helps you understand the protein's structure before proceeding with docking.

In [ ]:
sim.protein.show()

Prepare the Protein¶

Before docking, we need to prepare the protein structure. Water molecules are typically removed from crystal structures as they can interfere with docking calculations.

In [ ]:
sim.protein.remove_water()

sim.protein.show()

Find Pockets¶

The find_pockets() method of Protein uses computational methods to detect cavities and potential binding sites on the protein surface.

In [ ]:
pockets = sim.protein.find_pockets(pocket_count=1)
sim.protein.show(pockets=pockets)

Inspect Binding Pockets¶

View the detected binding pockets. Each pocket represents a potential binding site. You'll typically want to dock ligands into the most promising pocket (often the largest or most druggable one).

In [ ]:
pockets

Single Ligand Docking Example¶

Let's start with a simple example: docking a single ligand into a pocket. This demonstrates the basic docking workflow:

  1. Dock the ligand - Calculate possible binding poses
  2. View the poses - Visualize the docked conformations
  3. Analyze results - Examine binding energies and scores
  4. Filter top poses - Select the best binding pose

The dock() function returns a LigandSet object containing all calculated binding poses.

In [ ]:
poses = sim.protein.dock(
    pocket=pockets[0],
    ligand=sim.ligands[0],
)

View Docking Poses¶

Visualize all the calculated poses for the ligand. Each pose represents a different binding conformation with its own binding energy and score.

In [ ]:
sim.protein.show(poses=poses)

Analyze Docking Results¶

Convert the poses to a pandas DataFrame for detailed analysis. This allows you to:

  • Compare binding energies across poses
  • Examine pose scores
  • Filter and sort poses based on various criteria
In [ ]:
poses.to_dataframe()
In [ ]:
poses

Filter Best Poses¶

Select the top pose (best binding conformation) for the ligand. The filter_top_poses() method selects poses based on binding energy and score criteria.

In [ ]:
top_pose = poses.filter_top_poses()
sim.protein.show(poses=top_pose)
In [ ]:
jobs = sim.docking.run(
    pocket=pockets[0],
    quote=True,
    batch_size=8,
)
jobs