Skip to content

Plots

The deeporigin.plots module provides visualization functions for drug discovery data, including interactive scatter plots with molecular structure visualization.

Functions

scatter

scatter(
    *,
    x: ndarray,
    y: ndarray,
    smiles_list: list[str],
    x_label: str = "X",
    y_label: str = "Y",
    title: str = "Scatter Plot"
)

Create and display a Bokeh scatter plot with molecule images displayed on hover.

The function automatically detects the environment (notebook vs script) and displays the plot appropriately - inline in notebooks or in a browser window for scripts.

Parameters:

Name Type Description Default
x ndarray

X-coordinates for the scatter plot points.

required
y ndarray

Y-coordinates for the scatter plot points.

required
smiles_list list[str]

List of SMILES strings corresponding to each point. Must be the same length as x and y.

required
x_label str

Label for the x-axis. Defaults to "X".

'X'
y_label str

Label for the y-axis. Defaults to "Y".

'Y'
title str

Title for the plot. Defaults to "Scatter Plot".

'Scatter Plot'

Raises:

Type Description
ValueError

If the input arrays have different lengths or no valid SMILES strings found.

ImportError

If RDKit is not available (required for molecule rendering).

Examples

Basic Scatter Plot with Molecule Images

import numpy as np
from deeporigin.plots import scatter

# Sample data
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])
smiles_list = [
    "CCO",           # ethanol
    "CC(=O)O",       # acetic acid
    "c1ccccc1",      # benzene
    "CCN(CC)CC",     # triethylamine
    "CC(C)O"         # isopropanol
]

# Create and display scatter plot
scatter(
    x=x, 
    y=y, 
    smiles_list=smiles_list, 
    x_label="X Coordinate", 
    y_label="Y Coordinate",
    title="Molecule Analysis Plot"
)

Working with Drug Discovery Data

import pandas as pd
from deeporigin.plots import scatter

# Load data from a CSV file with SMILES and properties
df = pd.read_csv("ligand_data.csv")

# Create and display scatter plot of molecular weight vs logP
scatter(
    x=df["molecular_weight"].values,
    y=df["logp"].values,
    smiles_list=df["smiles"].tolist(),
    x_label="Molecular Weight (Da)",
    y_label="LogP",
    title="Drug Discovery: Molecular Properties Analysis"
)

Features

  • Interactive Hover: Hover over any point to see the molecular structure image
  • SMILES Validation: Automatically filters out invalid SMILES strings
  • High-Quality Images: Generates 200x200 pixel molecular structure images
  • Responsive Design: Follows mouse movement for optimal user experience
  • Error Handling: Gracefully handles invalid SMILES and rendering errors

Requirements

The plots module requires the following optional dependencies:

  • bokeh: For interactive plotting
  • rdkit: For molecular structure rendering

Install them with:

pip install deeporigin[plots,tools]

Notes

  • Invalid SMILES strings are automatically filtered out
  • If all SMILES strings are invalid, a ValueError is raised
  • The function returns a Bokeh figure object that can be further customized
  • Molecular images are generated at 200x200 pixels for optimal display