Plots¶
The deeporigin.plots module provides visualization functions for drug discovery data, including interactive scatter plots with molecular structure visualization.
Functions¶
scatter
¶
scatter(
*,
x: ndarray,
y: ndarray,
smiles_list: list[str],
x_label: str = "X",
y_label: str = "Y",
title: str = "Scatter Plot",
output_file: Optional[str] = None,
x_lim_min: Optional[float] = None,
x_lim_max: Optional[float] = None,
y_lim_min: Optional[float] = None,
y_lim_max: Optional[float] = None,
width: int = 800,
height: int = 800
)
Create and display a Bokeh scatter plot with molecule images displayed on hover.
The function automatically detects the environment (notebook vs script) and displays the plot appropriately - inline in notebooks or in a browser window for scripts. If output_file is provided, the plot is saved to an HTML file instead of being displayed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ndarray
|
X-coordinates for the scatter plot points. |
required |
y
|
ndarray
|
Y-coordinates for the scatter plot points. |
required |
smiles_list
|
list[str]
|
List of SMILES strings corresponding to each point. Must be the same length as x and y. |
required |
x_label
|
str
|
Label for the x-axis. Defaults to "X". |
'X'
|
y_label
|
str
|
Label for the y-axis. Defaults to "Y". |
'Y'
|
title
|
str
|
Title for the plot. Defaults to "Scatter Plot". |
'Scatter Plot'
|
output_file
|
Optional[str]
|
Optional file path to save the HTML figure. If provided, the plot is saved to this file instead of being displayed. Defaults to None. |
None
|
x_lim_min
|
Optional[float]
|
Optional minimum value for the x-axis. If provided, sets the lower bound of the x-axis. Defaults to None (auto-scale). |
None
|
x_lim_max
|
Optional[float]
|
Optional maximum value for the x-axis. If provided, sets the upper bound of the x-axis. Defaults to None (auto-scale). |
None
|
y_lim_min
|
Optional[float]
|
Optional minimum value for the y-axis. If provided, sets the lower bound of the y-axis. Defaults to None (auto-scale). |
None
|
y_lim_max
|
Optional[float]
|
Optional maximum value for the y-axis. If provided, sets the upper bound of the y-axis. Defaults to None (auto-scale). |
None
|
width
|
int
|
Width of the plot in pixels. Defaults to 800. |
800
|
height
|
int
|
Height of the plot in pixels. Defaults to 800. |
800
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If the input arrays have different lengths or no valid SMILES strings found. |
ImportError
|
If RDKit is not available (required for molecule rendering). |
Examples¶
Basic Scatter Plot with Molecule Images¶
import numpy as np
from deeporigin.plots import scatter
# Sample data
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])
smiles_list = [
"CCO", # ethanol
"CC(=O)O", # acetic acid
"c1ccccc1", # benzene
"CCN(CC)CC", # triethylamine
"CC(C)O" # isopropanol
]
# Create and display scatter plot
scatter(
x=x,
y=y,
smiles_list=smiles_list,
x_label="X Coordinate",
y_label="Y Coordinate",
title="Molecule Analysis Plot"
)
Working with Drug Discovery Data¶
import pandas as pd
from deeporigin.plots import scatter
# Load data from a CSV file with SMILES and properties
df = pd.read_csv("ligand_data.csv")
# Create and display scatter plot of molecular weight vs logP
scatter(
x=df["molecular_weight"].values,
y=df["logp"].values,
smiles_list=df["smiles"].tolist(),
x_label="Molecular Weight (Da)",
y_label="LogP",
title="Drug Discovery: Molecular Properties Analysis"
)
Saving Plot to HTML File¶
You can save the scatter plot to an HTML file instead of displaying it by providing the output_file parameter:
import numpy as np
from deeporigin.plots import scatter
# Sample data
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])
smiles_list = [
"CCO", # ethanol
"CC(=O)O", # acetic acid
"c1ccccc1", # benzene
"CCN(CC)CC", # triethylamine
"CC(C)O" # isopropanol
]
# Save scatter plot to HTML file
scatter(
x=x,
y=y,
smiles_list=smiles_list,
x_label="X Coordinate",
y_label="Y Coordinate",
title="Molecule Analysis Plot",
output_file="scatter_plot.html"
)
When output_file is provided, the plot is saved as an interactive HTML file that can be opened in any web browser. The file includes all interactive features such as hover tooltips with molecule images.
Setting Axis Limits¶
You can control the axis limits using individual x_lim_min, x_lim_max, y_lim_min, and y_lim_max parameters. This gives you fine-grained control - you can set just the minimum, just the maximum, or both:
import numpy as np
from deeporigin.plots import scatter
# Sample data
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])
smiles_list = [
"CCO", # ethanol
"CC(=O)O", # acetic acid
"c1ccccc1", # benzene
"CCN(CC)CC", # triethylamine
"CC(C)O" # isopropanol
]
# Create scatter plot with custom axis limits
scatter(
x=x,
y=y,
smiles_list=smiles_list,
x_label="X Coordinate",
y_label="Y Coordinate",
title="Molecule Analysis Plot",
x_lim_min=0, # Set x-axis minimum to 0
x_lim_max=6, # Set x-axis maximum to 6
y_lim_max=12 # Set only y-axis maximum to 12 (min auto-scales)
)
Each limit parameter is optional and independent. If not provided, that limit will auto-scale based on the data range. This allows you to, for example, set only the maximum value for an axis while letting the minimum auto-scale.
Features¶
- Interactive Hover: Hover over any point to see the molecular structure image
- SMILES Validation: Automatically filters out invalid SMILES strings
- High-Quality Images: Generates 200x200 pixel molecular structure images
- Responsive Design: Follows mouse movement for optimal user experience
- Error Handling: Gracefully handles invalid SMILES and rendering errors
Requirements¶
The plots module requires the following optional dependencies:
bokeh: For interactive plottingrdkit: For molecular structure rendering
Install them with:
pip install deeporigin[plots,tools]
Notes¶
- Invalid SMILES strings are automatically filtered out
- If all SMILES strings are invalid, a
ValueErroris raised - The function returns a Bokeh figure object that can be further customized
- Molecular images are generated at 200x200 pixels for optimal display