API reference: High-level API¶
How to use this reference
This page contains information about each class and function in this module. This is meant as a detailed reference for this module. If you're looking an introduction, we recommend reviewing the How to section.
The deeporigin.data_hub.api
module contains high-level functions for
interacting with your Deep Origin data hub.
add_database_column
¶
add_database_column(
*,
database_id: str,
type: DataType,
name: str,
cardinality: Cardinality = "one",
required: bool = False,
client=None,
_stash: bool = False
)
Add a column to a database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
database_id
|
str
|
ID (or human ID) of a database on Deep Origin. |
required |
type
|
DataType
|
type of the column. Should be one of DataType |
required |
name
|
str
|
name of the column |
required |
cardinality
|
Cardinality
|
cardinality of the column. Specifies whether cells in this column can contain or many items. Should be one of "one" or "many" |
'one'
|
required
|
bool
|
whether the column is required. If True, cells in this column cannot be empty |
False
|
add_database_rows
¶
add_database_rows(
*,
database_id: str,
data: dict,
client=None,
_stash: bool = False
) -> list[str]
Add new data to a database.
Use this function to add new rows, or fragments of rows, to a Deep Origin database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
database_id
|
str
|
Human ID or System ID of the database |
required |
data
|
dict
|
A dictionary where each key is a column name and each value is a list of values. All values should have the same length. Key names should match column names in the database. |
required |
Returns:
Type | Description |
---|---|
list[str]
|
A list of row IDs |
assign_files_to_cell
¶
assign_files_to_cell(
*,
file_ids: list[str],
database_id: str,
column_id: str,
row_id: Optional[str] = None,
client=None,
_stash: bool = False
)
Assign existing file(s) to a cell
Assign files to a cell in a database table, where the cell is identified by the database ID, row ID, and column ID.
If row_id is None
, a new row will be created.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_ids
|
list[str]
|
ID of the file |
required |
database_id
|
str
|
ID of database to assign to |
required |
column_id
|
str
|
ID of the column |
required |
row_id
|
Optional[str]
|
ID of the row |
None
|
convert_id_format
¶
convert_id_format(
*,
hids: Optional[Union[list[str], set[str]]] = None,
ids: Optional[Union[list[str], set[str]]] = None,
client=None,
_stash: bool = False
) -> list[dict]
Convert a list of human IDs to IDs or vice versa.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hids
|
Optional[Union[list[str], set[str]]]
|
List of human IDs |
None
|
ids
|
Optional[Union[list[str], set[str]]]
|
List of IDs (system IDs) |
None
|
create_database
¶
create_database(
*,
name: str,
client=None,
_stash: bool = False,
parent_id: Optional[str] = None,
hid: Optional[str] = None,
hid_prefix: Optional[str] = None
)
Create a new database in the data hub
A database contains rows of data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Name of the database to create |
required |
hid
|
Optional[str]
|
Human ID. If not specified, the name will be used |
None
|
parent_id
|
Optional[str]
|
ID of the parent. If None, the folder is created at the root level |
None
|
hid_prefix
|
Optional[str]
|
Human ID prefix to be used for each row. If not specified, the name will be used |
None
|
create_workspace
¶
create_workspace(
*,
name: str,
hid: Optional[str] = None,
parent_id: Optional[str] = None,
client=None,
_stash: bool = False
)
Create a new folder (workspace) in the data hub
A folder contains can contain other rows and databases.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Name of the folder to create |
required |
hid
|
Optional[str]
|
Human ID. If not specified, the name will be used |
None
|
parent_id
|
Optional[str]
|
ID of the parent. If None, the folder is created at the root level |
None
|
download
¶
download(
source: str,
destination: str,
*,
include_files: bool = False,
client=None,
_stash: bool = False
) -> None
Download resources from Deep Origin and save them to a local destination.
Download databases, objects and other entities from your Deep Origin data hub and save them to local disk.
Work in progress
All features in this function have not been implemented yet.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source
|
str
|
ID (or human ID) of a resource on Deep Origin. |
required |
destination
|
str
|
Path to local directory to save resources. |
required |
include_files
|
bool
|
if |
False
|
download_database
¶
download_database(
source: Any,
destination: str = getcwd(),
*,
include_files: bool = False,
client=None,
_stash: bool = False
) -> None
Download a database and save it to a CSV file on the local disk.
Download a database from your Deep Origin data hub and save to local disk as a CSV file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source
|
Any
|
ID (or human ID) of a resource on Deep Origin. |
required |
destination
|
str
|
Path to local directory to save resources. |
getcwd()
|
include_files
|
bool
|
if |
False
|
download_file
¶
download_file(
file_id: str,
*,
destination: str | Path = getcwd(),
client=None,
_stash: bool = False
) -> None
Download a file to a destination folder (workspace).
Download a file synchronously from Deep Origin to folder on the local file system.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_id
|
str
|
ID of the file on Deep Origin |
required |
destination
|
str | Path
|
Path to the destination folder |
getcwd()
|
download_files
¶
download_files(
files: Optional[list | dict] = None,
*,
save_to_dir: Path | str = Path("."),
use_file_names: bool = True,
client=None,
_stash: bool = False
) -> None
download multiple files in parallel to local disk
Parameters:
Name | Type | Description | Default |
---|---|---|---|
files
|
Optional[list | dict]
|
list of files to download. These can be of type |
None
|
save_to_dir
|
Path | str
|
directory to save files to on local computer |
Path('.')
|
get_cell_data
¶
get_cell_data(
*,
row_id: str,
column_name: str,
client=None,
_stash: bool = False
) -> Any
Extract data from a cell in a database, referenced
by row_id
and column_name
.
Returns the value in a single cell in a database.
Caution
This function internally calls get_row_data, so it is not efficient to write a loop to get all values of cells from a row. It will be faster to call get_row_data directly.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
row_id
|
str
|
ID (or human ID) of a row. |
required |
column_name
|
str
|
Name of column. |
required |
Returns:
Type | Description |
---|---|
Any
|
Value of that cell. |
get_columns
¶
get_columns(
row_id: str, *, client=None, _stash: bool = False
) -> list[dict]
Get information about the columns of a row or database.
If row_id
is a database, then column metadata and names
are returned. If row_id
is a row, then a dictionary of
human IDs and values are returned.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
row_id
|
str
|
ID (or human ID) of a row or database on Deep Origin. |
required |
get_dataframe
¶
get_dataframe(
database_id: str,
*,
use_file_names: bool = True,
reference_format: IDFormat = "human-id",
return_type: DatabaseReturnType = "dataframe",
filter: Optional[dict] = None,
client=None,
_stash: bool = False
)
Generate a pandas.DataFrame
or dictionary for a database.
Download a database from your Deep Origin data hub and return it as a data frame or dictionary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
database_id
|
str
|
ID (or human ID) of a database on Deep Origin. |
required |
use_file_names
|
bool
|
If |
True
|
reference_format
|
IDFormat
|
Refer to rows on Deep Origin using human IDs or system IDs. |
'human-id'
|
return_type
|
DatabaseReturnType
|
Whether to return a |
'dataframe'
|
get_notebook
¶
get_notebook(
row_id: str, *, client=None, _stash: bool = False
) -> list
Get the notebook of a row, if it exists
Parameters:
Name | Type | Description | Default |
---|---|---|---|
row_id
|
str
|
ID (or human ID) of a row on Deep Origin. |
required |
Returns:
Type | Description |
---|---|
list
|
The notebook of the row, returned as a list |
list
|
of blocks |
get_row_data
¶
get_row_data(
row_id: str,
*,
use_column_keys: bool = False,
client=None,
_stash: bool = False
) -> dict
Get the data in a row.
Read data from a row, and return it as a dictionary, where the keys are column names (or keys), and the values are the values of those cells.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
row_id
|
str
|
ID (or human ID) of a row or database on Deep Origin. |
required |
use_column_keys
|
bool
|
if |
False
|
Raises:
Type | Description |
---|---|
DeepOriginException
|
If row_id is not a row |
get_tree
¶
get_tree(
*,
include_rows: bool = True,
client=None,
_stash: bool = False
) -> list
Construct a tree of all folders (workspaces), databases and rows.
Returns a tree that contains all folders, databases and
(optionally) rows. The tree is returned as a dictionary,
and children of each object are contained in a field
called children
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
include_rows
|
bool
|
If |
True
|
Returns:
Type | Description |
---|---|
list
|
A dictionary describing the tree structure of all folders |
list
|
and databases. |
list_files
¶
list_files(
*,
assigned_row_ids: Optional[list[str]] = None,
is_unassigned: Optional[bool] = None,
file_ids: Optional[list[str]] = None,
client=None,
_stash: bool = False
) -> list
List files, with option to filter by assigned rows, assigned status
Returns a list of files.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
assigned_row_ids
|
Optional[list[str]]
|
List of IDs of rows that files are assigned to |
None
|
is_unassigned
|
Optional[bool]
|
If |
None
|
Returns:
Type | Description |
---|---|
list
|
A list of files, where each entry is an object that corresponds to a file on Deep Origin |
list_rows
¶
list_rows(
*,
parent_id: Optional[str] = None,
row_type: ObjectType = None,
parent_is_root: Optional[bool] = None,
client=None,
_stash: bool = False
) -> list
List rows in a database or folder (workspace).
Returns a list of rows from folders and databases, based on the parent, row type, or whether the parent is the root.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
parent_id
|
Optional[str]
|
ID (or human ID) or the parent. |
None
|
row_type
|
ObjectType
|
One of |
None
|
parent_is_root
|
Optional[bool]
|
If |
None
|
Returns:
Type | Description |
---|---|
list
|
A list of objects, where each entry corresponds to a row. |
make_database_rows
¶
make_database_rows(
database_id: str,
n_rows: int = 1,
*,
client=None,
_stash: bool = False
) -> dict
Makes one or several new row(s) in a database table
Parameters:
Name | Type | Description | Default |
---|---|---|---|
database_id
|
str
|
ID or Human ID of the database |
required |
n_rows
|
int
|
Number of rows to create. Must be an integer greater than 0 |
1
|
Returns:
Type | Description |
---|---|
dict
|
A dictionary that conforms to a EnsureRowsResponse |
row_to_dict
¶
row_to_dict(
row: dict,
*,
file_ids: Optional[list] = None,
reference_ids: Optional[list] = None
) -> dict
convert a database row (as returned by api.list_database_rows) to a dictionary where keys are column IDs and values are the values in the row
This function mutates inputs
This function mutates file_ids and reference_ids
Parameters:
Name | Type | Description | Default |
---|---|---|---|
row
|
dict
|
database row (as returned by api.list_database_rows) |
required |
file_ids
|
Optional[list]
|
list of file IDs, will be mutated in-place |
None
|
reference_ids
|
Optional[list]
|
list of reference IDs, will be mutated in-place |
None
|
Returns:
Type | Description |
---|---|
dict
|
dict |
set_cell_data
¶
set_cell_data(
value: Any,
*,
database_id: str,
row_id: str,
column_id: str,
client=None,
_stash: bool = False
) -> Any
Set data in a cell to some value.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
value
|
Any
|
Value to set in the cell |
required |
database_id
|
str
|
ID (or human ID) of a database |
required |
row_id
|
str
|
ID (or human ID) of a row |
required |
column_id
|
str
|
ID (or human ID) of a column |
required |
set_data_in_cells
¶
set_data_in_cells(
*,
values: list,
row_ids: list[str],
column_id: str,
database_id: str,
columns: Optional[list[dict]] = None,
client=None,
_stash: bool = False
)
Set data in multiple cells to some value.
upload_file
¶
upload_file(
file_path: str,
*,
client=None,
_stash: bool = False,
compute_hash: bool = True
) -> None
Upload a file to Deep Origin.
This upload files to your Deep Origin data hub. To assign this file to a cell, next run assign_files_to_cell
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path
|
str
|
Path to the file to upload |
required |
upload_file_to_new_database_row
¶
upload_file_to_new_database_row(
*,
database_id: str,
file_path: str,
column_id: str,
client=None,
_stash: bool = False
)
Upload a file to a new row in a database.
Upload a file to a new row in a database. This utility function wraps two other functions:
Parameters:
Name | Type | Description | Default |
---|---|---|---|
database_id
|
str
|
ID (or human ID) of a database. |
required |
file_path
|
str
|
Path to the file to upload. |
required |
column_id
|
str
|
ID (or human ID) of a column in the database. |
required |