Utility functions¶
How to use this reference
This page contains information about each class and function in this module. This is meant as a detailed reference for this module. If you're looking an introduction, we recommend reviewing the How to section.
This module contains utility functions used by tool execution. In general, you will not need to use many of these functions directly.
get_job_dataframe
¶
get_job_dataframe(update: bool = False) -> Any
returns a dataframe of all jobs and statuses, reading from local cache
Parameters:
Name | Type | Description | Default |
---|---|---|---|
update
|
bool
|
Whether to check for updates on non-terminal jobs. Defaults to False. |
False
|
Note that this function is deliberately not annotated with an output type because pandas is imported internally to this funciton.
Returns:
Type | Description |
---|---|
Any
|
pd.DataFrame: A dataframe containing job information |
make_payload
¶
make_payload(
*,
inputs: dict,
outputs: dict,
tool_id: str,
cluster_id: Optional[str] = None,
cols: Optional[list] = None
) -> dict
helper function to create payload for tool execution. This helper function is used by all wrapper functions in the run module to create the payload.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs
|
dict
|
inputs |
required |
outputs
|
dict
|
outputs |
required |
tool_id
|
str
|
tool ID of the tool to run. |
required |
cluster_id
|
Optional[str]
|
cluster ID. Defaults to None. If not provided, the default cluster (us-west-2) is used. |
None
|
cols
|
Optional[list]
|
(Optional[list], optional): list of columns. Defaults to None. If provided, column names (in inputs or outputs) are converted to column IDs. |
None
|
Returns:
Name | Type | Description |
---|---|---|
dict |
dict
|
correctly formatted payload, ready to be passed to execute_tool |
query_run_status
¶
query_run_status(job_id: str) -> str
Determine the status of a run, identified by job ID
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id
|
str
|
job ID |
required |
Returns:
Type | Description |
---|---|
str
|
One of "Created", "Queued", "Running", "Succeeded", or "Failed" |
run_tool
¶
run_tool(data: dict)
run any tool using provided data transfer object (DTO)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
dict
|
data transfer object. This is typically generated by the |
required |
wait_for_job
¶
wait_for_job(
job_id: str, *, poll_interval: int = 4
) -> None
Repeatedly poll Deep Origin for the job status, till the status is "Succeeded" or "Failed (a terminal state)
This function is useful for blocking execution of your code till a specific task is complete.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id
|
str
|
job ID. This is typically printed to screen and returned when a job is initialized. |
required |
poll_interval
|
int
|
number of seconds to wait between polling. Defaults to 4. |
4
|
wait_for_jobs
¶
wait_for_jobs(
refresh_time: int = 3, hide_succeeded: bool = True
) -> Any
Wait for all jobs started via this client to complete
Parameters:
Name | Type | Description | Default |
---|---|---|---|
refresh_time
|
int
|
number of seconds to wait between polling. Defaults to 3. |
3
|
hide_succeeded
|
bool
|
whether to hide jobs that have already completed. Defaults to True. |
True
|
Note that this function signature is explicitly not annotated with a return type to avoid importing pandas outside this function
Returns:
Type | Description |
---|---|
Any
|
pd.DataFrame: dataframe of all jobs. |