Renku Python API
The following sections describe the Renku Python API. If you work with the R programming language, you can also use this API through the reticulate package. For more information, visit our dedicated tutorial.
Activity
Renku API Activity.
Activity
represents executed workflows in a Renku project. You can get a
list of all activities in a project by calling its list
method:
from renku.api import Activity
activities = Activity.list()
The Activity
class provides a static filter
method that returns a
subset of activities. It can filter activities based on their input, outputs,
parameter names, and parameter values. You can pass a literal value, a list of
values, or a function predicate for each of these fields to filter activities:
from numbers import Number
from renku.api import Activity
# Return activities that use ``path/to/an/input``
Activity.filter(inputs="path/to/an/input")
# Return activities that use ``input-1`` or ``input-2`` AND generate
# output files that their name starts with ``data-``
Activity.filter(inputs=["input-1", "input-2"], outputs=lambda path: path.startswith("data-"))
# Return activities that use values between ``0.5`` and ``1.5`` for the
# parameter ``lr``
Activity.filter(parameters="lr", values=lambda value: 0.5 <= value <= 1.5 if isinstance(value, Number) else False)
Dataset
Renku API Dataset.
Dataset class allows listing datasets and files inside a Renku project and accessing their metadata.
To get a list of available datasets in a Renku project use list
method:
from renku.api import Dataset
datasets = Dataset.list()
You can then access metadata of a dataset like name
, title
,
keywords
, etc. To get the list of files inside a dataset use files
property:
for dataset_file in dataset.files:
print(dataset_file.path)
Inputs, Outputs, and Parameters
Renku API Workflow Models.
Input and Output classes can be used to define inputs and outputs of a script
within the same script. Paths defined with these classes are added to explicit
inputs and outputs in the workflow’s metadata. For example, the following
mark a data/data.csv
as an input with name my-input
to the script:
from renku.api import Input
with open(Input("my-input", "data/data.csv")) as input_data:
for line in input_data:
print(line)
Users can track parameters’ values in a workflow by defining them using
Parameter
function.
from renku.api import Parameter
nc = Parameter(name="n_components", value=10)
print(nc.value) # 10
Once a Parameter is tracked like this, it can be set normally in commands like
renku workflow execute
with the --set
option to override the value.
Plan, CompositePlan
Renku API Plan.
Plan
and CompositePlan
classes represent Renku workflow plans executed
in a Project. Each of these classes has a static list
method that returns a
list of all active plans/composite-plans in a project:
from renku.api import Plan
plans = Plan.list()
composite_plans = CompositePlan.list()
Project
Renku API Project.
Project class acts as a context for other Renku entities like Dataset, or Inputs/Outputs. It provides access to internals of a Renku project for such entities.
Normally, you do not need to create an instance of Project class directly unless you want to have access to Project metadata (e.g. path) or get its status. To separate parts of your script that uses Renku entities, you can create a Project context manager and interact with Renku inside it:
from renku.api import Project, Input
with Project():
input_1 = Input("input_1", "path_1")
You can use Project’s status
method to get info about outdated outputs and
activities, and modified or deleted inputs:
from renku.api import Project
outdated_generations, outdated_activities, modified_inputs, deleted_inputs = Project().status()
RDF Graph
Renku RDF Graph API.
The RDFGraph
class allows for the quick creation of a searchable graph object
based on the project’s metadata.
To create the graph and query it:
from renku.ui.api import RDFGraph
g = RDFGraph()
# get a list of contributors to the project
list(g.subjects(object=URIRef("http://schema.org/Person")))
For more information on querying the graph, see the RDFLib documentation.