Renku Python API

Project

Renku API Project.

Project class acts as a context for other Renku entities like Dataset, or Inputs/Outputs. It provides access to internals of a Renku project for such entities.

Normally, you do not need to create an instance of Project class directly unless you want to have access to Project metadata (e.g. path). To separate parts of your script that uses Renku entities, you can create a Project context manager and interact with Renku inside it:

from renku.api import Project, Input

with Project():
    input_1 = Input("data_1")

Dataset

Renku API Dataset.

Dataset class allows listing datasets and files inside a Renku project and accessing their metadata.

To get a list of available datasets in a Renku project use list method:

from renku.api import Dataset

datasets = Dataset.list()

You can then access metadata of a dataset like name, title, keywords, etc. To get the list of files inside a dataset use files property:

for dataset_file in dataset.files:
    print(dataset_file.path)

Inputs, Outputs, and Parameters

Renku API Workflow Models.

Input and Output classes can be used to define inputs and outputs of a script within the same script. Paths defined with these classes are added to explicit inputs and outputs in the workflow’s metadata. For example, the following mark a data/data.csv as an input to the script:

from renku.api import Input

with open(Input("data/data.csv")) as input_data:
    for line in input_data:
        print(line)

Users can track parameters’ values in a workflow by defining them using Parameter function.

from renku.api import Parameter

nc = Parameter(name="n_components", value=10)

Once a Parameter is tracked like this, it can be set normally in commands like renku workflow execute with the --set option to override the value.