Renku Python API¶
Project
¶
Renku API Project.
Project class acts as a context for other Renku entities like Dataset, or Inputs/Outputs. It provides access to internals of a Renku project for such entities.
Normally, you do not need to create an instance of Project class directly unless you want to have access to Project metadata (e.g. path). To separate parts of your script that uses Renku entities, you can create a Project context manager and interact with Renku inside it:
from renku.api import Project, Input
with Project():
input_1 = Input("data_1")
Dataset
¶
Renku API Dataset.
Dataset class allows listing datasets and files inside a Renku project and accessing their metadata.
To get a list of available datasets in a Renku project use list
method:
from renku.api import Dataset
datasets = Dataset.list()
You can then access metadata of a dataset like name
, title
,
keywords
, etc. To get the list of files inside a dataset use files
property:
for dataset_file in dataset.files:
print(dataset_file.path)
Inputs, Outputs, and Parameters
¶
Renku API Workflow Models.
Input and Output classes can be used to define inputs and outputs of a script
within the same script. Paths defined with these classes are added to explicit
inputs and outputs in the workflow’s metadata. For example, the following
mark a data/data.csv
as an input to the script:
from renku.api import Input
with open(Input("data/data.csv")) as input_data:
for line in input_data:
print(line)
Users can track parameters’ values in a workflow by defining them using
Parameter
function.
from renku.api import Parameter
nc = Parameter(name="n_components", value=10)
Once a Parameter is tracked like this, it can be set normally in commands like
renku workflow execute
with the --set
option to override the value.