renku workflow

Manage the set of execution templates created by the renku run command.

Commands and options

renku workflow

Workflow commands.

renku workflow [OPTIONS] COMMAND [ARGS]...

compose

Create a composite workflow consisting of multiple steps.

renku workflow compose [OPTIONS] NAME [STEPS]...

Options

-d, --description <description>

Workflow step’s description.

-m, --map <mappings>

Mapping for a workflow parameter.

-s, --set <defaults>

Default value for a workflow parameter.

-l, --link <links>

Link source and sink parameters to connect steps.

-p, --describe-param <describe_param>

Add description for a workflow parameter.

--map-inputs

Exposes all child inputs as inputs on the CompositePlan.

--map-outputs

Exposes all child outputs as outputs on the CompositePlan.

--map-params

Exposes all child parameters as parameters on the CompositePlan.

--map-all

Combination of –map-inputs, –map-outputs, –map-params.

Automatically link steps based on default values.

--keyword <keyword>

List of keywords for the workflow.

--from <sources>

Start a composite plan from this file as input.

--to <sinks>

End a composite plan at this file as output.

--creator <creators>

Creator’s name, email, and affiliation. Accepted format is ‘Forename Surname <email> [affiliation]’.

Arguments

NAME

Required argument

STEPS

Optional argument(s)

edit

Edit workflow details.

renku workflow edit [OPTIONS] <name or uuid>

Options

-n, --name <new name>

New name of the workflow

-d, --description <new desc>

New description of the workflow

-s, --set <parameter>=<value>

Set default <value> for a <parameter>/add new parameter

-m, --map <parameter>=<parameter or expression>

New mapping on the workflow

-r, --rename-param <parameter>="name">

New name for parameter

--describe-param <parameter>="description">

New description of the workflow

-m, --metadata <metadata>

Custom metadata to be associated with the workflow.

--creator <creators>

Creator’s name, email, and affiliation. Accepted format is ‘Forename Surname <email> [affiliation]’.

--keyword <keywords>

List of keywords for the workflow.

Arguments

<name or uuid>

Required argument

execute

Execute a given workflow.

renku workflow execute [OPTIONS] NAME_OR_ID

Options

-p, --provider <provider>

The workflow engine to use.

Default:

toil

Options:

toil | local | cwltool

-c, --config <config file>

YAML file containing configuration for the provider.

-s, --set <parameter>=<value>

Set <value> for a <parameter> to be used in execution.

--values <values-file>

YAML file containing parameter mappings to be used.

--skip-metadata-update

Do not update the metadata store for the execution.

Arguments

NAME_OR_ID

Required argument

export

Export workflow.

renku workflow export [OPTIONS] <name or uuid>

Options

-f, --format <format>

Workflow language format.

Default:

cwl

Options:

renku | cwl

-o, --output <path>

Save to <path> instead of printing to terminal

--values <file>

YAML file containing parameter mappings to be used.

Arguments

<name or uuid>

Required argument

inputs

Show all inputs used by workflows.

<PATHS> Limit results to these paths.

renku workflow inputs [OPTIONS] [PATHS]...

Arguments

PATHS

Optional argument(s)

iterate

Execute a workflow by iterating through a range of provided parameters.

renku workflow iterate [OPTIONS] NAME_OR_ID

Options

--skip-metadata-update

Do not update the metadata store for the execution.

--mapping <file>

YAML file containing parameter mappings to be used.

-n, --dry-run

Print the generated plans with their parameters instead of executing.

Default:

False

-p, --provider <provider>

The workflow engine to use.

Default:

toil

Options:

toil | local | cwltool

-m, --map <mappings>

Mapping for a workflow parameter.

-c, --config <config file>

YAML file containing config for the provider.

Arguments

NAME_OR_ID

Required argument

ls

List or manage workflows with subcommands.

renku workflow ls [OPTIONS]

Options

--format <format>

Choose an output format.

Options:

tabular | json-ld | json

-c, --columns <columns>

Comma-separated list of column to display: id, name, keywords, description, command.

Default:

id,name,command

outputs

Show all outputs generated by workflows.

<PATHS> Limit results to these paths.

renku workflow outputs [OPTIONS] [PATHS]...

Arguments

PATHS

Optional argument(s)

remove

Remove a workflow named <name>.

renku workflow remove [OPTIONS] <name>

Options

--force

Override the existence check.

Arguments

<name>

Required argument

revert

Revert activity metadata and generations.

renku workflow revert [OPTIONS] ACTIVITY_ID

Options

-m, --metadata-only

Only undo metadata, leave generated outputs unchanged.

Default:

False

-f, --force

Force-revert the activity, even if it breaks things.

Default:

False

-p, --plan

Delete activity’s plan if no other activity is using it.

Default:

False

Arguments

ACTIVITY_ID

Required argument

show

Show details for workflow <name_or_id_or_path>.

renku workflow show [OPTIONS] <name_or_id_or_path>

Arguments

<name_or_id_or_path>

Required argument

visualize

Visualization of workflows that produced outputs at the specified paths.

Either PATHS or –from need to be set.

renku workflow visualize [OPTIONS] [PATHS]...

Options

--from <sources>

Start drawing the graph from this file.

-c, --columns <columns>

Comma-separated list of column to display: command, id, date, plan.

Default:

command

-x, --exclude-files

Hide file nodes, only show Runs.

-a, --ascii

Only use Ascii characters for formatting.

--revision <revision>

Git revision to generate the graph for.

--format <format>

Choose an output format.

Options:

console | dot

-i, --interactive

Interactively explore run graph. Only available for console output

--no-color

Don’t colorize console output.

--pager

Force use pager (less) for console output.

--no-pager

Don’t use pager (less) for console output.

Arguments

PATHS

Optional argument(s)

Description

Renku records two different kinds of metadata when a workflow is executed, Run and Plan. Plans describe a recipe for a command. They function as a template that can be used directly or combined with other workflow templates to create more complex recipes. These Plans can be run in various ways, on creation with renku run, doing a renku rerun or renku update or manually using renku workflow execute.

Each time a Plan is run, we track that instance of it as a Run. Runs track workflow execution through time. They track which Plan was run, at what time, with which specific values. This gives an insight into what were the steps taken in a repository, how they were taken and what results they produced.

The renku workflow group of commands contains most of the commands used to interact with Plans and Runs

Working with Plans

Listing Plans

List Plans
$ renku workflow ls
ID                                       NAME
---------------------------------------  ---------------
/plans/11a3702184394b93ac422df760e40999  cp-B-C-ca4da
/plans/96642cac86d9435e8abce2384f8618b9  cat-A-C-fa017
/plans/96c70626575c41c5a13853b070eaaaf5  my-other-run
/plans/9a0961844fcc46e1816fde00f57e24a8  my-run

Each entry corresponds to a recorded Plan/workflow template. You can also show additional columns using the --columns parameter, which takes any combination of values from id, name, keywords and description.

Showing Plan Details

Show Plan

You can see the details of a plan by using renku workflow show:

$ renku workflow show my-run
Id: /plans/9a0961844fcc46e1816fde00f57e24a8
Name: run1
Command: cp A B
Success Codes:
Inputs:
        - input-1:
                Default Value: A
                Position: 1
Outputs:
        - output-2:
                Default Value: B
                Position: 2

This shows the unique Id of the Plan, its name, the full command of the Plan if it was run without any modifications (more on that later), which exit codes should be considered successful executions (defaults to 0) as well as its inputs, outputs and parameters.

Executing Plans

Execute Plans

Plans can be executed using renku workflow execute. They can be run as-is or their parameters can be modified as needed. Renku has a plugin architecture to allow execution using various execution backends.

$ renku workflow execute --provider cwltool --set input-1=file.txt my-run

Parameters can be set using the --set keyword or by specifying them in a values YAML file and passing that using --values. In case of passing a file, for a composite workflow like:

$ renku run --name train -- python train.py --lr=0.1 --gamma=0.5 --output=result.csv
$ renku run --name eval -- python eval.py --image=graph.png --data=result.csv
$ renku workflow compose --map learning_rate=train.lr --map graph=eval.image

the YAML file could look like:

# composite (mapped) parameters
learning_rate: 0.9
graph: overview.png
train: # child workflow name
    # child workflow parameters
    gamma: 1.0

Which would rerun the two steps but with lr set to 0.9, gamma set to 1.0 and the output saved under overview.png.

Note that this would be the same as using:

train:
    lr: 0.9
    gamma: 1.0
eval:
    image: overview.png

For a regular (non-composite) workflow it is enough to just specify key-value pairs like:

lr: 0.9
gamma: 1.0

In addition to being passed on the command line and being available to renku.ui.api.* classes in Python scripts, parameters are also set as environment variables when executing the command, in the form of RENKU_ENV_<parameter name>.

Provider specific settings can be passed as file using the --config parameter.

In some cases it may be desirable to avoid updating the renku metadata and to avoid committing this and any other change in the repository when a workflow is executed. If this is the case then you can pass the --skip-metadata-update flag to renku workflow execute.

Iterate Plans

Iterate Plans

For executing a Plan with different parametrization renku workflow iterate could be used. This sub-command is basically conducting a ‘grid search’-like execution of a Plan, with parameter-sets provided by the user.

$ renku workflow iterate --map parameter-1=[1,2,3]             --map parameter-2=[10,20] my-run

The set of possible values for a parameter can be given by --map command line argument or by specifying them in a values YAML file and passing that using --mapping. Content of the mapping file for the above example should be:

parameter-1: [1,2,3]
parameter-2: [10,20]

By default renku workflow iterate will execute all the combination of the given parameters’ list of possible values. Sometimes it is desired that instead of all the combination of possible values, a specific tuple of values are executed. This could be done by marking the parameters that should be bound together with the @tag suffix in their names.

$ renku workflow iterate --map parameter-1@tag1=[1,2,3]             --map parameter-2@tag1=[10,5,30] my-run

This will result in only three distinct execution of the my-run Plan, with the following parameter combinations: [(1,10), (2,5), (3,30)]. It is important to note that parameters that have the same tag, should have the same number of possible values, i.e. the values list should have the same length.

There’s a special template variable for parameter values {iter_index}, which can be used to mark each iteration’s index in a value of a parameter. The template variable is going to be substituted with the iteration index (0, 1, 2, …).

$ renku workflow iterate --map parameter-1=[10,20,30]             --map output=output_{iter_index}.txt my-run

This would execute my-run three times, where parameter-1 values would be 10, 20 and 30 and the producing output files output_0.txt, output_1.txt and output_2.txt files in this order.

In some cases it may be desirable to avoid updating the renku metadata and to avoid committing this and any other change in the repository when a workflow is iterated through. If this is the case then you can pass the --skip-metadata-update flag to renku workflow iterate.

Exporting Plans

You can export a Plan to a number of different workflow languages, such as CWL (Common Workflow Language) by using renku workflow export:

$ renku workflow export --format cwl my-run
baseCommand:
- cp
class: CommandLineTool
cwlVersion: v1.0
id: 63e3a2a8-5b40-49b2-a2f4-eecc37bc76b0
inputs:
- default: B
id: _plans_9a0961844fcc46e1816fde00f57e24a8_outputs_2_arg
inputBinding:
    position: 2
type: string
- default:
    class: File
    location: file:///home/user/my-project/A
id: _plans_9a0961844fcc46e1816fde00f57e24a8_inputs_1
inputBinding:
    position: 1
type: File
- default:
    class: Directory
    location: file:///home/user/my-project/.renku
id: input_renku_metadata
type: Directory
- default:
    class: Directory
    location: file:///home/user/my-project/.git
id: input_git_directory
type: Directory
outputs:
- id: _plans_9a0961844fcc46e1816fde00f57e24a8_outputs_2
outputBinding:
    glob: $(inputs._plans_9a0961844fcc46e1816fde00f57e24a8_outputs_2_arg)
type: File
requirements:
InitialWorkDirRequirement:
    listing:
    - entry: $(inputs._plans_9a0961844fcc46e1816fde00f57e24a8_inputs_1)
    entryname: A
    writable: false
    - entry: $(inputs.input_renku_metadata)
    entryname: .renku
    writable: false
    - entry: $(inputs.input_git_directory)
    entryname: .git
    writable: false

You can export into a file directly with -o <path>.

Composing Plans into larger workflows

Composing Plans

For more complex workflows consisting of several steps, you can use the renku workflow compose command. This creates a new workflow that has sub-steps.

The basic usage is:

$ renku run --name step1 -- cp input intermediate
$ renku run --name step2 -- cp intermediate output
$ renku workflow compose my-composed-workflow step1 step2

This would create a new workflow called my-composed-workflow that consists of step1 and step2 as steps. This new workflow is just like any other workflow in renku in that it can be executed, exported or composed with other workflows.

Workflows can also be composed based on past Runs and their inputs/outputs, using the --from and --to parameters. This finds chains of Runs from inputs to outputs and then adds them to the composed plan, applying mappings (see below) where appropriate to make sure the correct values for execution are used in the composite. This also means that all the parameters in the used plans are exposed on the composed plan directly. In the example above, this would be:

$ renku workflow compose --from input --to output my-composed-workflow

You can expose parameters of child steps on the parent workflow using --map/-m arguments followed by a mapping expression. Mapping expressions take the form of <name>=<expression> where name is the name of the property to be created on the parent workflow and expression points to one or more fields on the child steps that should be mapped to this property. The expressions come in two flavors, absolute references using the names of workflows and properties, and relative references specifying the position within a workflow.

An absolute expression in the example above could be step1.my_dataset to refer to the input, output or argument named my_dataset on the step step1. A relative expression could be @step2.@output1 to refer to the first output of the second step of the composed workflow.

Valid relative expressions are @input<n>, @output<n> and @param<n> for the nth input, output or argument of a step, respectively. For referring to steps inside a composed workflow, you can use @step<n>. For referencing a mapping on a composed workflow, you can use @mapping<n>. Of course, the names of the objects for all these cases also work.

The expressions can also be combined using , if a mapping should point to more than one parameter of a child step.

You can mix absolute and relative reference in the same expression, as you see fit.

A full example of this would be:

$ renku workflow compose --map input_file=step1.@input2        --map output_file=@step1.my-output,@step2.step2s_output        my-composed-workflow step1 step2

This would create a mapping called input_file on the parent workflow that points to the second input of step1 and a mapping called output_file that points to both the output my-output on step1 and step2s_output on step2.

You can also set default values for mappings, which override the default values of the parameters they’re pointing to by using the --set/-s parameter, for instance:

$ renku workflow compose --map input_file=step1.@input2        --set input_file=data.csv
    my-composed-workflow step1 step2

This would lead to data.csv being used for the second input of step1 when my-composed-workflow is executed (if it isn’t overridden at execution time).

You can add a description to the mappings to make them more human-readable by using the --describe-param/-p parameter, as shown here:

$ renku workflow compose --map input_file=step1.@input2        -p input_file="The dataset to process"
    my-composed-workflow step1 step2

You can also expose all inputs, outputs or parameters of child steps by using --map-inputs, --map-outputs or --map-params, respectively.

On execution, renku will automatically detect links between steps, if an input of one step uses the same path as an output of another step, and execute them in the correct order. Since this depends on what values are passed at runtime, you might want to enforce a certain order of steps by explicitly mapping outputs to inputs.

You can do that using the --link <source>=<sink> parameters, e.g. --link step1.@output1=step2.@input1. This gets recorded on the workflow template and forces step2.@input1 to always be set to the same path as step1.@output1, irrespective of which values are passed at execution time.

This way, you can ensure that the steps in your workflow are always executed in the correct order and that the dependencies between steps are modeled correctly.

Renku can also add links for you automatically based on the default values of inputs and outputs, where inputs/outputs that have the same path get linked in the composed run. To do this, pass the --link-all flag.

Warning

Due to workflows having to be directed acyclic graphs, cycles in the dependencies are not allowed. E.g. step1 depending on step2 depending on step1 is not allowed. Additionally, the flow of information has to be from outputs to inputs or parameters, so you cannot map an input to an output, only the other way around.

Values on inputs/outputs/parameters get set according to the following order of precedence (lower precedence first):

  • Default value on a input/output/parameter

  • Default value on a mapping to the input/output/parameter

  • Value passed to a mapping to the input/output/parameter

  • Value passed to the input/output/parameter

  • Value propagated to an input from the source of a workflow link

Editing Plans

Editing Plans

Plans can be edited in some limited fashion, but we do not allow structural changes, as that might cause issues with the reproducibility and provenance of the project. If you want to do structural changes (e.g. adding/removing parameters), we recommend you record a new plan instead.

You can change the name and description of Plans and of their parameters, as well as changing default values of the parameters using the renku workflow edit command:

$ renku workflow edit my-run --name new-run --description "my description"
  --rename-param input-1=my-input --set my-input=other-file.txt
  --describe-param my-input="My input parameter" my-run

This would rename the Plan my-run to new-run, change its description, rename its parameter input-1 to my-input and set the default of this parameter to other-file.txt and set its description.

Option

Description

-n, --name

Plan’s name

-d, --description

Plan’s description.

-s, --set

Set default value for a parameter. Accepted format is ‘<name>=<value>’

-m, --map

Add a new mapping on the Plan. Accepted format is ‘<name>=<name or expression>’

-r, --rename-param

Rename a parameter. Accepted format is ‘<name>=”new name”’

-d, --describe-param

Add a description for a parameter. Accepted format is ‘<name>=”description”’

-m, --metadata

Path to file containing custom JSON-LD metadata to be added to the dataset.

Removing Plans

Sometimes you might want to discard a recorded Plan or reuse its name with a new Plan. In these cases, you can delete the old plan using renku workflow remove <plan name>. Once a Plan is removed, it doesn’t show up in most renku workflow commands. renku update ignores deleted Plans, but renku rerun will still rerun them if needed, to ensure reproducibility.

Working with Runs

Listing Runs

To get a view of what commands have been execute in the project, you can use the renku log --workflows command:

$ renku log --workflows
DATE                 TYPE  DESCRIPTION
-------------------  ----  -------------
2021-09-21 15:46:02  Run   cp A C
2021-09-21 10:52:51  Run   cp A B

Refer to the documentation of the renku log command for more details.

Visualizing Executions

Visualizing Runs

You can visualize past Runs made with renku using the renku workflow visualize command. This will show a directed graph of executions and how they are connected. This way you can see exactly how a file was generated and what steps it involved. It also supports an interactive mode that lets you explore the graph in a more detailed way.

$ renku run echo "input" > input
$ renku run cp input intermediate
$ renku run cp intermediate output
$ renku workflow visualize
     ╔════════════╗
     ║echo > input║
     ╚════════════╝
             *
             *
             *
         ┌─────┐
         │input│
         └─────┘
             *
             *
             *
 ╔═════════════════════╗
 ║cp input intermediate║
 ╚═════════════════════╝
             *
             *
             *
     ┌────────────┐
     │intermediate│
     └────────────┘
             *
             *
             *
 ╔══════════════════════╗
 ║cp intermediate output║
 ╚══════════════════════╝
             *
             *
             *
         ┌──────┐
         │output│
         └──────┘

 $ renku workflow visualize intermediate
     ╔════════════╗
     ║echo > input║
     ╚════════════╝
         *
         *
         *
         ┌─────┐
         │input│
         └─────┘
         *
         *
         *
 ╔═════════════════════╗
 ║cp input intermediate║
 ╚═════════════════════╝
         *
         *
         *
     ┌────────────┐
     │intermediate│
     └────────────┘
 $ renku workflow visualize --from intermediate
     ┌────────────┐
     │intermediate│
     └────────────┘
             *
             *
             *
 ╔══════════════════════╗
 ║cp intermediate output║
 ╚══════════════════════╝
             *
             *
             *
         ┌──────┐
         │output│
         └──────┘

You can also run in interactive mode using the --interactive flag.

$ renku workflow visualize --interactive

This will allow you to navigate between workflow execution and see details by pressing the <Enter> key.

If you prefer to elaborate the output graph further, or if you wish to export it for any reason, you can use the --format option to specify an output format.

The following example generates the graph using the dot format. It can be stored in a file or piped directly to any compatible tool. Here we use the dot command line tool from graphviz to generate an SVG file.

$ renku workflow visualize --format dot <path> | dot -Tsvg > graph.svg

Use renku workflow visualize -h to see all available options.

Removing Runs

Renku allows you to undo a Run in a project by using renku workflow revert <activity ID>. You can obtain <activity ID> from the renku log command. If the deleted run generated some files, Renku either deletes these files (in case there are no earlier versions of them and they are not used in other activities) or revert them to their earlier versions. You can ask Renku to keep the generated files and only delete the metadata by passing the --metadata-only option.

Warning

Renku only checks project’s runs/plans to see if files are used. It doesn’t check if files, that are going to be deleted, are added to a dataset for example. Make sure that the project doesn’t use such files in other places or always use --metadata-only option when reverting a run.

If you want to delete a run along with its plan use the --plan option. This only deletes the plan if it’s not used by any other activity.

Renku won’t remove a run if there are downstream runs that depend on it. The reason is that removing a run will break the link between its upstream and downstream runs. If this is not an issue for you or if you want to delete the downstream runs later, then pass the --force option to make Renku delete the run anyway.

Input and output files

You can list input and output files generated in the repository by running renku workflow inputs and renku workflow outputs commands. Alternatively, you can check if all paths specified as arguments are input or output files respectively.

$ renku run wc < source.txt > result.wc
$ renku workflow inputs
source.txt
$ renku workflow outputs
result.wc
$ renku workflow outputs source.txt
$ echo $?  # last command finished with an error code
1