renku workflow
Manage the set of execution templates created by the renku run
command.
Commands and options
renku workflow
Workflow commands.
renku workflow [OPTIONS] COMMAND [ARGS]...
compose
Create a composite workflow consisting of multiple steps.
renku workflow compose [OPTIONS] NAME [STEPS]...
Options
- -d, --description <description>
Workflow step’s description.
- -m, --map <mappings>
Mapping for a workflow parameter.
- -s, --set <defaults>
Default value for a workflow parameter.
- -l, --link <links>
Link source and sink parameters to connect steps.
- -p, --describe-param <describe_param>
Add description for a workflow parameter.
- --map-inputs
Exposes all child inputs as inputs on the CompositePlan.
- --map-outputs
Exposes all child outputs as outputs on the CompositePlan.
- --map-params
Exposes all child parameters as parameters on the CompositePlan.
- --map-all
Combination of –map-inputs, –map-outputs, –map-params.
- --link-all
Automatically link steps based on default values.
- --keyword <keyword>
List of keywords for the workflow.
- --from <sources>
Start a composite plan from this file as input.
- --to <sinks>
End a composite plan at this file as output.
- --creator <creators>
Creator’s name, email, and affiliation. Accepted format is ‘Forename Surname <email> [affiliation]’.
Arguments
- NAME
Required argument
- STEPS
Optional argument(s)
edit
Edit workflow details.
renku workflow edit [OPTIONS] <name or uuid>
Options
- -n, --name <new name>
New name of the workflow
- -d, --description <new desc>
New description of the workflow
- -s, --set <parameter>=<value>
Set default <value> for a <parameter>/add new parameter
- -m, --map <parameter>=<parameter or expression>
New mapping on the workflow
- -r, --rename-param <parameter>="name">
New name for parameter
- --describe-param <parameter>="description">
New description of the workflow
- -m, --metadata <metadata>
Custom metadata to be associated with the workflow.
- --creator <creators>
Creator’s name, email, and affiliation. Accepted format is ‘Forename Surname <email> [affiliation]’.
- --keyword <keywords>
List of keywords for the workflow.
Arguments
- <name or uuid>
Required argument
execute
Execute a given workflow.
renku workflow execute [OPTIONS] NAME_OR_ID
Options
- -p, --provider <provider>
The workflow engine to use.
- Default:
toil
- Options:
toil | local | cwltool
- -c, --config <config file>
YAML file containing configuration for the provider.
- -s, --set <parameter>=<value>
Set <value> for a <parameter> to be used in execution.
- --values <values-file>
YAML file containing parameter mappings to be used.
- --skip-metadata-update
Do not update the metadata store for the execution.
Arguments
- NAME_OR_ID
Required argument
export
Export workflow.
renku workflow export [OPTIONS] <name or uuid>
Options
- -f, --format <format>
Workflow language format.
- Default:
cwl
- Options:
renku | cwl
- -o, --output <path>
Save to <path> instead of printing to terminal
- --values <file>
YAML file containing parameter mappings to be used.
Arguments
- <name or uuid>
Required argument
inputs
Show all inputs used by workflows.
<PATHS> Limit results to these paths.
renku workflow inputs [OPTIONS] [PATHS]...
Arguments
- PATHS
Optional argument(s)
iterate
Execute a workflow by iterating through a range of provided parameters.
renku workflow iterate [OPTIONS] NAME_OR_ID
Options
- --skip-metadata-update
Do not update the metadata store for the execution.
- --mapping <file>
YAML file containing parameter mappings to be used.
- -n, --dry-run
Print the generated plans with their parameters instead of executing.
- Default:
False
- -p, --provider <provider>
The workflow engine to use.
- Default:
toil
- Options:
toil | local | cwltool
- -m, --map <mappings>
Mapping for a workflow parameter.
- -c, --config <config file>
YAML file containing config for the provider.
Arguments
- NAME_OR_ID
Required argument
ls
List or manage workflows with subcommands.
renku workflow ls [OPTIONS]
Options
- --format <format>
Choose an output format.
- Options:
tabular | json-ld | json
- -c, --columns <columns>
Comma-separated list of column to display: id, name, keywords, description, command.
- Default:
id,name,command
outputs
Show all outputs generated by workflows.
<PATHS> Limit results to these paths.
renku workflow outputs [OPTIONS] [PATHS]...
Arguments
- PATHS
Optional argument(s)
remove
Remove a workflow named <name>.
renku workflow remove [OPTIONS] <name>
Options
- --force
Override the existence check.
Arguments
- <name>
Required argument
revert
Revert activity metadata and generations.
renku workflow revert [OPTIONS] ACTIVITY_ID
Options
- -m, --metadata-only
Only undo metadata, leave generated outputs unchanged.
- Default:
False
- -f, --force
Force-revert the activity, even if it breaks things.
- Default:
False
- -p, --plan
Delete activity’s plan if no other activity is using it.
- Default:
False
Arguments
- ACTIVITY_ID
Required argument
show
Show details for workflow <name_or_id_or_path>.
renku workflow show [OPTIONS] <name_or_id_or_path>
Arguments
- <name_or_id_or_path>
Required argument
visualize
Visualization of workflows that produced outputs at the specified paths.
Either PATHS or –from need to be set.
renku workflow visualize [OPTIONS] [PATHS]...
Options
- --from <sources>
Start drawing the graph from this file.
- -c, --columns <columns>
Comma-separated list of column to display: command, id, date, plan.
- Default:
command
- -x, --exclude-files
Hide file nodes, only show Runs.
- -a, --ascii
Only use Ascii characters for formatting.
- --revision <revision>
Git revision to generate the graph for.
- --format <format>
Choose an output format.
- Options:
console | dot
- -i, --interactive
Interactively explore run graph. Only available for console output
- --no-color
Don’t colorize console output.
- --pager
Force use pager (less) for console output.
- --no-pager
Don’t use pager (less) for console output.
Arguments
- PATHS
Optional argument(s)
Description
Renku records two different kinds of metadata when a workflow is executed,
Run
and Plan
.
Plans describe a recipe for a command. They function as a template that
can be used directly or combined with other workflow templates to create more
complex recipes.
These Plans can be run in various ways, on creation with renku run
,
doing a renku rerun
or renku update
or manually using renku workflow
execute
.
Each time a Plan
is run, we track that instance of it as a Run
.
Runs track workflow execution through time. They track which Plan was
run, at what time, with which specific values. This gives an insight into what
were the steps taken in a repository, how they were taken and what results they
produced.
The renku workflow
group of commands contains most of the commands used
to interact with Plans and Runs
Working with Plans
Listing Plans
$ renku workflow ls
ID NAME
--------------------------------------- ---------------
/plans/11a3702184394b93ac422df760e40999 cp-B-C-ca4da
/plans/96642cac86d9435e8abce2384f8618b9 cat-A-C-fa017
/plans/96c70626575c41c5a13853b070eaaaf5 my-other-run
/plans/9a0961844fcc46e1816fde00f57e24a8 my-run
Each entry corresponds to a recorded Plan/workflow template. You can also
show additional columns using the --columns
parameter, which takes any
combination of values from id
, name
, keywords
and description
.
Showing Plan Details
You can see the details of a plan by using renku workflow show
:
$ renku workflow show my-run
Id: /plans/9a0961844fcc46e1816fde00f57e24a8
Name: run1
Command: cp A B
Success Codes:
Inputs:
- input-1:
Default Value: A
Position: 1
Outputs:
- output-2:
Default Value: B
Position: 2
This shows the unique Id of the Plan, its name, the full command of the Plan
if it was run without any modifications (more on that later), which exit codes
should be considered successful executions (defaults to 0
) as well as its
inputs, outputs and parameters.
Executing Plans
Plans can be executed using renku workflow execute
. They can be run as-is
or their parameters can be modified as needed. Renku has a plugin architecture
to allow execution using various execution backends.
$ renku workflow execute --provider cwltool --set input-1=file.txt my-run
Parameters can be set using the --set
keyword or by specifying them in a
values YAML file and passing that using --values
. In case of passing a file,
for a composite workflow like:
$ renku run --name train -- python train.py --lr=0.1 --gamma=0.5 --output=result.csv
$ renku run --name eval -- python eval.py --image=graph.png --data=result.csv
$ renku workflow compose --map learning_rate=train.lr --map graph=eval.image
the YAML file could look like:
# composite (mapped) parameters
learning_rate: 0.9
graph: overview.png
train: # child workflow name
# child workflow parameters
gamma: 1.0
Which would rerun the two steps but with lr
set to 0.9
, gamma
set to 1.0
and the output saved under overview.png
.
Note that this would be the same as using:
train:
lr: 0.9
gamma: 1.0
eval:
image: overview.png
For a regular (non-composite) workflow it is enough to just specify key-value pairs like:
lr: 0.9
gamma: 1.0
In addition to being passed on the command line and being available to
renku.ui.api.*
classes in Python scripts, parameters are also set as
environment variables when executing the command, in the form of
RENKU_ENV_<parameter name>
.
Provider specific settings can be passed as file using the --config
parameter.
In some cases it may be desirable to avoid updating the renku metadata
and to avoid committing this and any other change in the repository when a workflow
is executed. If this is the case then you can pass the --skip-metadata-update
flag to renku workflow execute
.
Iterate Plans
For executing a Plan with different parametrization renku workflow iterate
could be used. This sub-command is basically conducting a ‘grid search’-like
execution of a Plan, with parameter-sets provided by the user.
$ renku workflow iterate --map parameter-1=[1,2,3] --map parameter-2=[10,20] my-run
The set of possible values for a parameter can be given by --map
command
line argument or by specifying them in a values YAML file and passing that
using --mapping
. Content of the mapping file for the above example
should be:
parameter-1: [1,2,3]
parameter-2: [10,20]
By default renku workflow iterate
will execute all the combination of the
given parameters’ list of possible values. Sometimes it is desired that instead
of all the combination of possible values, a specific tuple of values are
executed. This could be done by marking the parameters that should be bound
together with the @tag
suffix in their names.
$ renku workflow iterate --map parameter-1@tag1=[1,2,3] --map parameter-2@tag1=[10,5,30] my-run
This will result in only three distinct execution of the my-run
Plan,
with the following parameter combinations: [(1,10), (2,5), (3,30)]
. It is
important to note that parameters that have the same tag, should have the same
number of possible values, i.e. the values list should have the same length.
There’s a special template variable for parameter values {iter_index}
, which
can be used to mark each iteration’s index in a value of a parameter. The template
variable is going to be substituted with the iteration index (0, 1, 2, …).
$ renku workflow iterate --map parameter-1=[10,20,30] --map output=output_{iter_index}.txt my-run
This would execute my-run
three times, where parameter-1
values would be
10
, 20
and 30
and the producing output files output_0.txt
,
output_1.txt
and output_2.txt
files in this order.
In some cases it may be desirable to avoid updating the renku metadata
and to avoid committing this and any other change in the repository when a workflow
is iterated through. If this is the case then you can pass the --skip-metadata-update
flag to renku workflow iterate
.
Exporting Plans
You can export a Plan to a number of different workflow languages, such as CWL
(Common Workflow Language) by using renku workflow export
:
$ renku workflow export --format cwl my-run
baseCommand:
- cp
class: CommandLineTool
cwlVersion: v1.0
id: 63e3a2a8-5b40-49b2-a2f4-eecc37bc76b0
inputs:
- default: B
id: _plans_9a0961844fcc46e1816fde00f57e24a8_outputs_2_arg
inputBinding:
position: 2
type: string
- default:
class: File
location: file:///home/user/my-project/A
id: _plans_9a0961844fcc46e1816fde00f57e24a8_inputs_1
inputBinding:
position: 1
type: File
- default:
class: Directory
location: file:///home/user/my-project/.renku
id: input_renku_metadata
type: Directory
- default:
class: Directory
location: file:///home/user/my-project/.git
id: input_git_directory
type: Directory
outputs:
- id: _plans_9a0961844fcc46e1816fde00f57e24a8_outputs_2
outputBinding:
glob: $(inputs._plans_9a0961844fcc46e1816fde00f57e24a8_outputs_2_arg)
type: File
requirements:
InitialWorkDirRequirement:
listing:
- entry: $(inputs._plans_9a0961844fcc46e1816fde00f57e24a8_inputs_1)
entryname: A
writable: false
- entry: $(inputs.input_renku_metadata)
entryname: .renku
writable: false
- entry: $(inputs.input_git_directory)
entryname: .git
writable: false
You can export into a file directly with -o <path>
.
Composing Plans into larger workflows
For more complex workflows consisting of several steps, you can use the
renku workflow compose
command. This creates a new workflow that has
sub-steps.
The basic usage is:
$ renku run --name step1 -- cp input intermediate
$ renku run --name step2 -- cp intermediate output
$ renku workflow compose my-composed-workflow step1 step2
This would create a new workflow called my-composed-workflow
that
consists of step1
and step2
as steps. This new workflow is just
like any other workflow in renku in that it can be executed, exported
or composed with other workflows.
Workflows can also be composed based on past Runs and their
inputs/outputs, using the --from
and --to
parameters. This finds
chains of Runs from inputs to outputs and then adds them to the
composed plan, applying mappings (see below) where appropriate to make
sure the correct values for execution are used in the composite. This
also means that all the parameters in the used plans are exposed on the
composed plan directly.
In the example above, this would be:
$ renku workflow compose --from input --to output my-composed-workflow
You can expose parameters of child steps on the parent workflow using
--map
/-m
arguments followed by a mapping expression. Mapping expressions
take the form of <name>=<expression>
where name
is the name of the
property to be created on the parent workflow and expression points to one
or more fields on the child steps that should be mapped to this property.
The expressions come in two flavors, absolute references using the names
of workflows and properties, and relative references specifying the
position within a workflow.
An absolute expression in the example above could be step1.my_dataset
to refer to the input, output or argument named my_dataset
on the step
step1
. A relative expression could be @step2.@output1
to refer
to the first output of the second step of the composed workflow.
Valid relative expressions are @input<n>
, @output<n>
and @param<n>
for the nth input, output or argument of a step, respectively. For referring
to steps inside a composed workflow, you can use @step<n>
. For referencing
a mapping on a composed workflow, you can use @mapping<n>
. Of course, the
names of the objects for all these cases also work.
The expressions can also be combined using ,
if a mapping should point
to more than one parameter of a child step.
You can mix absolute and relative reference in the same expression, as you see fit.
A full example of this would be:
$ renku workflow compose --map input_file=step1.@input2 --map output_file=@step1.my-output,@step2.step2s_output my-composed-workflow step1 step2
This would create a mapping called input_file
on the parent workflow that
points to the second input of step1
and a mapping called output_file
that points to both the output my-output
on step1
and
step2s_output
on step2
.
You can also set default values for mappings, which override the default values
of the parameters they’re pointing to by using the --set
/-s
parameter, for
instance:
$ renku workflow compose --map input_file=step1.@input2 --set input_file=data.csv
my-composed-workflow step1 step2
This would lead to data.csv
being used for the second input of
step1
when my-composed-workflow
is executed (if it isn’t overridden
at execution time).
You can add a description to the mappings to make them more human-readable
by using the --describe-param
/-p
parameter, as shown here:
$ renku workflow compose --map input_file=step1.@input2 -p input_file="The dataset to process"
my-composed-workflow step1 step2
You can also expose all inputs, outputs or parameters of child steps by
using --map-inputs
, --map-outputs
or --map-params
, respectively.
On execution, renku will automatically detect links between steps, if an input of one step uses the same path as an output of another step, and execute them in the correct order. Since this depends on what values are passed at runtime, you might want to enforce a certain order of steps by explicitly mapping outputs to inputs.
You can do that using the --link <source>=<sink>
parameters, e.g.
--link step1.@output1=step2.@input1
. This gets recorded on the
workflow template and forces step2.@input1
to always be set to the same
path as step1.@output1
, irrespective of which values are passed at
execution time.
This way, you can ensure that the steps in your workflow are always executed in the correct order and that the dependencies between steps are modeled correctly.
Renku can also add links for you automatically based on the default values
of inputs and outputs, where inputs/outputs that have the same path get
linked in the composed run. To do this, pass the --link-all
flag.
Warning
Due to workflows having to be directed acyclic graphs, cycles in the dependencies are not allowed. E.g. step1 depending on step2 depending on step1 is not allowed. Additionally, the flow of information has to be from outputs to inputs or parameters, so you cannot map an input to an output, only the other way around.
Values on inputs/outputs/parameters get set according to the following order of precedence (lower precedence first):
Default value on a input/output/parameter
Default value on a mapping to the input/output/parameter
Value passed to a mapping to the input/output/parameter
Value passed to the input/output/parameter
Value propagated to an input from the source of a workflow link
Editing Plans
Plans can be edited in some limited fashion, but we do not allow structural changes, as that might cause issues with the reproducibility and provenance of the project. If you want to do structural changes (e.g. adding/removing parameters), we recommend you record a new plan instead.
You can change the name and description of Plans and of their parameters, as
well as changing default values of the parameters using the renku workflow
edit
command:
$ renku workflow edit my-run --name new-run --description "my description"
--rename-param input-1=my-input --set my-input=other-file.txt
--describe-param my-input="My input parameter" my-run
This would rename the Plan my-run
to new-run
, change its description,
rename its parameter input-1
to my-input
and set the default of this
parameter to other-file.txt
and set its description.
Option |
Description |
---|---|
|
Plan’s name |
|
Plan’s description. |
|
Set default value for a parameter. Accepted format is ‘<name>=<value>’ |
|
Add a new mapping on the Plan. Accepted format is ‘<name>=<name or expression>’ |
|
Rename a parameter. Accepted format is ‘<name>=”new name”’ |
|
Add a description for a parameter. Accepted format is ‘<name>=”description”’ |
|
Path to file containing custom JSON-LD metadata to be added to the dataset. |
Removing Plans
Sometimes you might want to discard a recorded Plan or reuse its name with a
new Plan. In these cases, you can delete the old plan using renku workflow
remove <plan name>
. Once a Plan is removed, it doesn’t show up in most renku
workflow commands.
renku update
ignores deleted Plans, but renku rerun
will still rerun
them if needed, to ensure reproducibility.
Working with Runs
Listing Runs
To get a view of what commands have been execute in the project, you can use
the renku log --workflows
command:
$ renku log --workflows
DATE TYPE DESCRIPTION
------------------- ---- -------------
2021-09-21 15:46:02 Run cp A C
2021-09-21 10:52:51 Run cp A B
Refer to the documentation of the renku log command for more details.
Visualizing Executions
You can visualize past Runs made with renku using the renku workflow
visualize
command.
This will show a directed graph of executions and how they are connected. This
way you can see exactly how a file was generated and what steps it involved.
It also supports an interactive mode that lets you explore the graph in a more
detailed way.
$ renku run echo "input" > input
$ renku run cp input intermediate
$ renku run cp intermediate output
$ renku workflow visualize
╔════════════╗
║echo > input║
╚════════════╝
*
*
*
┌─────┐
│input│
└─────┘
*
*
*
╔═════════════════════╗
║cp input intermediate║
╚═════════════════════╝
*
*
*
┌────────────┐
│intermediate│
└────────────┘
*
*
*
╔══════════════════════╗
║cp intermediate output║
╚══════════════════════╝
*
*
*
┌──────┐
│output│
└──────┘
$ renku workflow visualize intermediate
╔════════════╗
║echo > input║
╚════════════╝
*
*
*
┌─────┐
│input│
└─────┘
*
*
*
╔═════════════════════╗
║cp input intermediate║
╚═════════════════════╝
*
*
*
┌────────────┐
│intermediate│
└────────────┘
$ renku workflow visualize --from intermediate
┌────────────┐
│intermediate│
└────────────┘
*
*
*
╔══════════════════════╗
║cp intermediate output║
╚══════════════════════╝
*
*
*
┌──────┐
│output│
└──────┘
You can also run in interactive mode using the --interactive
flag.
$ renku workflow visualize --interactive
This will allow you to navigate between workflow execution and see details by pressing the <Enter> key.
If you prefer to elaborate the output graph further, or if you wish to export
it for any reason, you can use the --format
option to specify an output
format.
The following example generates the graph using the dot format. It can
be stored in a file or piped directly to any compatible tool. Here we
use the dot
command line tool from graphviz to generate an SVG file.
$ renku workflow visualize --format dot <path> | dot -Tsvg > graph.svg
Use renku workflow visualize -h
to see all available options.
Removing Runs
Renku allows you to undo a Run in a project by using renku workflow revert
<activity ID>
. You can obtain <activity ID> from the renku log
command.
If the deleted run generated some files, Renku either deletes these files (in
case there are no earlier versions of them and they are not used in other
activities) or revert them to their earlier versions. You can ask Renku to keep the
generated files and only delete the metadata by passing the --metadata-only
option.
Warning
Renku only checks project’s runs/plans to see if files are used.
It doesn’t check if files, that are going to be deleted, are added to a
dataset for example. Make sure that the project doesn’t use such files in
other places or always use --metadata-only
option when reverting a run.
If you want to delete a run along with its plan use the --plan
option.
This only deletes the plan if it’s not used by any other activity.
Renku won’t remove a run if there are downstream runs that depend on it. The
reason is that removing a run will break the link between its upstream and
downstream runs. If this is not an issue for you or if you want to delete the
downstream runs later, then pass the --force
option to make Renku delete
the run anyway.
Input and output files
You can list input and output files generated in the repository by running
renku workflow inputs
and renku workflow outputs
commands. Alternatively,
you can check if all paths specified as arguments are input or output files
respectively.
$ renku run wc < source.txt > result.wc
$ renku workflow inputs
source.txt
$ renku workflow outputs
result.wc
$ renku workflow outputs source.txt
$ echo $? # last command finished with an error code
1