renku update
Update outdated files created by the “run” command.
Commands and options
renku update
Update existing files by rerunning their outdated workflow.
renku update [OPTIONS] [PATHS]...
Options
- -a, --all
Update all outdated files (default).
- -n, --dry-run
Show a preview of plans that will be executed.
- -p, --provider <provider>
The workflow engine to use.
- Default:
toil
- Options:
toil | local | cwltool
- -c, --config <config file>
YAML file containing configuration for the provider.
- -i, --ignore-deleted
Ignore deleted paths.
- --skip-metadata-update
Do not update the metadata store for the execution.
Arguments
- PATHS
Optional argument(s)
Recreating outdated files
The information about dependencies for each file in a Renku project is stored in various metadata.
When an update command is executed, Renku looks into the most recent execution of each workflow (Plan and Activity combination) and checks which one is outdated (i.e. at least one of its inputs is modified). It generates a minimal dependency graph for each outdated file stored in the repository. It means that only the necessary steps will be executed.
Assume that the following history for the file H
exists.
C---D---E
/ \
A---B---F---G---H
The first example shows situation when D
is modified and files E
and
H
become outdated.
C--*D*--(E)
/ \
A---B---F---G---(H)
** - modified
() - needs update
In this situation, you can do effectively three things:
Update all files
$ renku update --all
Update only
E
$ renku update E
Update
E
andH
$ renku update H
Note
If there were uncommitted changes then the command fails. Check git status to see details.
In some cases it may be desirable to avoid updating the renku metadata
and to avoid committing this and any other change in the repository when the update
command is run. If this is the case then you can pass the --skip-metadata-update
flag to renku update
.
Pre-update checks
In the next example, files A
or B
are modified, hence the majority
of dependent files must be recreated.
(C)--(D)--(E)
/ \
*A*--*B*--(F)--(G)--(H)
To avoid excessive recreation of the large portion of files which could have
been affected by a simple change of an input file, consider specifying a single
file (e.g. renku update G
). See also renku status.
Update siblings
If a workflow step produces multiple output files, these outputs will be always updated together.
(B)
/
*A*--[step 1]--(C)
\
(D)
An attempt to update a single file would update its siblings as well.
The following commands will produce the same result.
$ renku update C
$ renku update B C D
Ignoring deleted paths
The update command will regenerate any deleted files/directories. If you don’t
want to regenerate deleted paths, pass --ignore-deleted
to the update
command. You can make this the default behavior by setting
update_ignore_delete
config value for a project or globally:
$ renku config set [--global] update_ignore_delete True
Note that deleted path always will be regenerated if they have siblings or downstream dependencies that aren’t deleted.