Gateways¶
Renku uses several gateways to abstract away dependencies on external systems such as the database or git.
Interfaces¶
Interfaces that the Gateways implement.
Renku activity gateway interface.
- class renku.core.interface.activity_gateway.IActivityGateway[source]¶
Bases:
abc.ABC
Interface for the ActivityGateway.
- get_activities_by_generation(path, checksum=None)[source]¶
Return the list of all activities that generate a path.
- get_activities_by_usage(path, checksum=None)[source]¶
Return the list of all activities that use a path.
- get_downstream_activities(activity, max_depth=None)[source]¶
Get downstream activities that depend on this activity.
- get_downstream_activity_chains(activity)[source]¶
Get a list of tuples of all downstream paths of this activity.
- get_upstream_activities(activity, max_depth=None)[source]¶
Get upstream activities that this activity depends on.
- get_upstream_activity_chains(activity)[source]¶
Get a list of tuples of all upstream paths of this activity.
- remove(activity, keep_reference=True, force=False)[source]¶
Remove an activity from the storage.
- Parameters
activity (Activity) – The activity to be removed.
keep_reference (bool) – Whether to keep the activity in the
activities
index or not.force (bool) – Force-delete the activity even if it has downstream activities.
Renku database gateway interface.
- class renku.core.interface.database_gateway.IDatabaseGateway[source]¶
Bases:
abc.ABC
Gateway interface for basic database operations.
Renku dataset gateway interface.
- class renku.core.interface.dataset_gateway.IDatasetGateway[source]¶
Bases:
abc.ABC
Interface for the DatasetGateway.
External storage interface.
- class renku.core.interface.storage.FileHash(base_uri, path, hash=None, hash_type=None, modified_datetime=None)[source]¶
Bases:
object
The has for a file at a specific location.
- property full_uri¶
Return the full uri to the file.
- class renku.core.interface.storage.IStorage(storage_scheme, provider, credentials, provider_configuration, provider_uri_convertor)[source]¶
Bases:
abc.ABC
Interface for the external storage handler.
- property credentials¶
Return the provider credentials for this storage handler.
- property provider¶
Return the dataset provider for this storage handler.
- property storage_scheme¶
Storage’s URI scheme.
- class renku.core.interface.storage.IStorageFactory[source]¶
Bases:
abc.ABC
Interface to get an external storage.
- abstract static get_storage(storage_scheme, provider, credentials, configuration, uri_convertor)[source]¶
Return a storage that handles provider.
- Parameters
storage_scheme (str) – Storage name.
provider (ProviderApi) – The backend provider.
credentials (ProviderCredentials) – Credentials for the provider.
configuration (Dict[str, str]) – Storage-specific configuration that are passed to the IStorage implementation
uri_convertor (Callable[[str], str]) – A function that converts backend-specific URI to a URI that is usable by the IStorage implementation.
- Returns
An instance of IStorage.
Renku plan gateway interface.
- class renku.core.interface.plan_gateway.IPlanGateway[source]¶
Bases:
abc.ABC
Interface for the PlanGateway.
Renku project gateway interface.
Implementations¶
Implementation of Gateway interfaces.
Renku activity database gateway implementation.
- class renku.infrastructure.gateway.activity_gateway.ActivityGateway[source]¶
Bases:
renku.core.interface.activity_gateway.IActivityGateway
Gateway for activity database operations.
- get_activities_by_generation(path, checksum=None)[source]¶
Return the list of all activities that generate a path.
- get_activities_by_usage(path, checksum=None)[source]¶
Return the list of all activities that use a path.
- get_downstream_activities(activity, max_depth=None)[source]¶
Get downstream activities that depend on this activity.
- get_downstream_activity_chains(activity)[source]¶
Get a list of tuples of all downstream paths of this activity.
- get_upstream_activities(activity, max_depth=None)[source]¶
Get upstream activities that this activity depends on them.
- get_upstream_activity_chains(activity)[source]¶
Get a list of tuples of all upstream paths of this activity.
- remove(activity, keep_reference=True, force=False)[source]¶
Remove an activity from the storage.
- Parameters
activity (Activity) – The activity to be removed.
keep_reference (bool) – Whether to keep the activity in the
activities
index or not.force (bool) – Force-delete the activity even if it has downstream activities.
- renku.infrastructure.gateway.activity_gateway.reindex_catalog(database)[source]¶
Clear and re-create database’s activity-catalog and its relations.
Renku generic database gateway implementation.
- class renku.infrastructure.gateway.database_gateway.ActivityDownstreamRelation(downstream, upstream)[source]¶
Bases:
object
Implementation of Downstream interface.
- class renku.infrastructure.gateway.database_gateway.DatabaseGateway[source]¶
Bases:
renku.core.interface.database_gateway.IDatabaseGateway
Gateway for base database operations.
- renku.infrastructure.gateway.database_gateway.dump_activity(activity, catalog, cache)[source]¶
Get storage token for an activity.
- renku.infrastructure.gateway.database_gateway.dump_downstream_relations(relation, catalog, cache)[source]¶
Dump relation entry to database.
- renku.infrastructure.gateway.database_gateway.initialize_database(database)[source]¶
Initialize an empty database with all required metadata.
- renku.infrastructure.gateway.database_gateway.load_activity(token, catalog, cache)[source]¶
Load activity from storage token.
- renku.infrastructure.gateway.database_gateway.load_downstream_relations(token, catalog, cache)[source]¶
Load relation entry from database.
Renku dataset gateway interface.
- class renku.infrastructure.gateway.dataset_gateway.DatasetGateway[source]¶
Bases:
renku.core.interface.dataset_gateway.IDatasetGateway
Gateway for dataset database operations.
Storage factory implementation.
- class renku.infrastructure.storage.factory.StorageFactory[source]¶
Bases:
renku.core.interface.storage.IStorageFactory
Return an external storage.
- static get_storage(storage_scheme, provider, credentials, configuration, uri_convertor)[source]¶
Return a storage that handles provider.
- Parameters
storage_scheme (str) – Storage name.
provider (ProviderApi) – The backend provider.
credentials (ProviderCredentials) – Credentials for the provider.
configuration (Dict[str, str]) – Storage-specific configuration that are passed to the IStorage implementation
uri_convertor (Callable[[str], str]) – A function that converts backend-specific URI to a URI that is usable by the IStorage implementation.
- Returns
An instance of IStorage.
Base storage handler.
- class renku.infrastructure.storage.rclone.RCloneStorage(storage_scheme, provider, credentials, provider_configuration, provider_uri_convertor)[source]¶
Bases:
renku.core.interface.storage.IStorage
External storage implementation that uses RClone.
- get_hashes(uri, hash_type='md5')[source]¶
Download hashes with rclone and parse them.
Returns a tuple containing a list of parsed hashes.
- Parameters
uri (str) – Provider uri.
hash_type (str) – Type of hash to get from rclone (Default value = md5).
Example
hashes_raw json:
[ { "Path":"resources/hg19.window.masker.bed.gz.tbi","Name":"hg19.window.masker.bed.gz.tbi", "Size":578288,"MimeType":"application/x-gzip","ModTime":"2022-02-07T18:45:52.000000000Z", "IsDir":false,"Hashes":{"md5":"e93ac5364e7799bbd866628d66c7b773"},"Tier":"STANDARD" } ]
- run_command(command, *args, **kwargs)[source]¶
Run a RClone command with storage-specific configuration.
- renku.infrastructure.storage.rclone.get_rclone_env_var_name(provider_name, name)[source]¶
Get name of an RClone env var config.
- renku.infrastructure.storage.rclone.run_rclone_command(command, *args, env=None, **kwargs)[source]¶
Execute an RClone command.
- renku.infrastructure.storage.rclone.transform_args(*args)[source]¶
Transforms args to command line args.
- renku.infrastructure.storage.rclone.transform_kwargs(**kwargs)[source]¶
Transforms kwargs to command line args.
Renku plan database gateway implementation.
- class renku.infrastructure.gateway.plan_gateway.PlanGateway[source]¶
Bases:
renku.core.interface.plan_gateway.IPlanGateway
Gateway for plan database operations.
Renku project gateway interface.
Repository¶
Renku uses git repositories for tracking changes. To abstract away git internals,
we delegate all git calls to the Repository
class.
An abstraction layer for the underlying VCS.
- class renku.infrastructure.repository.Actor(name, email)[source]¶
Bases:
NamedTuple
Author/creator of a commit.
Create new instance of Actor(name, email)
- email¶
Alias for field number 1
- name¶
Alias for field number 0
- class renku.infrastructure.repository.BaseRepository(path='.', repository=None)[source]¶
Bases:
object
Abstract Base repository.
- property active_branch¶
Return current checked out branch.
- property all_files¶
Return absolute paths of all files in the index and untracked files.
- property branches¶
Return all branches.
- commit(message, *, amend=False, author=None, committer=None, no_verify=False, no_edit=False, paths=None)[source]¶
Commit added files to the VCS.
- copy_content_to_file(path, *, revision=None, checksum=None, output_path=None, apply_filters=True)[source]¶
Get content of an object using its checksum, write it to a file, and return the file’s path.
- Parameters
path (Union[Path, str]) – Relative or absolute path to the file.
revision (Optional[Union[Reference, str]]) – A commit/branch/tag to get the file from. This cannot be passed with
checksum
.checksum (Optional[str]) – Git hash of the file to be retrieved. This cannot be passed with
revision
.output_path (Optional[Union[Path, str]]) – A path to copy the content to. A temporary file is created if it is
None
.apply_filters (bool) – Whether to apply Git filter on the retrieved object. Note that
apply_filters
still works if repository is cloned with--skip-smudge
or ifGIT_LFS_SKIP_SMUDGE
is set. It also works if there is not entry for the file in.gitattributes
(e.g. when a file was deleted). The reason is that we use git lfs smudge command to get the file content if this option is passed and we also disableGIT_LFS_SKIP_SMUDGE
.
- Returns
The path to the created file.
- create_worktree(path, reference, branch=None, checkout=True, detach=False)[source]¶
Create a git worktree.
- Parameters
path (Path) – Target folder.
reference (Union[Branch, Commit, Reference, str]) – the reference to base the tree on.
branch (str, optional) – Optional new branch to create in the worktree.
checkout (bool, optional) – Whether to perform a checkout of the reference (Default value = False).
detach (bool, optional) – Whether to detach HEAD in worktree (Default value = False).
- fetch(remote=None, refspec=None, all=False, tags=False, unshallow=False, depth=None)[source]¶
Update a remote branches.
- property files¶
Return a list of all files in the current version of the repository.
- get_attributes(*paths)[source]¶
Return a map from paths to its attributes.
NOTE: Dict keys are the same relative or absolute path as inputs.
- get_configuration(writable=False, scope=None)[source]¶
Return git configuration.
NOTE: Scope can be “global” or “local”.
- get_content(path, *, revision=None, checksum=None, binary=False)[source]¶
Get content of a file in a given revision as text or binary.
- get_existing_paths_in_revision(paths=None, revision='HEAD')[source]¶
List all paths that exist in a revision.
- get_ignored_paths(*paths)[source]¶
Return ignored paths matching
.gitignore
file.NOTE: This function returns the same value as inputs: If input is an absolute path output is an absolute path. The same is true for relative paths. NOTE: Relative paths should be relative to the current working directory and not the repository’s root.
- get_object_hash(path, revision=None)[source]¶
Return git hash of an object in a Repo or its submodule.
NOTE: path must be relative to the repo’s root regardless if this function is called from a subdirectory or not.
- get_object_hashes(paths, revision=None)[source]¶
Return git hash of an object in a Repo or its submodule.
NOTE: path must be relative to the repo’s root regardless if this function is called from a subdirectory or not.
- get_previous_commit(path, revision=None, first=False, full_history=True, submodule=False)[source]¶
Return a previous commit for a given path starting from
revision
.
- get_raw_content(*, path, revision=None, checksum=None)[source]¶
Get raw content of a file in a given revision as text without applying any filter on it.
- get_revisions_paths(*checksums)[source]¶
Return a revision:path tuple for each checksum so that revision contains the given blob with the checksum.
- static hash_object(path)[source]¶
Create a git hash for a a path. The path doesn’t need to be in a repository.
- static hash_objects(paths)[source]¶
Create a git hash for a list of paths. The paths don’t need to be in a repository.
- property head¶
HEAD of the repository.
- is_dirty(untracked_files=True)[source]¶
Return True if the repository has modified or untracked files ignoring submodules.
- iterate_commits(*paths, revision=None, reverse=False, full_history=False, max_count=- 1)[source]¶
Return a list of commits.
- property lfs¶
Return a Git LFS manager.
- property path¶
Absolute path to the repository’s root.
- push(remote=None, refspec=None, *, no_verify=False, set_upstream=False, delete=False, force=False)[source]¶
Push local changes to a remote repository.
- property remotes¶
Return all remotes.
- remove(*paths, index=False, not_exists_ok=False, recursive=False, force=False)[source]¶
Remove paths from repository or index.
- property staged_changes¶
Return a list of staged changes.
NOTE: This can be implemented by
git diff --cached --name-status -z
.
- property submodules¶
Return a list of submodules.
- property tags¶
Return all available tags.
- property unmerged_blobs¶
Return a map of path to stage and blob for unmerged blobs in the current index.
- property unstaged_changes¶
Return a list of changes that are not staged.
- property untracked_files¶
Return the list of untracked files.
- class renku.infrastructure.repository.Branch(repository, path)[source]¶
Bases:
renku.infrastructure.repository.Reference
A git branch.
- property remote_branch¶
Return the remote branch if any.
- class renku.infrastructure.repository.BranchManager(repository)[source]¶
Bases:
object
Manage branches of a Repository.
- class renku.infrastructure.repository.Commit(repository, commit)[source]¶
Bases:
object
A VCS commit.
- property author¶
Author of the commit.
- property authored_datetime¶
Commit authored date.
- property committed_datetime¶
Commit date.
- property committer¶
Committer of the commit.
- get_changes(*paths, commit=None, patch=False)[source]¶
Return list of changes in a commit.
NOTE: This function can be implemented with
git diff-tree
. NOTE: Whenpatch
is FalseDiff.diff
will be empty. We need to callCommit.diff
twice whenpatch
is True because GitPython won’t setDiff.change_type
in this case.
- property hexsha¶
Commit sha.
- property message¶
Commit message.
- property parents¶
List of commit parents.
- property root¶
Return True if this commit is the root commit.
- property tree¶
Return all objects in the commit’s tree.
- class renku.infrastructure.repository.Configuration(repository=None, scope=None, writable=True)[source]¶
Bases:
object
Git configuration manager.
- class renku.infrastructure.repository.Diff(a_path, b_path, change_type, diff)[source]¶
Bases:
NamedTuple
A single diff object between two trees.
Create new instance of Diff(a_path, b_path, change_type, diff)
- a_path¶
Alias for field number 0
- property added¶
True if file was added.
- b_path¶
Alias for field number 1
- change_type¶
Alias for field number 2
- property deleted¶
True if file was deleted.
- diff¶
Alias for field number 3
- class renku.infrastructure.repository.DiffChangeType(value)[source]¶
Bases:
enum.Enum
Type of change in a
Diff
.
- class renku.infrastructure.repository.DiffLine(text, change_type)[source]¶
Bases:
NamedTuple
A single line in a patch.
Create new instance of DiffLine(text, change_type)
- property added¶
True if line was added.
- change_type¶
Alias for field number 1
- property deleted¶
True if line was deleted.
- text¶
Alias for field number 0
- class renku.infrastructure.repository.DiffLineChangeType(value)[source]¶
Bases:
enum.Enum
Type of change in a
DiffLine
.
- class renku.infrastructure.repository.Object(path, type, size, hexsha)[source]¶
Bases:
NamedTuple
Represent a git object.
Create new instance of Object(path, type, size, hexsha)
- hexsha¶
Alias for field number 3
- path¶
Alias for field number 0
- size¶
Alias for field number 2
- type¶
Alias for field number 1
- class renku.infrastructure.repository.Reference(repository, path)[source]¶
Bases:
object
A git reference.
- property commit¶
Commit pointed to by the reference.
- property name¶
Reference name.
- property path¶
Reference path.
- class renku.infrastructure.repository.Remote(repository, name)[source]¶
Bases:
object
Remote of a Repository.
- property head¶
The head commit of the remote.
- property name¶
Remote’s name.
- property references¶
Return a list of remote references.
- property url¶
Remote’s URL.
- class renku.infrastructure.repository.RemoteManager(repository)[source]¶
Bases:
object
Manage remotes of a Repository.
- class renku.infrastructure.repository.RemoteReference(repository, path)[source]¶
Bases:
renku.infrastructure.repository.Reference
A git remote reference.
- property remote¶
Return reference’s remote.
- class renku.infrastructure.repository.Repository(path='.', search_parent_directories=False, repository=None)[source]¶
Bases:
renku.infrastructure.repository.BaseRepository
Abstract Base repository.
- class renku.infrastructure.repository.Submodule(parent, name, path, url)[source]¶
Bases:
renku.infrastructure.repository.BaseRepository
A git submodule.
- property name¶
Return submodule’s name.
- property relative_path¶
Relative submodule’s path to its parent repository.
- property url¶
Return submodule’s url.
- class renku.infrastructure.repository.SubmoduleManager(repository)[source]¶
Bases:
object
Manage submodules of a Repository.
- class renku.infrastructure.repository.SymbolicReference(repository, path)[source]¶
Bases:
renku.infrastructure.repository.Reference
A git symbolic reference.
- property reference¶
Return the reference that this object points to.
- class renku.infrastructure.repository.Tag(repository, path)[source]¶
Bases:
renku.infrastructure.repository.Reference
A git tag.
- property commit¶
Return the commit the tag refers to.
- class renku.infrastructure.repository.TagManager(repository)[source]¶
Bases:
object
Manage tags of a Repository.