Database

Renku uses an internal database store in the .renku/metadata that uses a custom implementation of the ZODB object database, with a separate file per main entity.

Database

Custom database for store Persistent objects.

class renku.infrastructure.database.Cache[source]

Database Cache.

clear()[source]

Remove all entries.

get(oid, default=None)[source]

See IPickleCache.

Parameters:
  • oid – The oid of the object to get.

  • default – Default value to return if object wasn’t found (Default value = None).

Returns:

The object or default value if the object wasn’t found.

new_ghost(oid, object)[source]

See IPickleCache.

pop(oid, default=<object object>)[source]

Remove and return an object.

Parameters:
  • oid – The oid of the object to remove from the cache.

  • default – Default value to return (Default value = MARKER).

Raises:

KeyError – If object wasn’t found and no default was given.

Returns:

The removed object or the default value if it doesn’t exist.

class renku.infrastructure.database.Database(storage)[source]

The Metadata Object Database.

This class is equivalent to a persistent.DataManager and implements the persistent.interfaces.IPersistentDataManager interface.

add(object, oid)[source]

Add a new object to the database.

NOTE: Normally, we add objects to indexes but this method adds objects directly to Dataset’s root. Use it only for singleton objects that have no Index defined for them (e.g. Project).

Parameters:
  • object (persistent.Persistent) – The object to add.

  • oid (OID_TYPE, optional) – The oid for the object (Default value = None).

add_index(name, object_type, attribute=None, key_type=None)[source]

Add an index.

Parameters:
  • name (str) – The name of the index.

  • object_type (type) – The type contained within the index.

  • attribute (str, optional) – The attribute of the contained object to create a key from (Default value = None).

  • key_type (type, optional) – The type of the key (Default value = None).

Returns:

The created Index object.

Return type:

Index

add_root_object(name, obj)[source]

Add an object to the DB root.

Parameters:
  • name (str) – The key of the object.

  • obj (Persistent) – The object to store.

clear()[source]

Remove all objects and clear all caches. Objects won’t be deleted in the storage.

commit()[source]

Commit modified and new objects.

classmethod from_path(path)[source]

Create a Storage and Database using the given path.

Parameters:

path (Union[pathlib.Path, str]) – The path of the database.

Returns:

The database object.

static generate_oid(object)[source]

Generate an oid for a persistent.Persistent object based on its id.

Parameters:

object (persistent.Persistent) – The object to create an oid for.

Returns:

An oid for the object.

get(oid)[source]

Get the object by oid.

Parameters:

oid (OID_TYPE) – The oid of the object to get.

Returns:

The object.

Return type:

persistent.Persistent

get_by_id(id)[source]

Return an object by its id.

Parameters:

id (str) – The id to look up.

Returns:

The object with the given id.

Return type:

persistent.Persistent

get_cached(oid)[source]

Return an object if it is in the cache or will be committed.

Parameters:

oid (OID_TYPE) – The id of the object to look up.

Returns:

The cached object.

Return type:

Optional[persistent.Persistent]

get_from_path(path, absolute=False, override_type=None)[source]

Load a database object from a path.

Parameters:
  • path (str) – Path of the database object.

  • absolute (bool) – Whether the path is absolute or a filename inside the database (Default value = False).

  • override_type (Optional[str]) – load object as a different type than what is set inside renku_data_type (Default value = None).

Returns:

The object.

Return type:

persistent.Persistent

static hash_id(id)[source]

Return oid from id.

Parameters:

id (str) – The id to hash.

Returns:

The hashed id.

Return type:

OID_TYPE

new_ghost(oid, object)[source]

Create a new ghost object.

Parameters:
  • oid (OID_TYPE) – The oid of the new ghost object.

  • object (persistent.Persistent) – The object to create a new ghost entry for.

static new_oid()[source]

Generate a random oid.

oldstate(object, tid)[source]

See persistent.interfaces.IPersistentDataManager::oldstate.

persist_to_path(object, path)[source]

Store an object to path.

readCurrent(object)[source]

We don’t use this method but some Persistent logic require its existence.

Parameters:

object – The object to read.

register(object)[source]

Register a persistent.Persistent object to be stored.

NOTE: When a persistent.Persistent object is changed it calls this method.

Parameters:

object (persistent.Persistent) – The object to register with the database.

remove_from_cache(object)[source]

Remove an object from cache.

Parameters:

object (persistent.Persistent) – The object to remove.

remove_root_object(name)[source]

Remove a root object from the database.

Parameters:

name (str) – The name of the root object to remove.

setstate(object)[source]

Load the state for a ghost object.

Parameters:

object (persistent.Persistent) – The object to set the state on.

class renku.infrastructure.database.Index(*args, **kwargs)[source]

Database index.

Create an index where keys are extracted using attribute from an object or a key.

Parameters:
  • name (str) – Index’s name.

  • object_type – Type of objects that the index points to.

  • attribute (Optional[str], optional) – Name of an attribute to be used to automatically generate a key (e.g. entity.path).

  • key_type – Type of keys. If not None then a key must be provided when updating the index (Default value = None).

add(object, *, key=None, key_object=None, verify=True)[source]

Update index with object.

If Index._attribute is not None then key is automatically generated. Key is extracted from key_object if it is not None; otherwise, it’s extracted from object.

Parameters:
  • object (persistent.Persistent) – Object to add.

  • key (Optional[str], optional) – Key to use in the index (Default value = None).

  • key_object – Object to use to extract a key from (Default value = None).

  • verify – Whether to check if the key is valid (Default value = True).

generate_key(object, *, key_object=None)[source]

Return index key for an object.

Key is extracted from key_object if it is not None; otherwise, it’s extracted from object.

Parameters:
  • object (persistent.Persistent) – The object to generate a key for.

  • key_object – The object to derive a key from (Default value = None).

Returns:

A key for object.

get(key, default=None)[source]

Return an entry based on its key.

Parameters:
  • key – The key of the entry to get.

  • default – Default value to return of entry wasn’t found (Default value = None).

Returns:

The found entry or the default value if it wasn’t found.

items()[source]

Return an iterator of keys and values.

keys(min=None, max=None, excludemin=False, excludemax=False)[source]

Return an iterator of keys.

property name

Return Index’s name.

property object_type

Return Index’s object_type.

pop(key, default=<object object>)[source]

Remove and return an object.

Parameters:
  • key – The key of the entry to remove.

  • default – Default value to return of entry wasn’t found (Default value = MARKER).

Returns:

The removed entry or the default value if it wasn’t found.

remove(object, *, key=None, key_object=None, verify=True)[source]

Remove object from the index.

If Index._attribute is not None then key is automatically generated. Key is extracted from key_object if it is not None; otherwise, it’s extracted from object.

Parameters:
  • object (persistent.Persistent) – Object to add.

  • key (Optional[str], optional) – Key to use in the index (Default value = None).

  • key_object – Object to use to extract a key from (Default value = None).

  • verify – Whether to check if the key is valid (Default value = True).

values()[source]

Return an iterator of values.

renku.infrastructure.database.MARKER = <object object>

These are used as _p_serial to mark if an object was read from storage or is new

class renku.infrastructure.database.ObjectReader(database)[source]

Deserialize objects loaded from storage.

deserialize(data)[source]

Convert JSON to Persistent object.

Parameters:

data – Data to deserialize.

Returns:

Deserialized object.

set_ghost_state(object, data)[source]

Set state of a Persistent ghost object.

Parameters:
class renku.infrastructure.database.ObjectWriter(database)[source]

Serialize objects for storage in storage.

serialize(object)[source]

Convert an object to JSON.

Parameters:

object (persistent.Persistent) – Object to serialize.

Returns:

Dictionary containing serialized data.

Return type:

dict

class renku.infrastructure.database.RenkuOOBTree(*args)[source]

Customize BTrees.OOBTree.BTree implementation.

class renku.infrastructure.database.Storage(path)[source]

Store Persistent objects on the disk.

load(filename, absolute=False)[source]

Load data for object with object id oid.

Parameters:
  • filename (str) – The file name of the data to load.

  • absolute (bool) – Whether the path is absolute or a filename inside the database (Default value: False).

Returns:

The loaded data in dictionary form.

store(filename, data, compress=False, absolute=False)[source]

Store object.

Parameters:
  • filename (str) – Target file name to store data in.

  • data (Union[Dict, List]) – The data to store.

  • compress (bool) – Whether to compress the data or store it as plain json (Default value = False).

  • absolute (bool) – Whether filename is an absolute path (Default value = False).

renku.infrastructure.database.get_attribute(object, name)[source]

Return an attribute of an object.

Parameters:
  • object – The object to get an attribute on.

  • name (Union[List[str], str) – The name of the attribute to get.

Returns:

The value of the attribute.

renku.infrastructure.database.get_class(type_name)[source]

Return the class for a fully-qualified type name.

Parameters:

type_name (Optional[str]) – The name of the class to get.

Returns:

The class.

Return type:

Optional[type]

renku.infrastructure.database.get_type_name(object)[source]

Return fully-qualified object’s type name.

Parameters:

object – The object to get the type name for.

Returns:

The fully qualified type name.

Return type:

Optional[str]

Persistent

Extension of persistent.Persistent that supports immutability.

Base Renku persistent class.

class renku.infrastructure.persistent.Persistent(*args, **kwargs)[source]

Base Persistent class for renku classes.

Subclasses are assumed to be immutable once persisted to the database. If a class shouldn’t be immutable then subclass it directly from persistent.Persistent.

freeze()[source]

Set immutable property.

property immutable

Return if object is immutable.

reassign_oid()[source]

Reassign oid (after assigning a new identifier for example).

unfreeze()[source]

Allows modifying an immutable object.

Don’t make an object mutable unless the intention is to drop the changes or modify the object in-place. Modified objects will be updated in-place which results in a binary diff when persisted. Normally, we want to create a mutable copy and persist it as a new object.

Immutable

Classes to support immutability of renku models.

Slots and Immutable classes.

class renku.infrastructure.immutable.DynamicProxy(subject, update=True)[source]

A proxy class to allow adding dynamic fields to slots/immutable classes.

class renku.infrastructure.immutable.Immutable(*args, **kwargs)[source]

An immutable class that its instances can be cached and reused.

Immutable subclasses should only contain immutable members. They must call super().__init__(…) to initialize their instances. They should not redefine id attribute.

NOTE: Immutable objects must not be modified during the whole provenance and not just only during the object’s lifetime. These is because we cache these objects and return a single instance if an object with the same id exists. For example, a DatasetFile object is not immutable across the whole provenance because once it gets removed its date_removed attribute is set which make the object different from a previous version. As a rule of thumb, an object can be immutable if all of its attributes values appear in its id.

Create and return an empty instance of the class.

classmethod make_instance(instance)[source]

Return a cached instance if available otherwise create an instance from the given parameters.

class renku.infrastructure.immutable.Slots(*args, **kwargs)[source]

An immutable class.

Subclasses are supposed to use __slots__ to define their members.

Create and return an empty instance of the class.

classmethod make_instance(**kwargs)[source]

Instantiate from the given parameters.