Datasets

Models representing datasets.

Dataset object

class renku.core.models.dataset.Dataset(*, creators: List[renku.core.models.provenance.agent.Person] = None, dataset_files: List[renku.core.models.dataset.DatasetFile] = None, date_created: datetime.datetime = None, date_published: datetime.datetime = None, date_removed: datetime.datetime = None, derived_from: renku.core.models.dataset.Url = None, description: str = None, id: str = None, identifier: str = None, images: List[renku.core.models.dataset.ImageObject] = None, in_language: renku.core.models.dataset.Language = None, initial_identifier: str = None, keywords: List[str] = None, license: str = None, name: str = None, project_id: str = None, same_as: renku.core.models.dataset.Url = None, title: str = None, version: str = None)[source]

Represent a dataset.

add_or_update_files(files: Union[renku.core.models.dataset.DatasetFile, List[renku.core.models.dataset.DatasetFile]])[source]

Add new files or update existing files.

clear_files()[source]

Remove all files.

copy() → renku.core.models.dataset.Dataset[source]

Return a clone of this dataset.

creators_csv

Comma-separated list of creators associated with dataset.

creators_full_csv

Comma-separated list of creators with full identity.

derive_from(dataset: renku.core.models.dataset.Dataset, creator: Optional[renku.core.models.provenance.agent.Person], identifier: str = None)[source]

Make self a derivative of dataset and update related fields.

files

Return list of existing files.

find_file(path: Union[pathlib.Path, str]) → Optional[renku.core.models.dataset.DatasetFile][source]

Find a file in the dataset using its relative path.

freeze()

Set immutable property.

classmethod from_jsonld(data, schema_class=None)[source]

Create an instance from JSON-LD data.

static generate_id(identifier: str) → str[source]

Generate an identifier for Dataset.

immutable

Return if object is immutable.

is_removed() → bool[source]

Return true if dataset is removed.

keywords_csv

Comma-separated list of keywords associated with dataset.

reassign_oid()

Reassign oid (after assigning a new identifier for example).

remove(date: datetime.datetime = None)[source]

Mark the dataset as removed.

replace_identifier(identifier: str = None)[source]

Replace dataset’s identifier and update relevant fields.

NOTE: Call this only for newly-created/-imported datasets that don’t have a mutability chain because it sets initial_identifier.

to_jsonld()[source]

Create JSON-LD.

Mark a file as removed using its relative path.

update_files_from(current_dataset: renku.core.models.dataset.Dataset, date: datetime.datetime = None)[source]

Check current_files to reuse existing entries and mark removed files.

update_metadata(**kwargs)[source]

Updates metadata.

update_metadata_from(other: renku.core.models.dataset.Dataset)[source]

Update metadata from another dataset.

Dataset file

Manage files in the dataset.

class renku.core.models.dataset.DatasetFile(*, based_on: renku.core.models.dataset.RemoteEntity = None, date_added: datetime.datetime = None, date_removed: datetime.datetime = None, entity: renku.core.models.entity.Entity = None, id: str = None, is_external: bool = False, source: Union[pathlib.Path, str] = None)[source]

A file in a dataset.

copy() → renku.core.models.dataset.DatasetFile[source]

Return a clone of this object.

classmethod from_path(client, path: Union[str, pathlib.Path], source=None, based_on: renku.core.models.dataset.RemoteEntity = None) → Optional[renku.core.models.dataset.DatasetFile][source]

Return an instance from a path.

static generate_id()[source]

Generate an identifier for DatasetFile.

NOTE: ID should not rely on Entity properties because the same Entity can be added and removed multiple times. So, it should be marked by different DatasetFiles.

is_equal_to(other: renku.core.models.dataset.DatasetFile)[source]

Compare content.

NOTE: id is generated randomly and should not be included in this comparison.

is_removed() → bool[source]

Return true if dataset is removed and should not be accessed.

classmethod make_instance(**kwargs)

Instantiate from the given parameters.

remove(date: datetime.datetime = None)[source]

Create a new instance and mark it as removed.

to_jsonld()[source]

Create JSON-LD.