What is Renku?

The Renku Project is a web platform (Renkulab) and a command-line interface (Renku CLI) built on top of open source components for researchers, data scientists, educators, and students to help manage:

  • code
  • data
  • execution environments
  • workflows

Renku combines many widely-used open-source tools to equip every project on the platform with resources that aid reproducibility, reusability and collaboration. Version control for data and code, containerization for runtime environments and automatic workflow capture are the core pillars on which the platform is built.

Reproducibility

Renku facilitates reproducibility by keeping a record of each step of the analysis. The sources of all the important elements can be identified even if they span multiple projects or might originate in external public repositories. This cross-linking of research artifacts is made possible through the Renku Knowledge Graph. The analysis steps can be re-executed to ensure the veracity of the results or reused in other projects on different data or rerun with different parameters.

The final results can be packaged into a dataset and easily published with all the requisite metadata on public repositories like Zenodo or Dataverse. However, the archived data is only one (static) part of the story; it is the analysis project on the Renku platform that holds the invaluable information about the entire processing chain that led to those published results. Thanks to the versioned runtime environment, all of the calculations can be reproduced and verified by anyone.

Reusability

Once the data is packaged into a dataset in Renku, its use and application can be discovered through the Knowledge Graph search. The collaborators, colleagues or the interested public can then easily reuse this data with the full information of how it needs to be processed and applied. Similarly, the workflows applied to the data can be reused and rerun on the same data with different input parameters to scrutinize the robustness of the conclusions.

Collaboration

Data-driven discovery does not happen in a vacuum. Renku allows researchers and analysts to easily share computational environments for rapid prototyping to more quickly move ideas forward. Projects can be discussed through interactive notebooks, and templates with complex runtimes can be reused to efficiently bootstrap new experiments.

To lean more about the hosted part of the platform, read about Renkulab. To explore the possibilities of the lower-level tools, head to the Renku CLI documentation.