Graph services
In Renku, the dependencies of research artifacts are recorded into a knowledge graph. Each project’s local knowledge graph is recorded in its repository; the creation of the global knowledge graph is possible via the graph services. When a project’s repository is pushed to the server, a webhook is triggered that causes the changes represented by the commits and all of the captured dependencies to be rendered as RDF triples and pushed to the triple store.
The graph services are made up of four micro-services: the webhook-service, triples-generator, token-repository and knowledge-graph. The knowledge graph data is stored in the triple store (currently Apache Jena). The basic architecture is illustrated below.
Sequence diagram of Graph Services APIs and processes.
POST <knowledge-graph>/knowledge-graph/graphql
An endpoint that allows performing GraphQL queries on the Knowledge Graph data.
POST <webhook-service>/projects/:id/webhooks
An endpoint to create a Graph Services webhook for a project in GitLab.
POST <webhook-service>/projects/:id/webhooks/validation
An endpoint to validate project’s webhook. It checks if a relevant Graph Services webhook exists on the repository in GitLab and if Graph Services have an Access Token associated with the project so they can use it for finding project specific information in GitLab.
POST <webhook-service>/webhooks/events
An endpoint to send Push Events containing information about commits pushed to the GitLab.
GET <webhook-service>/projects/:id/events/status
An endpoint that returns information about processing progress of events for a specific project.
Subscription to unprocessed Commit Events
A process initiated and maintained by Triples Generator instances so Event Log can send them Events requiring generation of triples.
Commit Events to RDF Triples
A process responsible for translating unprocessed Commit Events from the Event Log to RDF Triples in the RDF Store. This process runs continuously by polling the Event Log for unprocessed Commit Events.
Missed commits synchronization job
A scheduled job which synchronizes state between the Event Log and GitLab and generates Commit Events missing from the Event Log. It runs periodically with a configured interval.
Knowledge Graph re-provisioning process
A process executed on Triples Generator start-up that checks if triples in the RDF Store were generated with the version of renku-python currently set in the Triples Generator.