Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions docs/_data/tools_list.yml
Original file line number Diff line number Diff line change
Expand Up @@ -327,3 +327,12 @@
description: A modern, interactive web application for exploring and visualizing RO-Crates, built with Vue 3.
url: https://github.com/arunaengine/RO-Crate-Explorer
status:

- name: PyCOMPSs CLI
description: Quick and friendly visualisation of RO-Crates from the Command Line Interface with the 'inspect' option. Compatible with Workflow and Workflow Run RO-Crate profiles.
status: production
level:
url: https://pypi.org/project/pycompss-cli/
subitems:
- name: Documentation
url: https://compss-doc.readthedocs.io/en/stable/Sections/04_Ecosystem/09_CLI/02_Usage.html#inspect-workflow-provenance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a 404 error at this link

30 changes: 19 additions & 11 deletions docs/pages/use_cases/COMPSs.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,38 +31,46 @@ roles: [information_architect, researcher, software_developer] # should match it

COMP Superscalar ([COMPSs](https://compss.bsc.es/)) is a task-based programming model which aims to ease the development of applications for distributed infrastructures, such as large High-Performance Clusters (HPC), Clouds and Container managed clusters. PyCOMPSs is the Python binding of COMPSs.

COMPSs provides a programming interface for the development of applications in Python/Java/C/C++, a runtime system that exploits the inherent parallelism of applications at execution time, and a rich ecosystem for the operation, monitoring, performance evaluation and integration with Jupyter/Jupyterlab.
COMPSs provides a programming interface for the development of applications in Python/Java/C/C++/R, a runtime system that exploits the inherent parallelism of applications at execution time, and a rich ecosystem for the operation, monitoring, performance evaluation and integration with Jupyter/Jupyterlab.

The COMPSs runtime includes the capacity of automatically recording details of the application’s execution as metadata, also known as [Workflow Provenance](https://compss-doc.readthedocs.io/en/stable/Sections/05_Tools/04_Workflow_Provenance.html). The metadata is recorded in RO-Crate format, following [Workflow RO-Crate](https://w3id.org/workflowhub/workflow-ro-crate/1.0) and [Workflow Run Crate](https://www.researchobject.org/workflow-run-crate/) profiles. With workflow provenance, you are able to share not only your workflow application (i.e. the source code) but also your workflow run (i.e. the datasets used as inputs, and the outputs generated as results).
The COMPSs runtime includes the capacity of automatically recording details of the application’s execution as metadata, also known as [Workflow Provenance](https://compss-doc.readthedocs.io/en/stable/Sections/04_Ecosystem/05_Workflow_Provenance.html). The metadata is recorded in RO-Crate format, following [Workflow RO-Crate](https://w3id.org/workflowhub/workflow-ro-crate/1.0) and [Workflow Run RO-Crate](https://www.researchobject.org/workflow-run-crate/) profile collection. With workflow provenance, you are able to share not only your workflow application (i.e. the source code) but also your workflow run (i.e. the datasets used as inputs, the outputs generated as results, and rich information about every single task executed).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Provenance information can be useful for a number of things, including Governance, Reproducibility, Replicability, Traceability, or Knowledge Extraction, among others. In our case, we have initially targeted workflow provenance recording to enable users to publish research results obtained with COMPSs as artifacts that can be cited in scientific publications with their corresponding DOI, by using [WorkflowHub](https://workflowhub.eu/). Both workflow provenance metadata and its publication in WorkflowHub enable the reproducibility of the workflows.
Provenance information can be useful for a number of things, including Governance, Reproducibility, Replicability, Traceability, or Knowledge Extraction, among others. Workflow Provenance enables users to publish research results obtained with COMPSs as artifacts that can be cited in scientific publications with their corresponding DOI, by using [WorkflowHub](https://workflowhub.eu/) or [Zenodo](https://zenodo.org/) using the [RO-Crate InvenioRDM Deposit library](https://github.com/ResearchObject/ro-crate-inveniordm). Both workflow provenance metadata and its publication in WorkflowHub/Zenodo enable the reproducibility of the workflows.

We have also developed a 'inspect' option in the [PyCOMPSs CLI](https://pypi.org/project/pycompss-cli/) that allows to visualise in a friendly way not only COMPSs generated crates but also the ones generated from different WMSs that follow the Workflow related RO-Crate profiles. This means 'pycompss inspect' is interoperable at least with crates generated with: CWL, nextflow, Galaxy, Autosubmit, WfExS, Streamflow, Snakemake and Sapporo.

![COMPSs with RO-Crate](assets/img/COMPSs-screenshot.png)

## Examples of COMPSs RO-Crates

Plenty of examples of COMPSs Workflows with enabled provenance recording can be found at [WorkflowHub](https://workflowhub.eu/workflows?filter%5Bworkflow_type%5D=pycompss) (filtering the browsing by 'COMPSs' workflow type).

In addition, the COMPSs User Manual has a dedicated section on [how to generate Workflow Provenance with COMPSs](https://compss-doc.readthedocs.io/en/stable/Sections/05_Tools/04_Workflow_Provenance.html).
In addition, the COMPSs User Manual has a dedicated section on [how to generate Workflow Provenance with COMPSs](https://compss-doc.readthedocs.io/en/stable/Sections/04_Ecosystem/05_Workflow_Provenance.html).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another 404

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Eli, thanks for checking all the links! I was a bit ahead of myself. Our documentation branch ready for the next release and currently online is 'major-update', but on thursday it will be moved to 'stable'. That's why I prepared all the links so when the stable version is updated, they will work. An example of current 'major-update':

https://compss-doc.readthedocs.io/en/major-update/Sections/04_Ecosystem/05_Workflow_Provenance.html

If you want, we can hold this for 2 days and check on thursday that every link works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, okay - then yes, let's hold until the links work, please let me know when your docs update is done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Eli!!! We advanced the release of the documentation. Links should be working now.

Let me know any issue.


## Resources

* [COMPSs Homepage](https://compss.bsc.es/)
* [COMPSs documentation](https://compss-doc.readthedocs.io/en/stable/)
* [Workflow Provenance Slides Quick Overview](https://zenodo.org/records/11057731)
* [Workflow Provenance Detailed Slides](https://zenodo.org/records/10046567)
* [RO-Crate InvenioRDM Deposit library](https://github.com/ResearchObject/ro-crate-inveniordm)
* [PyCOMPSs CLI](https://pypi.org/project/pycompss-cli/)

## Publications

Raül Sirvent, Javier Conejero, Francesc Lordan, Jorge Ejarque, Laura Rodríguez-Navas, José M Fernández, Salvador Capella-Gutiérrez, Rosa M Badia (2022):
**Automatic, Efficient and Scalable Provenance Registration for FAIR HPC Workflows**.
_IEEE/ACM Workshop on Workflows in Support of Large-Scale Science (WORKS)_ (1-9)
<https://doi.org/10.1109/WORKS56498.2022.00006>
[[preprint](https://upcommons.upc.edu/handle/2117/384589)]
Panna Lukács, Rosa M. Badia, Raül Sirvent:
**Explaining AI Applications through Workflow Provenance**.
_Master Thesis, Universitat Politècnica de Catalunya, 2025_
<https://hdl.handle.net/2117/449230>

Leo, S., Crusoe, M. R., Rodríguez-Navas, L., Sirvent, R., Kanitz, A., De Geest, P., ... & Soiland-Reyes, S. (2024):
Leo, S., Crusoe, M. R., Rodríguez-Navas, L., Sirvent, R., Kanitz, A., De Geest, P., ... & Soiland-Reyes, S.:
**Recording provenance of workflow runs with RO-Crate**.
_PLoS ONE 19(9): e0309210._
_PLoS ONE 19(9): e0309210. 2024_
<https://doi.org/10.1371/journal.pone.0309210>

Raül Sirvent, Javier Conejero, Francesc Lordan, Jorge Ejarque, Laura Rodríguez-Navas, José M Fernández, Salvador Capella-Gutiérrez, Rosa M Badia:
**Automatic, Efficient and Scalable Provenance Registration for FAIR HPC Workflows**.
_IEEE/ACM Workshop on Workflows in Support of Large-Scale Science (WORKS) 2022_ (1-9)
<https://doi.org/10.1109/WORKS56498.2022.00006>
[[preprint](https://upcommons.upc.edu/handle/2117/384589)]