Data scientists & domain researchers
Open Source communities & application developers
Data Analytic development framework
COMPSs is a programming framework that aims to facilitate the parallelisation of existing applications written in Java, C/C++ and Python scripts. For that purpose, it offers a simple programming model based on sequential development in which the user is mainly responsible for identifying the functions to be executed as asynchronous parallel tasks. A runtime system is in charge of exploiting the inherent concurrency of the code, automatically detecting and enforcing the data dependencies between tasks and spawning these tasks to the available resources, which can be nodes in a cluster, clouds or containers platforms. In cloud, COMPSs provides scalability and elasticity features allowing the dynamic provision of resources.
In a nutshell, the main added value of COMPSs is its focus on different capabilities in the same framework with a low learning curve as developers do not have to deal with application programming interfaces (APIs). What is more, the development and execution of the applications is not restricted to a proprietary infrastructure as interoperability is a key feature.
Orchestrate a complex application in terms of number and type of tasks
Design of a workflow for the parallel simulations of many mutations and that could be executed on many platforms. Production tests require large (days) simulations
Implement a distributed application with MPI parts
Manage the resources dynamically distributed in a changing environment
Application integrable into multiple parallel platforms, balancing and organising internally all the necessary subtasks to ensuring an efficient usage of the computing resources.
Computations executed in real time; transparent management of the resources
COMPSs is directed to developers who don’t have expertise to parallelise their applications and have the need to run the code on multiple computing platforms.
The COMPSs programming model is designed to avoid modifications of the original application as possible. Consequently, its impact on the application is reduced to a minimum, which means that the only requirement is to implement a description of the functions, differently from other tools (i.e. Spark) that require adapting the code to a specific data model and operators.
A typical porting of an application to COMPSs includes a set of iterations to identify the tasks, optimize the parameters and the access to the data (for example, if a task should access a file on a shared file system or rather send the data as an object). After this phase, the code can be executed and tested on different infrastructures without the need of changing anything. The main benefit in this approach is that the user achieves great performance and scalability in the execution of his/her application without a rewriting the code. It’s very easy to adapt and compose applications using the COMPSs task-based programming model, that is also considered a good approach for Big Data applications. The COMPSs approach has been implemented in many use cases in the recent years always with successful results in terms of usability of the applications, performance and adoption. The previously described uses cases are the ones we are currently supporting in their production phase. The Guidance software is a joint effort between the COMPSs group, the Computational Genomic group of BSC and other research groups at European level; the usage of the tool on the Marenostrum supercomputer, has helped to discover a new gene associated with the risk of T2-diabetes and the results of this study, published in Nature, had a great impact on the scientific community (https://www.nature.com/articles/s41467-017-02380-9#Abs1)
The current release is 2.2, it can be downloaded from http://compss.bsc.es along with documentation.
COMPSs is available as Open Source under Apache2 license. It can be easily installed on different kind of infrastructures using packages available for different platforms. Moreover, it is also available as Docker image, easing the deployment
Daniele Lezzi: firstname.lastname@example.org
View related publicatios
--> COMP Superscalar, an interoperable programming framework, SoftwareX, Volumes 3-4, December 2015, Pages 32-36, Badia, R. M., J. Conejero, C. Diaz, J. Ejarque, D. Lezzi, F. Lordan, C. Ramon-Cortes, and R. Sirvent, DOI: 10.1016/j.softx.2015.10.004
--> Ramon-Cortes, C., Serven, A., Ejarque, J., Lezzi, D., Badia, R.M. Transparent Orchestration of Task-based Parallel Applications in Containers Platforms. Journal of Grid Computing (2018).