This document describes the implementation of the programming model prototypes developed as a part of the EUBra-BIGSEA platform. The programming models offer the tools to abstract the data services to the user scenarios and execute them on the QoS infrastructure. COMPSs and Apache Spark are the two available frameworks for the porting of the scenarios. This document, together with the description of the software components available in the project’s repository, realizes the milestone MS13 First release of the programming layer.
- COMPSs applications can be written in sequential Java, Python or C/C++, and make use of other higher-level software components, such as OPHIDIA workflows. Sequential code is instrumented with data flow information that COMPSs uses to infer parallelism. COMPSs is platform agnostic and deals both with the execution and the negotiation with the computing infrastructure to request the necessary resources for the execution of the workflows. In this project, COMPSs has been extended to create a Mesos framework and to support NoSQL storage. Additional dependencies are easily coded inside COMPSs jobs through the use of Docker containers.
- Lemonade (Live Exploration and Mining Of Non-trivial Amount of Data from Everywhere) is a visual platform for distributed computing, aimed to enable implementation, experimentation, testing and deployment of data processing and machine learning applications. It provides developers with high-level abstractions, called operations to build processing workflows using a graphical web interface. Lemonade currently generates Spark code, and it will be extended to support COMPSs workflows during the second year. Lemonade provides (or will provide) many operations typically used for Extraction, Transformation and Loading (ETL), including Data transformation, Machine Learning, Statistic analysis, Text processing and Data visualization. Lemonade is formed by a set of components which provide the whole functionality