EUROPE - BRAZIL COLLABORATION OF BIG DATA SCIENTIFIC RESEARCH THROUGH CLOUD-CENTRIC APPLICATIONS

EUROPE - BRAZIL COLLABORATION OF BIG DATA SCIENTIFIC RESEARCH THROUGH CLOUD-CENTRIC APPLICATIONS

Deliverable 7.6 Validation of the Requirements

The EUBra-BIGSEA project is aimed at developing a set of cloud services empowering Big Data analytics to ease the development of massive data processing applications. EUBra-BIGSEA will develop models, predictive and reactive cloud infrastructure QoS techniques, efficient and scalable Big Data operators and a privacy and quality analysis framework, exposed to several programming environments. EUBra-BIGSEA aims at covering general requirements of multiple application areas, although it will showcase in the treatment of massive connected society information, and particularly in traffic recommendation.

The validation of the requirements is the last document of the Use Case work package, which aims at demonstrating the capabilities of the platform. This document is defined in some way as a comprehensive compendium of the components developed and its applicability, and constitutes a good source of information for readers who want to exploit EUBra-BIGSEA components.

In the beginning of the project, 25 functional and non-functional requirements were identified from 11 use stories defined from the three use cases. After the analysis of those requirements, 18 technical requirements were identified. Those requirements have been properly addressed by the components of EUBra-BIGSEA.

EUBra-BIGSEA has developed a set of 19 components that address five layers (infrastructure management, programming models, security and privacy, high-level data analytic services and final user applications. These components provide platform-agnostic automatic deployment and configuration of virtual infrastructures, horizontal elasticity, vertical elasticity at hypervisor and framework level, QoS prediction and scheduling, privacy annotation, quality assurance, security mechanisms, vulnerability assessment, data analytic functions, parallel programming models, high-level design tools for data analytic workflows, a large set of high-level services for traffic data processing and modelling, entity matching services and three final user applications.

The whole ecosystem of EUBra-BIGSEA gives answer to the requirements for three use cases defined during the early stage of the project (data integration, descriptive models and predictive models). They deal with the way jobs are processed, the way data is ingested and stored as well as the management of the results. The system demonstrates the execution of short jobs, parallel jobs, high-throughput jobs, QoS boundaries, remote datacube analysis, and the creation of parallel applications from a graphical user interface. Several demos have been generated by integrating several components.

Finally EUBra-BIGSEA has developed several applications integrating multiple components of the platform that have been used to validate and demonstrate the components and to exemplify how solutions can be built. The main examples are: tools for convenient management of the deployment of a full elastic virtual cluster; horizontal and vertical elasticity on applications; data analytics with privacy annotations; and composing complex workflows with a graphical interface, which generates parallel COMPSs and Spark code.

Categories: