Deliverable 4.2 Report about the final implementation of the integrated big and fast data eco-system

The integrated big and fast data eco-system represents a central component of the EUBra-BIGSEA platform and it is devoted to the management and processing of big data. It integrates a wide set of technologies, libraries and services in a cloud environment aiming to provide a general architecture to address the data challenges related to the massive connected societies use cases, by tackling big data issues, such as volume, variety and velocity. The eco-system interacts with the other layers of the platform to support data processing in Quality of Service (QoS), cloud–based scenarios, while also taking into account data privacy and security. Additionally, the strong integration with the abstraction layer (e.g. Lemonade and COMPSs) allows the application developers to take advantage of the eco-system technologies and libraries directly from the programming frameworks.

This document describes the final implementation of the data eco-system, starting from the requirements and its initial design defined in deliverable D4.1 “Design of the integrated big and fast data eco-system”. The technologies selection, deployment, testing and validation have played a key role during the implementation of the eco-system. Some applications have been developed at the level of the data platform to provide the proper means to evaluate and validate the features provided by the eco-system, which supports among others: fast data analysis over continuous streams from external data sources, general purpose data mining and machine learning, OLAP-based analysis on multidimensional data. To test the applications, some testbed infrastructures have been deployed in both Europe and Brazil. Additionally, a toolbox of algorithms and libraries has also been defined to support end-users analysis and provide them with a reference guide.