The exponential increase of the available Open Data and the affordability of cloud computing resources are an excellent opportunity for the democratisation of Data analysis. However, the development of Data Analytic applications in the cloud is a complex task that faces essential challenges such as the Quality of Service and the minimisation of privacy risks, requiring high-level technical skills.
EUBra-BIGSEA developed a framework, a platform and a library to ease the development of highly-scalable, privacy-aware data analytic applications running on top of Quality of Service cloud infrastructures, reducing development cycles and deployment costs. While EUBra-BIGSEA targets Data Scientists in general in the context of the project timeline it has been demonstrated implementing a set of applications for analysing data transportation data, aiming at improving urban transportation users experience.
EUBra-BIGSEA has developed a Big Data application development framework that comprises three primary assets that are not available in the market:
One QoS Data Analytics Platform based on cloud services that can be conveniently deployed, which provides transparent horizontal elasticity and vertical elasticity for enabling a running application to meet an execution deadline.
One Open-source Data Analytics applications development framework that provides a graphical interface to build up data-analytics workflows that include automatic discovery of parallelism and On-Line Analytical Processing functions and services for multi-level Privacy annotation, quality assurance and Entity Matching.
A toolbox of 8 Descriptive and Predictive models for building traffic data analysis applications.
The software is available in the project GitHub (https://github.com/eubr-bigsea) and DockerHub (https://hub.docker.com/u/eubrabigsea/), as well as in the EUBra-BIGSEA website (http://www.eubra-bigsea.eu) along with papers and presentations. Video demos are available on the youtube channel of the project (https://goo.gl/FTCq3g).
The work has required the collaboration of European and Brazilian experts combining their expertise on data analysis, application performance modelling, privacy management, data analytics, parallel processing and cloud services.