PRIVAaaS is a software toolkit that provides a set of libraries and tools that allow to control and reduce the data leakage in the context of Big Data processing and, consequently, to protect sensible information that is part of the EUBra-BIGSEA framework.
The process is divided into two perspectives which model different aspects of the anonymization problem: the first perspective is related to the anonymization of the loaded input data, while the second is related to the anonymization of the data resulting from the data processing algorithms. The result is output data that is anonymized for the intended usage scenario.
The process starts with the definition step done by users who have knowledge of privacy policies that will guide the anonymization process according to a set of rules which are implemented by an ontology. To maximize data utility while preserving low levels of disclosure risk, these policies govern 2 key anonymization phases: 1) the anonymization of raw data and how each algorithm can use each set of data; and 2) the anonymization of the data provided by the analytics algorithm to the end user, to avoid that knowledge that is extracted by the algorithm is unduly accessed.
PRIVAaaS has been developed to be used in the EUBra-BIGSEA use cases. Concretely, it was designed to be used in scenarios that involve the processing of massive amounts of data that may contain or lead to privacy-sensitive information.
Furthermore, as data anonymization is currently relevant in several sectors, PRIVAaaS can be used in all sectors that take advantage of Big Data analytics techniques: as government, health, private companies, etc.