In this blogpost, we cover an issue that may seem not directly related to the goals of the MEDINA project, but which is at the heart of certification automation: When applying automated tools for certification tasks, these tools must be trusted. To be able to trust a tool, however, several criteria may be important, including code reviews by auditors and thorough evaluations of the tools in question. So, only when a tool’s quality is ensured, it can be used in a (partly) automated certification process.
CERTIFICATION AND STATIC CODE ANALYSIS
Code analysis is an important approach for compliance with certain certification requirements, e.g. in the EUCS. Recently, several tools for software analysis have been developed in MEDINA to possibly cover certification criteria, such as the following (taken from the EUCS draft):
- CKM-02.1: “The CSP shall define and implement strong encryption mechanisms for the transmission of cloud customer data over public Networks.”
- CKM-03.1: “The CSP shall document and implement procedures and technical safeguards to encrypt cloud customers’ data during storage.”
- DEV-05.3: “The tests of the security features shall cover all the specified inputs and all specified outcomes, including all specified error conditions.”
- OPS-19.1: “The CSP shall perform on a regular basis tests to detect publicly known vulnerabilities on the system components used to provide the cloud service, in accordance with policies for handling vulnerabilities (cf. OPS-17).”
The requirements shown above may be checked by software analysis, since they often depend on the source code and the configuration thereof.
In this blog post, we dive into a more fundamental problem we have encountered when developing and publishing such tools: They are mostly prototypical implementations of relatively new approaches, thus, their effectiveness cannot easily be demonstrated. Without knowing their effectiveness, however, CSPs and auditors cannot trust them with fulfilling the requirements, as for instance listed above.
EXISTING TOOLS AND NOVEL TOOLS DEVELOPED IN MEDINA
One direction of novel research in MEDINA is the development of graph-based analysis tools, more specifically based on code property graphs. The basis for our work on static analysis tools was a public library of a code property graph – the cpg. Based on this library, we have developed the Cloud Property Graph (CloudPG), and the Privacy Property Graph (PPG). The PPG is a new approach for analyzing source code and its deployment for privacy weaknesses.
Testing this tool for effectiveness is, however, tricky. In the existing scientific literature, comparable approaches are often evaluated and compared based on existing benchmarks or testbeds. Such a possibility, however, did not exist for the Privacy Property Graph (PPG).
For this reason, we have developed a new testbed for such analysis tools – the Patient Community social network.
THE PATIENT COMMUNITY SOCIAL NETWORK (PCSN)
The purpose of the PCSN is to provide a social network to patients who can exchange information about their medical journey with other patients who have the same disease. This way, they can learn about what medications others take and which symptoms they experience.
The PCSN is made up of a number of microservices as shown in the figure below.
An example data flow through this architecture is as follows: a patient uploads patient health record (PHR) data via the phr manager, which stores the data in the PHR DB. Another patient who is in the same group can then request PHR data of his/her group via the group phr controller. This controller checks the group membership in the User DB and retrieves the appropriate PHR data from the PHR DB to be shown to the patient.
Evidently, this network has huge privacy risks, since medical data is among the most sensitive personal data. Other users could, for instance, profile another user, or someone with access to the server could abuse the sensitive data.
USING THE PCSN FOR EVALUATING SOFTWARE ANALYSIS TOOLS
To use the PCSN for evaluating software analysis tools, the tool in question can be applied to the source code of the PCSN and the results can be compared to the implemented threats. Both the source code and the documentation of implemented threats is included in the public GitHub project .
Consider the following examples:
- An administrator can uniquely identify a patient.
- LINDDUN threat: Identifiable context
- Implementation: The PHR manager can identify the user based on the identifying meta- data, like the IP address (and possibly other meta-data) contained in the HTTP protocol.
- An administrator can link PHR data and user data.
- LINDDUN threat: Linkability of rertrieved data
- Implementation: The group-phr-controller accesses both the User DB and the PHR DB and links the data.
Using this approach, we have implemented 26 out of the 35 LINDDUN threats to create an independent evaluation application. We hope not only that this implementation will help increase trust in tools for automated certification processes, but for privacy and security analysis tools in general.
The PCSN implementation was also presented as a poster at the Conference on Computer and Communications Security 2022 .
 Wuyts, Kim. Patient Community system – example Privacy analysis. Available at: https://www.linddun.org/_files/ugd/cc602e_b4f5b1fc19da49a9bb8e39f0933cadab.pdf, accessed on 15.12.2022.
 Kunz, I., Schneider, A., Banse, C., Weiss, K., & Binder, A. (2022, November). Poster: Patient Community–A Test Bed for Privacy Threat Analysis. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (pp. 3383-3385).Available at: https://medina-project.eu/sites/default/files/Publications/ccs-poster.pdf
 Patient Community Example implementation, available at https://github.com/clouditor/patient-community-example