GithubHelp home page GithubHelp logo

easy-smpc / easy-smpc-performance-evaluation Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 0.0 463.21 MB

Performance evaluation of EasySMPC

License: Apache License 2.0

Dockerfile 1.95% Shell 9.55% Java 88.11% Batchfile 0.39%
cryptography secure-multi-party-computation performance-evaluation

easy-smpc-performance-evaluation's Introduction

EasySMPC - No-Code Secure Multi-Party Computation

EasySMPC is an app for securely summing up distributed confidential data using Secure Multi-Party Computation (SMPC). It is designed to enable simple statistical analysis with maximum usability, easy access and a strict no-coding approach. Your parents should be able to use this and so should your physician!



Prerequisites

EasySMPC requires Java, at least in version 14. The Java runtime is bundled in our Installer package. Moreover, to use EasySMPC in the automated mode, an e-mail account is required, which is accessible via SMTP and IMAP from the system executing EasySMPC. If you want to use EasySMPC in the automated micro-services mode, a server backend is necessary.

To compile the app from source in addition to the Java JDK the Maven build system is required.

Installation

EasySMPC does not need an installation and can be used as a Java jar package. However, to increase portability we packaged the necessary Java runtime with our application in an installer to build an executable for Linux, Windows and Mac OSX. This installer does not need any administrator privileges and should be run as a user. The installers for Windows and MacOS are not signed. Thus a respective message while installing must be confirmed.

Get binary installer

Check out our releases page for Windows, Linux and MacOS executables.

Build from Source

To build the executable yourself, please clone this repository and build with maven (mvn package). The assembled executable should be in the target directory. At present time some tests occasionally fail. We're looking into that. Until those tests are passing please compile with mvn package -DskipTests.

To build the installer please build the jar package as described above and then use the supplied scripts for your target platform. E.g.:

cd installer && ./linux.sh

Features

EasySMPC was built to allow non-technical personel in medical research perform simple analysis without sharing their input data. We tried to achieve a very low threshold of technical prerequisites by using email as an, in most cases, already established and configured communication medium. As an alternative, a restful micro-services based server can be used.

  • Easy to use
  • Communication using established channels, e.g. emails or restful micro-services
  • Microsoft Excel and CSV import and export
  • Automation of the protocol using IMAP-Mailboxes
  • Automatic Proxy-Detection

Security

EasySMPC uses Arithmetic Secret Sharing [DZS15], the arithmetic extension of the GMW-Protocol [GMW87] to achieve the private computation of the sums. It uses a ring of size Ring size formula, the 12th Mersenne prime.

We are working on

  • Differential Privacy
  • Use EasySMPC with Slack/Mattermost/IRC/...

Screenshot

Screenshot

Quick start guide

  1. As a study initiator, click on Create new project and enter the names of all participants, their e-mail addresses, the variables to sum up and your own confidential data. For performing the computation, you can choose between an automatic or a manual mode.

    a) In manual mode the users need to exchange all messages by sending and receiving e-mails manually.

    b) In automatic mode participants receive and import the initial message manually into EasySMPC (see 2). All further messages are exchanged automatically. If the study initiator chooses automatic mode, all participants also have to choose automated mode. The user name and connection details provided for the automated mode will be saved for future use, the password will not be saved and must be re-entered if the study was closed in between.

  2. As a participant, you copy the message you received via email into the clipboard, click on Participate in project in EasySMPC and paste the content. You will now see the study definition and can enter your own confidential data and connection details (if applicable).

  3. As an initiator or participant, you now click on proceed. If running in automated mode, EasySMPC will automatically perform all steps until the final result is displayed. If running in manual mode, all users need to send and receive e-mails prepared by EasySMPC to perform the computation.

  4. The final perspective shows the result of the secure addition of all variables.

Tutorial

Please see the attached tutorial for a step-by-step guide using EasySMPC.

Command line version

There is also a command-line version of EasySMPC. After building or downloading from our release page, use the jar easy-smpc-cli.jar either as a creator or a participant. The command-line version only supports the automatic mode for both e-mail and micro-services. Please note that when using the automated e-mail mode, the command-line version will delete all previous EasySMPC relevant e-mails (subject of the e-mails start with [EasySMPC]).

E-mail mode

Creator

Execute the program with java -jar easy-smpc-cli.jar -create -connection-type email -l STUDY_NAME -b FILES_PATH_VARIABLES -d FILES_PATH_DATA -f PARTICIPANTS -a EMAIL_ADDRESS -p PASSWORD -i IMAP_HOST -x IMAP_PORT -y IMAP_ENCRYPTION -s SMTP_HOST -z SMTP_PORT -q SMTP_ENCRYPTION. The parameters have the following meaning:

  1. -create: Indicates the creation of a new EasySMPC project.
  2. -connection-type email: Indicates that the automated e-mail mode will be used
  3. -l STUDY_NAME: Name/title of the study. Must be consistently used by everyone, the creator and all participants.
  4. -b FILE_PATH_VARIABLES: The path to the Excel or CSV-files containing the variable names in the format firstFile.xlsx,secondFile.csv,... The data needs to be row-oriented and thus must have at least one column. In the case of more than one column, EasySMPC will concatenate all columns of a row into a single column and use this as the name of the variable. The variable names will be shared with all participants. (More setting options are available under the optional parameters.) For an example, see example-cli/variables.xlsx.
  5. -d FILE_PATH_DATA: The path to the Excel or CSV-files containing the creator's data in the format firstFile.xlsx,secondFile.csv,... The data needs to be row-oriented and must have at least two columns. The last column is regarded as the data value and therefore must always contain numbers only. A single dot as a decimal separator is allowed but not necessary. In case of exactly two columns, the first column will be regarded as the sole name. In the case of more than two columns, EasySMPC will concatenate all columns of a row but the last column to a single column and match this name with the variable names defined with the -b option. Variable names for which no value can be found will be set to zero. The data will not be shared with other participants (more setting options are available under the optional parameters). For an example, see example-cli/PKU comorbidities.xlsx.
  6. -f LIST_PARTICIPANTS: The names and e-mail addresses of the participants in the form name1,emailAddress1;name2,emailAddress2;name3,emailAddress3.... The first name and e-mail address will be the creator. In case of separate e-mail addresses for sending and receiving, the e-mail addresses in this parameter will be the e-mail addresses used for receiving (see parameters -a and -v).
  7. -a EMAIL_ADDRESS: E-mail address to be used for communication. If the parameter -v is set, this parameter will only be used as the receiving mail address.
  8. -m PASSWORD: Password of the e-mail address used. If the parameter -v is set, this parameter will be used as password for the receiving mail address provided with -a.
  9. -i IMAP_HOST: Hostname of the IMAP server.
  10. -x IMAP_PORT: Port of the IMAP server.
  11. -y IMAP_ENCRYPTION: IMAP server uses SSL/TLS or Starttls. Use either SSLTLS or STARTTLS.
  12. -s SMTP_HOST: Hostname of the SMTP server.
  13. -z SMTP_PORT: Port of the SMTP server.
  14. -q SMTP_ENCRYPTION: SMTP server uses SSL/TLS or Starttls. Use either SSLTLS or STARTTLS.

After running a successful EasySMPC process, check the result in the file result_<study name>_<timestamp>.xlsx or check the file easy-smpc.log for details of errors.

Please note that in addition to the parameters mentioned above the following optional parameters exists:

  1. -h: Pass this parameter if the data in the data and variables files are oriented horizontally.
  2. -e: Pass this parameter if the data in the data and variables files have headers, which need to be skipped.
  3. -j N_COLUMNS_TO_SKIP: Pass this parameter to skip the first n columns.
  4. -v EMAIL_ADDRESS_SENDING: Pass this parameter if the e-mail address used to send the e-mails is supposed to differ from the receiving e-mail address.
  5. -p PASSWORD_SENDING: Pass this as the password for the receiving e-mail address if the parameter -v is set.
  6. -n LOGON_NAME_RECEIVING: Pass this parameter if the user name to the receiving e-mail servers deviates from the e-mail address used (e.g. the user name is name and not [email protected]). The receiving e-mail address still needs to be passed.
  7. -w LOGON_NAME_SENDING: The same as parameter -n but for the sending e-mail address. The sending e-mail address still needs to be passed. The user name is not copied from the parameter -n. Thus, if the same user name is used for receiving and sending, both parameters -n and -w need to be set.
  8. -t AUTH_MECHANISMS_RECEIVING: Pass this parameter to set the IMAP authentication mechanisms of the receiving e-mail account. For details, we refer to the property mail.imap.auth.mechanisms in the Jakarta e-mail documentation.
  9. -u AUTH_MECHANISMS_SENDING: Pass this parameter to set the SMTP authentication mechanisms of the sending e-mail account. For details, we refer to the property mail.smtp.auth.mechanisms in the Jakarta e-mail documentation.

Participant

Execute the program with java -jar easy-smpc-cli.jar -participate -connection-type email -l STUDY_NAME -d FILE_PATH_DATA -o PARTICIPANT_NAME -a EMAIL_ADDRESS -p PASSWORD -i IMAP_HOST -x IMAP_PORT -y IMAP_ENCRYPTION -s SMTP_HOST -z SMTP_PORT -q SMTP_ENCRYPTION. Most parameters are explained in the section above, other parameters are described below:

  1. -participate: Indicates the participation in a (new) EasySMPC project.
  2. -o PARTICIPANT_NAME: Name of the participant as defined in the option -f by the creator.

After executing check the result in the file result_<study name>_<timestamp>.xlsx or check the file easy-smpc.log for details of errors.

Example

Data for an example can be found in the folder example-cli. An exemplary process with this data can be started with these three commands:

  1. java -jar easy-smpc-cli.jar -create -connection-type email -b "-create -l "Example Study" -b ./example-cli/variables.xlsx -d "./example-cli/PKU comorbidities.xlsx" -f "Creator,[email protected];Participant1,[email protected];Participant2,[email protected]" -a [email protected] -p thePassword -i imap.gmail.com -x 993 -y SSLTLS -s smtp.gmail.com -z 465 -q SSLTLS
  2. java -jar easy-smpc-cli.jar -participate -connection-type email -l "Example Study" -d "./example-cli/PKU comorbidities.xlsx" -o Participant1 -a [email protected] -p thePassword -i imap.ionos.de -x 993 -y SSLTLS -s smtp.ionos.de -z 465 -q SSLTLS5
  3. java -jar easy-smpc-cli.jar -participate -connection-type email -l "Example Study" -d "./example-cli/PKU comorbidities.xlsx" -o Participant2 -a [email protected] -p &r6=Jbh9 -i imap.ionos.de -x 993 -y SSLTLS -s smtp.ionos.de -z 465 -q SSLTLS

All three commands are expected to start on different computers. If you want to try it on a single computer (i.e. as a dry run), please use different folders for the three parties, since otherwise errors of writing log and result files can happen. Also, in this minimal test the same data file example-cli/PKU comorbidities.xlsx is used for each party. However, in a real-world usage each party would use different data in the file.

Micro-services mode

Creator

Execute the program with java -jar easy-smpc-cli.jar -create -connection-type easybackend -l STUDY_NAME -b FILES_PATH_VARIABLES -d FILES_PATH_DATA -f PARTICIPANTS -i BACKEND_SERVER_NAME -p PASSWORD. The parameters have the following meaning:

  1. -create: Indicates the creation of a new EasySMPC project.
  2. -connection-type easybackend: Indicates that the automated micro-services mode will be used.
  3. -l STUDY_NAME: Name/title of the study. Must be consistently used by everyone, the creator and all participants.
  4. -b FILE_PATH_VARIABLES: The path to the Excel or CSV-files containing the variable names in the format firstFile.xlsx,secondFile.csv,... The data needs to be row-oriented and thus must have at least one column. In the case of more than one column, EasySMPC will concatenate all columns of a row into a single column and use this as the name of the variable. The variable names will be shared with all participants. (More setting options are available under the optional parameters.) For an example, see example-cli/variables.xlsx.
  5. -d FILE_PATH_DATA: The path to the Excel or CSV-files containing the creator's data in the format firstFile.xlsx,secondFile.csv,... The data needs to be row-oriented and must have at least two columns. The last column is regarded as the data value and therefore must always contain numbers only. A single dot as a decimal separator is allowed but not necessary. In case of exactly two columns, the first column will be regarded as the sole name. In the case of more than two columns, EasySMPC will concatenate all columns of a row but the last column to a single column and match this name with the variable names defined with the -b option. Variable names for which no value can be found will be set to zero. The data will not be shared with other participants (more setting options are available under the optional parameters). For an example, see example-cli/PKU comorbidities.xlsx.
  6. -f LIST_PARTICIPANTS: The names and e-mail addresses of the participants in the form name1,emailAddress1;name2,emailAddress2;name3,emailAddress3.... The first name and e-mail address will be the creator. In case of separate e-mail adresses for sending and receiving, the e-mail addresses in this parameter will be the e-mail addresses used for receiving (see parameters -a and -v).
  7. -i BACKEND_SERVER_NAME: The name of the server/micro-services backend as DNS name or ip-address. This can also include a port number (e.g. https://server.org:port). Note: Only HTTPS, no HTTP addresses are allowed.
  8. -m PASSWORD: Password of the e-mail address used. If the parameter -v is set, this parameter will be used as password for the receiving mail address provided with -a.

After running a successful EasySMPC process, check the result in the file result_<study name>_<timestamp>.xlsx or check the file easy-smpc.log for details of errors.

Participant

Execute the program with java -jar easy-smpc-cli.jar -participate -connection-type easybackend -l STUDY_NAME -d FILES_PATH_DATA -i BACKEND_SERVER_NAME -a PARTICIPANT_NAME -o EMAIL_ADDRESS -p PASSWORD. Most parameters are explained in the section above, other parameters are described below:

  1. -participate: Indicates the participation in a (new) EasySMPC project.
  2. -o PARTICIPANT_NAME: Name of the participant as defined in the option -f by the creator.
  3. -o EMAIL_ADDRESS: E-mail address of the participant as defined in the option -f by the creator.

After executing check the result in the file result_<study name>_<timestamp>.xlsx or check the file easy-smpc.log for details of errors.

Troubleshooting

Neither an error nor a result in automated mode

Should the program wait for an unreasonable time without throwing an error, first check whether EasySMPC-related e-mails are in a spam folder (the subject of the e-mails start with [EasySMPC]). If so just copy them into the regular inbox. If nothing can be found in the spam folder, it is likely that the different programs are using different EasySMPC studies with the same name. To solve the issues either (1) delete all e-mails in all mailboxes starting with [EasySMPC] in the title or (2) restart the process with a new name for all participants as well as the creator.

Command-line version: Error writing the result into an Excel file

When executing the command-line version on Linux systems the following entries can appear in the log:

2022-01-01 12:00:00.000 INFO Start calculating and writing result
Exception in thread "Thread-1" java.lang.InternalError: java.lang.reflect.InvocationTargetException
        at java.desktop/sun.font.FontManagerFactory$1.run(FontManagerFactory.java:86)
        at java.base/java.security.AccessController.doPrivileged(AccessController.java:312)
        at java.desktop/sun.font.FontManagerFactory.getInstance(FontManagerFactory.java:74)
        at java.desktop/java.awt.Font.getFont2D(Font.java:497)
        at java.desktop/java.awt.Font.canDisplayUpTo(Font.java:2244)
        at java.desktop/java.awt.font.TextLayout.singleFont(TextLayout.java:469)
        at java.desktop/java.awt.font.TextLayout.<init>(TextLayout.java:530)
        at org.apache.poi.ss.util.SheetUtil.getDefaultCharWidth(SheetUtil.java:273)
        at org.apache.poi.ss.util.SheetUtil.getColumnWidth(SheetUtil.java:248)
        at org.apache.poi.ss.util.SheetUtil.getColumnWidth(SheetUtil.java:233)
        at org.apache.poi.xssf.usermodel.XSSFSheet.autoSizeColumn(XSSFSheet.java:555)
        at org.apache.poi.xssf.usermodel.XSSFSheet.autoSizeColumn(XSSFSheet.java:537)
        at org.bihealth.mi.easysmpc.dataexport.ExportExcel.exportData(ExportExcel.java:70)
        at org.bihealth.mi.easysmpc.cli.User.exportResult(User.java:414)
        at org.bihealth.mi.easysmpc.cli.User.performCommonSteps(User.java:382)
        at org.bihealth.mi.easysmpc.cli.UserCreating$1.run(UserCreating.java:101)
        at java.base/java.lang.Thread.run(Thread.java:832)
        ...

To resolve this please install the package libfontconfig1 on your system (see e.g. also here)

Command-line version: Error message "Already existing first message for scope X and receiver Y"

When using the micro-services mode, the following error message can appear durin the creation process:

Already existing first message for scope X and receiver Y

This indicates that for at least one participant an initial message with the same project name already exists and therefore no new initial message can be added. Please proceed with the already started project, delete the already started project or start a new project with a new project name.

Contact

If you have questions or encounter any problems, we would like to invite you to open an issue on Github. This allows other users to collaborate and (hopefully) answer your question in a timely manner. If your request contains confidential information or is not suited for a public issue, send us an email.

EasySMPC's core development team consists of:

License

This software is licensed under the Apache License 2.0. The full text is accessible in the LICENSE file.

EasySMPC uses the following dependencies:

Acknowledgments

This project is partly financed by the "Collaboration on Rare Diseases" of the Medical Informatics Initiative, funded by the German Federal Ministry of Education and Research (BMBF).

Cite as

If you want to cite our software, you can use the following citation:

Wirth, F.N., Kussel, T., Müller, A. et al. EasySMPC: a simple but powerful no-code tool for practical secure multiparty computation. BMC Bioinformatics 23, 531 (2022). https://doi.org/10.1186/s12859-022-05044-8

easy-smpc-performance-evaluation's People

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

easy-smpc-performance-evaluation's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.