Reproducing Federated Learning in TensorFlow

An unofficial implementation of federated learning in TensorFlow: Communication-Efficient Learning of Deep Networks from Decentralized Data (AISTATS 2017). I reproduced some experiment results in the federated learning paper around the beginning of 2021. In this repository, I provided the implementation along with the reproduced results.

Image source: https://www.netapp.com/blog/future-of-AI-federated-learning/

Paper: McMahan, H. B. et al. “Communication-Efficient Learning of Deep Networks from Decentralized Data.” International Conference on Artificial Intelligence and Statistics (2017) > https://proceedings.mlr.press/v54/mcmahan17a/mcmahan17a.pdf

Summary of the paper

I summarize the paper in the following slides:

https://github.com/willyfh/federated-learning-tensorflow/blob/main/doc/Federated%20Learning%20-%20Summary.pdf

Requirements

matplotlib==3.3.4
tensorflow==2.4.1
keras==2.4.3
scikit-learn==0.24.1
tqdm==4.57.0

Python Files (.py)

The python files (federated_train.py & plot_multi_cases.py) are provided for a simpler execution. Basically the implementation is the same as the provided Jupyter Notebook file (which was actually used to obtained the results shown in this project).

Install the required libraries:

pip install -r requirements.txt

Execute federated_train.py to train the model: Example for the case of 2NN (MLP), Non-IID, C=0, E=1, B=all, Learning rates={0.1, 0.01}, target_accuracy=0.93, communication_rounds=200

python federated_train.py --model mlp --data_dist noniid --n_clients 100 --c 0 --e 1 --b -1 --c_rounds 200 --lr 0.1 0.01 --target_acc 0.93

After the execution is completed, the training result file (.pickle) and the plots (train-test plot) will be stored in the same directory as the python file.

Arguments info:

  --model {mlp,cnn}        mlp or cnn
  --data_dist {iid,noniid} iid or noniid
  --n_clients              number of clients
  --c                      client fraction [0.0,1.0]
  --e                      client epoch
  --b                      batch size. input -1 to use a single batch
  --c_rounds               communication rounds
  --lr                     learning rate. separate by space for multiple learning rates. eg. --lr 0.1 0.001
  --target_acc             target test accuracy [0.0, 1.0]

For generating multi cases plot, we need to have all of the training result files (for all cases) which were created using the above steps. Example for CNN IID case:

python plot_multi_cases.py --model cnn --data_dist iid --result_files train_result_all_0.1_1_cnn_iid.pickle train_result_50_0.1_1_cnn_iid.pickle train_result_10_0.1_1_cnn_iid.pickle train_result_all_0.1_20_cnn_iid.pickle train_result_50_0.1_20_cnn_iid.pickle train_result_10_0.1_20_cnn_iid.pickle

Arguments info:

  --model {mlp,cnn}        mlp or cnn
  --data_dist {iid,noniid} iid or noniid
  --result_files           training result files. separate by space for multiple files. eg. --result_files file1.pickle file2.pickle

Jupyter Notebook File (.ipynb)

The results shown in this project were obtained using the provided Jupyter Notebook file (federated_learning.ipynb). It was executed and tested in the Google Colab environment.

Upload federated_learning.ipynb to Google Colab / Jupyter Notebook
Change the parameters as needed in the provided section "Parameters".
Click Runtime > Run all
The training result file (.pickle) will be saved and the plots (train-test plot) will be displayed in the notebook.

For generating multi cases plot, we need to upload all of the training result files (for all cases) which were created using the above steps then execute the last section of the notebook (Generate Multi Cases Plots) to generate the plots.

Reproduced Results

Reproducing Table 1 in the paper for C = 0.0, 0.1, 1.0.

Due to limited computational power, I ran only 200-500 rounds with a single learning rate for producing the following results. Consequently, I adjusted the target-test accuracy to become 93% and 97% for 2NN and CNN, respectively.

Generally, we can see similar results to the paper. With B=∞, there is only a small advantage in increasing C. Using smaller B=10 shows a significant improvement in using C >= 0.1, especially in the non-IID case.

Please see the appendix below for accuracy and loss plot for each case in the table above.

Reproducing Figure 2 in the paper for MNIST CNN with (B, E) = (10,1), (10, 20), (50,1), (50, 20), (∞,1), (∞,20)

Here, I ran only 200 rounds with a single learning rate for this case due to limited computational power.

Generally, we also can see similar results to the paper. With C=0.1, adding more local updates per round (increase E & decrease B) can produce a significant decrease in communication costs.

willyfh / federated-learning-tensorflow Goto Github PK

federated-learning-tensorflow's Introduction

Reproducing Federated Learning in TensorFlow

Summary of the paper

Requirements

Python Files (.py)

Jupyter Notebook File (.ipynb)

Reproduced Results

Reproducing Table 1 in the paper for C = 0.0, 0.1, 1.0.

Reproducing Figure 2 in the paper for MNIST CNN with (B, E) = (10,1), (10, 20), (50,1), (50, 20), (∞,1), (∞,20)

Appendix

2NN, IID, E=1, B=∞, C=0

2NN, IID, E=1, B=∞, C=0.1

2NN, IID, E=1, B=∞, C=1

2NN, IID, E=1, B=10, C=0

2NN, IID, E=1, B=10, C=0.1

2NN, IID, E=1, B=10, C=1

2NN, Non-IID, E=1, B=∞, C=0

2NN, Non-IID, E=1, B=∞, C=0.1

2NN, Non-IID, E=1, B=∞, C=1

2NN, Non-IID, E=1, B=10, C=0

2NN, Non-IID, E=1, B=10, C=0.1

2NN, Non-IID, E=1, B=10, C=1

CNN, IID, E=5, B=∞, C=0

CNN, IID, E=5, B=∞, C=0.1

CNN, IID, E=5, B=∞, C=1

CNN, IID, E=5, B=10, C=0

CNN, IID, E=5, B=10, C=0.1

CNN, IID, E=5, B=10, C=1

CNN, Non-IID, E=5, B=∞, C=0

CNN, Non-IID, E=5, B=∞, C=0.1

CNN, Non-IID, E=5, B=∞, C=1

CNN, Non-IID, E=5, B=10, C=0

CNN, Non-IID, E=5, B=10, C=0.1

CNN, Non-IID, E=5, B=10, C=1

federated-learning-tensorflow's People

Contributors

Stargazers

Watchers

Recommend Projects

Recommend Topics

Recommend Org

Jobs