Comments (10)
Format: (before local fine-tuning) -> (after local fine-tuning) So if finetune_epoch = 0, x.xx% -> 0.00% is normal.
☝ finetune_epoch
is set to 0
in template.yml
Line 24 in b19d935
from fl-bench.
This issue is closed due to long time no response.
from fl-bench.
I changed as your recommend but got the same results. Seems still not running the finetune.
from fl-bench.
Sorry for my late respone. What's your run command? If you set finetune_epoch, you need to specify the config file in the command like python main.py fedavg your_config.yml
from fl-bench.
I use the same command as you mentioned . my config is :
Full explaination are listed on README.md
mode: parallel # [serial, parallel]
parallel: # It's fine to keep these configs.
Go check doc of https://docs.ray.io/en/latest/ray-core/api/doc/ray.init.html
for more details.
ray_cluster_addr: null # [null, auto, local]
null
implies that all cpus/gpus are included.
num_cpus: null
num_gpus: null
should be set larger than 1, or training mode fallback to serial
Set a larger num_workers
can further boost efficiency, also let each worker have less computational resources.
num_workers: 2
common:
dataset: mnist
seed: 42
model: lenet5
join_ratio: 0.1
global_epoch: 100
local_epoch: 5
finetune_epoch: 20
batch_size: 32
test_interval: 100
straggler_ratio: 0
straggler_min_local_epoch: 0
external_model_params_file: ""
optimizer:
name: sgd # [sgd, adam, adamw, rmsprop, adagrad]
lr: 0.01
dampening: 0 # SGD
weight_decay: 0
momentum: 0 # [SGD, RMSprop]
alpha: 0.99 # RMSprop
nesterov: false # SGD
betas: [0.9, 0.999] # [Adam, AdamW]
amsgrad: false # [Adam, AdamW]
lr_scheduler:
name: step # null for deactivating
step_size: 10
eval_test: true
eval_val: false
eval_train: false
verbose_gap: 10
visible: false
use_cuda: true
save_log: true
save_model: false
save_fig: true
save_metrics: true
check_convergence: true
You can set specific arguments for FL methods also
FL-bench uses FL method arguments by args..
e.g.
fedprox:
mu: 0.01
pfedsim:
warmup_round: 0.7
...
NOTE: For those unmentioned arguments, the default values are set in get_<method>_args()
in src/server/<method>.py
from fl-bench.
I tested on my workspace and everything is fine.
Here is the result, config, commands to reproduce it:
Result
==================== FedAvg Experiment Results: ====================
Format: (before local fine-tuning) -> (after local fine-tuning) So if finetune_epoch = 0, x.xx% -> 0.00% is normal.
{100: {'all_clients': {'test': {'loss': '0.3364 -> 0.3116', 'accuracy': '91.44% -> 92.18%'}}}}
========== FedAvg Convergence on train clients ==========
test (before local training):
10.0%(11.65%) at epoch: 0
20.0%(27.31%) at epoch: 3
30.0%(35.33%) at epoch: 4
40.0%(47.46%) at epoch: 5
60.0%(63.21%) at epoch: 7
70.0%(75.43%) at epoch: 9
80.0%(86.50%) at epoch: 18
90.0%(90.34%) at epoch: 37
test (after local training):
80.0%(82.13%) at epoch: 0
90.0%(91.06%) at epoch: 1
==================== FedAvg Max Accuracy ====================
all_clients:
(test) before fine-tuning: 91.44% at epoch 100
(test) after fine-tuning: 92.18% at epoch 100
Config
# cfg.yml
mode: parallel # [serial, parallel]
parallel: # It's fine to keep these configs.
# Go check doc of `https://docs.ray.io/en/latest/ray-core/api/doc/ray.init.html` for more details.
ray_cluster_addr: null # [null, auto, local]
# `null` implies that all cpus/gpus are included.
num_cpus: null
num_gpus: null
# should be set larger than 1, or training mode fallback to `serial`
# Set a larger `num_workers` can further boost efficiency, also let each worker have less computational resources.
num_workers: 2
common:
dataset: mnist
seed: 42
model: lenet5
join_ratio: 0.1
global_epoch: 100
local_epoch: 5
finetune_epoch: 5
batch_size: 32
test_interval: 100
straggler_ratio: 0
straggler_min_local_epoch: 0
external_model_params_file: ""
buffers: local # [local, global, drop]
optimizer:
name: sgd # [sgd, adam, adamw, rmsprop, adagrad]
lr: 0.01
dampening: 0 # SGD
weight_decay: 0
momentum: 0 # [SGD, RMSprop]
alpha: 0.99 # RMSprop
nesterov: false # SGD
betas: [0.9, 0.999] # [Adam, AdamW]
amsgrad: false # [Adam, AdamW]
lr_scheduler:
name: step # null for deactivating
step_size: 10
eval_test: true
eval_val: false
eval_train: false
verbose_gap: 10
visible: false
use_cuda: true
save_log: true
save_model: false
save_fig: true
save_metrics: true
check_convergence: true
# You can set specific arguments for FL methods also
# FL-bench uses FL method arguments by args.<method>.<arg>
# e.g.
fedprox:
mu: 0.01
pfedsim:
warmup_round: 0.7
# ...
# NOTE: For those unmentioned arguments, the default values are set in `get_<method>_args()` in `src/server/<method>.py`
Commands
python generate_data.py -d mnist -a 0.1 -cn 100
python main.py fedavg cfg.yml
from fl-bench.
thanks for your response . could I ask you what config I can use for resnet18 and cifar10 to get the best accuracy?
from fl-bench.
There are tons of variables that can affect the final accuracy. Sorry I can't tell you the optimal config.
from fl-bench.
is there a config that you used and got a reasonable response?
thanks
from fl-bench.
Just try it yourself.
from fl-bench.
Related Issues (20)
- please can somebody helps me to solve this problem HOT 5
- COPY failed: forbidden path outside the build context: ../ () HOT 3
- Changing "finetune_epoch" doesn't affect test accuracies. HOT 4
- problem run pre-treatment HOT 7
- [Implementation Error] algorithm "ccvr" code lost a "()" HOT 5
- Are u considering to add the FedMix algo in the repository? HOT 2
- There is no VALIDATION set for FEMNIST LEAF - help wanted HOT 8
- FL-bench welcomes PRs
- Hi, thanks for your contributions to the FL community, you are extremely a talented person
- Elastic aggregation and Non-iid data by using dirichlet distribution scenario HOT 5
- 我想把大佬你复现的pfedla的cnnwithbn迁移进这个model中,该如何迁移呢? HOT 7
- server aggregation about BN layers HOT 5
- pfedla with high cpu occupy and low gpu occupy HOT 4
- 关于新客户端微调问题 HOT 4
- a bug in “pfedhn” HOT 1
- prototype_loss in fedproto seems to be 0 all the time HOT 2
- Division by zero on cfl.py HOT 1
- inference in fedproto
- OSError: [WinError 123] The filename, directory name, or volumelabel syntax is incorrect(文件名、目录名或卷标语法不正确) HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fl-bench.