GithubHelp home page GithubHelp logo

Comments (10)

KarhouTam avatar KarhouTam commented on July 17, 2024

Format: (before local fine-tuning) -> (after local fine-tuning) So if finetune_epoch = 0, x.xx% -> 0.00% is normal.

finetune_epoch is set to 0 in template.yml

finetune_epoch: 0

from fl-bench.

KarhouTam avatar KarhouTam commented on July 17, 2024

This issue is closed due to long time no response.

from fl-bench.

Elhamnazari1372 avatar Elhamnazari1372 commented on July 17, 2024

I changed as your recommend but got the same results. Seems still not running the finetune.

20240531_181540

from fl-bench.

KarhouTam avatar KarhouTam commented on July 17, 2024

Sorry for my late respone. What's your run command? If you set finetune_epoch, you need to specify the config file in the command like python main.py fedavg your_config.yml

from fl-bench.

Elhamnazari1372 avatar Elhamnazari1372 commented on July 17, 2024

I use the same command as you mentioned . my config is :

Full explaination are listed on README.md

mode: parallel # [serial, parallel]

parallel: # It's fine to keep these configs.

Go check doc of https://docs.ray.io/en/latest/ray-core/api/doc/ray.init.html for more details.

ray_cluster_addr: null # [null, auto, local]

null implies that all cpus/gpus are included.

num_cpus: null
num_gpus: null

should be set larger than 1, or training mode fallback to serial

Set a larger num_workers can further boost efficiency, also let each worker have less computational resources.

num_workers: 2

common:
dataset: mnist
seed: 42
model: lenet5
join_ratio: 0.1
global_epoch: 100
local_epoch: 5
finetune_epoch: 20
batch_size: 32
test_interval: 100
straggler_ratio: 0
straggler_min_local_epoch: 0
external_model_params_file: ""
optimizer:
name: sgd # [sgd, adam, adamw, rmsprop, adagrad]
lr: 0.01
dampening: 0 # SGD
weight_decay: 0
momentum: 0 # [SGD, RMSprop]
alpha: 0.99 # RMSprop
nesterov: false # SGD
betas: [0.9, 0.999] # [Adam, AdamW]
amsgrad: false # [Adam, AdamW]

lr_scheduler:
name: step # null for deactivating
step_size: 10

eval_test: true
eval_val: false
eval_train: false

verbose_gap: 10
visible: false
use_cuda: true
save_log: true
save_model: false
save_fig: true
save_metrics: true
check_convergence: true

You can set specific arguments for FL methods also

FL-bench uses FL method arguments by args..

e.g.

fedprox:
mu: 0.01
pfedsim:
warmup_round: 0.7

...

NOTE: For those unmentioned arguments, the default values are set in get_<method>_args() in src/server/<method>.py

from fl-bench.

KarhouTam avatar KarhouTam commented on July 17, 2024

I tested on my workspace and everything is fine.

Here is the result, config, commands to reproduce it:

Result

==================== FedAvg Experiment Results: ====================                                                                                                                                                      
Format: (before local fine-tuning) -> (after local fine-tuning) So if finetune_epoch = 0, x.xx% -> 0.00% is normal.                                                                                                       
{100: {'all_clients': {'test': {'loss': '0.3364 -> 0.3116', 'accuracy': '91.44% -> 92.18%'}}}}                                                                                                                            
========== FedAvg Convergence on train clients ==========                                                                                                                                                                 
test (before local training):                                                                                                                                                                                             
10.0%(11.65%) at epoch: 0                                                                                                                                                                                                 
20.0%(27.31%) at epoch: 3                                                                                                                                                                                                 
30.0%(35.33%) at epoch: 4                                                                                                                                                                                                 
40.0%(47.46%) at epoch: 5                                                                                                                                                                                                 
60.0%(63.21%) at epoch: 7                                                                                                                                                                                                 
70.0%(75.43%) at epoch: 9                                                                                                                                                                                                 
80.0%(86.50%) at epoch: 18                                                                                                                                                                                                
90.0%(90.34%) at epoch: 37                                                                                                                                                                                                
test (after local training):                                                                                                                                                                                              
80.0%(82.13%) at epoch: 0                                                                                                                                                                                                 
90.0%(91.06%) at epoch: 1                                                                                                                                                                                                 
==================== FedAvg Max Accuracy ====================                                                                                                                                                             
all_clients:                                                                                                                                                                                                              
(test) before fine-tuning: 91.44% at epoch 100                                                                                                                                                                            
(test) after fine-tuning: 92.18% at epoch 100     

Config

# cfg.yml
mode: parallel # [serial, parallel]

parallel: # It's fine to keep these configs.
  # Go check doc of `https://docs.ray.io/en/latest/ray-core/api/doc/ray.init.html` for more details.
  ray_cluster_addr: null # [null, auto, local]
  
  # `null` implies that all cpus/gpus are included.
  num_cpus: null
  num_gpus: null

  # should be set larger than 1, or training mode fallback to `serial`
  # Set a larger `num_workers` can further boost efficiency, also let each worker have less computational resources.
  num_workers: 2

common:
  dataset: mnist
  seed: 42
  model: lenet5
  join_ratio: 0.1
  global_epoch: 100
  local_epoch: 5
  finetune_epoch: 5
  batch_size: 32
  test_interval: 100
  straggler_ratio: 0
  straggler_min_local_epoch: 0
  external_model_params_file: ""
  buffers: local # [local, global, drop]
  optimizer:
    name: sgd # [sgd, adam, adamw, rmsprop, adagrad]
    lr: 0.01
    dampening: 0 # SGD
    weight_decay: 0
    momentum: 0 # [SGD, RMSprop]
    alpha: 0.99 # RMSprop
    nesterov: false # SGD
    betas: [0.9, 0.999] # [Adam, AdamW]
    amsgrad: false # [Adam, AdamW]

  lr_scheduler:
    name: step # null for deactivating
    step_size: 10

  eval_test: true
  eval_val: false
  eval_train: false

  verbose_gap: 10
  visible: false
  use_cuda: true
  save_log: true
  save_model: false
  save_fig: true
  save_metrics: true
  check_convergence: true

# You can set specific arguments for FL methods also
# FL-bench uses FL method arguments by args.<method>.<arg>
# e.g.
fedprox:
  mu: 0.01
pfedsim:
  warmup_round: 0.7
# ...

# NOTE: For those unmentioned arguments, the default values are set in `get_<method>_args()` in `src/server/<method>.py`

Commands

python generate_data.py -d mnist -a 0.1 -cn 100
python main.py fedavg cfg.yml

from fl-bench.

Elhamnazari1372 avatar Elhamnazari1372 commented on July 17, 2024

thanks for your response . could I ask you what config I can use for resnet18 and cifar10 to get the best accuracy?

from fl-bench.

KarhouTam avatar KarhouTam commented on July 17, 2024

There are tons of variables that can affect the final accuracy. Sorry I can't tell you the optimal config.

from fl-bench.

Elhamnazari1372 avatar Elhamnazari1372 commented on July 17, 2024

is there a config that you used and got a reasonable response?
thanks

from fl-bench.

KarhouTam avatar KarhouTam commented on July 17, 2024

Just try it yourself.

from fl-bench.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.