jayzhang42 / federatedgpt-shepherd Goto Github PK

View Code? Open in Web Editor NEW

191.0 6.0 30.0 7.3 MB

Shepherd: A foundational framework enabling federated instruction tuning for large language models

Home Page: https://arxiv.org/pdf/2305.05644.pdf

License: Apache License 2.0

Python 100.00%

federated-learning large-language-model

federatedgpt-shepherd's People

Contributors

Stargazers

Watchers

federatedgpt-shepherd's Issues

Training Time

Hi,
I was wondering what setting the training time of 2 hours that you report in the paper refers to:

Is this the training time of fine-tuning lora-shepherd-7b for one client only? Or what is the setting in this case?

The implementation provided with 5 clients per round (0.05 participation over 100 clients in total on Databricks-dolly-15k) and 20 communications rounds on one single GPU NVIDIA A100 is around 13.5 hours on my side.

Please, can you release the GPT-4 auto-evaluation?

Hello,
I would like to have the GPT-4 assessment to evaluate your model and see your results.
Did you use the evaluation 1) with easy questions [LINK] or 2) the one which address the GPT-4 limitations [LINK]?
Please can you provide the whole assessment process you follow so I can replicate it?
Thank you very much :)

Why the training time is so long

I use the command below with two NVIDIA TITAN RTXs, it needs 20+ hours to get the model trained.
python main.py --global_model 'chavinlo/alpaca-native'
--data_path "./data"
--output_dir './lora-shepherd-7b/'
--num_communication_rounds 10
--num_clients 10
--train_on_inputs
--group_by_length

bad Q&A

Thank you for your excellent work. After training the llama-7b model with settings similar to yours, I found that the resulting 7B model couldn't even complete question-answering tasks. Have you experienced a similar situation?

Do you have plan to release Shepherd-7B ckpt?

Thanks for your great work! Do you have plan to release Shepherd-7B ckpt?

I can not find the requirements.txt

Please upload the requirements.txt, thx.

The uploaded model appears to be an untrained version

In initiate_local_training, self_params_dict_new is recorded, but these parameters are already detach(), and only the transient parameter state is recorded. Therefore new_adapter_weight, which is saved to the appropriate path in terminate_local_training for aggregation, is an untrained parameter.

def initiate_local_training(self):
        self.model.config.use_cache = False
        self.params_dict_old = copy.deepcopy(
            OrderedDict((name, param.detach()) for name, param in self.model.named_parameters() if
                        "default" in name))
        self.params_dict_new = OrderedDict((name, param.detach()) for name, param in self.model.named_parameters() if
                                           "default" in name)
        self.model.state_dict = (
            lambda instance, *_, **__: get_peft_model_state_dict(
                instance, self.params_dict_new, "default"
            )
        ).__get__(self.model, type(self.model))

def terminate_local_training(self, epoch, local_dataset_len_dict, previously_selected_clients_set):

        local_dataset_len_dict[self.client_id] = len(self.local_train_dataset)
        new_adapter_weight = self.model.state_dict()
        single_output_dir = os.path.join(self.output_dir, str(epoch), "local_output_{}".format(self.client_id))
        os.makedirs(single_output_dir, exist_ok=True)
        torch.save(new_adapter_weight, single_output_dir + "/pytorch_model.bin")

        older_adapter_weight = get_peft_model_state_dict(self.model, self.params_dict_old, "default")
        set_peft_model_state_dict(self.model, older_adapter_weight, "default")
        previously_selected_clients_set = previously_selected_clients_set | set({self.client_id})
        last_client_id = self.client_id

        return self.model, local_dataset_len_dict, previously_selected_clients_set, last_client_id

jayzhang42 / federatedgpt-shepherd Goto Github PK

federatedgpt-shepherd's People

Contributors

Stargazers

Watchers

Forkers

federatedgpt-shepherd's Issues

Training Time

Training Time

Please, can you release the GPT-4 auto-evaluation?

Why the training time is so long

bad Q&A

Do you have plan to release Shepherd-7B ckpt?

I can not find the requirements.txt

The uploaded model appears to be an untrained version

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs