Comments (10)
Please provide a YAML file to help us reproduce the issue, thanks.
Also, please try to set cfg.eval.count_flops
to False
. When enabling this, Torch's garbage collection mechanism may not be timely resulting in OOM. If you wan to get an exact GPU memory usage, please use torch.cuda.empty_cache()
.
from federatedscope.
Please provide a YAML file to help us reproduce the issue, thanks. Also, please try to set
cfg.eval.count_flops
toFalse
. When enabling this, Torch's garbage collection mechanism may not be timely resulting in OOM.
Thank you for your response, this is my YAML configuration:
use_gpu: True
device: 0
early_stop:
patience: 10
federate:
mode: standalone
client_num: 100
sample_client_num: 5
total_round_num: 10
save_to: "/root/autodl-tmp/model/finetuned_codellama13b/codellama13b.ckpt"
share_local_model: True
online_aggr: False
data:
root: /root/autodl-tmp/data/TutorCode
type: 'TutorCode.json@llm'
splits: [0.98,0.01,0.01]
splitter: 'iid'
dataloader:
batch_size: 1
llm:
tok_len: 2048
chat:
max_len: 2048
adapter:
use: True
args: [ { 'adapter_package': 'peft', 'adapter_method': 'qlora', 'r': 32, 'lora_alpha': 32, 'lora_dropout': 0.05, 'load_in_4bit': True, 'bnb_4bit_quant_type': 'nf4', 'bnb_4bit_compute_dtype': True, 'bnb_4bit_use_double_quant': True, 'module_int4': True, } ] # Quantized LoRA hyperparameter
mv_to_cpu: True
model:
type: 'codellama/CodeLlama-13b-Instruct-hf@huggingface_llm'
train:
local_update_steps: 30
batch_or_epoch: batch
is_enable_half: False
optimizer:
type: 'AdamW'
lr: 0.0001
weight_decay: 0.00
criterion:
type: CrossEntropyLoss
trainer:
type: llmtrainer
eval:
freq: 20
metrics: ['loss']
count_flops: False
This is the federated/llm/model/model_builder.py and adapter_builder.py that I modified to enable quantized LoRA:
from federatedscope.llm.model.adapter_builder import AdapterModel
def get_model_from_huggingface(model_name, config):
"""
Load a causal language model from HuggingFace transformers library.
Args:
model_name (str): The name of the pre-trained model to load.
config (Config): The configuration object that contains the model
parameters.
Returns:
AutoModelForCausalLM: A causal language model object.
"""
from transformers import AutoModelForCausalLM
kwargs = {}
if len(config.llm.cache.model):
kwargs['cache_dir'] = config.llm.cache.model
args = config.llm.adapter.args[0]
if args.get('adapter_method', 'lora') in ['qlora', 'qlora-pissa']:
from transformers import BitsAndBytesConfig
import torch
bnb_config = BitsAndBytesConfig(
load_in_4bit=args.pop('load_in_4bit', True),
bnb_4bit_quant_type=args.pop('bnb_4bit_quant_type', 'nf4'),
bnb_4bit_compute_dtype=torch.bfloat16 if args.pop('bnb_4bit_compute_dtype', True) else torch.float32,
bnb_4bit_use_double_quant=args.pop('bnb_4bit_use_double_quant', True),
)
kwargs['quantization_config'] = bnb_config
if args.get('bnb_4bit_compute_dtype', True):
kwargs['torch_dtype'] = torch.bfloat16
kwargs['device_map'] = f'cuda:{config.device}'
return AutoModelForCausalLM.from_pretrained(model_name, **kwargs)
def get_model_from_modelscope(model_name, config):
"""
Load a causal language model from ModelScope models library.
Args:
model_name (str): The name of the pre-trained model to load.
config (Config): The configuration object that contains the model
parameters.
Returns:
Model: A causal language model object.
"""
from modelscope import AutoModelForCausalLM
kwargs = {}
if len(config.llm.cache.model):
kwargs['cache_dir'] = config.llm.cache.model
return AutoModelForCausalLM.from_pretrained(model_name, **kwargs)
def get_llm(config):
"""
Get a causal language model based on the configuration.
Args:
config (Config): The configuration object that contains the model
parameters.
Returns:
AdapterModel: A causal language model object with optional adapter
layers.
"""
from federatedscope.llm.dataloader import get_tokenizer
model_config = config.model
model_name, model_hub = model_config.type.split('@')
if model_hub == 'huggingface_llm':
model = get_model_from_huggingface(model_name=model_name,
config=config)
elif model_hub == 'modelscope_llm':
model = get_model_from_modelscope(model_name=model_name, config=config)
else:
raise NotImplementedError(f'Not support LLM {model_name} in'
f' {model_hub}.')
# Resize LLM model based on settings
tokenizer, num_new_tokens = \
get_tokenizer(model_name, config.data.root, config.llm.tok_len,
model_hub)
model.resize_token_embeddings(len(tokenizer))
if num_new_tokens > 0:
input_embeddings = model.get_input_embeddings().weight.data
output_embeddings = model.get_output_embeddings().weight.data
input_embeddings_avg = input_embeddings[:-num_new_tokens].mean(
dim=0, keepdim=True)
output_embeddings_avg = output_embeddings[:-num_new_tokens].mean(
dim=0, keepdim=True)
input_embeddings[-num_new_tokens:] = input_embeddings_avg
output_embeddings[-num_new_tokens:] = output_embeddings_avg
args = config.llm.adapter.args[0] if len(
config.llm.adapter.args[0]) > 0 else {}
model = AdapterModel(model, use_adapter=config.llm.adapter.use, **args)
return model
import torch
import torch.nn as nn
from collections import OrderedDict
from federatedscope.llm.MyUtils.qlora_util import find_all_linear_names
def enable_adapter(model, package, adapter, **kwargs):
"""
Enables an adapter for a given model and package.
Args:
model: A pre-trained model from HuggingFace Transformers library.
package: A string indicating the name of the package that provides
the adapter. Currently, only 'peft' and 'adapterhub' is supported.
adapter: A string indicating the name of the adapter to enable. The
available adapters depend on the package.
**kwargs: Additional keyword arguments that are passed to the
adapter configuration.
Returns:
A model object that has the adapter enabled.
Raises:
NotImplementedError: If the package or the adapter is not supported.
"""
adapter = adapter.lower()
if package == 'peft':
"""
PEFT: https://github.com/huggingface/peft
Support methods:
LoRA
Prefix Tuning
P-Tuning
Prompt Tuning
AdaLoRA
"""
from peft import get_peft_model, TaskType
if adapter == 'lora':
from peft import LoraConfig
peft_config = LoraConfig(task_type=TaskType.CAUSAL_LM, **kwargs)
model = get_peft_model(model, peft_config)
elif adapter == 'prefix':
from peft import PrefixTuningConfig
peft_config = PrefixTuningConfig(task_type=TaskType.CAUSAL_LM,
**kwargs)
model = get_peft_model(model, peft_config)
elif adapter == 'prompt':
from peft import PromptTuningConfig
peft_config = PromptTuningConfig(task_type=TaskType.CAUSAL_LM,
**kwargs)
model = get_peft_model(model, peft_config)
elif adapter == 'p-tuning':
from peft import PromptEncoderConfig
peft_config = PromptEncoderConfig(task_type=TaskType.CAUSAL_LM,
**kwargs)
model = get_peft_model(model, peft_config)
elif adapter == 'qlora':
from peft import prepare_model_for_kbit_training, LoraConfig
model = prepare_model_for_kbit_training(model)
peft_config = LoraConfig(
r=kwargs.pop('r', 32),
lora_alpha=kwargs.pop('lora_alpha', 16),
target_modules=find_all_linear_names(model, int4=kwargs.pop('module_int4', True)),
lora_dropout=kwargs.pop('lora_dropout', 0.05),
bias="none",
task_type=TaskType.CAUSAL_LM,
)
model = get_peft_model(model, peft_config)
elif adapter == 'qlora-pissa':
from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training
model = prepare_model_for_kbit_training(model)
model = PeftModel.from_pretrained(model, kwargs.pop('adapter_path', ''), subfolder=kwargs.pop('subfolder', 'pissa_init'), is_trainable=True)
for name, params in model.named_parameters():
if "embed_tokens" in name or "lm_head" in name:
params.requires_grad=False
else:
raise NotImplementedError
model.print_trainable_parameters()
elif package == 'adapterhub':
"""
AdapterHub: https://docs.adapterhub.ml/model_overview.html
Support methods:
Bottleneck Adapters
Prefix Tuning
LoRA
Compacter
Adapter Fusion
Invertible Adapters
Parallel block
"""
# TODO: After supporting adapterhub, we will move the following
# parameters in yaml file for users' convenient
if adapter == 'lora':
from transformers.adapters import LoRAConfig
config = LoRAConfig(r=8, alpha=16)
model.add_adapter("lora_adapter", config=config)
model.train_adapter(['lora_adapter'])
elif adapter == 'bottleneck':
from transformers.adapters import AdapterConfig
config = AdapterConfig(mh_adapter=True,
output_adapter=True,
reduction_factor=16,
non_linearity="relu")
model.add_adapter("bottleneck_adapter", config=config)
model.train_adapter(['bottleneck_adapter'])
elif adapter == 'lang':
from transformers.adapters import PfeifferInvConfig
config = PfeifferInvConfig()
model.add_adapter("lang_adapter", config=config)
model.train_adapter(['lang_adapter'])
elif adapter == 'prefix':
from transformers.adapters import PrefixTuningConfig
config = PrefixTuningConfig(flat=False, prefix_length=30)
model.add_adapter("prefix_tuning", config=config)
model.train_adapter(['prefix_tuning'])
elif adapter == 'compacter':
from transformers.adapters import CompacterConfig
config = CompacterConfig()
model.add_adapter("dummy", config=config)
model.train_adapter(['dummy'])
elif adapter == 'ia_3':
from transformers.adapters import IA3Config
config = IA3Config()
model.add_adapter("ia3_adapter", config=config)
model.train_adapter(['ia3_adapter'])
elif adapter == 'union':
from transformers.adapters import AdapterConfig, ConfigUnion
# TODO: configure these args in cfg
config = ConfigUnion(
AdapterConfig(mh_adapter=True,
output_adapter=False,
reduction_factor=16,
non_linearity="relu"),
AdapterConfig(mh_adapter=False,
output_adapter=True,
reduction_factor=2,
non_linearity="relu"),
)
model.add_adapter("union_adapter", config=config)
model.train_adapter(['union_adapter'])
elif adapter == 'mam':
from transformers.adapters import \
ConfigUnion, ParallelConfig, PrefixTuningConfig
config = ConfigUnion(
PrefixTuningConfig(bottleneck_size=800),
ParallelConfig(),
)
model.add_adapter("mam_adapter", config=config)
model.train_adapter(['mam_adapter'])
else:
raise NameError(
f"There is no adapter named {adapter} in {package}")
else:
raise NotImplementedError
return model
class AdapterModel(nn.Module):
"""
A wrapper class for a model that can use adapters for fine-tuning.
This class inherits from torch.nn.Module and implements a wrapper for a
model that can optionally use adapters for fine-tuning. Adapters are small
modules that can be inserted between the layers of a pretrained model and
trained on a specific task, while keeping the original parameters frozen.
This class can use different adapter packages and methods, such as PEFT
and LoRA. It also provides methods for saving and loading the model state
dict, as well as generating text using the model.
Attributes:
model: A torch.nn.Module object that represents the original or
adapted model.
"""
def __init__(self, model, use_adapter=False, *args, **kwargs):
"""
Initializes the wrapper with the given model and arguments.
Args:
model: A torch.nn.Module object that represents the original model.
use_adapter: A boolean indicating whether to use adapters for
fine-tuning. Default is False.
*args: Additional positional arguments to pass to the adapter
package or method.
**kwargs: Additional keyword arguments to pass to the adapter
package or method. These may include adapter_package,
adapter_method, etc.
"""
super().__init__()
self.model = None
if use_adapter:
adapter_package = kwargs.pop('adapter_package', 'peft')
adapter_method = kwargs.pop('adapter_method', 'lora')
self.model = enable_adapter(model, adapter_package, adapter_method,
**kwargs)
else:
self.model = model
def forward(self, *args, **kwargs):
"""
Calls the forward method of the wrapped model.
Args:
*args: Positional arguments to pass to the model's forward method.
**kwargs: Keyword arguments to pass to the model's forward method.
Returns:
The output of the model's forward method.
"""
return self.model.forward(*args, **kwargs)
def generate(self, *args, **kwargs):
"""
Calls the generate method of the wrapped model.
Args:
*args: Positional arguments to pass to the model's generate method.
**kwargs: Keyword arguments to pass to the model's generate method.
Returns:
The output of the model's generate method.
"""
try:
res = self.model.generate(*args, **kwargs)
except RuntimeError as e:
# When does evaluation in HELM,
# half precision will cause RuntimeError,
# the following solves it
if 'do_sample' in kwargs.keys():
del kwargs['do_sample']
res = self.model.generate(*args, **kwargs)
else:
raise RuntimeError(e)
return res
def state_dict(self, return_trainable=True, *args, **kwargs):
"""
Returns the state dict of the wrapped model.
Args:
return_trainable: A boolean indicating whether to return only the
trainable parameters of the model. Default is True.
*args: Additional positional arguments to pass to the model's
state_dict method.
**kwargs: Additional keyword arguments to pass to the model's
state_dict method.
Returns:
A dictionary containing the state dict of the model. If
return_trainable is True, only the parameters that require grad are
included. Otherwise, all parameters are included.
"""
if return_trainable:
return self.get_trainable_state_dict()
else:
return self.model.state_dict(*args, **kwargs)
def load_state_dict(self, state_dict, strict=False):
"""
Loads the state dict into the wrapped model.
Args:
state_dict: A dictionary containing the state dict to load into
the model.
strict: A boolean indicating whether to strictly enforce that the
keys in state_dict match the keys returned by this module’s
state_dict() function. Default is False.
"""
return self.model.load_state_dict(state_dict, strict=False)
def get_trainable_state_dict(self):
"""
Returns only the trainable parameters of the wrapped model.
This method can be used to get only the parameters that require grad,
such as adapters or task-specific layers.
Returns:
A dictionary containing the state dict of the trainable parameters
of the model.
"""
grad_params = []
for name, param in self.model.named_parameters():
if param.requires_grad:
grad_params.append(name)
model_state_dict = self.model.state_dict()
new_state_dict = OrderedDict()
for k, v in model_state_dict.items():
if k in grad_params:
new_state_dict[k] = v
return new_state_dict
def save_model(self, path, state=0):
"""
Saves the model state dict and the current round to a file.
Args:
path: A string representing the file path to save the model to.
state: An integer representing the current round of training or
evaluation. Default is 0.
"""
ckpt = {'cur_round': state, 'model': self.model.state_dict()}
torch.save(ckpt, path)
# TODO: Fix `__getattr__`
# def __getattr__(self, item):
# return getattr(self.model, item)
def save_pretrained(self, path):
"""
Saves the pretrained model to a directory.
Args:
path: A string representing the directory path to save the model to.
"""
self.model.save_pretrained(path)
By the way, the "save_to" attribute seems not able to really save the model when using adapter, the save_model function in federated/core/aggregators/aggregator.py only saves the weights:
def save_model(self, path, cur_round=-1):
assert self.model is not None
ckpt = {'cur_round': cur_round, 'model': self.model.state_dict()}
torch.save(ckpt, path)
probably it is better to use self.model.save_pretrained in this case since it saves the finetuned adapter weights and the configuration files.
Thank you very much!
from federatedscope.
sorry, just accidentally clicked the close issue button...
from federatedscope.
Related Issues (20)
- Offsite tuning code with multigpu setting throws error HOT 3
- Smaller test/val loss but lower evaluation accuracy HOT 3
- How to use multi GPU to finetune Llama2 HOT 2
- Unable to run demo in hyperparameter optimization HOT 2
- Server global evaluation total number HOT 1
- Error with 4 bit quantized LLM HOT 3
- TypeError: call_file_data() missing 1 required positional argument: 'client_cfgs' HOT 6
- Some questions about Backdoor Bench HOT 2
- 训练得到的total_flops是负数 HOT 2
- LDA splitter:ValueError: too many values to unpack (expected 2) HOT 2
- how can i use my saving model HOT 1
- how can i use my saving model HOT 3
- How can i get the output result? HOT 1
- Soft prompt Tuning
- one bug in federatedscope/gfl/fedsageplus/trainer.py
- 关于图学习中链接任务的样例使用错误 HOT 1
- Issue with federate.method set to global HOT 1
- 运行代码FederatedScope/tree/FSreal,得不到想要的结果,请问可能是什么原因?
- too many values to unpack (expected 2) in model_builder.py HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from federatedscope.