imp,milvlg

Quantized model latency issue.

Hi, I am trying to see if I can load a quantized model of this.

When I load in 4-bit, the model size is smaller but the latency significantly increases.

Not sure if there needs to be any changes to be done to support quantization.

Please, let me know.

I can also help in creating a MR to make the quantized model better.

Thanks

The training environment

Hi,

I try to reproduction the training of Imp. But there is some problem with the training environment as following shows.

My transformers is 4.31.0 as requirement.

finetune_lora_custom.sh

I only changed

IMP_MODEL='./checkpoints/imp-v1-3b'

--data_path 
--image_folder

but have this infomation in my terminal

You are using a model of type imp to instantiate a model of type llava. This is not supported for all configurations of models and can yield errors.
You are using a model of type imp to instantiate a model of type llava. This is not supported for all configurations of models and can yield errors.
You are using a model of type imp to instantiate a model of type llava. This is not supported for all configurations of models and can yield errors.
You are using a model of type imp to instantiate a model of type llava. This is not supported for all configurations of models and can yield errors.
Downloading config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 576/576 [00:00<00:00, 1.89MB/s]
[2024-02-22 16:33:49,885] [WARNING] [partition_parameters.py:836:_post_init_method] param `probe` in SiglipMultiheadAttentionPoolingHead not on GPU so was not broadcasted from rank 0
[2024-02-22 16:33:53,686] [INFO] [partition_parameters.py:453:__exit__] finished initializing model with 7.77B parameters
Traceback (most recent call last):
  File "/data1/*** /imp/imp_llava/train/train_mem.py", line 15, in <module>
    train()
  File "/data1/***/imp/./imp_llava/train/train.py", line 827, in train
    model = LlavaLlamaForCausalLM.from_pretrained(
  File "/data1/***/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/data1/***/site-packages/transformers/modeling_utils.py", line 3125, in _load_pretrained_model
    model.apply(model._initialize_weights)
  File "/data1/***/site-packages/torch/nn/modules/module.py", line 884, in apply
    module.apply(fn)
  File "/data1/***/site-packages/torch/nn/modules/module.py", line 884, in apply
    module.apply(fn)
  File "/data1/***/site-packages/torch/nn/modules/module.py", line 885, in apply
    fn(self)
  File "/data1/***/site-packages/transformers/modeling_utils.py", line 1261, in _initialize_weights
    self._init_weights(module)
  File "/data1/***/site-packages/transformers/models/llama/modeling_llama.py", line 472, in _init_weights
    module.weight.data[module.padding_idx].zero_()
IndexError: index 50256 is out of bounds for dimension 0 with size 0
[2024-02-22 16:33:55,511] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 2275311
[2024-02-22 16:33:55,524] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 2275312
[2024-02-22 16:33:55,535] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 2275313
[2024-02-22 16:33:55,545] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 2275314

demo链接错啦，把逗号改成.

Model Evaluation

Hi,

I have trained imp with lora. However, it does not process the reference when I run the evaluation scripts.
Following is the output when I eval pope.

milvlg / imp Goto Github PK

imp's People

Contributors

Stargazers

Watchers

Forkers

imp's Issues

Quantized model latency issue.

The training environment

finetune_lora_custom.sh

demo链接错啦，把逗号改成.

Model Evaluation

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs