prateekjoshi565 / fine-tuning-bert Goto Github PK

License: Apache License 2.0

Jupyter Notebook 100.00%

fine-tuning-bert's Introduction

Hi there 👋

I am a data scientist. I solve real-world problems using data science and AI. I have a passion for learning and sharing my knowledge with others. You can read my blogs here and feel free to reach out to me to share your thoughts.

My skills 📊📈

Python
R Programming
Stable Diffusion
Generative AI
Natural Language Processing (NLP)
Streamlit
Anvil

If you found value in something I have created, please feel free to follow me on Linkedin!!

fine-tuning-bert's People

Contributors

Stargazers

Watchers

Forkers

seanahmad agg-pranav abukrek nawshad jtquisenberry nahumfgz 4bahija jaganlal-thoppe japneetwalia ahmet-amani akhyar-ahmed chen256 azamrabiee kawsarnoor sekhar2017 gnsandeep rebeccanoordeen emjosh13 kailash-thiyagarajan ulwan custom-org benyoubohtmane iamsantoshkumar shubhammicrosoft1 ahmed-hassan97 muluayele999 abhisekk781 wang-haining caraxl ejaz22 blue-create diandiaye jina10-star mbkan izabela29 lohith0501 malyang mateusfiori huandeng1990 abhijitanand krishxo akbism trabis-khalid carvalhoamc soyeon-erinlee sandeshregmi sidharthiimc zenpro100 sushil-ds nirmal14 sophiezang oranidjar sibbsnb c-ritam98 lamiathu shy-runge reneje boilertoad odykai2009 fabregas201307 saraswatpreeti shinasakawa pranavreddyp16 besteverandever ti-yao sezaitunca yukiki-jc rubythalib33 abdullahmuaad9 haizhuolaojisite silo86 diamant-m sksujan58 24parida hudakas reemharel22 meetgandhi123 ayush1409 iamridam wang-rui rupanjali1 m-zia-khan arifx khiwrale miguelfrutos houcembm hardikbapna haramiday antoryulimyh prathamej741 nghiatiger102 hexyedev masoudkaviani idontcalculate chetanphartale titirat543 ananthu-raj johnsonr vliublinska usamarizwan98

fine-tuning-bert's Issues

Extend for multi class classification

Can you please help how to extend it for multi-class classification?

Thanks in advance!

Unfreezing Bert Parameters and training only on BERT

Hello, first of all thanks for the useful repo.
I experimented your code on my task, the results were not so good, so I wanted to play around with it. I faced 2 issues:

1] I don't want to freeze BERT parameters, so I commented those 2 lines as you mentioned:

for param in bert.parameters():
    param.requires_grad =  False

However, when I did that all my sentences were classified 0 (or sometimes all classified as 1), any idea why that happened ?

Alternatively, I set param.requires_grad = True, instead of False, yet I experienced the same behavior, a single label is assigned to all sentences in some runs its 0, other its 1.

2] Another thing I tried is to just classify using the original BERT so I set model=bert instead of model = BERT_Arch(bert), I get the following error while training:

TypeError: nll_loss_nd(): argument 'input' (position 1) must be Tensor, not BaseModelOutputWithPoolingAndCrossAttentions

The trace stack:

 Epoch 1 / 4
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-789-c5138ddf6b25> in <module>()
     12 
     13     #train model
---> 14     train_loss, _ = train()
     15 
     16     #evaluate model

3 frames
<ipython-input-787-a8875e82e2a3> in train()
     28 
     29     # compute the loss between actual and predicted values
---> 30     loss = cross_entropy(preds, labels)
     31 
     32     # add on to the total loss

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py in forward(self, input, target)
    209 
    210     def forward(self, input: Tensor, target: Tensor) -> Tensor:
--> 211         return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)
    212 
    213 

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
   2530     if size_average is not None or reduce is not None:
   2531         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2532     return torch._C._nn.nll_loss_nd(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   2533 
   2534 

TypeError: nll_loss_nd(): argument 'input' (position 1) must be Tensor, not BaseModelOutputWithPoolingAndCrossAttentions

I added return_dict=False in (bert = AutoModel.from_pretrained('bert-base-uncased',return_dict=False)), but the error just changed to TypeError: nll_loss_nd(): argument 'input' (position 1) must be Tensor, not tuple with similar stack trace as the one shown above.

NameError: name 'epochs' is not defined

please help me with this @prateekjoshi565

'str' object has no attribute 'dim'

I got ' 'str' object has no attribute 'dim'' by simply running the notebook. Any idea where is the error?

Epoch 1 / 10
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-31-c5138ddf6b25> in <module>
     12 
     13     #train model
---> 14     train_loss, _ = train()
     15 
     16     #evaluate model

<ipython-input-21-a8875e82e2a3> in train()
     25 
     26     # get model predictions for the current batch
---> 27     preds = model(sent_id, mask)
     28 
     29     # compute the loss between actual and predicted values

~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

<ipython-input-16-9ebdcf410f97> in forward(self, sent_id, mask)
     28       _, cls_hs = self.bert(sent_id, attention_mask=mask)
     29 
---> 30       x = self.fc1(cls_hs)
     31 
     32       x = self.relu(x)

~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/linear.py in forward(self, input)
     91 
     92     def forward(self, input: Tensor) -> Tensor:
---> 93         return F.linear(input, self.weight, self.bias)
     94 
     95     def extra_repr(self) -> str:

~/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py in linear(input, weight, bias)
   1686         if any([type(t) is not Tensor for t in tens_ops]) and has_torch_function(tens_ops):
   1687             return handle_torch_function(linear, tens_ops, input, weight, bias=bias)
-> 1688     if input.dim() == 2 and bias is not None:
   1689         # fused op is marginally faster
   1690         ret = torch.addmm(bias, input, weight.t())

AttributeError: 'str' object has no attribute 'dim'

CUDA error: device-side assert triggered

Please suggest how to resolve this issue.
I have tried degrading the transformer version to 2.5.1.

change the underlying model

is it possible to change the underlying model from Bert to some other like RobertA or longformer?
I tried but go the error.

any thoughts please?

Multi-label classification issue

Hi,

thank s for this great tutorial
I want to apply this for a multi label text classification problem. My labels are of this format
tensor([[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
...,
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0]])
I changed the softmax function in the bert model by the sigmoid function but when I tried to train the model I got this error
multi-target not supported at /pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:18

Could u help plz
thank u

Add AUC score for AUC-ROC curve

How do I generate AUC score for AUC-ROC curve
I tried the predict_proba but its give me an error :"'BERT_Arch' object has no attribute 'predict_proba'"

furthermore, is it possible to add other neural network layer on top of Bert such as LSTM or BiLSTM.

ValueError: too many dimensions 'str'

Hello,

for train set

train_seq = torch.tensor(tokens_train['input_ids'])
train_mask = torch.tensor(tokens_train['attention_mask'])
train_y = torch.tensor(train_labels.tolist())

at train_y an error occuring as; ValueError: too many dimensions 'str'

There is my train labels:

0 positive
1 negative
2 positive
3 notr
4 positive
... ...
4002 notr
4003 positive
4004 positive
4005 notr
4006 negative

can you help me about that issue

linear(): argument 'input' (position 1) must be Tensor, not str

Tried running the code but got this error on both my dataset and the example dataset.

not enough values to unpack (expected 2, got 1)

The code that reaches this part has an error.
What should I do to fix it?

Code

Is there a way to cite the code you used for BERT fine-tuning?

Fine-tune BERT with Classification

Hello I'm running the code in my dataset. With frozen weights the values are promising.

However, to fully defrost the BERT the model hits only one label.

Warning: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use zero_division parameter to control this behavior. warn_prf(average, modifier, msg_start, len(result))

          precision    recall  f1-score   support

       0       0.00      0.00      0.00        42
       1       0.40      1.00      0.57        28

accuracy                           0.40        70

macro avg 0.20 0.50 0.29 70
weighted avg 0.16 0.40 0.23 70

Do you need any more procedures for the model to learn labels correctly? I'm using 470 examples.

axis 1 is out of bounds for array of dimension 1

I used my own dataset to train a Bert model but an issue occur when I am trying to evaluate the model with testing dataset (refer to the image 1). The shape of the array is (1909,2) which is a 2 dimensional array. One more issue is the original size of testing data should be 1910, not sure why the shape of preds is become 1909.

Image 1

Image 2

Preds returned is a tuple of float tensors

get model predictions for the current batch

preds = model(sent_id, mask)

# compute the loss between actual and predicted values
loss = cross_entropy(preds, labels)

When train is invoked , I see the following error -
AttributeError: 'tuple' object has no attribute 'dim'