GithubHelp home page GithubHelp logo

fine-tuning-bert's Introduction

Hi there ๐Ÿ‘‹

I am a data scientist. I solve real-world problems using data science and AI. I have a passion for learning and sharing my knowledge with others. You can read my blogs here and feel free to reach out to me to share your thoughts.

My skills ๐Ÿ“Š๐Ÿ“ˆ

  • Python
  • R Programming
  • Stable Diffusion
  • Generative AI
  • Natural Language Processing (NLP)
  • Streamlit
  • Anvil

If you found value in something I have created, please feel free to follow me on Linkedin!!

fine-tuning-bert's People

Contributors

prateekjoshi565 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

fine-tuning-bert's Issues

Unfreezing Bert Parameters and training only on BERT

Hello, first of all thanks for the useful repo.
I experimented your code on my task, the results were not so good, so I wanted to play around with it. I faced 2 issues:

1] I don't want to freeze BERT parameters, so I commented those 2 lines as you mentioned:

for param in bert.parameters():
    param.requires_grad =  False

However, when I did that all my sentences were classified 0 (or sometimes all classified as 1), any idea why that happened ?

Alternatively, I set param.requires_grad = True, instead of False, yet I experienced the same behavior, a single label is assigned to all sentences in some runs its 0, other its 1.

2] Another thing I tried is to just classify using the original BERT so I set model=bert instead of model = BERT_Arch(bert), I get the following error while training:

TypeError: nll_loss_nd(): argument 'input' (position 1) must be Tensor, not BaseModelOutputWithPoolingAndCrossAttentions

The trace stack:

 Epoch 1 / 4
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-789-c5138ddf6b25> in <module>()
     12 
     13     #train model
---> 14     train_loss, _ = train()
     15 
     16     #evaluate model

3 frames
<ipython-input-787-a8875e82e2a3> in train()
     28 
     29     # compute the loss between actual and predicted values
---> 30     loss = cross_entropy(preds, labels)
     31 
     32     # add on to the total loss

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py in forward(self, input, target)
    209 
    210     def forward(self, input: Tensor, target: Tensor) -> Tensor:
--> 211         return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)
    212 
    213 

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
   2530     if size_average is not None or reduce is not None:
   2531         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2532     return torch._C._nn.nll_loss_nd(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   2533 
   2534 

TypeError: nll_loss_nd(): argument 'input' (position 1) must be Tensor, not BaseModelOutputWithPoolingAndCrossAttentions

I added return_dict=False in (bert = AutoModel.from_pretrained('bert-base-uncased',return_dict=False)), but the error just changed to TypeError: nll_loss_nd(): argument 'input' (position 1) must be Tensor, not tuple with similar stack trace as the one shown above.

'str' object has no attribute 'dim'

I got ' 'str' object has no attribute 'dim'' by simply running the notebook. Any idea where is the error?

Epoch 1 / 10
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-31-c5138ddf6b25> in <module>
     12 
     13     #train model
---> 14     train_loss, _ = train()
     15 
     16     #evaluate model

<ipython-input-21-a8875e82e2a3> in train()
     25 
     26     # get model predictions for the current batch
---> 27     preds = model(sent_id, mask)
     28 
     29     # compute the loss between actual and predicted values

~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

<ipython-input-16-9ebdcf410f97> in forward(self, sent_id, mask)
     28       _, cls_hs = self.bert(sent_id, attention_mask=mask)
     29 
---> 30       x = self.fc1(cls_hs)
     31 
     32       x = self.relu(x)

~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/linear.py in forward(self, input)
     91 
     92     def forward(self, input: Tensor) -> Tensor:
---> 93         return F.linear(input, self.weight, self.bias)
     94 
     95     def extra_repr(self) -> str:

~/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py in linear(input, weight, bias)
   1686         if any([type(t) is not Tensor for t in tens_ops]) and has_torch_function(tens_ops):
   1687             return handle_torch_function(linear, tens_ops, input, weight, bias=bias)
-> 1688     if input.dim() == 2 and bias is not None:
   1689         # fused op is marginally faster
   1690         ret = torch.addmm(bias, input, weight.t())

AttributeError: 'str' object has no attribute 'dim'

change the underlying model

is it possible to change the underlying model from Bert to some other like RobertA or longformer?
I tried but go the error.
image

any thoughts please?

Multi-label classification issue

Hi,

thank s for this great tutorial
I want to apply this for a multi label text classification problem. My labels are of this format
tensor([[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
...,
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0]])
I changed the softmax function in the bert model by the sigmoid function but when I tried to train the model I got this error
multi-target not supported at /pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:18

Could u help plz
thank u

Add AUC score for AUC-ROC curve

How do I generate AUC score for AUC-ROC curve
I tried the predict_proba but its give me an error :"'BERT_Arch' object has no attribute 'predict_proba'"

furthermore, is it possible to add other neural network layer on top of Bert such as LSTM or BiLSTM.

ValueError: too many dimensions 'str'

Hello,

for train set

train_seq = torch.tensor(tokens_train['input_ids'])
train_mask = torch.tensor(tokens_train['attention_mask'])
train_y = torch.tensor(train_labels.tolist())

at train_y an error occuring as; ValueError: too many dimensions 'str'

There is my train labels:

0

0 positive
1 negative
2 positive
3 notr
4 positive
... ...
4002 notr
4003 positive
4004 positive
4005 notr
4006 negative

can you help me about that issue
x0
x0077-02

Code

Is there a way to cite the code you used for BERT fine-tuning?

Fine-tune BERT with Classification

Hello I'm running the code in my dataset. With frozen weights the values are promising.

However, to fully defrost the BERT the model hits only one label.

Warning: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use zero_division parameter to control this behavior. warn_prf(average, modifier, msg_start, len(result))

          precision    recall  f1-score   support

       0       0.00      0.00      0.00        42
       1       0.40      1.00      0.57        28

accuracy                           0.40        70

macro avg 0.20 0.50 0.29 70
weighted avg 0.16 0.40 0.23 70

Do you need any more procedures for the model to learn labels correctly? I'm using 470 examples.

axis 1 is out of bounds for array of dimension 1

I used my own dataset to train a Bert model but an issue occur when I am trying to evaluate the model with testing dataset (refer to the image 1). The shape of the array is (1909,2) which is a 2 dimensional array. One more issue is the original size of testing data should be 1910, not sure why the shape of preds is become 1909.

Image 1
image

Image 2
image

Preds returned is a tuple of float tensors

get model predictions for the current batch

preds = model(sent_id, mask)

# compute the loss between actual and predicted values
loss = cross_entropy(preds, labels)

When train is invoked , I see the following error -
AttributeError: 'tuple' object has no attribute 'dim'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.