Comments (6)
@spranjal25 are you fine with @uakarsh suggestion?
Yes, that's very helpful. I think I can get started on it. Will pick this up ASA I get some free time, are we looking at a timeline here though @Borda?
from lightning-transformers.
Hi @SeanNaren, I'm looking to contribute in ML-Software projects and have been using pytorch-lightning
myself (read: I'm a fan!). Can you tell me where to get started for this issue? I'd like to scope if I can devote some of my time fixing this one.
from lightning-transformers.
@SeanNaren would you have some points on how/where to start? 🐰
from lightning-transformers.
Hi @SeanNaren, @Borda, I think here is what is being asked to be modified.
I referred to this example.
In here, we use TranslationTransformer
for the training purpose, and it inherits from Seq2SeqTransformer
. If we see this line, we see that the output is loss, logits, however here the loss is calculated taking the padding token into account.
I found the answer about how to solve it, and it is described by the Hugging Face community here.
So, I guess the change to be made is (in simple language), in the same line, i.e here:
- Obtain the loss, logits from the common step
- Initialize the Cross-Entropy loss with
ignore_index = -100
. - Make the indexes of the target tokens which are
0
to-100
- Calculate the final loss and then perform the steps as usual.
Hope this helps in solving the issue.
from lightning-transformers.
@spranjal25 are you fine with @uakarsh suggestion?
from lightning-transformers.
not a special rush :)
from lightning-transformers.
Related Issues (20)
- Running inference on GPU HOT 3
- DeepSpeed integration broken: Tensors must be CUDA and dense HOT 3
- Support for `pytorch_lightning.Trainer.predict` HOT 1
- What do you guys think about having this in a method called set model? HOT 2
- Shuffle support HOT 2
- HFSaveCheckpoint does not work with deepspeed HOT 5
- LightningCLI compatibility and Type["AutoModel"] HOT 4
- TextClassificationTransformer should log torchmetrics object instead of computed Tensors HOT 1
- How to fix ModuleNotFoundError: No module named 'habana_frameworks.torch' HOT 1
- pip installation issues HOT 4
- Can you demonstrate how to fine-tune a pretrained model on unlabeled data HOT 6
- Deepspeed sharding and load from checkpoint with custom lightning module - setup() not called during checkpoint loading HOT 2
- Jointly train Question Answering and Multiple Choice HOT 1
- language model load from checkpoint error HOT 1
- Shuffling support
- `TransformerDataModule.setup()` run more than once unnecessarily
- Lightning 1.8 breaks lightning-transformers HOT 1
- Ver0.2.4 compatibility PL v1.8 HOT 6
- HF compatibility issue HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lightning-transformers.