renovamen / text-classification Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of some text classification models (HAN, fastText, BiLSTM-Attention, TextCNN, Transformer) | 文本分类
License: MIT License
PyTorch implementation of some text classification models (HAN, fastText, BiLSTM-Attention, TextCNN, Transformer) | 文本分类
License: MIT License
:~/Text-Classification$ python classify.py
Traceback (most recent call last):
File "classify.py", line 6, in
from datasets.preprocess import get_clean_text
ImportError: cannot import name 'get_clean_text' from 'datasets.preprocess' (unknown location)
HI @Renovamen
When run classify.py
, it's oki, but move classify.py file to other anywhere: ModuleNotFdeloundError: No module named 'models'
.
I know that model, epoch, optimizer, ... They have been saved in .pth.tar
.
So, why?
Thank
Hello,
I'm learning the source code of TextCNN,at ./datesets/preprocess/sentence.py
, here
for text in row[1:]: text = get_clean_text(text) s = s + text
So The last word of title and the first word of description will be a new word. I think this place should be s = s + ' ' + text;
Loading embeddings: 400000it [00:55, 7234.47it/s]
Saving vectors to /home/a/Text-Classification/data/outputs/ag_news/sents/glove.6B.300d.txt.pth.tar
Traceback (most recent call last):
File "train.py", line 73, in
trainer = set_trainer(config)
File "train.py", line 28, in set_trainer
model = models.setup(
File "/home/a/Text-Classification/models/init.py", line 83, in setup
model = Transformer(
File "/home/a/Text-Classification/models/Transformer/transformer.py", line 53, in init
self.encoder = EncoderLayer(d_model, n_heads, hidden_size, dropout)
File "/home/a/Text-Classification/models/Transformer/encoder_layer.py", line 22, in init
self.attention = MultiHeadAttention(d_model, n_heads, dropout)
File "/home/a/Text-Classification/models/Transformer/attention.py", line 56, in init
assert d_model % n_heads == 0
AssertionError
The acc and loss reported during the training process in trainer.py
seems to be the training data metrics.
And is there any easy-to-implement options to calculate the testing set acc during the training evolution in the current framework?
thank you~
result = self.forward(*input, **kwargs)
File "/home/ahsadmin/anaconda3/envs/san/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 124, in forward
return F.embedding(
File "/home/ahsadmin/anaconda3/envs/san/lib/python3.8/site-packages/torch/nn/functional.py", line 1852, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self
How to solve this error..?
Also I am getting the following in the batch,
(Pdb) batch
[tensor([[ 1, 2, 3, ..., 0, 0, 0],
[ 1, 23959, 3, ..., 0, 0, 0],
[ 1, 23959, 3, ..., 0, 0, 0],
...,
[ 1, 23959, 3, ..., 3, 6, 3],
[ 1, 23959, 3, ..., 6, 3, 0],
[ 1, 2, 3, ..., 3, 6, 3]]), tensor([[188],
[184],
[190],
[197],
[197],
[200],
[200],
[193],
[200],
[200],
[194],
[184],
[191],
[200],
[200],
[195],
[181],
[180],
[198],
[196],
[190],
[188],
[197],
[196],
[195],
[183],
[185],
[192],
[197],
[200],
[199],
[200]]), tensor([[-1],
[ 0],
[ 0],
[ 0],
[ 0],
[-1],
[-1],
[ 0],
[ 0],
[-1],
[ 0],
[ 0],
[ 0],
[-1],
[ 0],
[ 0],
[ 0],
[ 0],
[ 0],
[ 0],
[-1],
[ 0],
[ 0],
[ 0],
[ 0],
[ 0],
[ 0],
[-1],
[-1],
[ 0],
[ 0],
[ 0]])]
(Pdb) n
not sure about the -1 in the labels...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.