renovamen / text-classification Goto Github PK

PyTorch implementation of some text classification models (HAN, fastText, BiLSTM-Attention, TextCNN, Transformer) | 文本分类

License: MIT License

Python 100.00%

text-classification han nlp hierarchical-attention-networks fasttext bilstm-attention textcnn lstm cnn document-classification

text-classification's People

Contributors

Stargazers

Watchers

text-classification's Issues

get_clean_text function is inside datasets.preprocess.sentence

:~/Text-Classification$ python classify.py
Traceback (most recent call last):
File "classify.py", line 6, in
from datasets.preprocess import get_clean_text
ImportError: cannot import name 'get_clean_text' from 'datasets.preprocess' (unknown location)

ModuleNotFoundError: No module named 'models'

HI @Renovamen
When run classify.py, it's oki, but move classify.py file to other anywhere: ModuleNotFdeloundError: No module named 'models'.
I know that model, epoch, optimizer, ... They have been saved in .pth.tar.
So, why?

Thank

A little bug in processing data

Hello，
I'm learning the source code of TextCNN，at ./datesets/preprocess/sentence.py , here

for text in row[1:]: text = get_clean_text(text) s = s + text
So The last word of title and the first word of description will be a new word. I think this place should be s = s + ' ' + text;

Parameters of experimental results

python train.py --config ./configs/transformer.yaml

Loading embeddings: 400000it [00:55, 7234.47it/s]
Saving vectors to /home/a/Text-Classification/data/outputs/ag_news/sents/glove.6B.300d.txt.pth.tar
Traceback (most recent call last):
File "train.py", line 73, in
trainer = set_trainer(config)
File "train.py", line 28, in set_trainer
model = models.setup(
File "/home/a/Text-Classification/models/init.py", line 83, in setup
model = Transformer(
File "/home/a/Text-Classification/models/Transformer/transformer.py", line 53, in init
self.encoder = EncoderLayer(d_model, n_heads, hidden_size, dropout)
File "/home/a/Text-Classification/models/Transformer/encoder_layer.py", line 22, in init
self.attention = MultiHeadAttention(d_model, n_heads, dropout)
File "/home/a/Text-Classification/models/Transformer/attention.py", line 56, in init
assert d_model % n_heads == 0
AssertionError

Support to perform test set validation during the training evolution

The acc and loss reported during the training process in trainer.py seems to be the training data metrics.
And is there any easy-to-implement options to calculate the testing set acc during the training evolution in the current framework?

thank you~

getting the index out of bound error in han.yaml python train.py --config ./configs/han.yaml

result = self.forward(*input, **kwargs)

File "/home/ahsadmin/anaconda3/envs/san/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 124, in forward
return F.embedding(
File "/home/ahsadmin/anaconda3/envs/san/lib/python3.8/site-packages/torch/nn/functional.py", line 1852, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

How to solve this error..?

Also I am getting the following in the batch,

(Pdb) batch
[tensor([[ 1, 2, 3, ..., 0, 0, 0],
[ 1, 23959, 3, ..., 0, 0, 0],
[ 1, 23959, 3, ..., 0, 0, 0],
...,
[ 1, 23959, 3, ..., 3, 6, 3],
[ 1, 23959, 3, ..., 6, 3, 0],
[ 1, 2, 3, ..., 3, 6, 3]]), tensor([[188],
[184],
[190],
[197],
[197],
[200],
[200],
[193],
[200],
[200],
[194],
[184],
[191],
[200],
[200],
[195],
[181],
[180],
[198],
[196],
[190],
[188],
[197],
[196],
[195],
[183],
[185],
[192],
[197],
[200],
[199],
[200]]), tensor([[-1],
[ 0],
[ 0],
[ 0],
[ 0],
[-1],
[-1],
[ 0],
[ 0],
[-1],
[ 0],
[ 0],
[ 0],
[-1],
[ 0],
[ 0],
[ 0],
[ 0],
[ 0],
[ 0],
[-1],
[ 0],
[ 0],
[ 0],
[ 0],
[ 0],
[ 0],
[-1],
[-1],
[ 0],
[ 0],
[ 0]])]
(Pdb) n

not sure about the -1 in the labels...

renovamen / text-classification Goto Github PK

text-classification's People

Contributors

Stargazers

Watchers

Forkers

text-classification's Issues

get_clean_text function is inside datasets.preprocess.sentence

ModuleNotFoundError: No module named 'models'

A little bug in processing data

Parameters of experimental results

python train.py --config ./configs/transformer.yaml

Support to perform test set validation during the training evolution

getting the index out of bound error in han.yaml python train.py --config ./configs/han.yaml

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs