songyouwei / absa-pytorch Goto Github PK
View Code? Open in Web Editor NEWAspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。
License: MIT License
Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。
License: MIT License
Try to input first:
trump is good, but obama is bad
, with trump
being target.
Then, second with obama
being target.
None of the models achieve the result they claim to achieve, because they output the same sentiment for both sentences.
(Sorry if wrongly tested, but i think i didn't).
请问在Multi-Head Attention计算时不考虑位置编码吗?我在代码中好像没找到,位置编码应该挺重要的吧
Hi,
I am getting below error when running infer_example.py for bert_spc. Can someone help me solve it?
Below is the code for infer_example.py that I am using.
import torch
import torch.nn.functional as F
import torch.nn as nn
import argparse
from data_utils import build_tokenizer, build_embedding_matrix
from models import IAN, MemNet, ATAE_LSTM, AOA
from models.bert_spc import BERT_SPC
class Inferer:
"""A simple inference example"""
def __init__(self, opt):
self.opt = opt
self.tokenizer = build_tokenizer(
fnames=[opt.dataset_file['train'], opt.dataset_file['test']],
max_seq_len=opt.max_seq_len,
dat_fname='{0}_tokenizer.dat'.format(opt.dataset))
embedding_matrix = build_embedding_matrix(
word2idx=self.tokenizer.word2idx,
embed_dim=opt.embed_dim,
dat_fname='{0}_{1}_embedding_matrix.dat'.format(str(opt.embed_dim), opt.dataset))
self.model = opt.model_class(embedding_matrix, opt)
print('loading model {0} ...'.format(opt.model_name))
self.model.load_state_dict(torch.load(opt.state_dict_path))
self.model = self.model.to(opt.device)
# switch model to evaluation mode
self.model.eval()
torch.autograd.set_grad_enabled(False)
def evaluate(self, raw_texts):
context_seqs = [self.tokenizer.text_to_sequence(raw_text.lower().strip()) for raw_text in raw_texts]
aspect_seqs = [self.tokenizer.text_to_sequence('null')] * len(raw_texts)
context_indices = torch.tensor(context_seqs, dtype=torch.int64).to(self.opt.device)
aspect_indices = torch.tensor(aspect_seqs, dtype=torch.int64).to(self.opt.device)
t_inputs = [context_indices, aspect_indices]
t_outputs = self.model(t_inputs)
t_probs = F.softmax(t_outputs, dim=-1).cpu().numpy()
return t_probs
if __name__ == '__main__':
model_classes = {
'atae_lstm': ATAE_LSTM,
'ian': IAN,
'memnet': MemNet,
'aoa': AOA,
'bert_spc': BERT_SPC,
}
# set your trained models here
model_state_dict_paths = {
'atae_lstm': 'state_dict/atae_lstm_restaurant_acc0.7786',
'ian': 'state_dict/ian_restaurant_acc0.7911',
'memnet': 'state_dict/memnet_restaurant_acc0.7911',
'aoa': 'state_dict/aoa_restaurant_acc0.8063',
'bert_spc': 'state_dict/bert_spc_restaurant_val_acc0.8196',
}
class Option(object):
pass
opt = Option()
opt.model_name = 'bert_spc'
opt.model_class = model_classes[opt.model_name]
opt.dataset = 'restaurant'
opt.dataset_file = {
'train': './datasets/semeval14/Restaurants_Train.xml.seg',
'test': './datasets/semeval14/Restaurants_Test_Gold.xml.seg'
}
opt.state_dict_path = model_state_dict_paths[opt.model_name]
opt.embed_dim = 300
opt.hidden_dim = 300
opt.max_seq_len = 80
opt.polarities_dim = 3
opt.dropout = 0.1
opt.bert_dim = 768
opt.hops = 3
opt.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
inf = Inferer(opt)
t_probs = inf.evaluate(['happy memory', 'the service is terrible', 'just normal food'])
print(t_probs.argmax(axis=-1) - 1)
Thanks
您好,最近我在跑您基于 BERT 的 AEN 的代码,我发现您在训练的过程中是每间隔 5 个 step 就在测试集上进行一次测试,总共 20 个 epoch,那么这个过程中岂不是要测试非常多次,似乎有些耗时,这是否有必要呢?是否可以 1 个epoch跑完了之后再进行测试。另外,我使用 RTX 2080 Ti 跑您的 AEN-BERT 模型,batch_size 最大只能设为 16,我看您代码中写的是有16, 32 和 64 三种batch_size,不知道您在论文中的结果是用的多大的 batch_size。
望有空解惑,谢谢!:)
I am confused how to do significant tests with difference metrics, e.g. accuracy and macro-f1.
Could someone please help me to come out of this issue:
I am using miniconda, tried to execute the sentiment analysis, i could see killed message.
Here are the more details:
(base) renuk@renuk-ThinkPad-X1-Carbon-6th:/ABSA-PyTorch$ python -c "import torch; print(torch.version)"/ABSA-PyTorch$ python
1.0.1.post2
(base) renuk@renuk-ThinkPad-X1-Carbon-6th:
Python 3.6.2 |Anaconda, Inc.| (default, Oct 5 2017, 07:59:26)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
[4]+ Stopped python
(base) renuk@renuk-ThinkPad-X1-Carbon-6th:~/ABSA-PyTorch$ python train.py --model_name bert_spc --dataset restaurant --logdir bert_spc_logs
n_trainable_params: 109484547, n_nontrainable_params: 0
training arguments:
model_name: bert_spc
dataset: restaurant
optimizer: <class 'torch.optim.adam.Adam'>
initializer: <function xavier_uniform_ at 0x7f65dd9719d8>
learning_rate: 2e-05
dropout: 0.1
l2reg: 0.01
num_epoch: 20
batch_size: 64
log_step: 5
logdir: bert_spc_logs
embed_dim: 300
hidden_dim: 300
bert_dim: 768
pretrained_bert_name: bert-base-uncased
max_seq_len: 80
polarities_dim: 3
hops: 3
device: cpu
model_class: <class 'models.bert_spc.BERT_SPC'>
dataset_file: {'train': './datasets/semeval14/Restaurants_Train.xml.seg', 'test': './datasets/semeval14/Restaurants_Test_Gold.xml.seg'}
inputs_cols: ['text_bert_indices', 'bert_segments_ids']
epoch: 0
Killed
When I run
python3 train.py --model_name ram --dataset restaurant --learning_rate 1e-3 --num_epoch 200
I get
RuntimeError: Expected object of scalar type Float but got scalar type Long for sequence element 1 in sequence argument at position #1 'tensors'
Hi,
Thanks for the great work.
Just notice that the CrossEntropyLoss_LSR
is not referenced at all. So...does that mean I need to manually replace this line?
Line 152 in e4da01e
BTW, I notice there are some reproducible issues as mentioned in #38. I think it would be great if you could provide the actual commands (random seeds, hyperparameters, etc.) for experiments.
关于IAN模型的实现,
aspect_len = torch.tensor(aspect_len, dtype=torch.float).to(self.opt.device)
aspect = torch.sum(aspect, dim=1)
aspect = torch.div(aspect, aspect_len.view(aspect_len.size(0), 1))
text_raw_len = torch.tensor(text_raw_len, dtype=torch.float).to(self.opt.device)
context = torch.sum(context, dim=1)
context = torch.div(context, text_raw_len.view(text_raw_len.size(0), 1))
aspect_final = self.attention_aspect(aspect, context).squeeze(dim=1)
context_final = self.attention_context(context, aspect).squeeze(dim=1)
在上面的实现中,context
和aspect
最后均为平均值,是否与原文模型不符。
what's the learning rate and the num_epochs when tranning AEN-glove MHA model
您好,请问一下AEN-glove MHA训练时候的学习率和迭代次数是多少呢,感觉收敛的特别慢,迭代100次都没达到论文中的0.7178~
I'm currently somehow confused about why we use squeeze_embedding module to pack the padded sequences and then unpack it ?
I try to modify data_util.py, but there are some places that make me confused. I am caught in the bottleneck. So how to modify data_util.py to add position embedding?
Thanks
Thank you so much for your efforts, but I have a question (Do you think that these models could work with datasets in other languages such as Arabic?)
Regards,
Saja
I tried to use CrossEntropyLoss_LSR, but i got this error:
AttributeError: 'CrossEntropyLoss_LSR' object has no attribute 'backward'
Hi,
Thank you for your modification of the error. I try to run the aen model, and I get 0.7642... after 50 epoch, I am wondering why.
I will be grateful if I can get your reply.
Please look at this
ImportError: /home/linux/anaconda3/envs/absa/lib/python3.7/site-packages/torch/lib/libtorch.so.1: undefined symbol: nvrtcGetProgramLogSize
作者你好。
在尝试了很多实验之后, aen-bert模型复现遇到了一些瓶颈。
学习率尝试了很多,包括2,3,5e-5等等, epoch也适当增加了多轮次(到了20.30+)
自己的多次实验表明:
res数据集最高只能到82.4的准确率(论文是83.12)
lap数据集最高只能到78.4的准确率(论文是79.93),此结果差别较大
想请问一下是不是我自己操作哪里有问题。
能否麻烦您详细把训练的超参再说明一下。
谢谢。
Hi,
When I try to run aen model, I get the error "can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first." I don't know why and how to fix it. I will be grateful if I can get your reply.
Hello authors, thank you for the nice, clean, running repo!
If I want to use the models to my own text classification dataset, what's the best way to prepare the data?
Specifically,
What is the best way to mask the aspects with $T$
?
Currently what I have are just plain text and their labels.
Is there any way to use AEN_bert or BERT_spc without having to preprocess the data with $T$
?
thank you for this nice repo, I have a question: could you share the details of preprocessing on SemEval2014 Dataset (e.g. removing the conflict lable in the dataset ). thanks a lot.
after install the pytorch-pretrained-bert 0.6.1, and loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt , it shows:
https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz not found in cache, downloading to /tmp/tmpzqs03pcr
want to know whether I can install the tar.gz in advance? and how?
您好,请问一下,我直接跑源代码的aen-Bert 发现macro-f1 最高只能到0.68,从没有超过0.7,这个现象在其他的模型中同样存在,请问是我哪里设置有问题,还是说实验跑不到论文中的效果(或者有其他的解释方法?) 谢谢
Hello,
please, when I run these models using the provided data-sets almost the results were lower than the reported in the original papers by 2 or 3. So, have you any explanation about that?
Thanks a lot,
Traceback (most recent call last):
File "/Users/wei/Desktop/pythonDemo/nlpdazuoye/nlp_absa/demo2.py", line 479, in
train()
File "/Users/wei/Desktop/pythonDemo/nlpdazuoye/nlp_absa/demo2.py", line 475, in train
ins.run()
File "/Users/wei/Desktop/pythonDemo/nlpdazuoye/nlp_absa/demo2.py", line 403, in run
best_model_path = self._train(criterion, optimizer, train_data_loader, val_data_loader) # 保存最好的模型
File "/Users/wei/Desktop/pythonDemo/nlpdazuoye/nlp_absa/demo2.py", line 327, in _train
outputs = self.model(inputs)
File "/Users/wei/anaconda3/envs/tensorflow/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/Users/wei/Desktop/pythonDemo/nlpdazuoye/nlp_absa/demo2.py", line 218, in forward
target = self.squeeze_embedding(target, target_len)
File "/Users/wei/anaconda3/envs/tensorflow/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/Users/wei/Desktop/pythonDemo/nlpdazuoye/nlp_absa/demo2.py", line 70, in forward
x_emb_p = torch.nn.utils.rnn.pack_padded_sequence(x, x_len, batch_first=self.batch_first)
File "/Users/wei/anaconda3/envs/tensorflow/lib/python3.6/site-packages/torch/nn/utils/rnn.py", line 268, in pack_padded_sequence
torch._C._VariableFunctions._pack_padded_sequence(input, lengths, batch_first)
RuntimeError: Length of all samples has to be greater than 0, but found an element in 'lengths' that is <= 0
这种错误 不知道怎么回事
BERT in your application is to test wheather the sentence B is the actual next sentence that follows sentence A, but what can I do to use the BERT in aspect-level sentiment analysis?
因为mgan论文中,本意是要加大aspect词和靠近aspect的那些词的权重?
Thanks for sharing and making all ABSA baselines together.
Is there a validation mechanism in the code? I am not sure how to know which model (selection) to test.
Thank you very much for implementing the models, but I have a question. Have you tried to reproduce the results on the datasets from the articles with your models?
动手能力太差,想到做不出来,请问大佬后期对ABSA有更深入研究的打算吗?
假如每个句子有不等个target(1-3个),需要预测出来每一个target的情感。。
想问下一下 假如每一个句子target不等的话,如何对齐呢...
我看好多论文最后都是过了softmax,这样的话是只能预测一个target的情感态度吗..
请问一下,直接运行你的aen模型,参数不断调整但是f1总是只有0.6左右,最多只有0.65,实在调不到你的水平,可以将你修改的参数值发一下吗,参考一下,谢谢了
please mention it in the readme.
Best regard
哈喽,请问如何计算单个模型的运行时间然后与其他模型进行比较呢?谢谢~
After cloning, I ran the following command
python train.py --model_name bert_spc --dataset restaurant --learning_rate 2e-5 --logdir bert_spc_logs
epoch: 10
loss: 0.9447, acc: 0.6562, test_acc: 0.6500, f1: 0.2626
loss: 0.9312, acc: 0.6562, test_acc: 0.6500, f1: 0.2626
loss: 1.0388, acc: 0.6042, test_acc: 0.6500, f1: 0.2626
loss: 0.9838, acc: 0.5977, test_acc: 0.6500, f1: 0.2626
loss: 0.9509, acc: 0.6000, test_acc: 0.6500, f1: 0.2626
loss: 0.9240, acc: 0.6068, test_acc: 0.6500, f1: 0.2626
loss: 0.8728, acc: 0.6205, test_acc: 0.6500, f1: 0.2626
loss: 1.0066, acc: 0.6113, test_acc: 0.6500, f1: 0.2626
loss: 0.9481, acc: 0.6111, test_acc: 0.6500, f1: 0.2626
loss: 0.9346, acc: 0.6125, test_acc: 0.6500, f1: 0.2626
loss: 0.9435, acc: 0.6122, test_acc: 0.6500, f1: 0.2626
The test accuracy remains at 0.65 and doest not change.
您好,论文后面可视化图是怎么做的呢
If I want to use both of them, how to modify code in aen.py? Thanks a lot.
Please, I want to ask you about the .seg files, how can I convert my dataset to be with this extension?
Thanks,
the .py file don't have run...I want to say, you haven't run it?
I use a 12GB titanxp. It runs out of memory.
How to avoid this problem ? Is there a bug or my gpu memory is too small? What gpu do you use ?
Thank you
Hi I ran 10 fold cross validation test as shown in repo60,62, but gain lower results:
twitter:
learning rate 2e-5,mean_test_acc: 0.7095, mean_test_f1: 0.6925
learning rate 5e-5,mean_test_acc: 0.6259, mean_test_f1: 0.5539
restaurant:
learning rate 2e-5,mean_test_acc: 0.7095, mean_test_f1: 0.6925
learning rate 5e-5,mean_test_acc: 0.7407, mean_test_f1: 0.5457
how to solve this problem?
the other parameters are:
parser.add_argument('--model_name', default='aen_bert', type=str)
parser.add_argument('--dataset', default='laptop', type=str, help='twitter, restaurant, laptop')
parser.add_argument('--optimizer', default='adam', type=str)
parser.add_argument('--initializer', default='xavier_uniform_', type=str)
parser.add_argument('--learning_rate', default=2e-5, type=float, help='try 5e-5, 2e-5 for BERT, 1e-3 for others')
parser.add_argument('--dropout', default=0.1, type=float)
parser.add_argument('--l2reg', default=0.01, type=float)
parser.add_argument('--num_epoch', default=10, type=int, help='try larger number for non-BERT models')
parser.add_argument('--batch_size', default=16, type=int, help='try 16, 32, 64 for BERT models')
parser.add_argument('--log_step', default=10, type=int)
parser.add_argument('--embed_dim', default=300, type=int)
parser.add_argument('--hidden_dim', default=300, type=int)
parser.add_argument('--bert_dim', default=768, type=int)
parser.add_argument('--pretrained_bert_name', default='bert-base-uncased', type=str)
parser.add_argument('--max_seq_len', default=80, type=int)
parser.add_argument('--polarities_dim', default=3, type=int)
parser.add_argument('--hops', default=3, type=int)
parser.add_argument('--device', default=None, type=str, help='e.g. cuda:0')
parser.add_argument('--seed', default=None, type=int, help='set seed for reproducibility')
parser.add_argument('--cross_val_fold', default=10, type=int, help='k-fold cross validation')
Is it possible to add K-fold cross-validation support to the code?
HI,
In the file infer_example.py, I have the following code
def evaluate(self, raw_texts):\n ......\n aspect_seqs = [self.tokenizer.text_to_sequence('battery')] * len(raw_texts)\n ......
t_probs = inf.evaluate(['laptop is good but battery is bad'])\n print(t_probs.argmax(axis=-1) - 1)
why is the sentiment score = 1?
I tested the sentence 'laptop is good' and 'battery is bad and the output are 1 and -1 respectively. But when I combine the sentence together, the output is always 1 no matter the aspect term.
The model I am using is AOA, and it is trained on the laptop reviews and the accuracy is 0.7304.
您好,在magn模型中,损失函数是不一样的,代码中有体现吗?
I ran the code of bert_spc.py model, the accuracy rate was 65.8%, F1 was 36.5%. Why the accuracy rate was not so high? I used the restaurant data. Was it because of the data set?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.