yzhangcs / crfpar Goto Github PK

[ACL'20, IJCAI'20] Code for "Efficient Second-Order TreeCRF for Neural Dependency Parsing" and "Fast and Accurate Neural CRF Constituency Parsing".

Home Page: https://www.aclweb.org/anthology/2020.acl-main.302

License: MIT License

Python 100.00%

pytorch treecrf dependency-parsing constituency-parsing acl2020 ijcai2020

crfpar's Introduction

CʀꜰPᴀʀ

Source code for ACL'20 paper "Efficient Second-Order TreeCRF for Neural Dependency Parsing" and IJCAI'20 paper "Fast and Accurate Neural CRF Constituency Parsing".

The code of ACL'20 paper (Cʀꜰ2o is not ported yet) and IJCAI'20 paper is available at the crf-dependency branch and crf-dependency branch respectively.

Currently I'm working to release a python package named supar, including pretrained models for my papers. The code is unstable and not imported to this repo yet. If you would like to try them out in advance, please refer to my another repository parser.

Citation

If you are interested in our researches, please cite:

@inproceedings{zhang-etal-2020-efficient,
  title     = {Efficient Second-Order {T}ree{CRF} for Neural Dependency Parsing},
  author    = {Zhang, Yu and Li, Zhenghua and Zhang Min},
  booktitle = {Proceedings of ACL},
  year      = {2020},
  url       = {https://www.aclweb.org/anthology/2020.acl-main.302},
  pages     = {3295--3305}
}

@inproceedings{zhang-etal-2020-fast,
  title     = {Fast and Accurate Neural {CRF} Constituency Parsing},
  author    = {Zhang, Yu and Zhou, Houquan and Li, Zhenghua},
  booktitle = {Proceedings of IJCAI},
  year      = {2020},
  doi       = {10.24963/ijcai.2020/560},
  url       = {https://doi.org/10.24963/ijcai.2020/560},
  pages     = {4046--4053}
}

Please feel free to email me if you have any issues.

crfpar's People

Contributors

Stargazers

Watchers

Forkers

nlngh xsthunder ironsword666 yyyrex suda-la speedcell4 hins

crfpar's Issues

bug: dim[1] unmatch on DataParallel

background

--feat char with two gpu

error

  File "/data/user/jupyter-ws/sematic/crfpar/parser/model.py", line 98, in forward
    word_embed, feat_embed = self.embed_dropout(word_embed, feat_embed)
  File "/home/user/anaconda3/envs/crfpar/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/data/user/jupyter-ws/sematic/crfpar/parser/modules/dropout.py", line 56, in forward
    total = sum(masks)
RuntimeError: The size of tensor a (126) must match the size of tensor b (125) at non-singleton dimension 1

analyse

crfpar/parser/cmds/train.py

Line 73 in 8abb95f

self.model = nn.DataParallel(self.model)

see DataParallel — PyTorch 1.7.0 documentation

crfpar/parser/model.py

Line 77 in 8abb95f

lens = mask.sum(dim=1)

DataParallel scatter inputs to different gpu which means words's max length may not match the words.shape[1]. Thus, calculating lens from mask will lead to unmatch max sequence length.

crfpar/parser/model.py

Line 88 in 8abb95f

feat_embed = pad_sequence(feat_embed.split(lens.tolist()), True)

rebuild feat_embed from lens could lead to unmatch dim1, namely the max sequence length dim of feat_embed, resulting in the above error.

possible solution

use only one gpu
~~pass lens as model input instead of calculating from words.~~ this wont work, use #6 (comment) instead

What's next for the state of the art?

Firstly I would like to thank you for this fantastic work!

I am not an expert, I am more of a user of dependency parsing than a researcher but I NEED (I try to build true semantic parsing) accurate dependency parsing.
As you know, the number 1 SOTA:
Label Attention Layer + HPSG + XLNet (Mrini et al., 2019) has a LAS of 96.26.
While this is correct, it is not accurate enough for many semantic downstream tasks !

So I'm looking for the future state of the art, what do you think would be the most promising?

I'm really into merging the best ideas from others SOTA into a new state of the art, being the best of all.
But some techniques are incompatible with others.

So let me ask some noob questions:
Could your crfparser benefit from using XLnet? From using HPSG? And/or from using a label attention layer?

Reproduce results with pretrained model

Hi,
Your work is exciting.
I can reproduce non-pretrained parsing results quite fast and efficiently.
However, I can not reproduce the constituency parsing results with Bert.
For the bert constituency parsing, I run python run.py train --device 0 --feat bert --file exp/ptb.bert
And get the result:

max score of dev is 94.18% at epoch 183
the score of test at epoch 183 is 94.09%

For pretrained, we should get something around 95.59.
Can you guide me to reproduce the results?

Reproducing results from the paper (Universal Dependencies)

Hi, interesting paper and thank you for making this well structured code available.

I'm having trouble figuring out what to run to reproduce your results. More specifically, I've managed to load the Universal Dependencies dataset (v2.2) after making some changes to the corpus loading code, but I still have the following question:

What configuration do I need to change to reproduce the comparison of LOC, CRF and CRF2O. Are second order features part of the released code?

Thanks in advance!

关于论文Efficient Second-Order TreeCRF for Neural Dependency Parsing中的式子

您好，我想请教论文中与打分triaffine结构相关张量的shape形状

具体来说是我想请问：
您的论文中有说W(triaffine)的shape为（dxdxd），我想知道剩余三个部分的shape是怎样的，进而了解这三部分r(i)，r(k)，r(j)是如何与三阶的权重张量W进行相乘的。
期待您的回答，谢谢您！

potential problem with inside function

Test the result of inside function, the inside_ng is the inside function removing all the register_hook

import torch
from torch_struct import DependencyCRF
from alg import inside_ng

score = torch.zeros(1,3,3)
score1 = torch.zeros(1,2,2) #equivalent score used by torch struct

mask = torch.tensor([False, True, True]).unsqueeze(0)
lens = torch.tensor([2]).long()

#partition by inside

s_i, s_c = inside_ng(score, mask)
partition = s_c[0].gather(0, lens.unsqueeze(0)).sum()
print(partition)

#partition by torch struct
deptree = DependencyCRF(score1, lens)
print(deptree.partition)

The result is 0.6931(by inside) and 1.0986

By calculation, the true logZ value should be ln(3)=1.0986, but not ln(2)=0.6931

Where is the triaffine module?

There are two duplicate crf-dependency in your README.md, I think it's a typo.

from README.md: The code of ACL'20 paper (Cʀꜰ2o is not ported yet) and IJCAI'20 paper is available at the crf-dependency branch and crf-dependency branch respectively.

I can't find the triaffine module in both crf-dependency and crf-constituency branch.
https://github.com/yzhangcs/crfpar/blob/crf-dependency/parser/model.py
https://github.com/yzhangcs/crfpar/blob/crf-constituency/parser/model.py

Why [0] was omitted in fn isprojective

Thanks for releasing the code.

I found [0] was omitted in

crfpar/parser/utils/fn.py

Line 31 in 8abb95f

arcs = [(h, d) for d, h in enumerate(sequence[1:], 1) if h >= 0]

However, in yzhangcs/parser, [0] is included in calculation as far as i concerned. Was it a intended behavior? Can you explain the difference between the two scene. Following are yzhangcs/parser references.

https://github.com/yzhangcs/parser/blob/d4559e521ca88a3997337465a3c6d2bce4165d11/supar/utils/transform.py#L259

https://parser.readthedocs.io/en/latest/_modules/supar/utils/transform.html#CoNLL.isprojective

https://github.com/yzhangcs/parser/blob/d4559e521ca88a3997337465a3c6d2bce4165d11/tests/test_transform.py#L43

Thanks in advance

easy use

Thanks for publish this work, I wonder is there any way to easy use this method without deploy the whole project, such as python package, API or online demo?