GithubHelp home page GithubHelp logo

Adjectival verbs in Japanese lambeq about lambeq HOT 8 CLOSED

cqcl avatar cqcl commented on July 17, 2024
Adjectival verbs in Japanese lambeq

from lambeq.

Comments (8)

dimkart avatar dimkart commented on July 17, 2024

Hi @masakiowari, thanks for this. First thing to check is whether the parser creates a valid CCG tree, you can do this with the following code:

tree = parser.sentence2tree("your sentence") 
if tree is None:
    print("Failure")
else:
    print(tree.deriv())

If the sentence has a CCG derivation, then this is probably a lambeq problem. However, if the sentence fails to parse, then this is a problem of the DepCCG parser. Let us know the result.

from lambeq.

masakiowari avatar masakiowari commented on July 17, 2024

Hello! Thank you very much.
For "the bad" sentence, the result is as follows
the code,

import sys
sys.path.append("/lambeq")
sys.path.append("/depccg")
from lambeq import DepCCGParser
from discopy import grammar
tree = DepCCGParser.sentence2tree("上品な 表現 を する")
if tree is None:
print("Failure")
else:
print(tree.deriv())

give an output as

Traceback (most recent call last):
File "c:\Users\bi21008\Downloads\depccg-master\import sys.py", line 16, in
parser = DepCCGParser(verbose='suppress')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\bi21008\Downloads\depccg-master\lambeq\text2diagram\depccg_parser.py", line 155, in init
raise ValueError('DepCCGParser only supports
ValueError: DepCCGParser only supports "progress" level of verbosity. suppress was given.`

For the "good" sentence like "これはテストです",
the similar code gives the following output
これ >> Id(n) @ は >> Id(n @ n.r) @ テスト >> Id(n @ n.r) @ です >> Id(n @ n.r @ n.r @ s) @ Cup (n.l, n) >> Id(n @ n.r @ s) @ Cup(n.l, n) >> Cup(n, n.r) @ Id(s)

As you say that this result may suggest the problem is with the parser.
However, when we directly use Depccg parser as is written on the following page:
https://github.com/masashi-y/depccg
the bad sentece like "上品な 表現 を する" get a proper output like
ID=2, Prob=-53.02713191278883
{< S[mod=nm,form=base,fin=t] {< S[mod=nm,form=base,fin=f] {< NP[case=nc,mod=nm,fin=f] {NP[case=nc,mod=nm,fin=f] 上品な/上品な/} {NP[case=nc,mod=nm,fin=f]\NP[case=nc,mod=nm,fin=f] 表現/表現/}} {< S[mod=nm,form=base,fin=f]\NP[case=nc,mod=nm,fin=f] {< NP[case=nc,mod=nm,fin=f] {< NP[case=nc,mod=nm,fin=f] {NP[case=nc,mod=nm,fin=f] を/を/} {NP[case=nc,mod=nm,fin=f]\NP[case=nc,mod=nm,fin=f] する/する/}} {S[mod=nm,form=base,fin=t]\S[mod=nm,form=base,fin=f] 。/。/**}}

Here, we may emphasize that the "bad sentence" means the sentence for which depccg + Lambeq do give an output.
these bad sentences are grammatically correct in Japanese.

from lambeq.

masakiowari avatar masakiowari commented on July 17, 2024

By the way, we have modified "ja.py" of depccg to work depccg+Lambeq well.
The above error is those we get even after this modification.
modification of ja.py is as follows:

for example

def generalized_backward_composition1(x: Category, y: Category) -> Optional[CombinatorResult]:
uni = Unification("b\c", "a\b")
if uni(x, y):
result = x if _is_modifier(y) else uni['a'] | uni['c']
return CombinatorResult(
cat=result,
op_string="bx",
op_symbol="<B1",
head_is_left=False,
)
return None

is modified as

def generalized_backward_composition1(x: Category, y: Category) -> Optional[CombinatorResult]:
uni = Unification("b\c", "a\b")
if uni(x, y):
result = x if _is_modifier(y) else uni['a'] | uni['c']
return CombinatorResult(
cat=result,
op_string="bc",
op_symbol="<B",
head_is_left=False,
)
return None

So, we applied the change #op_string bx -> bc, op_symbol <B1 -> <BC, here

Similarly, we applied the following modification on ja.py
on "def generalized_backward_composition2"
#op_string bx -> gbc, op_symbol <B2 -> <Bⁿ

on "def generalized_backward_composition3"
#op_string bx -> gbc, op_symbol <B3 -> <Bⁿ

on "def generalized_backward_composition4"
#op_string bx -> gbc, op_symbol <B4 -> <Bⁿ

on "def generalized_forward_composition1"
#op_string fx -> gfc, op_symbol >Bx1 -> >Bⁿ

on "def generalized_forward_composition2"
#op_string fx -> gfc, op_symbol >Bx2 -> >Bⁿ

on "def generalized_forward_composition3"
#op_string fx -> gfc, op_symbol >Bx3 -> >Bⁿ

That is all.

Before, this modification much more error occurred.
The problem on "Adjectival verbs" are that remain even after this modification.

from lambeq.

ianyfan avatar ianyfan commented on July 17, 2024

@masakiowari Hi, I can't seem to replicate this issue. Could you run this code fragment on your system and show us the full output please?

from lambeq import DepCCGParser
parser = DepCCGParser(lang='ja')

sentences = [
    '感動的な映画を見る',
    '曖昧な表現をする',
    '静かな海を見る',
    '健康な男性が歩く',
    '親切な男性がいる',
    '元気な男性が歩く',
    '上品な表現をする',
    'きれいな海を見る',
    '健やかな男性が歩く',
    '和やかな雰囲気を感じる',
    '穏やかな笑顔を浮かべる',
    '正直な男性がいる',
    '有名な男性がいる',
    'にぎやかな雰囲気を感じる',
    '特別な表現をする',
    '複雑な表現をする',
    'まじめな男性がいる',
    '下手な表現をする',
    '便利な本を買う',
    '朗らかな笑顔を浮かべる',
    '幸せな笑顔を浮かべる',
    '好きなスープを食べる',
    '無理な計画を立てる',
    '暇な男性がいる',
    '必要な計画を立てる',
    '邪魔なものをどかす',
    '変な表現をする',
    '自由な表現をする'
]

for sentence in sentences:
    print(parser.sentence2tree(sentence))

Thank you.

from lambeq.

masakiowari avatar masakiowari commented on July 17, 2024

@ianyfan Thank you very much for your great suggestion!
now, we have reconstructed our environment of lambeq and depccg without using the modified ja.py file.
Now, we can work on the sentences like
'感動的な映画を見る',
'曖昧な表現をする',
'静かな海を見る',
without errors.

2023-06-09 (1)

2023-06-09 (2)

2023-06-09 (3)

Now, we can treat Adjectival verbs.

Unfortunately, however, it seems that there still exit sentences which cannot be treated.
e.g. "ボブはおいしくないカレーが嫌いではない"

2023-06-09 (4)

2023-06-09 (5)

2023-06-09 (6)
ent with the modified ja.py file.

This sentence can be treated in the old environm

from lambeq.

ianyfan avatar ianyfan commented on July 17, 2024

Hi, I've had a look and the issue seems to due to depccg returning a parse that cannot be drawn under standard CCG rules.

From your initial list of sentences, there are 4 that lambeq cannot draw:

  • 親切な男性がいる
  • 正直な男性がいる
  • まじめな男性がいる
  • 暇な男性がいる

They all have the same issue. For example, for the first sentence, depccg returns a parse that contains this problematic sub-parse:

 親切   な
----- -----
  S    S\S       男性
-----------(BA) -----
     S            N
---------------------(UNK)
           N

depccg tells us which rule it uses at each step, e.g. BA for backwards application. For the bottom rule, depccg provides the rule "other" which clearly isn't a standard CCG rule. Therefore, we cannot draw this tree as a diagram.

The example in your comment "ボブはおいしくないカレーが嫌いではない" has a different issue, where depccg tries to perform backwards cross composition (BX) on the types S\N + S\S -> S\N which are not valid types to perform backwards cross composition on, which results in an error when trying to draw the diagram.

So I'm afraid I'm not sure if we can help you on the lambeq side; this seems to be an issue with how depccg parses these sentences.

I hope that helps. Let me know if you have any more questions.

from lambeq.

masakiowari avatar masakiowari commented on July 17, 2024

@ianyfan , Thank you very much for your detailed explanation.
Now, I perfectly understand the reason for this problem.
So, Lambeq can only understand the standard ccg, and depccg-ja sometimes outputs something which does not obey the standard ccg.
Now, the possible solution may be to modify depccg such that it only output standard ccg.
I will try to solve the problem along this direction.

from lambeq.

dimkart avatar dimkart commented on July 17, 2024

We'll convert this to a Discussion since it might be useful for other users as well.

from lambeq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.