Comments (7)
In case of latest stanza I have to make these changes to work (check lines with ###), started the coreNLP server outside (check this stanfordnlp/stanza#245 (comment))
#!/usr/bin/env python3
from argparse import ArgumentDefaultsHelpFormatter, ArgumentParser
from asyncio import start_server
import os
import records
import ujson as json
from stanza.server.client import CoreNLPClient ###
from tqdm import tqdm
import copy
from lib.common import count_lines, detokenize
from lib.query import Query
import stanza.server as corenlp ###
client = None
if client is None:
client = CoreNLPClient(annotators='tokenize,ssplit,pos,lemma,ner,depparse',
start_server=corenlp.StartServer.DONT_START) ###
words, gloss, after = [], [], []
objs = client.annotate(sentence) ###
for s in objs.sentence: ###
for t in s.token: ###
words.append(t.word)
gloss.append(t.originalText)
after.append(t.after)
from sqlova.
I am facing same issue. Did you get any solution to this problem?
from sqlova.
@Daljeetka Not yet...
from sqlova.
When running annotate_wa.py, i got an error: ModuleNotFoundError: No module named 'stanza.nlp'. But i has installed stanza. Which package else should I install?
from sqlova.
i know. Change line 8 to from stanza.server import CoreNLPClient
. Now i am facing the same issue TypeError: 'Document' object is not iterable
too..
from sqlova.
Try this:
import stanza
nlp = stanza.Pipeline('en')
def annotate(sentence, lower=True, nlp=nlp):
"""
Input: Question
Output: Tokenized input question
{
'gloss': original question,
'words': list of tokens,
'after': " " for tokens through last 2; last 2 tokens = ""
}
"""
doc = nlp(sentence)
words, gloss, after = [], [], []
for sentence in doc.sentences:
for token in sentence.tokens:
word, originalText = token.text, token.text
after_ = " "
words.append(word)
gloss.append(originalText)
after.append(after_)
after[-2:] = ["", ""]
if lower:
words = [w.lower() for w in words]
return {
'gloss': gloss,
'words': words,
'after': after,
}
from sqlova.
Yes, the code by @dsivakumar seems to be correct. The return value of client.annotate(sentence)
is not an actual Document
object, no matter what the error message says. It's something called a Protobuf, as explained (sort of) here. These objects' fields are named in the singular (sentence
, token
) even though they refer to iterables of multiple sentences and tokens.
from sqlova.
Related Issues (20)
- Typos in ReadMe.md
- How can I have a look at the ftable1.csv and ftable2.csv
- For real industrial application, what strategy to locate the exact table? HOT 1
- How to fine tune sequence-to-SQL?
- How do I use predict.py for custom data? HOT 2
- weird error occurs when run predict.py
- Testing models with predict.py does not give me any results file HOT 1
- Need of results_train.jsonl
- Standford coreNLP
- Training on custom data HOT 4
- Replicating dev results on BERT base
- Training doesn't run + how to predict with custom query? HOT 1
- Keras implementation of Column Attention
- Isuue in utlis.wikisql.py
- TypeError: 'Document' object is not iterable
- There is insufficient memory during training, and the get_wemb_bert function call in the train function applies for a lot of memory
- dbengine HOT 5
- How to use RoBERTa instead of BERT ?
- How can I evaluate the pre-trained model on WikiSQL? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sqlova.