GithubHelp home page GithubHelp logo

Comments (2)

lukehsiao avatar lukehsiao commented on June 15, 2024

Output from CoreNLP on the simple documents:

---------------------------------------- Captured log call -----------------------------------------
[DEBUG] Starting new HTTP connection (1): 127.0.0.1
[DEBUG] http://127.0.0.1:12345 "POST /?properties=%7B%22annotators%22:%20%22tokenize,ssplit,pos,lemma,depparse,ner%22,%22outputFormat%22:%20%22json%22,%22tokenize.options%22:%22escapeForwardSlashAsterisk=false,asciiQuotes=false,unicodeQuotes=false,normalizeOtherBrackets=fal
se,ptb3Ellipsis=false,normalizeParentheses=false,normalizeCurrency=false,unicodeEllipsis=false,latexQuotes=false,normalizeSpace=false,strictTreebank3=true,ptb3Dashes=false,normalizeFractions=false%22,%22ssplit.htmlBoundariesToDiscard%22:%20%22NB%22%7D HTTP/1.1" 200 31713
[DEBUG] http://127.0.0.1:12345 "POST /?properties=%7B%22annotators%22:%20%22tokenize,ssplit,pos,lemma,depparse,ner%22,%22outputFormat%22:%20%22json%22,%22tokenize.options%22:%22escapeForwardSlashAsterisk=false,asciiQuotes=false,unicodeQuotes=false,normalizeOtherBrackets=fal
se,ptb3Ellipsis=false,normalizeParentheses=false,normalizeCurrency=false,unicodeEllipsis=false,latexQuotes=false,normalizeSpace=false,strictTreebank3=true,ptb3Dashes=false,normalizeFractions=false%22,%22ssplit.htmlBoundariesToDiscard%22:%20%22NB%22%7D HTTP/1.1" 200 62272
[DEBUG] Doc: diseases
[DEBUG]   Phrase: Types of viruses, coughs, and colds
[DEBUG]   Phrase: Here isa line break
[DEBUG]   Phrase: I don't have Brain Canceror the hiccups
[DEBUG]   Phrase: See Table 1 Below.
[DEBUG]   Phrase: Common Ailments
[DEBUG]   Phrase: In between the tables there is a nasty case of heart attack
[DEBUG]   Phrase: And here is a final sentence with warts.
[DEBUG]   Phrase: Table 1: Infectious diseases and where to find them.
[DEBUG]   Phrase: Table 2: Three ways to get Pneumonia and how much they cost.
[DEBUG]   Phrase: Disease
[DEBUG]   Phrase: Location
[DEBUG]   Phrase: Year
[DEBUG]   Phrase: Polio and BC546 is -55OC cold.
[DEBUG]   Phrase: -Dublin to Milwaukee
[DEBUG]   Phrase: 2001
[DEBUG]   Phrase: I don't like TIPL761 or Chicken Pox or pizza.
[DEBUG]   Phrase: Shingles is also bad.
[DEBUG]   Phrase: whooping cough
[DEBUG]   Phrase: 2009
[DEBUG]   Phrase: Scurvy
[DEBUG]   Phrase: Annapolis
[DEBUG]   Phrase: Junction and Storage Temperature -55 to 150 o ?
[DEBUG]   Phrase: C
[DEBUG]   Phrase: Problem
[DEBUG]   Phrase: Cause
[DEBUG]   Phrase: Cost
[DEBUG]   Phrase: Arthritis
[DEBUG]   Phrase: Pokemon Go
[DEBUG]   Phrase: Free
[DEBUG]   Phrase: Yellow
[DEBUG]   Phrase: Fever
[DEBUG]   Phrase: Unicorns
[DEBUG]   Phrase: $17.75
[DEBUG]   Phrase: Hypochondria
[DEBUG]   Phrase: Fear
[DEBUG]   Phrase: $100
[DEBUG] Doc: md
[DEBUG]   Phrase: Sample Markdown
[DEBUG]   Phrase: This is some basic, sample markdown.
[DEBUG]   Phrase: Second Heading
[DEBUG]   Phrase: Unordered lists, and:
[DEBUG]   Phrase: One
[DEBUG]   Phrase: Two
[DEBUG]   Phrase: Three
[DEBUG]   Phrase: More
[DEBUG]   Phrase: Blockquote
[DEBUG]   Phrase: And
[DEBUG]   Phrase: bold
[DEBUG]   Phrase: ,
[DEBUG]   Phrase: italics
[DEBUG]   Phrase: , and even
[DEBUG]   Phrase: italics and later
[DEBUG]   Phrase: .
[DEBUG]   Phrase: Even
[DEBUG]   Phrase: bold
[DEBUG]   Phrase: strikethrough
[DEBUG]   Phrase: .
[DEBUG]   Phrase: A link
[DEBUG]   Phrase: to somewhere.
[DEBUG]   Phrase: Here is a table
[DEBUG]   Phrase: Or inline code like
[DEBUG]   Phrase: var foo = 'bar';
[DEBUG]   Phrase: .
[DEBUG]   Phrase: Or an image of bears
[DEBUG]   Phrase: The end ...
[DEBUG]   Phrase: Name
[DEBUG]   Phrase: Lunch order
[DEBUG]   Phrase: Spicy
[DEBUG]   Phrase: Owes
[DEBUG]   Phrase: Joan
[DEBUG]   Phrase: saag paneer
[DEBUG]   Phrase: medium
[DEBUG]   Phrase: $11
[DEBUG]   Phrase: Sally
[DEBUG]   Phrase: vindaloo
[DEBUG]   Phrase: mild
[DEBUG]   Phrase: $14
[DEBUG]   Phrase: Erin
[DEBUG]   Phrase: lamb madras
[DEBUG]   Phrase: HOT
[DEBUG]   Phrase: $5

CoreNLP is splitting different formatting (e.g. italics, bold, etc) into different phrases.

from fonduer.

lukehsiao avatar lukehsiao commented on June 15, 2024

Inspecting 5 candidates using the code:

from fonduer.features import features

cand = []

log = open('scapy_log_features.txt', 'w')

for i, c in enumerate(train_cands):
    if c[0].get_span().startswith('BC856') and c[1].get_span() == '150':
        print("###", i)
        cand.append(c)

print("Candidates: {}".format(len(cand)))
        
for c in cand:
    log.write("Candidate: {}\n".format(c))
    for f in list(features.get_all_feats([c])):
        log.write("    Feature: {}\n".format(f))

log.close()

at the end of the stg_temp_max tutorial.

from fonduer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.