GithubHelp home page GithubHelp logo

ppwb's People

Contributors

asylumcs avatar cpeel avatar okrick avatar rtonsing avatar windymilla avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ppwb's Issues

"Suppress pipe characters" option for ppcomp

It's common DP practice to represent table borders with | in the text version of a book. In HTML, borders can be created with CSS, so the pipe characters are usually removed. When comparing the two versions with ppcomp, this leads to sometimes hundreds of lines of diffs to look through, and finding any genuine diffs that need fixing is needle-in-haystack level of difficulty.

The example that broke me today: https://www.pgdp.net/d/ppwb/r63aebbc4c881a/result.html

There are at least two genuine diffs in that report. Good luck. ;)

So: I'd really like to be able to run ppcomp with an option to ignore the pipe character, to make the output on table-heavy books like this one much shorter and much easier to review.

(Bonus points for also excluding sequences that look like --+----+-- or ==+====+==, but those occur much less frequently and aren't so much trouble to scroll past.)

ppcomp fails to parse html file

The attached html and text files cause the following error log:

Traceback (most recent call last): File "./bin/comp_pp.py", line 1621, in main() File "./bin/comp_pp.py", line 1616, in main _, html_content, fn1, fn2 = x.do_process() File "./bin/comp_pp.py", line 1338, in do_process f.load(fname) File "./bin/comp_pp.py", line 694, in load self.myfile.load_xhtml(filename, relax=True) File "./bin/comp_pp.py", line 223, in load_xhtml os.path.basename(name)) SyntaxError: Parsing errors in document: lettersfromandoldtimesalesman.html

The HTML file validates OK at W3C.

Brackets around footnote anchors

The old standalone ppcomp had a way of recognizing that square brackets around footnote anchors in plain text should not be reported (when there aren't corresponding brackets around superscripted anchors in HTML). Can an option for that be added to this version?

ppcomp not showing error list on failure

Not sure where the problem is, the current ppcomp.py in Github is correct, and the PHP code looks good to me. I locally change a </p> to a </div> and redirect output to a file, I get:
((5731, 44), 'end-tag-too-early', {'name': 'div'}) .
But on the web site I just get:
`Whoops! Something went wrong and no output was generated. The error message was

For more assistance, ask in the discussion topic and include this identifier: r6268af6feae7d`.

Allow no language to be selected for pptext

Request
Change the pptext validation to allow no language to be selected as long as spellcheck is also not selected. Bonus: allowing pptext to dynamically load the list of available dictionary languages and populate the checkboxes.

Discussion
The following discussion was pulled from #9

@srjfoo:

[U]nder "Select wordlist language(s), is the implication that pptext should not be run if a text is in a language not listed? One of my test projects is in Dutch, which is not listed. Even if spellcheck is not possible because of a missing dictionary, I would think that some of the other checks (excluding jeebies) would be useful. Could an "Other non-English" option be added that would disable spellcheck unless a good words file is provided?

@asylumcs:

I think it's fine to run pptext if it's not a language on the list. Perhaps a user would not want to tick the "run spellcheck" box in some cases. And it's not really right to me to have a list anyway because there are perfectly good dictionaries loaded that do not have a checkbox. Wouldn't surprise me if Dutch were actually available.

Portuguese spell check only works past a if secondary language is also selected

There is an issue with the spell check for Portuguese, at a minimum, and possible other non-English languages. When you run it with Portuguese checked as your only language, the spell check only returns words starting with 'a'. If you add English also, it will spell check the other letters. Laura Natal reported on the DP forums that doing this with Spanish as a secondary language will also resolve the issue, although I have not personally tried that.

There are some more details here from my original question about it and her replies: https://www.pgdp.net/phpBB3/viewtopic.php?p=1293475#p1293475

Since I'm working on another book today and realized it still wasn't fixed, I figured I'd log an official ticket, including the files I'm using right now.
good_words.txt
projectID62c58d2177033.txt

I can provide some more sample files if needed.

Should a Right Single Quotation Mark be an query: unexpected paragraph end?

Why are paragraphs which end in a period, question mark, or exclamation mark followed by Right Single Quotation Mark considered unexpected?

For example these 3 on my current output:

...“‘Well, youngster, what are you looking for here?’

...got yourself into a scrape with your meddlesome disposition.’

...“‘I am no beggar!’

Not an urgent issue.

Minor update to ppcomp web page

The option Type of text cleaning: can be removed, there is no code to support it.

I'm working on a major upgrade to ppcomp to handle HTML5 & other changes, and noticed this.

PPComp error message when using "Extract and process footnotes separately"

When the"Extract and process footnotes separately" option is selected, the Workbench's ppcomp returns this error message:

`Whoops! Something went wrong and no output was generated. The error message was

For more assistance, ask in the discussion topic and include this identifier: r625483e56d486`

(There is no error message.) The attached files will demonstrate this when these options are selected:

Ignore case when comparing
Extract and process footnotes separately
Suppress "[Illustration:" marks

ppcomp_html5_limitation.zip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.