GithubHelp home page GithubHelp logo

snudown's Introduction

Snudown

Snudown is a reddit-specific fork of the Sundown Markdown parser used by GitHub, with Python integration added.

Setup for development on Mac OS X

  1. From ~/src/snudown run $ python setup.py build
  2. If this is successful, there will now be a snudown.so file in the /snudown/build/lib.< os info >-< python version number> directory
  3. From within the /lib.< os info >-< python version number> directory, start a python interpreter
<!-- Make sure you can import snudown -->
>>> import snudown
<!-- verify that the build you just made is being used -->
>>> print(snudown.__file__)
snudown.so
<!-- Test the functionality of the build -->
>>> snudown.markdown('[hi](http://www.reddit.com)')
'<p><a href="http://www.reddit.com">hi</a></p>\n'
<!-- Great! You can exit now. -->
>>> quit()
  1. Verify that the tests pass
$ PYTHONPATH="$(pwd)" python ../../test_snudown.py
  1. Verify that all the previous steps work for both Python 2 AND Python 3

Install for general use

Run setup.py install to install the module.

For Mac OS X:

  1. Install afl-fuzz via homebrew: brew install afl-fuzz
  2. You can now install the module via python setup.py install
  3. You may also compile snudown using the Makefile directly if you so wish

Thanks

Many thanks to @vmg for implementing the initial version of this fork!

License

Permission to use, copy, modify, and distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

snudown's People

Contributors

andre-d avatar bnoordhuis avatar brandonc avatar chobie avatar cweider avatar fsx avatar gregleaver avatar jbergstroem avatar jjallaire avatar jnovinger avatar jordanmilne avatar kjk avatar mattsta avatar mcansky avatar mlburgos avatar nandhp avatar nono avatar rram avatar sakjur avatar samb avatar sp3nx0r avatar spladug avatar srombauts avatar txdv avatar vmg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

snudown's Issues

Strikethrough code spans sometimes break depending on text afterwards

~~`a`~~`b`

should render as

ab

but on Reddit renders as

ab

Some more examples; the ones on the left are rendered incorrectly, while the ones on the right work fine:

Broken Working
~~a~~b`` ~~a`~~b``
~~a~~b` ~~a`~~``
~~a~~\b` ~~a`~~``

The broken examples remain broken if you insert a line break after the last tilde.

Code Formatting

It would be great if RFM supported formatted code snippets like GFM. So that

const buildselecthtml = (options, selected) => 
    $.map(options, option => 
        '<option ' + ((option.id === selected) ? ' selected="selected" ' : '') + ' value="' + option.id + '">' + option.name + '</option>')
    .join('');

Could become

const buildselecthtml = (options, selected) => 
    $.map(options, option => 
        '<option ' + ((option.id === selected) ? ' selected="selected" ' : '') + ' value="' + option.id + '">' + option.name + '</option>')
    .join('');

Which is a great deal easier to read

*<text>* appears to be mishandled

In the old markdown parser would show the whole text & parens in italics. The new one removes the whole paren clause and displays it as **, omitting the entire text between the asterisks.

Reintroduce houdini_unescape_html

houdini.h declares a number of methods which are not implemented anywhere:

extern void houdini_escape_html(struct buf *ob, const uint8_t *src, size_t size);
extern void houdini_escape_html0(struct buf *ob, const uint8_t *src, size_t size, int secure);
extern void houdini_unescape_html(struct buf *ob, const uint8_t *src, size_t size);
extern void houdini_escape_xml(struct buf *ob, const uint8_t *src, size_t size);
extern void houdini_escape_uri(struct buf *ob, const uint8_t *src, size_t size);
extern void houdini_escape_url(struct buf *ob, const uint8_t *src, size_t size);
extern void houdini_escape_href(struct buf *ob, const uint8_t *src, size_t size);
extern void houdini_unescape_uri(struct buf *ob, const uint8_t *src, size_t size);
extern void houdini_unescape_url(struct buf *ob, const uint8_t *src, size_t size);
extern void houdini_escape_js(struct buf *ob, const uint8_t *src, size_t size);
extern void houdini_unescape_js(struct buf *ob, const uint8_t *src, size_t size);

I notice that houdini does have an implementation. but snudown does not. Could this method be re-added to snudown?
https://github.com/vmg/houdini/blob/master/houdini_html_u.c

Autolinking occur within links

Writing [Text http://www.snoogle.com Text](http://www.google.com) results in <a href="http://www.google.com">Text <a href="http://www.snoogle.com">http://www.snoogle.com</a> Text</a> which is clearly wrong. Autolinking of all forms should be disabled within the text of a link.

(I hope that made up domain isn't something bad)

Incorrect 'mailto' link scheme

Bug description

When linking to an email address the URI has to be written as mailto://[email protected] instead of mailto:[email protected]. Only the second form is correct according to the standard.

Expected behaviour

The second form should work as expected and generate the link. Github's Markdown does it correctly.

Steps to reproduce

Write a comment on reddit containing the code

[test](mailto:[email protected])

which should be equivalent to the current

[test](mailto://[email protected])

Proposed fix

Remove the slashes in src/autolink.c to accept 'mailto:'. This will also keep backwards compatibility with the existing 'mailto://' links.

The code over at reddit is already correct.

Corruption in the header tags.

The following input

http://aaaaaaaaaaaaaaaa.aaa/aaaaaaa#aaaaa

=

Generates the following output

<p><a href="http://aaaaaaaaaaaaaaaa.aaa/aaaaaaa#aaaaa">http:&#47;&#47;aaaaaaaaaaaaaaaa.aaa&#47;aaaaaaa#aaaaa</a></p>

<h1></h118>

when MKDEXT_AUTOLINK is enabled.

I haven't dug in much here yet, but it is important to note that the number and position of the 'a's in the input is significant, but not what letter they actually are. Some slight variations of input text resulted in </h120>.

Domain names are incorrectly encoded as percent Hex characters

A domain name containing non-ASCII characters will not resolve properly. According to RFC5890 domain names can contain any character in the Unicode charset. snudown incorrectly escapes these characters when used in a domain name which the DNS servers then cannot resolve. For instance, if I type http://domaintest.みんな/ (note: This is an issue in Github too) it is parsed into http://domaintest.%E3%81%BF%E3%82%93%E3%81%AA/. The valid link should either be punycode encoded such as http://domaintest.xn--q9jyb4c/, or not changed at all. The second option is not ideal as it potentially leaves open the door for XSS hacks (repeating the Great Reddipocalypse of October 28th, 2009).

Example here

This bug will not show up in Chrome as Chrome automatically translates hex encoded Unicode in domain names to punycode.

Previously functional link stopped being accepted in v1.0.3

Between 1.0.2 and 1.0.3 the following link from /r/juggling's sidebar stopped being rendered as a link:

[Message me](/message/compose?to=raerth&subject=Please%20add%20to%20%2fr%2fJuggling's%20sidebar!&message=%7eremember%20to%20spell%20your%20subreddit%20name%20accurately!%7e) if you have a subreddit you want added!

can you please provide a sample example?

Hi, I'm trying to write Java binding to snudown, but I'm having problems understanding how snudown works, for example using heodown (which is also a sundown fork) I can do this:

#include <string.h>
#include <jni.h>
#include "hoedown/html.h"

JNIEXPORT jstring JNICALL
Java_com_elmanahil_rdown_Rdown_render(JNIEnv* env, jobject that, jstring rdown) {
    const char *cdown = (*env)->GetStringUTFChars(env, rdown, NULL);
    const char *result = "";

    hoedown_renderer *renderer = hoedown_html_renderer_new(0, 0);
    hoedown_document *document = hoedown_document_new(renderer, 0, 16);
    hoedown_buffer *html = hoedown_buffer_new(16);
    hoedown_document_render(document, html, cdown, strlen(cdown));

    result = (const char*) html->data;

    hoedown_buffer_free(html);
    hoedown_document_free(document);
    hoedown_html_renderer_free(renderer);

    (*env)->NewStringUTF(env, result);
}

what would be equivalent to above code in snudown?
thanks

snudown fails its own test suite

running test
running egg_info
creating snudown.egg-info
writing snudown.egg-info/PKG-INFO
writing top-level names to snudown.egg-info/top_level.txt
writing dependency_links to snudown.egg-info/dependency_links.txt
writing manifest file 'snudown.egg-info/SOURCES.txt'
reading manifest file 'snudown.egg-info/SOURCES.txt'
writing manifest file 'snudown.egg-info/SOURCES.txt'
running build_ext
copying build/lib.linux-i686-2.7/snudown.so ->
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... FAIL
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok
runTest (test_snudown.SnudownTestCase) ... ok

======================================================================
FAIL: runTest (test_snudown.SnudownTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/aidanhs/Desktop/emscript/emsnudown/snudown/test_snudown.py", line 180, in runTest
    self.fail(io.getvalue())
AssertionError: TEST FAILED:
       input: '[Test](#test)'
    expected: '<p><a href="#test">Test</a></p>\n'
      actual: '<p>[Test](#test)</p>\n'
                  ^


----------------------------------------------------------------------
Ran 52 tests in 0.005s

FAILED (failures=1)

Nested block quotes behave too strictly

> Blah
>> Blah blah

renders as <blockquote>Blah >>Blah blah</blockquote> rather than as nested block quotes. This is likely because those two lines are adjacent and therefore not technically separate paragraphs. This is not the current behaviour on reddit and is frankly awkward.

Support Parentheses in URLs

Bug description

The URL http://msdn.microsoft.com/en-us/library/windows/desktop/ms724451(v=vs.85).aspx must be written as http://msdn.microsoft.com/en-us/library/windows/desktop/ms724451%28v=vs.85%29.aspx for Snudown not to parse the second parenthesis as end of link.

Expected behaviour

The link http://msdn.microsoft.com/en-us/library/windows/desktop/ms724451(v=vs.85).aspx should work in Markdown as is. GitHub does it.

Steps to reproduce

Write a comment on Reddit containing the code

[GitHub does it](http://msdn.microsoft.com/en-us/library/windows/desktop/ms724451(v=vs.85).aspx)

should be equivalent to

[GitHub does it](http://msdn.microsoft.com/en-us/library/windows/desktop/ms724451%28v=vs.85%29.aspx)

Proposed fix

When a opening parenthesis is occuring within a URL according to the [link](url)-schema, a counter will tick up to one. If a closing parenthesis is occured, the counter will tick down by one. When the counter is below zero, the last parenthesis is not counted as part of the URL and the URL terminated.

Bug introduced by proposed fix

When having an opening parenthesis without a closing, the following code will make the URL weird
[GitHub does it](http://msdn.microsoft.com/en-us/library/windows/desktop/ms724451(v=vs.85.aspx)

I propose that if a parenthesis is immediately followed by a non-URL-safe character (e.g. space), it will assume that is a closing parenthesis despite being opened.

Thus, the following successful testcases needs to be introduced

1. [link](http://example.com/parenthe(sis).html)
1. [link](http://example.com/parenthe(sis.html) something else
1. [link](http://example.com/parenthe(sis.html))

The following testcases will probably fail:

1. [link](http://example.com/parenthesis).html)
1. [link](http://example.com/parenthe(sis.html). Something else.

This should still work:

1. [link](http://example.com/parenthe%28sis%29.html)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.