GithubHelp home page GithubHelp logo

dikmax / mdown Goto Github PK

View Code? Open in Web Editor NEW
20.0 3.0 7.0 1.18 MB

Fast CommonMark-compliant Markdown parser.

Home Page: https://pub.dartlang.org/packages/md_proc

License: BSD 3-Clause "New" or "Revised" License

Dart 99.78% Shell 0.22%
dart commonmark markdown markdown-parser

mdown's Introduction

mdown

Build Status codecov Pub CommonMark spec

mdown is fast and CommonMark-compliant Markdown parser.

Basic usage:

print(markdownToHtml('# Hello world!'));

Project main goal is create processing library for Markdown.

Performance

As there are not many Markdown parsers written in Dart out there, parsing speed is compared with markdown package. Progit was used as a source of markdown files in different languages. mdown appears to be 3 times faster in VM, 11 times faster in Chrome, 2.2 times faster in Safari and 3.7 times faster in Firefox.

Run benchmarks yourself or see details.

mdown make extensive use of String.codeUnitAt instead of RegExp. So you can see noticeable gain for non-latin languages (up to ×38 in Chrome for Japan language).

Extensions

mdown supports some language extensions. You can specify enabled extensions using options parameter in markdownToHtml.

Options options = const Options(superscript: true);
String res = markdownToHtml('Hello world!\n===', options);

There three predefined sets of options:

  • Options.strict: all extensions, except rawHtml are disabled
  • Options.commonmark: only smartPunctuation and rawHtml extension are enabled.
  • Options.gfm: rawHtml, tagFilter, pipeTables.
  • Options.defaults: smartPunctuation, strikeout, subscript, superscript, pipeTables, texMathDollars, rawTex, rawHtml are enabled.

To get correspondent parser/writer instance use static getter on class:

String res = markdownToHtml('Hello world!\n===', Options.strict);

If second parameter is not provided, Options.defaults is used.

Raw HTML (Options.rawHtml)

Allows to include raw HTML blocks. Official CommonMark extension.

Tag filter (Options.tagFilter)

Filters <textarea>, <style>, <xmp>, <iframe>, <noembed>, <noframes>, <script>, <plaintext> from HTML output. Works together with Options.rawHtml. Part of GitHub Flavored Markdown.

Smart punctuation (Options.smartPunctuation)

Smart punctuation is automatic replacement of ..., ---, --, " and ' to "…", "—", "–" and curly versions of quote marks accordingly. It's only official extension to date.

NOTE: This extension uses Unicode chars. Make sure that your code supports it.

Extended attributes for fenced code (Options.fencedCodeAttributes)

Allows fenced code block to have arbitrary extended attributes.

``` {#someId .class1 .class2 key=value}
code
```

This will be rendered in HTML as

<pre id="someId" class="class1 class2" key="value"><code>code
</code></pre>

Extended attributes for headings (Options.headingAttributes)

Allows headings to have arbitrary extended attributes.

# Heading 1 {#someId}

Heading 2 {.someClass}
-------------------

This will be rendered in HTML as

<h1 id="someId">Heading 1</h1>
<h2 class="someClass">Heading 2</h2>

Extended attributes for inline code (Options.inlineCodeAttributes)

Adds extended attributes support to inline code.

`code`{#id .class key='value'}

Extended attributes for links and images (Options.linkAttributes)

Extended attributes for links and images. Both inline and reference links are supported.

![](image.jpg){width="800" height="600"}

[test][ref]

[ref]: http://test.com/ {#id}

This will be transformed into:

<p><img src="image.jpg" alt="" width="800" height="600" /></p>
<p><a href="http://test.com/" id="id">test</a></p>

Strikeout (Options.strikeout)

Strikeouts text (like this). Just wrap text with double tildes (~~).

Strikeouts text (~~like this~~).

Subscript (Options.subscript)

Support for subscript (H2O). Wrap text with tildes (~).

H~2~O

Subscript couldn't contain spaces. If you need to insert space into the subscript, escape space (\ ).

subscript~with\ spaces~

Superscript (Options.superscript)

Support for superscript (22=4). Wrap text with carets (^).

2^2^=4

Superscript couldn't contain spaces. If you need to insert space into superscript, escape space (\ ).

superscript^with\ spaces^

Pipe tables (Options.pipeTables)

Allows to parse tables where cells are separated with vertical bars (|). Compatible with GitHub table syntax.

head | cells
-----|------
body | cells
more | cells

Also supports cells alignment.

:----|:-----:|----:
left aligned | center aligned | right aligned

TeX Math between dollars (Options.texMathDollars)

Anything between two $ characters will be treated as inline TeX math. The opening $ must have a non-space character immediately to its right, while the closing $ must have a non-space character immediately to its left, and must not be followed immediately by a digit. Thus, $20,000 and $30,000 won’t parse as math. If for some reason you need to enclose text in literal $ characters, backslash-escape them and they won’t be treated as math delimiters.

Anything between two $$ will be treated as display TeX math.

HTML writer generates markup for MathJax library. I.e. wraps content with \(...\) or \[...\] and additionally wraps it with <span class="math inline"> or <span class="math display">. If you need custom classes for span you can override them with Options.inlineTexMathClasses and Options.displayTexMathClasses.

TeX Math between backslashed () or [] (Options.texMathSingleBackslash)

Causes anything between \( and \) to be interpreted as inline TeX math and anything between \[ and \] to be interpreted as display TeX math.

NOTE 1: This extension breaks escaping of ( and [].

NOTE 2: This extension is disabled by default.

TeX Math between double backslashed () or [] (Options.texMathDoubleBackslash)

Causes anything between \\( and \\) to be interpreted as inline TeX math and anything between \\[ and \\] to be interpreted as display TeX math.

NOTE: This extension is disabled by default.

Raw TeX (Options.rawTex)

Allows to include raw TeX blocks into documents. Right now only environment blocks are supported. Everything between \begin{...} and \end{...} is treated as TeX and passed into resulting HTML as is.

Custom reference resolver

Custom reference resolver may be required when parsing document without implicitly defined references, for example, Dartdoc.

/**
 * Throws a [StateError] if ...
 * similar to [anotherMethod], but ...
 */

In that case, you could supply parser with the resolver, which should provide all missing links.

import 'package:mdown/mdown.dart';
import 'package:mdown/ast/standard_ast_factory.dart';

String library = "mdown";
String version = "0.11.0";
Target linkResolver(String normalizedReference, String reference) {
  if (reference.startsWith("new ")) {
    String className = reference.substring(4);
    return astFactory.target(
        "http://www.dartdocs.org/documentation/$library/$version/index.html#$library/$library.$className@id_$className-",
        null);
  } else {
    return null;
  }
}

String res = markdownToHtml('Hello world!\n===', new Options(linkResolver: linkResolver));

mdown's People

Contributors

dikmax avatar enyo avatar kevmoo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mdown's Issues

how to enable table transformer

Hi, I use md_proc following the basic usage:
print(markdownToHtml('# Hello world!'));
but It does not work for table ,could you please figure out how to enable the fucntion for table.
thanks

Note on UTF-8 and smart punctation

Hi Maxim and thanks for implementing this package, it seem very nice.

I started using it now and ran into some issues with "Smart punctuation" and Dart's default encoding (Latin 1). It was a hard to find the issue because you only get a somewhat random error but after some debugging I realized that Latin 1 is the default encoding (for Dart:io HttpClientRequest) and it simply does not have "smart quotes", see comment 4 here: http://stackoverflow.com/a/9282615

What I had to do to get things working was to explicitly set the charset like this on the headers:

res.headers.contentType = new ContentType("text", "html", charset: "utf-8");

Now it works without errors and I can render commonmark with md_proc server side and then send the output directly to a client.

This is not a bug per say in md_proc, but I think this might be common enough that it's worth documenting somewhere. Not sure exactly where, but maybe some small note in the readme on smart punctuation that if you plan to write that output to a HttpClientRequest you need to set the encoding explicitly to utf-8 or some other charset that supports smart quotes.

Cheers, Robert

extension for \begin{...}, \end{...} latex math blocks

@dikmax I'm using your math extension now on my website, works great 👍

I have one other feature request. Many mathematicians also use \begin{...} \end{...} kind of code blocks. For example \begin{matrix}, \begin{align} etc. Those are supported by some markdown flavours, for example the one used at math.stackexchange.com.

Would be great if md_proc could also escape markdown for those blocks (and mark as <span class="math display">).

A workaround that is now possible, is to use $$...$$ around the equation, see:
http://kasperpeulen.github.io/mathedit/#/gist/097f6d551598a3a8487f

But I think people are so used to not having to do that from standard latex, math.stackexchange etc., so an extension would be great.

don't process math blocks

Would it be possible to have an extension that doesn't process math blocks (that could be rendered by mathjax afterwards) ?

Use package:collection for collection comparison

Hi,

I was curious about what you're using package:parsers for and found this project (nice project!). I remarked that you implemented maps and iterable comparison functions, which are available in this (google-maintained) package already: http://pub.dartlang.org/packages/collection.

Also if you're porting Haskell code you might be interested in some of my other projects: persistent efficient immutable lists, maps and sets, propcheck a quickcheck/smallcheck test library which uses a port of testing-feat for generating enumerations, adts for generating Dart classes from algebraic datatype declarations, and pretty a port of wl-pprint.

possibiliy to define an extension ?

This seems like a very nice markdown parser. Is it possible to define some extension ? I have some troubles parsing documentation from dart (dartdoc) with this parser, as dartdoc uses some non-standard markdown. For example the [myArg] notation.

But anyway, seems like excellent work ! 👍

Use a prefix for css class names

As the title suggest this issue is about using a css prefix in md_proc to avoid any potential bugs that can come from using common names such as display or math.

My first idea for a css class prefix names would md- or md-proc- to avoid any potential collisions.

See issue #13 for more discussion.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.