russross / blackfriday Goto Github PK

Blackfriday: a markdown processor for Go

License: Other

Go 100.00%

blackfriday's Introduction

Blackfriday

Blackfriday is a Markdown processor implemented in Go. It is paranoid about its input (so you can safely feed it user-supplied data), it is fast, it supports common extensions (tables, smart punctuation substitutions, etc.), and it is safe for all utf-8 (unicode) input.

HTML output is currently supported, along with Smartypants extensions.

It started as a translation from C of Sundown.

Installation

Blackfriday is compatible with modern Go releases in module mode. With Go installed:

go get github.com/russross/blackfriday

will resolve and add the package to the current development module, then build and install it. Alternatively, you can achieve the same if you import it in a package:

import "github.com/russross/blackfriday"

and go get without parameters.

Old versions of Go and legacy GOPATH mode might work, but no effort is made to keep them working.

Versions

Currently maintained and recommended version of Blackfriday is v2. It's being developed on its own branch: https://github.com/russross/blackfriday/tree/v2 and the documentation is available at https://pkg.go.dev/github.com/russross/blackfriday/v2.

It is go get-able in module mode at github.com/russross/blackfriday/v2.

Version 2 offers a number of improvements over v1:

Cleaned up API
A separate call to Parse, which produces an abstract syntax tree for the document
Latest bug fixes
Flexibility to easily add your own rendering extensions

Potential drawbacks:

Our benchmarks show v2 to be slightly slower than v1. Currently in the ballpark of around 15%.
API breakage. If you can't afford modifying your code to adhere to the new API and don't care too much about the new features, v2 is probably not for you.
Several bug fixes are trailing behind and still need to be forward-ported to v2. See issue #348 for tracking.

If you are still interested in the legacy v1, you can import it from github.com/russross/blackfriday. Documentation for the legacy v1 can be found here: https://pkg.go.dev/github.com/russross/blackfriday.

Usage

v1

For basic usage, it is as simple as getting your input into a byte slice and calling:

output := blackfriday.MarkdownBasic(input)

This renders it with no extensions enabled. To get a more useful feature set, use this instead:

output := blackfriday.MarkdownCommon(input)

v2

For the most sensible markdown processing, it is as simple as getting your input into a byte slice and calling:

output := blackfriday.Run(input)

Your input will be parsed and the output rendered with a set of most popular extensions enabled. If you want the most basic feature set, corresponding with the bare Markdown specification, use:

output := blackfriday.Run(input, blackfriday.WithNoExtensions())

Sanitize untrusted content

Blackfriday itself does nothing to protect against malicious content. If you are dealing with user-supplied markdown, we recommend running Blackfriday's output through HTML sanitizer such as Bluemonday.

Here's an example of simple usage of Blackfriday together with Bluemonday:

import (
    "github.com/microcosm-cc/bluemonday"
    "github.com/russross/blackfriday"
)

// ...
unsafe := blackfriday.Run(input)
html := bluemonday.UGCPolicy().SanitizeBytes(unsafe)

Custom options, v1

If you want to customize the set of options, first get a renderer (currently only the HTML output engine), then use it to call the more general Markdown function. For examples, see the implementations of MarkdownBasic and MarkdownCommon in markdown.go.

Custom options, v2

If you want to customize the set of options, use blackfriday.WithExtensions, blackfriday.WithRenderer and blackfriday.WithRefOverride.

`blackfriday-tool`

You can also check out blackfriday-tool for a more complete example of how to use it. Download and install it using:

go get github.com/russross/blackfriday-tool

This is a simple command-line tool that allows you to process a markdown file using a standalone program. You can also browse the source directly on github if you are just looking for some example code:

https://github.com/russross/blackfriday-tool

Note that if you have not already done so, installing blackfriday-tool will be sufficient to download and install blackfriday in addition to the tool itself. The tool binary will be installed in $GOPATH/bin. This is a statically-linked binary that can be copied to wherever you need it without worrying about dependencies and library versions.

Sanitized anchor names

Blackfriday includes an algorithm for creating sanitized anchor names corresponding to a given input text. This algorithm is used to create anchors for headings when EXTENSION_AUTO_HEADER_IDS is enabled. The algorithm has a specification, so that other packages can create compatible anchor names and links to those anchors.

The specification is located at https://pkg.go.dev/github.com/russross/blackfriday#hdr-Sanitized_Anchor_Names.

SanitizedAnchorName exposes this functionality, and can be used to create compatible links to the anchor names generated by blackfriday. This algorithm is also implemented in a small standalone package at github.com/shurcooL/sanitized_anchor_name. It can be useful for clients that want a small package and don't need full functionality of blackfriday.

Features

All features of Sundown are supported, including:

Compatibility. The Markdown v1.0.3 test suite passes with the --tidy option. Without --tidy, the differences are mostly in whitespace and entity escaping, where blackfriday is more consistent and cleaner.
Common extensions, including table support, fenced code blocks, autolinks, strikethroughs, non-strict emphasis, etc.
Safety. Blackfriday is paranoid when parsing, making it safe to feed untrusted user input without fear of bad things happening. The test suite stress tests this and there are no known inputs that make it crash. If you find one, please let me know and send me the input that does it.

NOTE: "safety" in this context means runtime safety only. In order to protect yourself against JavaScript injection in untrusted content, see this example.
Fast processing. It is fast enough to render on-demand in most web applications without having to cache the output.
Thread safety. You can run multiple parsers in different goroutines without ill effect. There is no dependence on global shared state.
Minimal dependencies. Blackfriday only depends on standard library packages in Go. The source code is pretty self-contained, so it is easy to add to any project, including Google App Engine projects.
Standards compliant. Output successfully validates using the W3C validation tool for HTML 4.01 and XHTML 1.0 Transitional.

Extensions

In addition to the standard markdown syntax, this package implements the following extensions:

Intra-word emphasis supression. The _ character is commonly used inside words when discussing code, so having markdown interpret it as an emphasis command is usually the wrong thing. Blackfriday lets you treat all emphasis markers as normal characters when they occur inside a word.
Tables. Tables can be created by drawing them in the input using a simple syntax:
```
Name    | Age
--------|------
Bob     | 27
Alice   | 23
```
Fenced code blocks. In addition to the normal 4-space indentation to mark code blocks, you can explicitly mark them and supply a language (to make syntax highlighting simple). Just mark it like this:
```
```go
func getTrue() bool {
    return true
}
```
```
You can use 3 or more backticks to mark the beginning of the block, and the same number to mark the end of the block.

To preserve classes of fenced code blocks while using the bluemonday HTML sanitizer, use the following policy:
```
p := bluemonday.UGCPolicy()
p.AllowAttrs("class").Matching(regexp.MustCompile("^language-[a-zA-Z0-9]+$")).OnElements("code")
html := p.SanitizeBytes(unsafe)
```
Definition lists. A simple definition list is made of a single-line term followed by a colon and the definition for that term.
```
Cat
: Fluffy animal everyone likes

Internet
: Vector of transmission for pictures of cats
```
Terms must be separated from the previous definition by a blank line.
Footnotes. A marker in the text that will become a superscript number; a footnote definition that will be placed in a list of footnotes at the end of the document. A footnote looks like this:
```
This is a footnote.[^1]

[^1]: the footnote text.
```
Autolinking. Blackfriday can find URLs that have not been explicitly marked as links and turn them into links.
Strikethrough. Use two tildes (~~) to mark text that should be crossed out.
Hard line breaks. With this extension enabled (it is off by default in the MarkdownBasic and MarkdownCommon convenience functions), newlines in the input translate into line breaks in the output.
Smart quotes. Smartypants-style punctuation substitution is supported, turning normal double- and single-quote marks into curly quotes, etc.
LaTeX-style dash parsing is an additional option, where -- is translated into –, and --- is translated into —. This differs from most smartypants processors, which turn a single hyphen into an ndash and a double hyphen into an mdash.
Smart fractions, where anything that looks like a fraction is translated into suitable HTML (instead of just a few special cases like most smartypant processors). For example, 4/5 becomes <sup>4</sup>&frasl;<sub>5</sub>, which renders as ⁴⁄₅.

Other renderers

Blackfriday is structured to allow alternative rendering engines. Here are a few of note:

github_flavored_markdown: provides a GitHub Flavored Markdown renderer with fenced code block highlighting, clickable heading anchor links.

It's not customizable, and its goal is to produce HTML output equivalent to the GitHub Markdown API endpoint, except the rendering is performed locally.
markdownfmt: like gofmt, but for markdown.
LaTeX output: renders output as LaTeX.
bfchroma: provides convenience integration with the Chroma code highlighting library. bfchroma is only compatible with v2 of Blackfriday and provides a drop-in renderer ready to use with Blackfriday, as well as options and means for further customization.
Blackfriday-Confluence: provides a Confluence Wiki Markup renderer.
Blackfriday-Slack: converts markdown to slack message style

TODO

More unit testing
Improve Unicode support. It does not understand all Unicode rules (about what constitutes a letter, a punctuation symbol, etc.), so it may fail to detect word boundaries correctly in some instances. It is safe on all UTF-8 input.

License

Blackfriday is distributed under the Simplified BSD License

blackfriday's People

Contributors

Stargazers

Watchers

Forkers

intfrr bmatsuo burntsushi fiber moshee cespare skey justinhuang917 mdietz lancee jessonchan dchest sunfmin fvbock bussiere jakerr wendal rtfb theplant japina fanflash kenshinx akavel rygorous jasonmoo athom olivere jonsen strogo chrislusf otiai10 slene bertzzie liamstask ajstarks davidoram nosize sevki zmarcantel danieltoshea johnkew lifeibo asdf20122012 wangbin jayvansantos siscia scottkiss albert-wang illahaha glycerine niksem grymoire7 bk eastany gihnius bantana bookman muhqu aspic benjaminrh tobert mprobst speedata bakedsoftware kinghrothgar langhorst dimfeld mewbak archs mehmetalianil flosch edgarcai mb2chainz jordie danward79 kenpu weisd cpuguy83 yuyongwei dunhamsteve zhgwenming lykling imloama skriticos mervin0502 icco gigovich alexis211 rainsome-org1 miekg morephp rayleyva davidajohnston muroachanf itoolabs erasin andreburgaud gopensource kenjitakahashi krrrr38

blackfriday's Issues

Smartypants removes solo backticks

We were trying to explain how to use a backtick to create a code block.

But for the input: Just use a ` backtick

The output is: Just use a backtick

This occurs when blackfriday.HTML_USE_SMARTYPANTS is enabled

I suspect (from glancing at the code) that it requires a pairing.

In instances where a pair does not exist, the punctuation should ideally be left as-is.

Create implicit header ids when no explicit ones are provided

Similar to what GitHub does, e.g.

# This is a header

becomes

<h1 id="this-is-a-header">This is a header</h1>

Even better if there is an option to generate clickable anchor tags for each header that can be used to copy full URLs to specific parts of the document (again, like GitHub).

I'm happy to work on this if there are no objections.

HTML_SAFELINK should trust relative links

Links such as:

[foo](/my/foo)

Are of a known/trusted protocol (http).

blackfriday produces invalidly nested HTML.

I'm noticing more and more that for various input blackfriday can produce HTML that is invalidly nested and that may break the layout of the page on which it appears.

Input:

<blockquote>
A list:
1. Foo</blockquote>
1. Bar

Which is a valid HTML blockquote containing a list with Foo, and outside of the blockquote a list with Bar.

Expected (something like):

<blockquote>
A list:
<ol>
<li>Foo<br/></li>
</ol>
</blockquote>
<ol>
<li>Bar</li>
</ol>

Output:

<blockquote><br/>
A list:</p>

<ol>
<li>Foo</blockquote><br/></li>
<li>Bar<br/></li>
</ol>

Note that the blockquote is now terminated within the list item, forcing the browser to close tags wherever it feels suitable, which then introduces phantom tags which will change the page layout.

Question: Is it the job of blackfriday to produce valid HTML?

On rendering GitHub Flavored Markdown and code organization/separation of responsibilities.

Hello,

I have some uncommitted code in my GOPATH that performs a task that I think is generally useful, and I want to discuss the most appropriate way to move it "upstream" (so I can finally start making some pull requests).

Specifically, the task is very narrow and well defined:

Render an arbitrary GitHub Flavored Markdown document, but:
- Do so using native Go code that is easy to go get, import and start using.
- Do so locally and fast (i.e., without internet access).

Effectively, the output should be the same HTML (or equivalent HTML that produces the same visual result) as what GitHub Markdown API produces. See https://developer.github.com/v3/markdown/#render-a-markdown-document-in-raw-mode for reference.

I want to provide a Go function that is very direct and doesn't require configuration:

// Best effort at generating GitHub Flavored Markdown-like HTML output locally.
func MarkdownGfm(input []byte) []byte
func WriteMarkdownGfm(w io.Writer, input []byte)

The closest I'm able to come to solving that task with existing Go code is by using blackfriday with custom extensions and html flags for blackfriday.HtmlRenderer, see here.

However, the main missing feature is code highlighting for fenced code blocks. It is possible to rely on client-side JavaScript code to apply that in post-processing, but the GitHub Markdown API does this as part of Markdown generation and that's what I want too.

To make that possible, blackfriday would need to be modified. I see three approaches:

Add func MarkdownGfm(input []byte) []byte to blackfriday directly, and import other packages that are required for it to work.
- The disadvantage is that it'd make blackfriday have more imports.
Make as few modifications to blackfriday as possible, and then I can create a new package which imports blackfriday, other packages, allowing for func MarkdownGfm(input []byte) []byte to exist.
- I think this is the best way to go. It would keep blackfriday lightweight and focused on providing the existing "highly customizable general purpose Markdown parser and renderer" functionality.
Make no modifications to blackfriday. Create a new package that largely duplicates the existing blackfriday.HtmlRenderer type and makes the required changes there.
- I don't like this because there'd be a lot of code duplication in the HtmlRenderer; this is a fallback for me if my change proposal is not accepted.

Given the trend that I'm seeing from the discussion in #90, IMO it's best to go with option 2.

The change to blackfriday would be an addition of an exported interface and one new field added to HtmlRendererParameters struct. Best visualized with a diff:

+type BlockCodeHighlighter interface {
+   // Highlights text using lang syntax and returns highlighted HTML output.
+   BlockCodeHighlight(text []byte, lang string) []byte
+}
+
type HtmlRendererParameters struct {
    // Prepend this text to each relative URL.
    AbsolutePrefix string
    // Add this text to each footnote anchor, to ensure uniqueness.
    FootnoteAnchorPrefix string
    // Show this text inside the <a> tag for a footnote return link, if the
    // HTML_FOOTNOTE_RETURN_LINKS flag is enabled. If blank, the string
    // <sup>[return]</sup> is used.
    FootnoteReturnLinkContents string
+   // If not nil, this is used to highlight contents of code blocks.
+   BlockCodeHighlighter BlockCodeHighlighter
}

Update: On second thought, I think I'd just use a BlockCodeHighlighter func (text []byte, lang string) []byte instead of an interface. Nothing to gain from it being an interface.

That's a rough draft (typed by hand based on my hacky uncommited code), feedback is welcome.

Then, to fully implement MarkdownGfm(), it's just a matter of creating a bunch of highlighters for various languages. I imagine a high level highlighter that switches based on the lang parameter (and maybe this part can be added to blackfriday), and then uses lower level language-specific highlighters. syntaxhighlight package can be used (/cc @sqs), or similar highlighters that are language specific (I've made one for diff, and can use go/scanner for highlighting Go code specifically). These probably best live in their own packages.

Looking forward to hearing feedback on this proposal, thanks!

Edit: I forgot to mention there will be a css component that contains the styles for classes. The highlighters rely on those. It's just one more detail to consider for the final design/implementation.

Two remaining issues with the new HTML_SANITIZE_OUTPUT.

Hi, there are still two issues with the new HTML-parser based sanitization (#69) remaining after #71 and #70. /cc @mprobst One major, one minor.

First, the minor. When HTML_SANITIZE_OUTPUT is on, self-closing tags like <hr /> get rewritten as <hr>. See: https://github.com/russross/blackfriday/pull/70/files#r12218729.

Next, the major. When HTML_SANITIZE_OUTPUT is off, the following Markdown,

Here are some "quotes".

is converted to this HTML,

<p>Here are some &quot;quotes&quot;.</p>

Or, with HTML_USE_SMARTYPANTS on, then,

<p>Here are some &ldquo;quotes&rdquo;.</p>

However, when HTML_SANITIZE_OUTPUT is turned on, the html escaped quotes are replaced with output that doesn't render correctly as HTML (update: unless I explicitly set the charset to "utf-8", see the edits below).

From my limited testing, it seemed that replacing this code with the following fixed the problem.

wr.Write(tokenizer.Raw())

But that should be carefully validated.

I'm quite surprised the tests didn't catch this. Are there any tests for HTML_SANITIZE_OUTPUT with these symbols?

Edit: The 2nd "major" issue may be a non-issue, the escaped html is replaced by valid unicode characters that may get displayed correctly under the right conditions. But, is this an intended part of the sanitization process, and what is the motivation for it? I'm pretty sure it wasn't the case before.

Edit 2: Yeah, it turns out it's valid utf-8. The curly quotes show up correctly as “ if I explicitly set the charset to "utf-8" in the HTML, otherwise I see something else like â€œ in Chrome when viewing the generated HTML.

Add LaTeX Options for skipping header and footer

As for now this can be circumvented by shadowing the header and footer methods with stubs, but it'd be more convenient if there was an option similar to the "generate complete html document" thing for html output.

PDF output

Any thoughts on supporting PDF output as well?

panic

blackfriday.MarkdownCommon([]byte("[[t]](/t)"))

Calling to the above code caused panic.

panic: runtime error: index out of range

at go/src/github.com/russross/blackfriday/inline.go:184

LaTeX output is stubbed for EXTENSION_FOOTNOTES

XSS BY LINK

    [FUCKLINK][1]

    [1]: javascript:alert(window.document.cookie);

Your markdown link to check if it is not strictly a hypertext link, the bug causing js script can be executed.
0.0

Not sure if invalid HTML, or bug in blackfriday...

package main

import (
    "os"

    "github.com/russross/blackfriday"
)

func main() {
    text := []byte(`Hello <span title="<">there</span> world.`)
    os.Stdout.Write(blackfriday.MarkdownBasic(text))

    text = []byte(`Hello <span title=">">there</span> world.`)
    os.Stdout.Write(blackfriday.MarkdownBasic(text))
}

Output:

<p>Hello <span title="<">there</span> world.</p>
<p>Hello <span title=">&quot;&gt;there</span> world.</p>

I'm having a hard time figuring out if <span title=">"> is valid HTML, or if the title attribute value should be escaped ala <span title=">">. Browsers obviously accept both, and I think both are valid, but not sure.

If it is valid HTML, then perhaps blackfriday output for 2nd line should be:

<p>Hello <span title=">">there</span> world.</p>

But, this may be hard to fix and not worth fixing? As far as I can tell, even GitHub's internal Markdown renderer has the same bug/behavior.

# in <pre><code> missinterpreted

If you put "#" in

, then the symbol is interpreted as header.
See this
Thanks,

Human-readable anchor values for headings

It would be really nice is to have the anchor URLs be generated as a lowercase, hyphenated version of the title text. Something like:

# Welcome to my new blog article
--> website.com/posts/page#welcome-to-my-new-blog-article

I'm currently using your library as apart of Hugo.

Here is my original post:

gohugoio/hugo#387

TOC header ids

Currently, html TOC header ids are of the form #toc_<num>, which is not great for linking. They're not illustrative or resistant to reordering. I propose instead doing what sites like github do: use the header text (replacing special characters with '-').

Protect against script injection

In the interest of "safety against malicious user input", shouldn't there be an option to prevent the passthrough of script tags?

Code highlighting

It would be nice to be able to highlight code on server-side instead of doing that on client-side (right now I'm thinking about HTML and not Latex output). I see two possibilities here:

get blocks of code from the resulting HTML (by parsing it), highlight them and inject back,
provide (optional) highlighting function to blackfriday which will be called during rendering parts of the code.

I believe that latter is better option both API- and performance-wise. Do you have any thoughts/opinion about this?

Alternatives to changing HtmlRenderer prototype?

I'm putting together a change to automatically convert all relative links in a Markdown file into absolute links during the render process. Currently this involves adding a new flag HTML_ABSOLUTE_LINKS and changing the prototype of HtmlRenderer to func HtmlRenderer(flags int, title string, css string, absolutePrefix string) Renderer. My particular application is generating RSS feeds from Markdown, where relative links won't correctly reference other files on the original server.

This change is fine for my purposes, but if I submit a pull request for it, then I'm concerned about changing the API and causing compile errors, albeit easily fixable ones, for all existing users.

One possibility is to create an HtmlRendererWithAbsolutePrefix function which would have the new prototype. But I worry that it would encourage the proliferation of a bunch of HtmlRendererWithSomeExtraArgument functions, which isn't too clean.

Any thoughts on this? Of course, if you don't want this change at all, then I can ust keep it in my fork and the issue is moot :)

Block quote not supported?

Is it just me, or is the conversion of > He said, she said style block quotes not supported?

Am I just doing something wrong? I'm very new to Go, but I might attempt to implement it?

Metadata extension

Hello, I really like this package; however, a big headache for me is the inability to add metadata to documents. Apparently conventions are emerging for metadata, and I was wondering if you could add support for one. Here's an article that discusses one simple convention: http://hiltmon.com/blog/2012/06/18/markdown-metadata/

This would simplify development considerably.

Bug handling HTML blockquote if no markdown outside of the blockquote

Given this input:

<blockquote>Got this today... Leg hurts

![](http://farm6.static.flickr.com/5083/5258310683_f7c87edbc4_z.jpg)</blockquote>

The image inside the blockquote remains as markdown after processing with blackfriday.

Out:

<blockquote>Got this today... Leg hurts

![](http://farm6.static.flickr.com/5083/5258310683_f7c87edbc4_z.jpg)</blockquote>

Expected... the markdown image tag to become an HTML IMG element.

But, if you add any markdown outside of the blockquote, then the markdown image tag is converted to HTML though the HTML is invalid (nesting of paragraphs is wrong):

Input:

**bold**
<blockquote>Got this today... Leg hurts

![](http://farm6.static.flickr.com/5083/5258310683_f7c87edbc4_z.jpg)</blockquote>

Out:

<p><strong>bold</strong><br />
<blockquote>Got this today... Leg hurts</p>

<p><img src="http://farm6.static.flickr.com/5083/5258310683_f7c87edbc4_z.jpg" alt="" />
</blockquote></p>

Expected:

<p><strong>bold</strong></p>
<blockquote>Got this today... Leg hurts

<img src="http://farm6.static.flickr.com/5083/5258310683_f7c87edbc4_z.jpg" alt="" />
</blockquote>

Bug parsing emphasized links?

*A[B](C)* [D](E)
gives
<p>*A<a href="C">B</a>* <a href="E">D</a></p>
but I was expecting
<p><em>A<a href="C">B</a></em> <a href="E">D</a></p>

New version tag?

Would it be possible to get a new version tag? ie, v1.2 or even v2.0 depending on how many changes have gone in since that 2011 v1.1 tag? 😄

Smart quotes break when enclosing punctuation.

"what what"! -> “what what”!
"what what!" -> “what what!“

HTML Sanitize doesn't allow relative links

With HTML sanitize turned on, relative URLs are filtered out. I think this is because protocolAllowed is called on relative URLs, so adding !isRelativeLink(val) && to sanitize.go line 95 should fix this.

I haven't tested this, but it seems like the correct fix. I have verified that relative links are present in the output with sanitize disabled. Note that I've only tried this with image URLs, but it probably applies to others as well.

HTML_SANITIZE + HTML_NOFOLLOW_LINKS = no links

<a href="..." rel="nofollow"> does not pass the anchor regex.

Locally, I've worked around it by killing half the sanitizer, making it only check tags (as opposed to tags, alignments, attributes and attribute ordering.) I'm not certain that that's the best way to fix the issue.

Add ids to headings

It would be nice to have headings (h1, h2, ...) receive id attribute, to have possibility to refer them in generated html. id could be generated from content of heading, or maybe with some callback...

support python markdown codehilite syntax for code block

I used the python's markdown package with codehilite extension, it's syntax is quite similar with the regular markdown, but add one line to specify the language of the code, document is at here, I wonder if blackfriday could also support this syntax?

I forked the project and did some changes, the commit is wangbin@056f292

Consider other days of the week?

I've recently stumbled upon https://github.com/microcosm-cc/bluemonday, which seems to be a Go library for HTML sanitizing.

Would it be a good idea or a bad idea to use it?

I haven't really looked at it closely yet, but I just wanted to start the discussion here.

Header #-tags without trailing whitespace after the # not recognized

Headings are not correctly parsed when using the #syntax without trailing whitespace.
Example:

# correct

#incorrect

This should generate (checked against the Daring Fireball Markdown processor):
<h1>correct</h1>
<h1>incorrect</h1>

but creates:
<h1>correct</h1>
<p>#incorrect</p>

Footnotes extension

This is a feature request for footnotes in the style of pandoc

http://johnmacfarlane.net/pandoc/README.html#footnotes

A Text Renderer

It's usefully to provide a Renderer that renders markdown to raw text which can be embeded safely in the page.

deck renderer

I've written a renderer for deck markup [1] [2] [3] [4] and I have a couple of questions:

is it possible to add a flag so that the list renderer can distinguish between a list beginning with '-' and from one beginning with ''? I want to render the '-' with without bullets, and the '' with bullets.
For example:

  - item 1
  - item 2

should generate

  <list xp="10" yp="90" sp="2">
      <li>item 1</li>
      <li>item 2</li>
  </list>

but

  * item 1
  * item 2

generates:

 <list xp="10" yp="90" sp="2" type="bullet">
      <li>item 1</li>
      <li>item 2</li>
  </list>

The image parser places paragraph tags around an image. Is there a way to just get the plain image? For example:

   ![50,50,960,540](/Users/ajstarks/Images/desert960.jpg "The desert")

I'd like to render in deck without as:

<image name="/Users/ajstarks/Images/desert960.jpg" xp="50" yp="50" width="960" height="540" caption="The desert" />

Finally, I've updated blackdown-tool to use a -format flag so that you can say:

$ blackdown-tool -format html ...
$ blackdown-tool -format latex ...
$ blackdown-tool -format deck ...

[1] https://github.com/ajstarks/deck
[2] http://godoc.org/github.com/ajstarks/deck
[3] https://github.com/ajstarks/deck/blob/master/examples/deck.xml
[4] https://github.com/ajstarks/deck/blob/master/examples/deck.pdf?raw=true

Compatibility with go weekly 2011-11-09 and newer

Just rename utf8 to unicode/utf8 on markdown.go.
Both gofix or goinstall -fix=true... work nicely.

All tabs are converted to spaces, even inside fenced code blocks.

Blackfriday currently converts all tabs to spaces (4 or 8, depending on config) as part of the pre-processing step. This irreversibly converts tabs even inside fenced code blocks.

This is bad for 2 reasons:

Primarily, it's frustrating that it changes the code you've pasted into fenced code blocks, making it not identical if you try to copy it from the Markdown output.
Additionally, in rare circumstances, it may change the behavior of code. As an example, consider a program that tries to count the number of bytes in a string literal that contains tabs. Additionally, it turns valid diff (.patch) blocks into invalid ones.

Links in lists disable list

If links are left in a list as such:

blackfriday
blah

it renders as a paragraph with links and dashes (or whatever list item marker was used).

Is it possible to access the document for analysis rather than generating output?

For instance, iterating through all the parts of the document and see what format it is, etc.

Angle brackets in a bookmarklet link breaks link

This input:

<a href="javascript:(function(h){var i=h.indexOf('&');if(i>=0)
{url=h.substring(0,i);}else{url=h;}
resp=prompt('This is the address to use (Hit Ctrl+C or Cmd+C to copy)',url)})
(window.location.href);">YouTube Link</a>

produces this output:

<p><a href="javascript:(function(h){var i=h.indexOf('&');if(i>=0)
{url=h.substring(0,i);}else{url=h;}
resp=prompt(&lsquo;This is the address to use (Hit Ctrl+C or Cmd+C to copy)&rsquo;,url)})
(window.location.href);&ldquo;&gt;YouTube Link</a></p>

I expected to have it render thusly

<p><a href="javascript:(function(h){var i=h.indexOf('&');if(i&gt;=0)
{url=h.substring(0,i);}else{url=h;}
resp=prompt('This is the address to use (Hit Ctrl+C or Cmd+C to copy)',url)})
(window.location.href);">YouTube Link</a></p>

Escaping the angle bracket with a backslash made no difference.

A workaround is to preemptively convert the angle bracket (in this case the greater-than symbol) to an HTML entity, like so:

<a href="javascript:(function(h){var i=h.indexOf('&');if(i&gt;=0)
{url=h.substring(0,i);}else{url=h;}
resp=prompt('This is the address to use (Hit Ctrl+C or Cmd+C to copy)',url)})
(window.location.href);">YouTube Link</a>

Textile support?

Any plans to add textile support or willingness to include one if written by someone else?

In my project I need textile support so I decided first to port upskirt to Go, to learn how it works, and then implement textile in similar way.

Only after finishing the go port (https://github.com/kjk/go-markup) I've found your project, which is slightly ahead.

There's little point in having 2 almost identical codebases but I really want to complete the phase 2 i.e. textile support. I would be happy to drop my port and just contribute that (and possibly other improvements) to blackfriday.

Are you at all interested in extending blackfriday that way?

github flavored Markdown configuration in core

Hi,

there are two out of the box configurations how markdown may be rendered Basic and Common.

GitHub does some things different than the standard markdown - for example a line break in Markdown is a link break in HTML.

blackfriday has everything that is needed to make a markdownGHF configuration.
As this configuration is quite popular it would be nice to have this as a third out of the box configuration.

The documentation about how to implement it for my own is not clear enough for me - maybe a clearer documentation would be an other good solution.

This would be the extension set for it

EXTENSION_NO_INTRA_EMPHASIS
EXTENSION_HARD_LINE_BREAK
EXTENSION_AUTOLINK
EXTENSION_STRIKETHROUGH
EXTENSION_FENCED_CODE

see: https://help.github.com/articles/github-flavored-markdown

MarkdownCommon() renders table tags and then strips them

As near as I can tell, blackfriday.MarkdownCommon() renders table elements (if present) and then strips them via the HTML_SANITIZE_OUTPUT HtmlRenderer flag. Possibly related to #64.

Test Case

package main

import (
    "fmt"
    "github.com/russross/blackfriday"
)

func main() {
    input := `Name    | Age
--------|------
Bob     | 27
Alice   | 23
`
    output := blackfriday.MarkdownCommon([]byte(input))
    fmt.Println(string(output))
}

Expected Output

<table>
<thead>
<tr>
<th>Name</th>
<th>Age</th>
</tr>
</thead>

<tbody>
<tr>
<td>Bob</td>
<td>27</td>
</tr>

<tr>
<td>Alice</td>
<td>23</td>
</tr>
</tbody>
</table>

Actual Output

Name
Age





Bob
27



Alice
23

I noticed that if you comment out htmlFlags |= HTML_SANITIZE_OUTPUT on line 239 in markdown.go, you get the expected result.

Implement HTML description lists

There is one syntax extension I had when using a PHP Markdown parser, that is the ability to write HTML description lists with the following syntax :

Cat
: Fluffy animal everyone likes
Internet
: Vector of transmission for pictures of cats

The corresponding HTML output would be the following :

<dl>
<dt>Cat</dt>
<dd>Fluffy animal everyone likes</dd>
<dt>Internet</dt>
<dd>Vector of transmission for pictures of cats</dd>
</dl>

Could you implement such a functionnality ? I would have done it myself, only I have no idea how your markdown parser works and how to modify it.

Keep up the good work !

render markdown to template include ##

I have a struct like this:

type Page struct {
  Content  string
}

then I read a markdown file and assign to a variable:

data, err := ioutil.ReadFile("a.md")
lines = string(data)
page.Content = markdownRender([]byte(lines))

the markdown file like this:

##Hello World

###Holo Go

and then I put it into markdown render function and return a string value:

func markdownRender(content []byte) string {
  htmlFlags := 0
  htmlFlags |= blackfriday.HTML_USE_SMARTYPANTS
  htmlFlags |= blackfriday.HTML_SMARTYPANTS_FRACTIONS

  renderer := blackfriday.HtmlRenderer(htmlFlags, "", "")

  extensions := 0
  extensions |= blackfriday.EXTENSION_NO_INTRA_EMPHASIS
  extensions |= blackfriday.EXTENSION_TABLES
  extensions |= blackfriday.EXTENSION_FENCED_CODE
  extensions |= blackfriday.EXTENSION_AUTOLINK
  extensions |= blackfriday.EXTENSION_STRIKETHROUGH
  extensions |= blackfriday.EXTENSION_SPACE_HEADERS

  return string(blackfriday.Markdown(content, renderer, extensions))
}

and finally I call the page.Content in a html template and generate a static html:

{{.Content}}

but in the generated html it show in the browser(I try it in the chrome and safari) like this(not the source code,It just show in the page):

<p>##Hello World ###Holo Go </p>

but I want it like this

Hello World

Holo Go

So,how can i do this

Header #-tags without trailing whitespace after the # not recognized

Headings are not correctly parsed when using the #syntax without trailing whitespace.
Example:

# correct

#incorrect

This should generate (checked against the Daring Fireball Markdown processor):
<h1>correct</h1> <h1>incorrect</h1>

but creates:
<h1>correct</h1>
<p>#incorrect</p>

How can I render markdown to a golang template(html or tmpl) with blackfriday?

I use the Martini framework,I have some markdown file and I want render it as HTML in tmpl/html template.

The markdown file like this:

title: A Test Demo

---
##ABC
> 123

And the template file like this:

<head>
  <title>{{name}}</title>
</head>

<body>
  <h2>{{abc}}</h2>
  <blockquote>
    <p>{{xyz}}</p>
  </blockquote>
</body>

I use the blackfriday parse the markdown and return []byte type,next step I wanna render the markdown file to this template and make each block to the right place,so how can I do this right way? Or use any way to do this better?

Fenced Code Blocks without a blank line before.

GitHub Flavored Markdown has a section on fenced code blocks, where it says "Keep in mind that both types of code blocks need to have a blank line before them".

github.com, however, ignores that statement, and renders markdown without such a blank line correctly (i.e. as if the blank line were present). blackfriday currently does not follow that behavior (tested with blackfriday.MarkdownCommon, and renders such markdown in an odd manner. Should it be changed to match that of GitHub?

The follow markdown reproduces the issue:

some text without a blank line afterwards
```Go
someCode()
```

For a larger example, see this Markdown source [1], how github.com displays it [2], and how blackfriday.MarkdownCommon renders it [3].

[1] - https://raw.github.com/shurcooL/go-goon/8ddcefebec68d2dbcbac5225bf8760fbd4598c47/README.md
[2] - https://github.com/shurcooL/go-goon/blob/8ddcefebec68d2dbcbac5225bf8760fbd4598c47/README.md
[3] - http://dl.dropboxusercontent.com/u/8554242/available-for-2-weeks/fenced_code_blocks_blackfriday.html

Support image hyperlinks

blackfriday currently wont parse an image tag inside a hyperlink tag, like so:

[![alt text](image.png)](http://hyperli.nk)

It treats the image markdown as text and displays it.

README implies "go build" would fetch remote packages

With Go 1 and git installed:

    go get github.com/russross/blackfriday

will download, compile, and install the package into your `$GOPATH`
directory hierarchy. Alternatively, you can import it into a
project:

    import "github.com/russross/blackfriday"

and when you build that project with `go build`, blackfriday will be
downloaded and installed automatically.

The "alternatively" path just disappointed a newbie. Of course, go build doesn't download the missing remote packages.

Relative links broken in MarkdownCommon

The code in this gist used to render relative links and images correctly:

https://gist.github.com/ancientlore/fa1a084def32e0828a33

Now it skips them. It doesn't seem like it would be by design.

Escaping astericks does not work

Typing
What is A all about?*
should be rendered in all italics, but isn't
Github also messes it up, cool. GFM != Markdown