GithubHelp home page GithubHelp logo

island205 / h2m Goto Github PK

View Code? Open in Web Editor NEW
271.0 12.0 43.0 786 KB

Tool for converting HTML to Markdown, like html2markdown.

Home Page: http://island205.github.io/h2m/

JavaScript 48.01% HTML 51.18% CSS 0.81%

h2m's Introduction

h2m

logo

npm Build Status Coverage Status

Tool for converting HTML to Markdown, like html2markdown.

online converter: http://island205.github.io/h2m/

online converter

Install

$npm install h2m

How to use

h2m(html[, options])

example

var h2m = require('h2m')

var md = h2m('<h1>Hello World</h1>')
// md = '# Hello World'

options

  • converter: the converter you can choose. now support CommonMark(default) and MarkdownExtra
  • overides: custom converter behavior:
h2m('<a href="http://island205.github.io/h2m/">h2m</a>', {
    overides: {
        a: function(node) {
          /**
          node is an object as the a tag:
          {
            name: "a",
            attrs: {
              href: 'http://island205.github.io/h2m/'
            },
            md: 'h2m'
          }
          */
          return `[This is an link element](${node.attrs.href})`
        }
    }
}

// output [This is an link element](http://island205.github.io/h2m/)

Command Line Tool

install

$ npm install h2m -g

h2m downloads

usage

$h2m -h

Usage: h2m [options]

Options:

  -V, --version      output the version number
  -f, --file <file>  HTML file path or an url adress (default: )
  -c, --clipboard    read HTML from clipboard
  -h, --help         output usage information

Convert a local file:

$ h2m  -f index.html

converting HTML to Markdown

made by [@island205](https://github.com/island205)

Can't be convert? welcome to submit an [issue](https://github.com/island205/h2m/issues/new).

Convert an online url:

$ h2m -f https://baidu.com

Convert from clipboard:

$ h2m -c

Save result:

$ h2m  -f https://google.com > google.md

Support

h2m supports standard Markdown sytax: CommonMark now and Markdown Extra.

CommonMark

  • ✅ br
  • ✅ em
  • ✅ strong
  • ✅ code
  • ✅ a
  • ✅ img
  • ✅ hr
  • ✅ ul, ol
  • ✅ pre
  • ✅ div
  • ✅ p
  • ✅ blockquote
  • ✅ h1 ~ h6

Markdown Extra

  • ✅ Special Attributes for headers link and image
  • ✅ Fenced Code Blocks
  • ✅ dl, dt, dd Definition Lists
  • ✅ abbr Abbreviations
  • ✅ table (tks @天凉's PR')

Contribution

PRs are welcome to implement other extend Markdown language, like Markdown Extra, GFM and so on.

h2m's People

Contributors

daycool avatar gilbertsun avatar island205 avatar nandenjin avatar voischev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

h2m's Issues

<pre> breaks following inline <code>

require('h2m')(
    `
        <p>Text <code>code</code> text</p>
        <pre>code</pre>
        <p>Text <code>code</code> text</p>
        <pre><code>code</code></pre>
    `,
    { converter: 'MarkdownExtra' }
)

converts to

Text `code` text

```
code
```

Text code text

```
code
```

stdin and options for command line tool

Hi,

thanks for h2m - seems exactly what I was looking for to convert HTML in the clipboard to Markdown - and the best feature for me is that styles are not conserved, therefore the resulting Markdown becomes useable! Cool!

For a command line workflow, could you add the options also to the command line tool and also add stdin support, so that a one-liner without temp files could become possible?

Thanks, and keep the good work up!

用绝对路径时报错

chengang-MacBook:blog chen$ h2m /Volumes/M320/Downloads/tmp/blog/92.html
fs.js:663
return binding.open(pathModule.toNamespacedPath(path),
^

Error: ENOENT: no such file or directory, open '/Volumes/M320/Downloads/tmp/blog/Volumes/M320/Downloads/tmp/blog/92.html'
at Object.fs.openSync (fs.js:663:18)
at Object.fs.readFileSync (fs.js:568:33)
at loadHTMLFromFile (/usr/local/lib/node_modules/h2m/bin/h2m-cli.js:23:13)

Can't convert:Cannot read property 'src' of undefined

建议支持通配符,批量转换

正好有一批旧博客文章需要转 markdown, 挺好用的。我还有以下让它使用很方便些的建议:

  • h2m abc.html 默认直接输出abc.md
  • h2m *.html 支持通配符

Unsupported HTML tags?

Hi, I'm just trying implement converters for <b> and <i> tags on my own forked repo, but it fails on its original tests, and I found it is intended to ignore these tags.

I know that <b> and <strong>, or <i> and <em> have absolutely different purposes, and Spec for CommonMark defines that we should use <strong> and <em> to convert CommonMark, but the definition is for converting from Markdown to HTML and I wonder they can be implemented for convenience in this package.

If you have any opinion about this I want you tell me here.

Ref: My forked repo

大文件无法转换?【h2m】

您好~

我在PC(win10,x64)上用npm安装了h2m,发现无法将一个大的html文件转换为md,但测试了从这个大html中截取的一段小html文件就可以转换......

请问是因为太大了的原因吗?除了将它切分以外还有别的办法吗?(我并不会怎么切分html....我有的工具仅仅是notepad++之类的......)

  1. 这两个文件转换前都重命名删去了空格和中文
  2. 这两个文件都有语法问题,因为我不是想要发布,而是想提取每一段的markdown录入数据(然后发送到我的trello邮件地址进而导入trello里头......whatever)

谢谢~

Cyan

无法正确的处理table

2017年9月10日 22:22:00更新
markdown支持html语法,可以直接把table拿过来用,不需要转化。建议遇到table不转化。

meta The meta object matches the HTTP response message:
  • status: the 3-digit HTTP Status-Code (e.g., 200)
  • msg: the HTTP Reason-Phrase (e.g., OK)
response API-specific results

如下:处理不正常

<table>
    <tbody>
    <tr>
     <th>meta</th>
     <td> The <code>meta</code> object matches the HTTP response message: 
      <ul class="tight">
       <li><code>status</code>: the 3-digit HTTP Status-Code (e.g., <code>200</code>)</li>
       <li><code>msg</code>: the HTTP Reason-Phrase (e.g., <code>OK</code>)</li>
      </ul></td>
    </tr>
    <tr>
     <th>response</th>
     <td>API-specific results</td>
    </tr>
    </tbody>
</table>

转化后为:

meta The `meta` object matches the HTTP response message: 
      
- `status`: the 3-digit HTTP Status-Code (e.g., `200`)
- `msg`: the HTTP Reason-Phrase (e.g., `OK`)

responseAPI-specific results

Spaces between bold and anchor not preserved

Input: <strong>Getting Started Documentation</strong> <a href="https://api.centro.rocks/v1/url/?u=AxhWpwHn">here</a>
Output: **Getting Started Documentation**[here](https://api.centro.rocks/v1/url/?u=AxhWpwHn)

Notice the space between the text and the link is lost.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.