GithubHelp home page GithubHelp logo

ashuraxelr / jparsedown Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 0.0 56 KB

Lightweight Markdown Parser in Java

License: Other

Java 49.06% HTML 50.94%
java markdown parser markdown-parser markdown-processor markdown-conversion markdown-to-html library command-line-tool

jparsedown's Introduction

JParsedown

Lightweight Markdown Parser in Java: library or command line tool

JParsedown Library

JParsedown is a lightweight single-file library for converting Markdown to HTML format. The library is translated from Parsedown PHP library (version 1.8.0-beta-5) and preserves its features:

The library is compliant with Java 7+.

Additinoal features of JParsedown that are not (yet) available in the original Parsedown:

Download

Source file: JParsedown.java

JAR file: jparsedown-1.0.4.jar (50.8 KB)

Usage

JParsedown parsedown = new JParsedown();
System.out.println(parsedown.text("Hello _Parsedown_!")); // prints: <p>Hello <em>Parsedown</em>!</p>

You can also parse inline markdown only:

System.out.println(parsedown.line("Hello _Parsedown_!")); // prints: Hello <em>Parsedown</em>!

Security

See Parsedown Security page.

Header IDs

Github automatically generates anchor IDs for each header in Markdown file to make it easier to reference individual sections and create the table of contents. JParsedown attempts to generate the same IDs, so the itra-page links in rendered HTML page still work like on Github.

For example, ## Header IDs creates the following HTML:

<h2 id="header-ids">Header IDs</h2>

and can be referenced as follows:

[Header IDs](#header-ids)

ID generation in JParsedown follows these rules:

  1. The header text is converted to lower case.
  2. Special HTML characters like &ndash; are removed.
  3. All characters other than letters, numbers, underscore, or whitespaces are removed.
  4. Whitespaces are replaced with dashes -.
  5. ID is URL-encoded to handle Unicode letters.
  6. Duplicate IDs have a dash and a number appended: header-ids, header-ids-1, header-ids-2, etc.

Page Title Detection

JParsedown provides the title string available after calling text() method:

JParsedown parsedown = new JParsedown();
parsedown.text("# My Title\n\nMore text...");
System.out.println(parsedown.title); // prints: My Title

The string contains the best candidate for HTML page title, which is the first highest level header. For example, if the page has no level-1 header, but has several level-2 headers, the first of them will be the title.

If the page does not contain any headers, title will be null.

Note: The Markdown in the title is not stripped or processed.

MD Links Conversion

Github documentation may have links between MD files like [see other file](file.md#anchor). When converting documentation to static HTML pages, it is often desired to convert these links to respective HTML files, i.e. [see other file](file.html#anchor).

JParsedown provides a function setMdUrlReplacement(String) that tells what replacement to use for .md extensions. For example, setMdUrlReplacement(".html") will replace .md in URL links with .html.

The conversion is applied only to relative URLs, i.e. the ones that do not contain colon : character.

Use setMdUrlReplacement(null) to disable conversion (default behaviour).

Performance

Benchmark results:

test file repeat JParsedown Parsedown (PHP) flexmark-java
cheatsheet.md ×100 4.4 ms per item 5.5 ms per item (×1.25) 6.2 ms per item (×1.41)
cheatsheet.md ×1000 2.4 ms per item 5.4 ms per item (×2.25) 2.4 ms per item (×1.00)

The benchmarking does not consider saving and loading times. Only text() function is measured.

At the moment, JParsedown is not properly performance optimised. Speedup against the origial Parsedown is due to Java vs PHP performance difference. Also note how JIT really helps Java with large batches of work.

MD Tool

MD tool is a JParsedown-based command line tool for converting Markdown files into HTML pages.

See MD Tool Readme

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.