GithubHelp home page GithubHelp logo

runt18 / html5ever Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rreverser/html5ever

0.0 2.0 0.0 1.94 MB

High-performance browser-grade HTML5 parser

License: Other

Rust 94.24% Objective-C 0.28% HTML 4.39% Python 0.95% Shell 0.13%

html5ever's Introduction

html5ever

Build Status

API Documentation

html5ever is an HTML parser developed as part of the Servo project.

It can parse and serialize HTML according to the WHATWG specs (aka "HTML5"). There are some omissions at present, most of which are documented in the bug tracker. html5ever passes all tokenizer tests from html5lib-tests, and most tree builder tests outside of the unimplemented features. The goal is to pass all html5lib tests, and also provide all hooks needed by a production web browser, e.g. document.write.

Note that the HTML syntax is a language almost, but not quite, entirely unlike XML. For correct parsing of XHTML, use an XML parser. (That said, many XHTML documents in the wild are serialized in an HTML-compatible form.)

html5ever is written in Rust, so it avoids the most notorious security problems from C, but has performance similar to a parser written in C. You can call html5ever as if it were a C library, without pulling in a garbage collector or other heavy runtime requirements.

Getting started in Rust

Add html5ever as a dependency in your Cargo.toml file:

[dependencies]
html5ever = "*"

Then take a look at examples/print-rcdom.rs and the API documentation.

Getting started in other languages

The C API is not yet complete, but it's already possible to do tokenization.

Bindings for Python and other languages are much desired.

Working on html5ever

To fetch the test suite, you need to run

git submodule update --init

Run cargo doc in the repository root to build local documentation under target/doc/.

Details

html5ever uses callbacks to manipulate the DOM, so it works with your choice of DOM representation. A simple reference-counted DOM is included.

html5ever exclusively uses UTF-8 to represent strings. In the future it will support other document encodings (and UCS-2 document.write) by converting input.

The code is cross-referenced with the WHATWG syntax spec, and eventually we will have a way to present code and spec side-by-side.

html5ever builds against the official stable releases of Rust, though some optimizations are only supported on nightly releases.

html5ever's People

Contributors

akosthekiss avatar aroben avatar chris-morgan avatar chrisparis avatar frewsxcv avatar glennw avatar guillaumegomez avatar huonw avatar jdm avatar kichjang avatar kmcallister avatar kroisse avatar kstep avatar larsbergstrom avatar manishearth avatar mbrubeck avatar metajack avatar michaelwu avatar mmatyas avatar ms2ger avatar notriddle avatar nox avatar o01eg avatar ogeon avatar pcwalton avatar renato-zannon avatar sfackler avatar simonsapin avatar ygg01 avatar zarazek avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.