GithubHelp home page GithubHelp logo

html-differ's Introduction

html-differ Build Status Coverage Status Dependency Status devDependency Status

Compares two HTML.

The comparison algorithm

html-differ compares HTML using the following criteria:

  • <!DOCTYPE> declarations are case-insensitive, so the following two code samples will be considered to be equivalent:
<!DOCTYPE HTML PUBLIC "_PUBLIC" "_SYSTEM">
<!doctype html public "_PUBLIC" "_SYSTEM">
  • Whitespaces (spaces, tabs, new lines etc.) inside start and end tags are ignored during the comparison.

For example, the following two code samples will be considered to be equivalent:

<span id="1"></span>
<span id=
    "1"    ></span   >
  • Two respective lists of attributes are considered to be equivalent even if they are specified in different order.

For example, the following two code samples will be considered to be equivalent:

<span id="blah" class="ololo" tabIndex="1">Text</span>
<span tabIndex="1" id="blah" class="ololo">Text</span>
  • Two respective attributes class are considered to be equivalent if they refer to the same groups of CSS styles.

For example, the following two code samples will be considered to be equivalent:

<span class="ab bc cd">Text</span>
<span class=" cd  ab bc bc">Text</span>

CAUTION!
html-differ does not check the validity of HTML, but compares them using the above shown criteria and specified options (see the list of possible options).

Install

$ npm install html-differ

API

HtmlDiffer

var HtmlDiffer = require('html-differ').HtmlDiffer,
    htmlDiffer = new HtmlDiffer(options);

where options is an object.

Options

ignoreAttributes: [Array]

Sets what kind of respective attributes' content will be ignored during the comparison (default: []).

Example: ['id', 'for']
The following two code samples will be considered to be equivalent:

<label for="random">Text</label>
<input id="random">
<label for="sfsdfksdf">Text</label>
<input id="sfsdfksdf">
compareAttributesAsJSON: [Array]

Sets what kind of respective attributes' content will be compared as JSON objects, but not as strings (default: []).
In cases when the value of the attribute is an invalid JSON or can not be wrapped into a function, it will be compared as undefined.

Example: [{ name: 'data', isFunction: false }, { name: 'onclick', isFunction: true }]
The following two code samples will be considered to be equivalent:

<div data='{"bla":{"first":"ololo","second":"trololo"}}'></div>
<span onclick='return {"aaa":"bbb","bbb":"aaa"}'></span>

<button data='REALLY BAD JSON'></button>
<button onclick='REALLY BAD FUNCTION'></button>
<div data='{"bla":{"second":"trololo","first":"ololo"}}'></div>
<span onclick='return {"bbb":"aaa","aaa":"bbb"}'></span>

<button data='undefined'></button>
<button onclick='undefined'></button>

REMARK!
The first element of the array could be written in a short form as string:
['data', { name: 'onclick', isFunction: true }].

ignoreWhitespaces: Boolean

Makes html-differ ignore whitespaces (spaces, tabs, new lines etc.) during the comparison (default: true).

Example: true
The following two code samples will be considered to be equivalent:

<html>Text Text<head lang="en"><title></title></head><body>Text</body></html>
 <html>
 Text   Text
<head lang="en">
    <title>               </title>


            </head>

<body>
     Text

    </body>




</html>
ignoreComments: Boolean

Makes html-differ ignore HTML comments during the comparison (default: true).

REMARK!
Does not ignore conditional comments.

Example: true
The following two code samples will be considered to be equivalent:

<!DOCTYPE html>
<!-- comments1 -->
<html>
<head lang="en">
    <meta charset="UTF-8">
    <!--[if IE]>
        <link rel="stylesheet" type="text/css" href="all-ie-only.css" />
    <![endif]-->
    <!--[if !IE]><!-->
        <link href="non-ie.css" rel="stylesheet">
    <!--<![endif]-->
</head>
<body>
Text<!-- comments2 -->
</body>
</html>
<!DOCTYPE html>

<html>
<head lang="en">
    <meta charset="UTF-8">
    <!--[if IE]>
        <link href="all-ie-only.css" type="text/css" rel="stylesheet"/>
    <![endif]-->
    <!--[if !IE]><!-->
        <link href="non-ie.css" rel="stylesheet">
    <!--<![endif]-->
</head>
<body>
Text
</body>
</html>
ignoreEndTags: Boolean

Makes html-differ ignore end tags during the comparison (default: false).

Example: true
The following two code samples will be considered to be equivalent:

<span>Text</span>
<span>Text</spane>
ignoreDuplicateAttributes: Boolean

Makes html-differ ignore tags' duplicate attributes during the comparison.
From the list of the same tag's attributes, the attribute which goes the first will be taken for comparison, others will be ignored (default: false).

Example: true
For example, the following two code samples will be considered to be equivalent:

<span id="blah" id="ololo">Text</span>
<span id="blah">Text</span>

Presets

  • bem - sets predefined options for BEM.
Usage

Passing of a preset via the constructor:

var HtmlDiffer = require('html-differ').HtmlDiffer,
    htmlDiffer = new HtmlDiffer('bem');

Redefinition of a preset via the constructor:

var HtmlDiffer = require('html-differ').HtmlDiffer,
    htmlDiffer = new HtmlDiffer({ preset: 'bem', ignoreAttributes: [] });

Methods

htmlDiffer.diffHtml

@param {String} - the 1-st HTML code
@param {String} - the 2-nd HTML code
@returns {Array of objects} - array with diffs between HTML

htmlDiffer.isEqual

@param {String} - the 1-st HTML code
@param {String} - the 2-nd HTML code
@returns {Boolean}

Logger

var logger = require('html-differ/lib/logger');

Methods

logger.getDiffText

@param {Array of objects} - the result of the work of the method htmlDiffer.diffHtml
@param {Object} - options:

  • charsAroundDiff: Number - the number of characters around the diff result between two HTML (default: 40).

@returns {String}

logger.logDiffText

@param {Array of objects} - the result of the work of the method htmlDiffer.diffHtml
@param {Object} - options:

  • charsAroundDiff: Number - the number of characters around the diff result between two HTML (default: 40).

@returns - pretty logging of diffs:

Example

var fs = require('fs'),
    HtmlDiffer = require('html-differ').HtmlDiffer,
    logger = require('html-differ/lib/logger');

var html1 = fs.readFileSync('1.html', 'utf-8'),
    html2 = fs.readFileSync('2.html', 'utf-8');

var options = {
        ignoreAttributes: [],
        compareAttributesAsJSON: [],
        ignoreWhitespaces: true,
        ignoreComments: true,
        ignoreEndTags: false,
        ignoreDuplicateAttributes: false
    };

var htmlDiffer = new HtmlDiffer(options);

var diff = htmlDiffer.diffHtml(html1, html2),
    isEqual = htmlDiffer.isEqual(html1, html2),
    res = logger.getDiffText(diff, { charsAroundDiff: 40 });

logger.logDiffText(diff, { charsAroundDiff: 40 });

Usage as a program

$ html-differ --help
Compares two HTML

Usage:
  html-differ [OPTIONS] [ARGS]

Options:
  -h, --help : Help
  -v, --version : Shows the version number
  --config=CONFIG : Path to a configuration JSON file
  --bem : Uses predefined options for BEM (deprecated)
  -p PRESET, --preset=PRESET : Name of a preset
  --chars-around-diff=CHARSAROUNDDIFF : The number of characters around the diff (default: 40)

Arguments:
  PATH1 : Path to the 1-st HTML file (required)
  PATH2 : Path to the 2-nd HTML file (required)

Example

$ html-differ path/to/html1 path/to/html2

$ html-differ --config=path/to/config --chars-around-diff=40 path/to/html1 path/to/html2

$ html-differ --preset=bem path/to/html1 path/to/html2

Configuration file

Study the following file config.json:

{
    "ignoreAttributes": [],
    "compareAttributesAsJSON": [],
    "ignoreWhitespaces": true,
    "ignoreComments": true,
    "ignoreEndTags": false,
    "ignoreDuplicateAttributes": false
}

Masks

html-differ supports handling of masks in HTML.

For example, the following two code samples will be considered to be equivalent:

<div id="{{[a-z]*\s\d+}}">
<div id="text 12345">

Syntax

Masks in html-differ have the following syntax:

{{RegExp}}

where:

  • {{ – opening identifier of the mask.

  • RegExp – regular expression for matching with the corresponding value in another HTML. The syntax is similar to regular expressions in JavaScript written in a literal notation.

  • }} – closing identifier of the mask.

Screening

The rules of screening of symbols are similar to the rules which are used in regular expressions in JavaScript written in a literal notation.

For example, the following two code samples will be considered to be equivalent:

<div id="{{\d\.\d}}">
<div id="1.1">

If you want to use {{ or }} inside a mask, you should screen both curly braces, i.e. \{\} or \}\}.

For example, the following two code samples will be considered to be equivalent:

<div class="{{a\{\{b\}\}c}}">
<div class="a{{b}}c">

html-differ's People

Contributors

baweaver avatar egavr avatar golyshevd avatar marcbachmann avatar shuhrat avatar tadatuta avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

html-differ's Issues

Update option 'ignoreWhitespace'

Парсить по умолчанию правильней с учётом пробельных символов, а сравнивать на эквивалентность — без

Refactor 'diff-logger'

var showCharacters = options && options.showCharacters || 20;

вместо

options = options || { showCharacters: 20 };

var showCharacters = options.showCharacters;

showCharacters = showCharacters > 0 ? showCharacters : 20;

return ‘’; вместо `return output``, т.к. output к этому моменту всегда пустая строка
3.

if (color !== 'grey' || part.value.length < showCharacters * 2) {

    output += (!indexOfPart ? '\n' : '') +  part.value[color];

    return;

}

4.Переименовать название опции из showCharacters в charsAroundDiff

onclick

The content of this attribute should be compared as JS-objects

We need an option which sets what respective attributes' content will be compared as JS-objects, not as strings or JSONs

data-bem attribute

should be sorted

data-bem="{"menu-item":{"checkedText":"tw","text":"Twitter","val":{"id":1}}}"
data-bem="{"menu-item":{"checkedText":"tw","val":{"id":1},"text":"Twitter"}}"

Use `differ` wrap instead of `compare` method

Add diffHtml method to the differ and this object instead of the compare method.

var diff = require('diff');

diff.diffHtml = function(oldStr, newStr) {
    return HtmlDiff.diff(oldStr, newStr); 
}

module.exports = diff;

Add new tests

3 tests

  1. function isEqual without ignore parameters -> return true

  2. function isEqual without ignore parameters -> return false

  3. function isEqual with ignore parameters [ 'id', 'for' ] -> return true

Fix 'sortObj' function

Now this function has dangerous realization, because sorting of objects is an unstable operation! This function can fail!

diff-logger

Please add an option to the log function, which returns just log instead of printing it to the console

Comprassion bug!

Assume that you have two chunks of HTML

<span class="copyright link">Copyright content</spane>
<span class="copyright link">Copyright content</span>

calling isEqual(html1, html2) now returning true while false is expected

Modify bemDiff method

I want to make bemDiff method return true or false in dependency of the comparison result OR create a new method isEqualForBem which returns true or false.

provide shortcut for BEM use case

I think we need shortcut for

diffLogger.log(htmlDiffer.diffHtml(html1, html2, { ignoreHtmlAttrs: ['id', 'for'] } ));

and not to require 2 modules each time as it's the most common use case for us.

lets export something like bemDiff?

Add options for using as a program from a command line

$ bin/html-differ [options] path-to-html1 path-to-html2

  1. The ability to set what attributes should be always considered to be equal
  2. The ability to set the number of characters which will be logged before the diff and after it

Incorrect comparison

<tag sameparam="one" sameparam="two">

and

<tag sameparam="two">

are now reported as equal which is not correct.

Rename options

  1. ignoreHtmlAttrs --> ignoreAttributes
  2. compareHtmlAttrsAsJSON --> compareAttributesAsJSON
  3. ignoreHtmlComments --> ignoreComments

New version of parser

Nowadays I was looking for an alternative for parser module, to use in html-differ.
Realized that there is a (htmlparser2)[https://github.com/fb55/htmlparser2] which is a fork of module which we use (with compatable API). I highly offer you to change parser to htmlparser2, becouse it has an active maintainer and he's accepting pull requests, hence htmlparser's last commit is two years old!

Mind blowing exports

module.exports = {
    HtmlDiff: HtmlDiff,
    HtmlDiffer: HtmlDiffer,
    diffHtml: htmlDiffer.diffHtml.bind(htmlDiffer),
    isEqual: htmlDiffer.isEqual.bind(htmlDiffer),

    bemDiff: bemDiff
};

Sorry, but I don't understand what HtmlDiff is and what is the differences between HtmlDiffer. Maybe we should rethink the API and release major release?

Plugin

Could you please make a plugin for chai assertion library?

Add `strictMode`

I want to sum up these issues

  1. #73
<tag sameparam="one" sameparam="two">
<tag sameparam="one">

should be considered to be equivalent, because they are identical for browsers (according to HTML5 spec)

  1. #57
<span class="copyright link">Copyright content</span>
<span class="copyright link">Copyright content</spane>

should be considered to be equivalent, because they are identical for browsers too (according to HTML5 spec)

I propose to create an option strictMode! If it is set to true the above shown two cases will be considered to be not equivalent.

Unfortunately, I could create this option only when the authors of htmlparser2 fix all my issues - #95 and #97

Add more tests

  1. Unit tests for utils.js
  2. Rise coverage as high as it is possible!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.