GithubHelp home page GithubHelp logo

rehanhaider / pyhtmlproofer Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 11.36 MB

Test internal & external links in your website or rendered HTML

Home Page: https://pypi.org/project/pyhtmlproofer/

License: GNU Affero General Public License v3.0

Dockerfile 4.70% Python 95.30%
links static-site-generator

pyhtmlproofer's Introduction

CI PyPI Version License

pyhtmlproofer

Check for website and static HTML pages for link rot.

Features

pyhtmlproofer can be used on

  1. Static HTML pages (typically generated by an SSG). You can specify either files or directories to be checked.
  2. Webpages, you can specify a URL/link to be checked.

pyhtmlproofer at the moment does the following:

  1. Checks for broken internal links in HTML files
  2. Checks if external links in HTML or website link are valid
  3. Check for scripts / stylesheets in HTML files
  4. Check for images in HTML files

You can read more details below in What's Tested? section.

Roadmap

The follower features are under development:

  1. Check for images and alt-text in HTML files
  2. Check Favicons
  3. Check optimal SEO meta tags
  4. Caching results
  5. Config file

Installation

Install pyhtmlproofer with pip:

pip install pyhtmlproofer

What's tested?

You can configure pyhtmlproofer to check:

  • a file
  • a directory or list of directories
  • a URL / Link

Links / Hyperlinks

a, link elements: pyhtmlproofer checks-

  • If the internal links are valid
  • If the internal references (#in-page-links) are valid
  • If the external links are valid

Images

img elements: pyhtmlproofer checks -

  • if the internal image references are valid
  • if the external image references are valid

Scripts

script elements: pyhtmlproofer checks -

  • If the internal script references are valid
  • If the external script references are reachable

Usage

a) To check a file:

import pyhtmlproofer as proofer
file = "path/to/file1.html"
proofer.file(file).check()

b) To check a directories:

import pyhtmlproofer as proofer
directory_paths = ["path/to/1/file.html", "path/to/2/file.html"]
proofer.directories(directory_paths).check()

c) To validate URL(s):

import pyhtmlproofer as proofer
links = ["https://example.com", "https://cloudbytes.dev"]
proofer.links(links).check()

CLI

There is also a CLI that can be used:

$ pyhtmlproofer check -F <file_name>

Available Config Options

PROOFER_DEFAULTS = {
    "assume_extension": ".html",
    "directory_index_file": "index.html",
    "disable_external": False,
    "ignore_files": [],
    "ignore_urls": [],
    "enforce_https": True,
    "extensions": [".html"],
    "log_level": "ERROR",
    "report_to_file": True,
    "report_filename": "proofer_report",
}

You can override the default configuration options by passing a dictionary of options.

import pyhtmlproofer as proofer

options = {"log_level": "ERROR", "disable_external": True}
directory_paths = ["path/to/1/file.html", "path/to/2/file.html"]

proofer.directories(directory_paths, , options=options).check()

Credits

The inspiration was by Ruby based HTMLProofer and lack of Python based alternatives. Although, pyhtmlproofer is not a Python rewrite, instead it focuses on solving problems that I encountered while maintaining CloudBytes/Dev> website.

pyhtmlproofer's People

Contributors

rehanhaider avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.