GithubHelp home page GithubHelp logo

tkman59 / sitemap-generator Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lgraubner/sitemap-generator

0.0 2.0 0.0 53 KB

Easily create XML sitemaps for your website.

License: MIT License

JavaScript 100.00%

sitemap-generator's Introduction

Sitemap Generator

Travis David David Dev npm

Easily create XML sitemaps for your website.

Installation

$ npm install -S sitemap-generator

Usage

var SitemapGenerator = require('sitemap-generator');

// create generator
var generator = new SitemapGenerator('http://example.com');

// register event listeners
generator.on('done', function (sitemap) {
  console.log(sitemap); // => prints xml sitemap
});

// start the crawler
generator.start();

The crawler will fetch all folder URL pages and file types parsed by Google. If present the robots.txt will be taken into account and possible rules are applied for each URL to consider if it should be added to the sitemap. Also the crawler will not fetch URL's from a page if the robots meta tag with the value nofollow is present and ignore them completely if noindex rule is present. The crawler is able to apply the base value to found links.

Options

You can provide some options to alter the behaviour of the crawler.

var generator = new SitemapGenerator('http://example.com', {
  restrictToBasepath: false,
  stripQuerystring: true,
});

Since version 5 port is not an option anymore. If you are using the default ports for http/https your are fine. If you are using a custom port just append it to the URL.

restrictToBasepath

Type: boolean
Default: false

If you specify an URL with a path (e.g. example.com/foo/) and this option is set to true the crawler will only fetch URL's matching example.com/foo/*. Otherwise it could also fetch example.com in case a link to this URL is provided.

stripQueryString

Type: boolean
Default: true

Whether to treat URL's with query strings like http://www.example.com/?foo=bar as indiviual sites and to add them to the sitemap.

Events

The Sitemap Generator emits several events using nodes EventEmitter.

fetch

Triggered when the crawler tries to fetch a ressource. Passes the status and the url as arguments. The status can be any HTTP status.

generator.on('fetch', function (status, url) {
  // log url
});

ignore

If an URL matches a disallow rule in the robots.txt file this event is triggered. The URL will not be added to the sitemap. Passes the ignored url as argument.

generator.on('ignore', function (url) {
  // log ignored url
});

clienterror

Thrown if there was an error on client side while fetching an URL. Passes the crawler error and additional error data as arguments.

generator.on('clienterror', function (queueError, errorData) {
  // log error
});

done

Triggered when the crawler finished and the sitemap is created. Passes the created XML markup as callback argument. The second argument provides an object containing found URL's, ignored URL's and faulty URL's.

generator.on('done', function (sitemap, store) {
  // do something with the sitemap, e.g. save as file
});

sitemap-generator's People

Contributors

lgraubner avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.