GithubHelp home page GithubHelp logo

slevithan / parseuri Goto Github PK

View Code? Open in Web Editor NEW
18.0 3.0 0.0 99 KB

Mighty but tiny URI parser

Home Page: https://slevithan.github.io/parseuri/demo/

License: MIT License

JavaScript 97.75% HTML 2.25%
uri uri-parser url url-parser

parseuri's Introduction

parseUri

parseUri is a mighty but tiny JavaScript URI/URN/URL parser that splits any URI into its parts (all of which are optional). Its combination of accuracy, comprehensiveness, and brevity is unrivaled (1KB min/gzip, with no dependencies).

Breaking changes

Version 2 was a major, breaking change that might require updating URI part names in your code and/or providing 'friendly' as a second argument to preserve the previous default handling of relative paths. See details in the v2 release notes, and compare results with v1.2.2 on the demo page. Version 3 was a minor update published on npm as pure ESM.

Compared to the URL constructor

parseUri includes several advantages over the built-in URL:

  • It gives you many additional properties (authority, userinfo, subdomain, domain, tld, resource, directory, filename, suffix) that aren’t available from URL.
  • URL throws e.g. if not given a protocol, and in many other cases of valid (but not supported) and invalid URIs. parseUri makes a best case effort even with partial or invalid URIs and is extremely good with edge cases.
  • URL’s rules don’t allow correctly handling many non-web protocols. For example, URL doesn’t throw on any of 'git://localhost:1234', 'ssh://[email protected]', or 't2ab:///path/entry', but it also doesn’t get their details correct since it treats everything after <non-web-protocol>: up to ? or # as part of the pathname.
  • parseUri includes a “friendly” parsing mode (in addition to its default mode) that handles human-friendly URLs like 'example.com/file.html' as expected.
  • parseUri supports providing a list of second-level domains that should be treated as part of the top-level domain (ex: co.uk).

Conversely, parseUri is single-purpose and doesn’t apply normalization.

You can compare with URL’s results on the demo page.

Results / URI parts

Returns an object with 20 URI parts as properties plus queryParams, a URLSearchParams object that includes methods get(key), getAll(key), etc.

Here’s an example of what each part contains:

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                  href                                                    │
├────────────────────────────────────────────────────────────────┬─────────────────────────────────────────┤
│                             origin                             │                resource                 │
├──────────┬─┬───────────────────────────────────────────────────┼──────────────────────┬───────┬──────────┤
│ protocol │ │                     authority                     │       pathname       │ query │ fragment │
│          │ ├─────────────────────┬─────────────────────────────┼───────────┬──────────┤       │          │
│          │ │      userinfo       │            host             │ directory │ filename │       │          │
│          │ ├──────────┬──────────┼──────────────────────┬──────┤           ├─┬────────┤       │          │
│          │ │ username │ password │       hostname       │ port │           │ │ suffix │       │          │
│          │ │          │          ├───────────┬──────────┤      │           │ ├────────┤       │          │
│          │ │          │          │ subdomain │  domain  │      │           │ │        │       │          │
│          │ │          │          │           ├────┬─────┤      │           │ │        │       │          │
│          │ │          │          │           │    │ tld │      │           │ │        │       │          │
"  https   ://   user   :   pass   @ sub1.sub2 . dom.com  : 8080   /p/a/t/h/  a.html    ?  q=1  #   hash   "
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘

If this chart isn’t appearing correctly, view it on GitHub.

parseUri additionally supports IPv4 and IPv6 addresses, URNs, and many edge cases not shown here. See the extensive tests. References include RFC 3986 and WHATWG URL.

Parsing modes

parseUri has two parsing modes (default and friendly), specified via an optional second argument:

// Default mode
parseUri(uri);
// Also default mode
parseUri(uri, 'default');
// Friendly mode
parseUri(uri, 'friendly');

The default mode follows official URI standards, whereas friendly mode handles human-friendly URLs like 'example.com/file.html' as expected. Results are identical for any URI that starts with <protocol>://, <web-protocol>:, :, //, /, \, ?, or #.

To be precise, the only difference is that friendly mode doesn’t require <protocol>:, :, //, or other repeating slashes to signal the start of an authority. This has the following effects:

  • It allows starting a URI with an authority (as noted).
  • It therefore precludes proper handling for relative paths (without a leading / or \) such as 'dir/file.html'. Friendly mode considers it to start with hostname dir.
  • It avoids requiring // after a non-web protocol.
    • Note: The web protocols http, https, ws, wss, and ftp never require //; friendly mode extends this to non-web protocols.

You can compare results from default and friendly mode on the demo page.

Examples

let uri = parseUri('https://a.b.example.com:80/@user/a/my.img.jpg?q=x&q=#hash');
uri.protocol // → 'https'
uri.host // → 'a.b.example.com:80'
uri.hostname // → 'a.b.example.com'
uri.subdomain // → 'a.b'
uri.domain // → 'example.com'
uri.port // → '80'
uri.resource // → '/@user/a/my.img.jpg?q=x&q=#hash'
uri.pathname // → '/@user/a/my.img.jpg'
uri.directory // → '/@user/a/'
uri.filename // → 'my.img.jpg'
uri.suffix // → 'jpg'
uri.query // → 'q=x&q='
uri.fragment // → 'hash'
uri.queryParams.get('q') // → 'x'
uri.queryParams.getAll('q') // → ['x', '']
uri.queryParams.get('not-present') // → null
uri.queryParams.getAll('not-present') // → []
// Also available: href, origin, authority, userinfo, username, password, tld

// Relative path
uri = parseUri('dir/file.html?q=x');
uri.hostname // → ''
uri.directory // → 'dir/'
uri.filename // → 'file.html'
uri.query // → 'q=x'

// Friendly mode allows starting with an authority
uri = parseUri('example.com/file.html', 'friendly');
uri.hostname // → 'example.com'
uri.directory // → '/'
uri.filename // → 'file.html'

// IPv4 address
uri = parseUri('ssh://[email protected]');
uri.protocol // → 'ssh'
uri.username // → 'myid'
uri.hostname // → '192.168.1.101'
uri.domain // → ''

// IPv6 address
uri = parseUri('https://[2001:db8:85a3::7334]:80?q=x');
uri.hostname // → '[2001:db8:85a3::7334]'
uri.port // → '80'
uri.domain // → ''
uri.query // → 'q=x'

// Mailto
uri = parseUri('mailto:[email protected]?subject=Hey&body=Sign%20me%20up!');
uri.protocol // → 'mailto'
uri.authority // → ''
uri.username // → ''
uri.hostname // → ''
uri.pathname // → '[email protected]'
uri.query // → 'subject=Hey&body=Sign%20me%20up!'
uri.queryParams.get('body') // → 'Sign me up!'

// Mailto in friendly mode
uri = parseUri('mailto:[email protected]', 'friendly');
uri.protocol // → 'mailto'
uri.authority // → '[email protected]'
uri.username // → 'me'
uri.hostname // → 'my.com'
uri.pathname // → ''

/* Also supports e.g.:
- https://[2001:db8:85a3::7334%en1]/ipv6-with-zone-identifier
- git://localhost:1234
- file:///path/file
- tel:+1-800-555-1212
- urn:uuid:c5542ab6-3d96-403e-8e6b-b8bb52f48d9a?q=x
*/

Test and compare results on the demo page.

Install

npm install parseuri

Use

import { parseUri, setTlds } from 'parseuri';

In browsers:

<script src="https://cdn.jsdelivr.net/npm/parseuri/dist/parseuri.min.js"></script>
<script>
  console.log(parseUri('https://example.com/'));
  // If needed, use `parseUri.setTlds`
</script>

parseuri's People

Contributors

slevithan avatar

Stargazers

Neos21 avatar Joël Galeran avatar Vladimir Klepov avatar  avatar Louis Walch avatar  avatar  avatar Luke Hamburg avatar  avatar Thomas Claburn avatar  avatar  avatar Adam M. Euans avatar James Nguyen avatar Marcis Bergmanis avatar  avatar Jonny Buchanan avatar Jean-Baptiste Musso avatar

Watchers

 avatar Luke Hamburg avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.