GithubHelp home page GithubHelp logo

ttsugriy / ada Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ada-url/ada

0.0 0.0 0.0 7.55 MB

WHATWG-compliant and fast URL parser written in modern C++

Home Page: https://ada-url.com

License: Apache License 2.0

Shell 0.33% C++ 88.84% Python 2.98% C 3.24% Rust 0.10% CMake 4.49% Dockerfile 0.02%

ada's Introduction

Ada

OpenSSF Scorecard OpenSSF Best Practices Ubuntu 22.04 VS17-CI VS17-clang-CI Ubuntu s390x (GCC 11)

Ada is a fast and spec-compliant URL parser written in C++. Specification for URL parser can be found from the WHATWG website.

The Ada library passes the full range of tests from the specification, across a wide range of platforms (e.g., Windows, Linux, macOS). It fully supports the relevant Unicode Technical Standard.

Requirements

The project is otherwise self-contained and it has no dependency. A recent C++ compiler supporting C++17. We test GCC 9 or better, LLVM 10 or better and Microsoft Visual Studio 2022.

Ada is fast.

On a benchmark where we need to validate and normalize thousands URLs found on popular websites, we find that ada can be several times faster than popular competitors (system: Apple MacBook 2022 with LLVM 14).

      ada ▏  188 ns/URL ███▏
servo url ▏  664 ns/URL ███████████▎
     CURL ▏ 1471 ns/URL █████████████████████████

Ada has improved the performance of the popular JavaScript environment Node.js:

Since Node.js 18, a new URL parser dependency was added to Node.js — Ada. This addition bumped the Node.js performance when parsing URLs to a new level. Some results could reach up to an improvement of 400%. (State of Node.js Performance 2023)

Bindings of Ada

We provide clients for different programming languages through our C API.

  • Rust: Rust bindings for Ada
  • Go: Go bindings for Ada
  • Python: Python bindings for Ada

Usage

Ada supports two types of URL instances, ada::url and ada::url_aggregator. The usage is the same in either case: we have an parsing function template ada::parse which can return either a result of type ada::result<ada::url> or of type ada::result<ada::url_aggregator> depending on your needs. The ada::url_aggregator class is smaller and it is backed by a precomputed serialized URL string. The ada::url class is made of several separate strings for the various components (path, host, and so forth).

Parsing & Validation

  • Parse and validate a URL from an ASCII or UTF-8 string
ada::result<ada::url_aggregator> url = ada::parse<ada::url_aggregator>("https://www.google.com");
if (url) { /* URL is valid */ }

After calling 'parse', you must check that the result is valid before accessing it when you are not sure that it will succeed. The following code is unsafe:

ada::result<ada::url_aggregator> url = ada::parse<ada::url_aggregator>("some bad url");
url->get_href();

You should do...

ada::result<ada::url_aggregator> url = ada::parse<ada::url_aggregator>("some bad url");
if(url) {
  // next line is now safe:
  url->get_href();
} else {
  // report a parsing failure
}

For simplicity, in the examples below, we skip the check because we know that parsing succeeds.

Examples

  • Get/Update credentials
ada::result<ada::url_aggregator> url = ada::parse<ada::url_aggregator>("https://www.google.com");
url->set_username("username");
url->set_password("password");
// ada->get_href() will return "https://username:[email protected]/"
  • Get/Update Protocol
ada::result<ada::url_aggregator> url = ada::parse<ada::url_aggregator>("https://www.google.com");
url->set_protocol("wss");
// url->get_protocol() will return "wss:"
// url->get_href() will return "wss://www.google.com/"
  • Get/Update host
ada::result<ada::url_aggregator> url = ada::parse<ada::url_aggregator>("https://www.google.com");
url->set_host("github.com");
// url->get_host() will return "github.com"
// you can use `url.set_hostname` depending on your usage.
  • Get/Update port
ada::result<ada::url_aggregator> url = ada::parse<ada::url_aggregator>("https://www.google.com");
url->set_port("8080");
// url->get_port() will return "8080"
  • Get/Update pathname
ada::result<ada::url_aggregator> url = ada::parse<ada::url_aggregator>("https://www.google.com");
url->set_pathname("/my-super-long-path")
// url->get_pathname() will return "/my-super-long-path"
  • Get/Update search/query
ada::result<ada::url_aggregator> url = ada::parse<ada::url_aggregator>("https://www.google.com");
url->set_search("target=self");
// url->get_search() will return "?target=self"
  • Get/Update hash/fragment
ada::result<ada::url_aggregator> url = ada::parse<ada::url_aggregator>("https://www.google.com");
url->set_hash("is-this-the-real-life");
// url->get_hash() will return "#is-this-the-real-life"

For more information about command-line options, please refer to the CLI documentation.

C wrapper

See the file include/ada_c.h for our C interface. We expect ASCII or UTF-8 strings.

#include "ada_c.h"
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

static void ada_print(ada_string string) {
  printf("%.*s\n", (int)string.length, string.data);
}

int main(int c, char *arg[] ) {
  ada_url url = ada_parse("https://username:[email protected]:8080/"
      "pathname?query=true#hash-exists");
  if(!ada_is_valid(url)) { puts("failure"); return EXIT_FAILURE; }
  ada_print(ada_get_href(url)); // prints https://username:password@host:8080/pathname?query=true#hash-exists
  ada_print(ada_get_protocol(url)); // prints https:
  ada_print(ada_get_username(url)); // prints username
  ada_set_href(url, "https://www.yagiz.co");
  if(!ada_is_valid(url)) { puts("failure"); return EXIT_FAILURE; }
  ada_set_hash(url, "new-hash");
  ada_set_hostname(url, "new-host");
  ada_set_host(url, "changed-host:9090");
  ada_set_pathname(url, "new-pathname");
  ada_set_search(url, "new-search");
  ada_set_protocol(url, "wss");
  ada_print(ada_get_href(url)); // will print wss://changed-host:9090/new-pathname?new-search#new-hash
  ada_free(url);
  return EXIT_SUCCESS;
}

When linking against the ada library from C++, be minding that ada requires access to the standard C++ library. E.g., you may link with the C++ compiler.

E.g., if you grab our single-header C++ files (ada.cpp and ada.h), as well as the C header (ada_c.h), you can often compile a C program (demo.c) as follows under Linux/macOS systems:

c++ -c ada.cpp -std=c++17
cc -c demo.c
c++ demo.o ada.o -o cdemo
./cdemo

CMake dependency

See the file tests/installation/CMakeLists.txt for an example of how you might use ada from your own CMake project, after having installed ada on your system.

Installation

Homebrew

Ada is available through Homebrew. You can install Ada using brew install ada-url.

Contributing

Building

Ada uses cmake as a build system. It's recommended you to run the following commands to build it locally.

  • Build: cmake -B build && cmake --build build
  • Test: ctest --output-on-failure --test-dir build

Windows users need additional flags to specify the build configuration, e.g. --config Release.

The project can also be built via docker using default docker file of repository with following commands.

docker build -t ada-builder . && docker run --rm -it -v ${PWD}:/repo ada-builder

Amalgamation

You may amalgamate all source files into only two files (ada.h and ada.cpp) by typing executing the Python 3 script singleheader/amalgamate.py. By default, the files are created in the singleheader directory.

License

This code is made available under the Apache License 2.0 as well as the MIT license.

Our tests include third-party code and data. The benchmarking code includes third-party code: it is provided for research purposes only and not part of the library.

ada's People

Contributors

actions-user avatar anonrig avatar ckerr avatar debadree25 avatar dependabot[bot] avatar dfrostbytes avatar github-actions[bot] avatar lemire avatar miguelteixeiraa avatar myd7349 avatar nick-nuon avatar okanpinar avatar q66 avatar ronag avatar seantolstoyevski avatar star-hengxing avatar ttsugriy avatar ulisesgascon avatar vanemoraess avatar wx257osn2 avatar zzzode avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.