GithubHelp home page GithubHelp logo

one0day / rust-headless-chrome Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rust-headless-chrome/rust-headless-chrome

1.0 0.0 0.0 996 KB

A high-level API to control headless Chrome or Chromium over the DevTools Protocol. It is the Rust equivalent of Puppeteer, a Node library maintained by the Chrome DevTools team.

License: MIT License

JavaScript 0.04% Rust 99.41% CSS 0.01% HTML 0.55%

rust-headless-chrome's Introduction

Headless Chrome

Build Status Crate API Discord channel

A high-level API to control headless Chrome or Chromium over the DevTools Protocol. It is the Rust equivalent of Puppeteer, a Node library maintained by the Chrome DevTools team.

It is not 100% feature compatible with Puppeteer, but there's enough here to satisfy most browser testing / web crawling use cases, and there are several 'advanced' features such as:

Quick Start

use std::error::Error;

use headless_chrome::Browser;
use headless_chrome::protocol::cdp::Page;

fn browse_wikipedia() -> Result<(), Box<dyn Error>> {
    let browser = Browser::default()?;

    let tab = browser.wait_for_initial_tab()?;

    /// Navigate to wikipedia
    tab.navigate_to("https://www.wikipedia.org")?;

    /// Wait for network/javascript/dom to make the search-box available
    /// and click it.
    tab.wait_for_element("input#searchInput")?.click()?;

    /// Type in a query and press `Enter`
    tab.type_str("WebKit")?.press_key("Enter")?;

    /// We should end up on the WebKit-page once navigated
    let elem = tab.wait_for_element("#firstHeading")?;
    assert!(tab.get_url().ends_with("WebKit"));

    /// Take a screenshot of the entire browser window
    let _jpeg_data = tab.capture_screenshot(
        Page::CaptureScreenshotFormatOption::Jpeg,
        None,
        None,
        true)?;

    /// Take a screenshot of just the WebKit-Infobox
    let _png_data = tab
        .wait_for_element("#mw-content-text > div > table.infobox.vevent")?
        .capture_screenshot(Page::CaptureScreenshotFormatOption::Png)?;

    // Run JavaScript in the page
    let remote_object = elem.call_js_fn(r#"
        function getIdTwice () {
            // `this` is always the element that you called `call_js_fn` on
            const id = this.id;
            return id + id;
        }
    "#, vec![], false)?;
    match remote_object.value {
        Some(returned_string) => {
            dbg!(&returned_string);
            assert_eq!(returned_string, "firstHeadingfirstHeading".to_string());
        }
        _ => unreachable!()
    };

    Ok(())
}

Auto fetching chrome binary

[dependencies]
headless_chrome = {git = "https://github.com/atroche/rust-headless-chrome", features = ["fetch"]}

For fuller examples, take a look at tests/simple.rs and examples.

Before running examples. Make sure add failure crate in your cargo project dependency of Cargo.toml

What can't it do?

The Chrome DevTools Protocol is huge. Currently, Puppeteer supports way more of it than we do. Some of the missing features include:

  • Dealing with frames
  • Handling file picker / chooser interactions
  • Tapping touchscreens
  • Emulating different network conditions (DevTools can alter latency, throughput, offline status, 'connection type')
  • Viewing timing information about network requests
  • Reading the SSL certificate
  • Replaying XHRs
  • HTTP Basic Auth
  • Inspecting EventSources (aka server-sent events or SSEs)
  • WebSocket inspection

If you're interested in adding one of these features but would like some advice about how to start, please reach out by creating an issue or sending me an email at [email protected].

Related crates

  • fantoccini uses WebDriver, so it works with browsers other than Chrome. It's also asynchronous and based on Tokio, unlike headless_chrome, which has a synchronous API and is just implemented using plain old threads. Fantoccini has also been around longer and is more battle-tested. It doesn't support Chrome DevTools-specific functionality like JS Coverage.

Testing

For debug output, set these environment variables before running cargo test:

RUST_BACKTRACE=1 RUST_LOG=headless_chrome=trace

Version numbers

Starting with v0.2.0, we're trying to follow SemVar strictly.

Troubleshooting

If you get errors related to timeouts, you likely need to enable sandboxing either in the kernel or as a setuid sandbox. Puppeteer has some information about how to do that here

By default, headless_chrome will download a compatible version of chrome to XDG_DATA_HOME (or equivalent on Windows/Mac). This behaviour can be optionally turned off, and you can use the system version of chrome (assuming you have chrome installed) by disabling the default feature in your Cargo.toml:

[dependencies.headless_chrome]
default-features = false

Contributing

Pull requests and issues are most welcome, even if they're just experience reports. If you find anything frustrating or confusing, let me know!

rust-headless-chrome's People

Contributors

atroche avatar awikman avatar billy-sheppard avatar chinedufn avatar danielpikilidis avatar djfm avatar djozis avatar dowwie avatar dvc94ch avatar elpiel avatar hollowman6 avatar iwatakeshi avatar jeprojects avatar k-bx avatar leshow avatar lukaslueg avatar mdrokz avatar mlleprudhomme avatar nebnes avatar octaltree avatar qiaoruntao avatar r-darwish avatar ravi0213 avatar ryuugan avatar sdimkov avatar shandanjay avatar striezel avatar terry90 avatar thedan64 avatar theduke avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.