mafredri / cdp Goto Github PK
View Code? Open in Web Editor NEWPackage cdp provides type-safe bindings for the Chrome DevTools Protocol (CDP), written in the Go programming language.
License: MIT License
Package cdp provides type-safe bindings for the Chrome DevTools Protocol (CDP), written in the Go programming language.
License: MIT License
Hi, @mafredri can you provide example for injecting css rule, for example
body { height: auto; !important; }
this would help with making proper screenshots :)
Hi @mafredri ,
Trying to use this package for page automation.
Login page has input of type text. But I get rpc error when trying to set input value. The input can be found on the page though(I can print all the attributes). Below are error string and code. What I am missing here? It can be something really basic. Please help if you can.
Error :
cdp.DOM: SetNodeValue: rpc error: Can only set value of text nodes (code = -32000)
Code Snippet:
email, err := c.DOM.QuerySelector(ctx, cdpcmd.NewDOMQuerySelectorArgs(doc.Root.NodeID, "input#email"))
if err != nil {
fmt.Println(err)
}
err = c.DOM.SetNodeValue(ctx, cdpcmd.NewDOMSetNodeValueArgs(email.NodeID, "[email protected]"))
if err != nil {
fmt.Println(err)
}
Please provide example with ability to screenshot view port with custom size.
Hi all, I want to get the console output , but I can't find the way to get them . Anyone solve this ?
Using this code I can actually screenshot properly most of web pages
https://gist.github.com/s3rj1k/0fc459f9c6aa7db42283b805c509e007
but it fails with pikabu.ru, output is somehow stacked
proper screenshot can be seen in https://dehtml.com/https://pikabu.ru
using phantomjs
any ideas on how to fix this?
Hi, I'm trying to understand the purpose and use of methods like devtool.New(...)
and Get()
, Create()
etc. There doesn't seem to be any information either in the CDP GoDoc or the official DevTools Protocol Viewer. Where can I get more information on how to use these methods? Thanks.
I'm trying to use cdp to monitor network requests to follow a chain of redirects. Ideally I would manage to get all the visited URLs, but in the worst case I need the very last opened URL.
I understand the best way to do this with Chrome Devtools protocol is to monitor for Page.frameNavigated events and extract the frame URL. Quickly searching through the code it seems this event is not supported? What would be the best way to implement this using cdp?
Thanks for your work!
Hi.
does this implementacion have ways to capture all the network events including the performance of each resource loaded?
Hello again)
I'm trying to run a script with Runtime.RunScript
.
Script itself is about 13000loc.
First i do Runtime.CompileScript
and then i do Runtime.RunScript
But it always fails on Runtime.CompileScript
, error message differs from time to time:
i tried to run Runtime.CompileScript and Runtime.RunScript with short script that returns promise, and it works fine.
Is there any restriction for the script size? Or maybe there is another way to run such long script?
this is how it looks in code
exp, err := ioutil.ReadFile("./content.js")
if err != nil {
panic(err)
}
if err := c.Runtime.Enable(ctx); err != nil {
panic(err)
}
compileReply, err := c.Runtime.CompileScript(context.Background(), &cdpcmd.RuntimeCompileScriptArgs{Expression: string(exp), PersistScript: true})
if err != nil {
// it panics all the time for a 13000loc script
panic(err)
}
awaitPromise := true
runReply, err := c.Runtime.RunScript(ctx, &cdpcmd.RuntimeRunScriptArgs{ScriptID: *compileReply.ScriptID, AwaitPromise: &awaitPromise})
if err != nil {
panic(err)
}
Sincerely, yarik
Some types, commands and events contain unnamed enums. Unnamed meaning they do not reference any domain type.
Take debugger.PausedReply
for example, it contains a Reason
of type string
. This is actually an enum
without a name defined in the protocol by:
{ "name": "reason", "type": "string", "enum": [ "XHR", "DOM", "EventListener", "exception", "assert", "debugCommand", "promiseRejection", "OOM", "other", "ambiguous" ], "description": "Pause reason." }
This could easily be translated into:
// PausedReason Pause reason.
type PausedReason int
// PausedReason as enums.
const (
PausedReasonNotSet PausedReason = iota
PausedReasonXHR
PausedReasonDOM
PausedReasonEventListener
PausedReasonException
PausedReasonAssert
PausedReasonDebugCommand
PausedReasonPromiseRejection
PausedReasonOOM
PausedReasonOther
PausedReasonAmbiguous
)
// ...
One problem here is naming. Since the struct (or event) is called paused
, and the parameter is called reason
, I think it makes sense to name it PausedReason
. This works well for the most part, but can result in some really long names, EmulateTouchFromMouseEventButtonNone
, or in the worst case, names that do not work at all: ForcePseudoStateForcedPseudoClassesActive
.
If we were to name them differently, e.g. just Reason
it would not work with e.g. ObjectPreview.Subtype
(name Subtype
) since this name is shared between ObjectPreview
, PropertyPreview
and RemoteObject
. Also, the enum for RemoteObject
differs from ObjectPreview
and PropertyPreview
, which seems like a bug.
Since some props/params share enums it might be nice to have these types as part of the domain, and it would also ease with the naming of things.
I think the next step is to report/request this to the chrome-debugging-protocol
mailing list.
Looks like the writes to the websocket stream are happening concurrently which isn't allowed in gorilla. More info here: https://godoc.org/github.com/gorilla/websocket#hdr-Concurrency
panic: concurrent write to websocket connection
goroutine 52 [running]:
github.com/gorilla/websocket.(*messageWriter).flushFrame(0xc4202231d0, 0xc42018f901, 0x0, 0x0, 0x0, 0x199, 0xc420200100)
/Users/matt/go/src/github.com/gorilla/websocket/conn.go:585 +0x57b
github.com/gorilla/websocket.(*messageWriter).Close(0xc4202231d0, 0xc420024650, 0xc420024600)
/Users/matt/go/src/github.com/gorilla/websocket/conn.go:699 +0x60
github.com/gorilla/websocket.(*Conn).prepWrite(0xc42015adc0, 0x1, 0x12a752f, 0xc420053ea8)
/Users/matt/go/src/github.com/gorilla/websocket/conn.go:468 +0x10c
github.com/gorilla/websocket.(*Conn).NextWriter(0xc42015adc0, 0x1, 0x40, 0x50, 0xc42013ee60, 0x0)
/Users/matt/go/src/github.com/gorilla/websocket/conn.go:488 +0x39
github.com/gorilla/websocket.(*Conn).WriteJSON(0xc42015adc0, 0x1409ca0, 0xc42013ee60, 0x55, 0x81)
/Users/matt/go/src/github.com/gorilla/websocket/json.go:22 +0x49
exit status 2
FAIL command-line-arguments 2.806s
FWIW, I'm not sure exactly how rpcc works yet internally, but I had to deal with this once before, solving it with a write channel and a write event loop. It looked like this, though I'd probably want to update it to use context over additional channels.
writeLoop function:
func (w *WS) writeLoop() {
conn := w.conn
for {
select {
case outgoing := <-w.writeCh:
err := conn.WriteJSON(outgoing.Message)
if err != nil {
outgoing.Response <- err
} else {
close(outgoing.Response)
}
case <-w.closedCh:
// if we've closed end the loop
log.Info("ending outgoing loop")
return
}
}
}
write function:
// Write to the websocket
func (w *WS) Write(v interface{}) error {
errorCh := make(chan error)
req := &writeRequest{
Message: v,
Response: errorCh,
}
// write request
select {
case w.writeCh <- req:
case <-w.closedCh:
return ErrClosed
}
// ensure we wrote to the socket
select {
case err := <-errorCh:
return err
case <-w.closedCh:
return ErrClosed
}
}
when monitor network event, it willl happend
cdp.Network: GetResponseBody: rpc error: No resource with given identifier found (code = -32000)
code:
_, err = c.Network.GetResponseBody(ctx, &network.GetResponseBodyArgs{RequestID: xxs.RequestID})
//fmt.Println("resp", resp)
fmt.Println("xxs.RequestID: ", xxs.RequestID)
if err != nil {
fmt.Println("------------------------sss", err, xxs.RequestID)
return cret, err
}
I noticed that recently some references for network event types have been changed and moved around, but I can't quite figure out how to resolve this.
I noticed that page.ResourceTypeDocument
was moved to network.ResourceTypeDocument
and I fixed like this:
event, err := requestWillBeSent.Recv()
if err == nil {
if event.Initiator.Type == "other" && *event.Type == network.ResourceTypeDocument { ... }
Yet this error is still raised:
../browser.go:261:44: invalid indirect of event.Type (type "github.com/mafredri/cdp/protocol/network".ResourceType)
make: *** [Makefile:24: linux] Error 2
Although looking at the code, it seems that Type is still present in the RequestWillBeSentReply
. Could you clarify? Thanks!
Hi,
Sometimes, headless chrome produced Invalid message: VALIDATION_ERROR_MESSAGE_HEADER_UNKNOWN_METHOD error. I noticed that this error happened when my laptop went standby and on again several time while i kept the headless chrome up.
I tried to listen to event produced by c.Runtime.ExceptionThrown, but it did not produce above error.
Is there any way to listen to this type of error?
Thanks!
Is there any way to terminate debug protocol process without using SIGTERM to PID.
For example closing all of targets gracefully and let chrome process manager shutdown all children and close connection with rpc.
HI ,sir ,this is my scrapy html url ,the interception is repeated ,please see this result will influences
my use result ,thanks @mafredri @rtomayko @fd0 :
2017/12/07 13:46:10 http://m.elongstatic.com/promotions/wireless/uploadImages/images149370993339413.jpg
2017/12/07 13:46:10 http://www.elongstatic.com/common/js/noexpire/s_code.js?20171129181150
2017/12/07 13:46:10 {"interceptionId":"id-26","request":{"url":"http://www.elongstatic.com/common/js/noexpire/s_code.js?20171129181150","method":"GET","headers":{"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/64.0.3264.0 Safari/537.36","X-DevTools-Emulate-Network-Conditions-Client-Id":"(DE0A908A06D7D765113ED0BF265D0147), (DE0A908A06D7D765113ED0BF265D0147)","Accept":"*/*"},"initialPriority":"High","referrerPolicy":"no-referrer-when-downgrade"},"frameId":"(DE0A908A06D7D765113ED0BF265D0147)","resourceType":"Script","isNavigationRequest":false,"redirectUrl":"http://m.elongstatic.com/wwwelongstatic/common/js/expire/s_code.js?20171129181150"}
2017/12/07 13:46:10 {"interceptionId":"id-87","request":{"url":"http://ihotel.elong.com/HotDataWindow_Region.html?callback=jQuery1111024597974242530696_1512625570090\u0026_=1512625570096","method":"GET","headers":{"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/64.0.3264.0 Safari/537.36","X-DevTools-Emulate-Network-Conditions-Client-Id":"(DE0A908A06D7D765113ED0BF265D0147), (DE0A908A06D7D765113ED0BF265D0147)","Accept":"*/*"},"initialPriority":"Low","referrerPolicy":"no-referrer-when-downgrade"},"frameId":"(DE0A908A06D7D765113ED0BF265D0147)","resourceType":"Script","isNavigationRequest":false}
2017/12/07 13:46:10 http://ihotel.elong.com/HotDataWindow_Region.html?callback=jQuery1111024597974242530696_1512625570090&_=1512625570096
2017/12/07 13:46:10 http://www.elongstatic.com/common/js/noexpire/s_code.js?20171129181150
2017/12/07 13:46:10 {"interceptionId":"id-26","request":{"url":"http://www.elongstatic.com/common/js/noexpire/s_code.js?20171129181150","method":"GET","headers":{"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/64.0.3264.0 Safari/537.36","X-DevTools-Emulate-Network-Conditions-Client-Id":"(DE0A908A06D7D765113ED0BF265D0147), (DE0A908A06D7D765113ED0BF265D0147)","Accept":"*/*"},"initialPriority":"High","referrerPolicy":"no-referrer-when-downgrade"},"frameId":"(DE0A908A06D7D765113ED0BF265D0147)","resourceType":"Script","isNavigationRequest":false,"redirectUrl":"http://m.elongstatic.com/wwwelongstatic/common/js/expire/s_code.js?20171129181150"}
2017/12/07 13:46:10 {"interceptionId":"id-26","request":{"url":"http://www.elongstatic.com/common/js/noexpire/s_code.js?20171129181150","method":"GET","headers":{"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/64.0.3264.0 Safari/537.36","X-DevTools-Emulate-Network-Conditions-Client-Id":"(DE0A908A06D7D765113ED0BF265D0147), (DE0A908A06D7D765113ED0BF265D0147)","Accept":"*/*"},"initialPriority":"High","referrerPolicy":"no-referrer-when-downgrade"},"frameId":"(DE0A908A06D7D765113ED0BF265D0147)","resourceType":"Script","isNavigationRequest":false,"redirectUrl":"http://m.elongstatic.com/wwwelongstatic/common/js/expire/s_code.js?20171129181150"}
2017/12/07 13:46:10 http://www.elongstatic.com/common/js/noexpire/s_code.js?20171129181150
2017/12/07 13:46:10 {"interceptionId":"id-26","request":{"url":"http://www.elongstatic.com/common/js/noexpire/s_code.js?20171129181150","method":"GET","headers":{"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/64.0.3264.0 Safari/537.36","X-DevTools-Emulate-Network-Conditions-Client-Id":"(DE0A908A06D7D765113ED0BF265D0147), (DE0A908A06D7D765113ED0BF265D0147)","Accept":"*/*"},"initialPriority":"High","referrerPolicy":"no-referrer-when-downgrade"},"frameId":"(DE0A908A06D7D765113ED0BF265D0147)","resourceType":"Script","isNavigationRequest":false,"redirectUrl":"http://m.elongstatic.com/wwwelongstatic/common/js/expire/s_code.js?20171129181150"}
2017/12/07 13:46:10 http://www.elongstatic.com/common/js/noexpire/s_code.js?20171129181150
2017/12/07 13:46:10 {"interceptionId":"id-26","request":{"url":"http://www.elongstatic.com/common/js/noexpire/s_code.js?20171129181150","method":"GET","headers":{"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/64.0.3264.0 Safari/537.36","X-DevTools-Emulate-Network-Conditions-Client-Id":"(DE0A908A06D7D765113ED0BF265D0147), (DE0A908A06D7D765113ED0BF265D0147)","Accept":"*/*"},"initialPriority":"High","referrerPolicy":"no-referrer-when-downgrade"},"frameId":"(DE0A908A06D7D765113ED0BF265D0147)","resourceType":"Script","isNavigationRequest":false,"redirectUrl":"http://m.elongstatic.com/wwwelongstatic/common/js/expire/s_code.js?20171129181150"}
2017/12/07 13:46:10 http://www.elongstatic.com/common/js/noexpire/s_code.js?20171129181150
2017/12/07 13:46:10 {"interceptionId":"id-26","request":{"url":"http://www.elongstatic.com/common/js/noexpire/s_code.js?20171129181150","method":"GET","headers":{"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/64.0.3264.0 Safari/537.36","X-DevTools-Emulate-Network-Conditions-Client-Id":"(DE0A908A06D7D765113ED0BF265D0147), (DE0A908A06D7D765113ED0BF265D0147)","Accept":"*/*"},"initialPriority":"High","referrerPolicy":"no-referrer-when-downgrade"},"frameId":"(DE0A908A06D7D765113ED0BF265D0147)","resourceType":"Script","isNavigationRequest":false,"redirectUrl":"http://m.elongstatic.com/wwwelongstatic/common/js/expire/s_code.js?20171129181150"}
2017/12/07 13:46:10 http://www.elongstatic.com/common/js/noexpire/s_code.js?20171129181150
I was looking for example SetExtraHTTPHeaders for setting Accept-Language headers
how to open mulit browser tab?
I should call which function?
It would be great to be able to have access to the raw channel behind the events to allow support select statements.
This is slightly contrived, but you should get the point:
loadEventFired, err := c.Page.LoadEventFired(ctx)
if err != nil {
return err
}
_, err = c.Page.Navigate(ctx, &cdpcmd.PageNavigateArgs{
URL: "https://github.com",
})
if err != nil {
return err
}
select {
case <-loadEventFired.Channel():
return nil
case time.After(300 * time.Millisecond):
return errors.New("timeout")
}
Alternatively if we could pass an additional context to (ie. loadEventFired.RecvWithContext(ctx)
), that could work too.
The use case I need this for is after click events, where you may not know if the page is navigating or not so you need to wait a couple hundred milliseconds to see if a Page. navigationRequested
has come in or not.
Like click the checkbox and submit a form,
i didn't find a example. is it already implemented or not possible to do?
Thanks.
Hi, I'm trying to make a post request and as far as i know it's possible only via interception.
Here is my code:
package main
import (
"context"
"fmt"
"io/ioutil"
"log"
"time"
"github.com/mafredri/cdp/protocol/network"
"github.com/mafredri/cdp"
"github.com/mafredri/cdp/devtool"
"github.com/mafredri/cdp/protocol/dom"
"github.com/mafredri/cdp/protocol/page"
"github.com/mafredri/cdp/rpcc"
)
func main() {
err := run(5 * time.Second)
if err != nil {
log.Fatal(err)
}
}
func run(timeout time.Duration) error {
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
// Use the DevTools HTTP/JSON API to manage targets (e.g. pages, webworkers).
devt := devtool.New("http://127.0.0.1:9222")
pt, err := devt.Get(ctx, devtool.Page)
if err != nil {
pt, err = devt.Create(ctx)
if err != nil {
return err
}
}
// Initiate a new RPC connection to the Chrome Debugging Protocol target.
conn, err := rpcc.DialContext(ctx, pt.WebSocketDebuggerURL)
if err != nil {
return err
}
defer conn.Close() // Leaving connections open will leak memory.
c := cdp.NewClient(conn)
// Open a DOMContentEventFired client to buffer this event.
domContent, err := c.Page.DOMContentEventFired(ctx)
if err != nil {
return err
}
defer domContent.Close()
// Enable events on the Page domain, it's often preferrable to create
// event clients before enabling events so that we don't miss any.
if err = c.Page.Enable(ctx); err != nil {
return err
}
if err = c.Network.Enable(ctx, nil); err != nil {
return err
}
siteUrl := "https://requestb.in/1hmd8l51"
pattern := network.RequestPattern{URLPattern: &siteUrl}
patterns := []network.RequestPattern{pattern}
interArgs := network.NewSetRequestInterceptionArgs(patterns)
err = c.Network.SetRequestInterception(ctx, interArgs)
if err != nil {
return err
}
cl, err := c.Network.RequestIntercepted(ctx)
if err != nil {
return err
}
r, err := cl.Recv()
if err != nil {
return err
}
interceptedArgs := network.NewContinueInterceptedRequestArgs(r.InterceptionID)
interceptedArgs.SetMethod("POST")
interceptedArgs.SetPostData("a=b&c=d")
if err = c.Network.ContinueInterceptedRequest(ctx, interceptedArgs); err != nil {
return err
}
// Create the Navigate arguments with the optional Referrer field set.
navArgs := page.NewNavigateArgs("https://requestb.in/1hmd8l51").
SetReferrer("https://www.amazon.co.uk/")
nav, err := c.Page.Navigate(ctx, navArgs)
if err != nil {
return err
}
// Wait until we have a DOMContentEventFired event.
if _, err = domContent.Recv(); err != nil {
return err
}
fmt.Printf("Page loaded with frame ID: %s\n", nav.FrameID)
// Fetch the document root node. We can pass nil here
// since this method only takes optional arguments.
doc, err := c.DOM.GetDocument(ctx, nil)
if err != nil {
return err
}
// Get the outer HTML for the page.
_, err = c.DOM.GetOuterHTML(ctx, &dom.GetOuterHTMLArgs{
NodeID: &doc.Root.NodeID,
})
if err != nil {
return err
}
// fmt.Printf("HTML: %s\n", result.OuterHTML)
// Capture a screenshot of the current page.
screenshotName := "screenshot.jpg"
screenshotArgs := page.NewCaptureScreenshotArgs().
SetFormat("jpeg").
SetQuality(80)
screenshot, err := c.Page.CaptureScreenshot(ctx, screenshotArgs)
if err != nil {
return err
}
if err = ioutil.WriteFile(screenshotName, screenshot.Data, 0644); err != nil {
return err
}
fmt.Printf("Saved screenshot: %s\n", screenshotName)
return nil
}
I get the error on r, err := cl.Recv():
cdp.Network: RequestIntercepted Recv: context deadline exceeded
Please help
Hi,
I'm trying to build a small tool that takes screenshots of a number of URLs. For that I'd like to use a headless chromium running on a server (without a display). I'd also like to do that with ~10 URLs in parallel, so I'd like to create 10 tabs in the browser and use them from different goroutines in parallel.
With a GUI chromium that works great by creating a new "Page" for every goroutine:
page, err := devt.Create(ctx)
However on a headless chromium that fails:
panic: target CreateURL: Could not create new page
Turns out that the code just requests /json/new
, which does not work on headless chromium. The way to go is to use createTarget()
to create a new browsing environment, reference: https://groups.google.com/a/chromium.org/forum/#!topic/headless-dev/HLeIEn5V0DA
Can I do that already with cdp
? I'm also willing to help implement this, if you guide me through the code base a bit :)
Hello!
I'm trying to emulate touch event.
Please tell me where I'm wrong.
ts := input.TouchPoint{X:100,Y:200}
touchDispatchStart := c.Input.DispatchTouchEvent(ctx, &input.DispatchTouchEventArgs{
Type: "touchStart",
TouchPoints: []input.TouchPoint{ts},
})
if touchDispatchStart != nil {
panic(touchDispatchStart)
}
tm := input.TouchPoint{X:200,Y:300}
touchDispatchMove := c.Input.DispatchTouchEvent(ctx, &input.DispatchTouchEventArgs{
Type: "touchMove",
TouchPoints: []input.TouchPoint{tm},
})
if touchDispatchMove != nil {
panic(touchDispatchMove)
}
cdp.Input: DispatchTouchEvent: rpc error: Must have no prior active touch points to start a new touch. (code = -32602)
2018/03/14 13:39:48 cdp.Security: SetIgnoreCertificateErrors: rpc error: 'Security.setIgnoreCertificateErrors' wasn't found (code = -32601)
using this code:
err = c.Security.SetIgnoreCertificateErrors(ctx, &security.SetIgnoreCertificateErrorsArgs{Ignore: true})
if err != nil {
return err
}
what I am doing wrong here?
Right now Target.SendMessageToTarget
accepts Message
as a string, which is what the protocol expects, but that string is actually a subcommand, like this:
bctx, err := c.Target.CreateBrowserContext(ctx)
if err != nil {
panic(err)
}
width := 940
height := 500
tar, err := c.Target.CreateTarget(ctx, &cdpcmd.TargetCreateTargetArgs{
BrowserContextID: &bctx.BrowserContextID,
URL: "about:blank",
Width: &width,
Height: &height,
})
if err != nil {
panic(err)
}
attached, err := c.Target.AttachToTarget(ctx, &cdpcmd.TargetAttachToTargetArgs{
TargetID: tar.TargetID,
})
if err != nil {
panic(err)
} else if !attached.Success {
panic("could not attach")
}
err = c.Target.SendMessageToTarget(ctx, &cdpcmd.TargetSendMessageToTargetArgs{
TargetID: tar.TargetID,
Message: `{"id":0,"method":"Page.enable","params":{}}`,
})
if err != nil {
panic(err)
}
I'm not exactly sure how this will work in Goland, but it'd be nice to somehow treat Message as a recursive command that runs json.Marshal
on the command before sending it off.
Hey @mafredri, I'm really digging this implementation. I've also noticed you on a couple common issues across github related to headless chrome – awesome stuff :-)
For my use case, I'm just using the generated types in this library, so I was wondering if those are meant to be public interfaces, or if you anticipate changing that up in the future?
Thanks!
Hello!
I'm trying to override Platform, but get a error.
em := emulation.NewClient(conn)
emErr := em.SetNavigatorOverrides(ctx, &emulation.SetNavigatorOverridesArgs{Platform: "Win32",})
if emErr != nil {
panic(emErr)
}
panic: cdp.Emulation: SetNavigatorOverrides: rpc error: 'Emulation.setNavigatorOverrides' wasn't found (code = -32601)
Google Chrome 62.0.3202.89
Hello,
In attempting to create new targets in the browser (i.e. multiple tabs), I've run into what I believe is a bug, in that when I create a new target and activate it using Target.ActivateTarget
, the debugger is still attached to the old target. Strangely, calling Target.CloseTarget
will close the newly-created target, but I still cannot seem to attach to it no matter what I do. I've attempted to close all targets when initializing the cdp.Client
, but this returns an error and causes any subsequent target creation to fail with an error like rpcc: the connection is closing
.
Here's a gist exhibiting this behaviour (relies on Chrome listening on localhost:9222).
Hello, I am working on my own project, one of feature uses your awesome library and I need to use it to get a HTTP response (include header), but I read the official devtools protocol documentation and your project documentation, it seems that there is no related API provided. Is there any way to do that?
Hello! I'm using this package for some Headless Chrome automation, & I'm trying to inject some utility scripts into the browser context (jquery in particular). Everything is working great, except that when I use Runtime.Evaluate
to inject jquery into the page, I get an error saying websocket: close 1006 (abnormal closure): unexpected EOF
.
Page.AddScriptToEvaluateOnLoad
appears to succeed, but $
won't be defined in the page context even after loading a new page to trigger the evaluationhttps://code.jquery.com/jquery-3.2.1.min.js
, but I've also tried v3.2.0-min, which exhibits the same bugHi!
First of all, I want to thank you for your incredible work of bringing CDP to go land. That's fantastic.
Second, I'm working on scraping tool where I'm trying to use the package as an access to JS-based web pages. During the use, I've noticed, sometimes DescribeNode
(but, BackendId
is not empty) return a node with empty NodeId
field and as a result I have to keep the id returned from QuerySelector
/QuerySelectorAll
methods.
I'm wondering whether this behavior is by design?
If I print to PDF from desktop Chrome, the PDF automatically sizes to fit the content being printed. When using cdp to print from headless Chrome, PrintToPDFArgs.PaperWidth
and PrintToPDFArgs.PaperHeight
are set by default. This can cause large white margins to be added to pages that don't fit this dimension or aspect ratio. I'm sure it could be fixed by setting these values correctly, but it would be great if they could somehow not be used at all like is done on desktop. Is there a way around this? Thanks.
Running go version go1.9rc1 darwin/amd64
cdpgen -dest-pkg github.com/mafredri/cdp -browser-proto $GOPATH/src/github.com/mafredri/cdp/cmd/cdpgen/protodef/browser_protocol.json -js-proto $GOPATH/src/github.com/mafredri/cdp/cmd/cdpgen/protodef/js_protocol.json
...writing...
2017/07/31 11:59:14 /Users/matt/go/src/github.com/mafredri/cdp/protocol/page/types19.go:56:19: expected type, found '='
/Users/matt/go/src/github.com/mafredri/cdp/protocol/page/types19.go:82:14: expected type, found '='
/Users/matt/go/src/github.com/mafredri/cdp/protocol/page19.go:10:23: expected type, found '='
/Users/matt/go/src/github.com/mafredri/cdp/protocol/page19.go:15:18: expected type, found '='
2017/07/31 11:59:14 exit status 2
Any ideas? :-)
Can someone point me to some examples of using the cdp package with an interactive web page? I see a lot of examples of loading a page and querying the DOM, but nothing covering how to move the mouse, click elements, tap elements, fill a form, attach a file to a form, interact with dialogs and subwindows, etc.
Hey there,
I'd like to be able to write in order synchronously without waiting for the replies.
The use case being able to speed up typing in keyboard events, where you need them to run in order (keyDown, char, keyUp), so you want the writes to go through in order, but you don't care about waiting for the result ids to come back.
It would probably just be a simple addition to the API, rpcc.InvokeAsync maybe. Would this be something you'd like to see in here?
Do you have any advice how to do geolocation on headless chrome?
On regular chrome I can run this in the console:
navigator.geolocation.getCurrentPosition(function (position) { console.log(position); });
And it will pop up a dialog asking for permission, then print the coordinates in the console.
On headless chrome I don't get any indication of a dialog box opening and I never receive a response.
I tried adding an emulated position thinking that maybe by default headless chrome doesn't have a position:
err = c.Emulation.SetGeolocationOverride(ctx, emulation.NewSetGeolocationOverrideArgs().SetLatitude(11.0).SetLongitude(12.0).SetAccuracy(13.0))
if err != nil {
log.Fatal(err)
}
And I don't receive any errors, however again when I try to query the geolocation I don't get any indication of a dialog box, and there is never a response to my query.
Hi,
First of all, thanks for a great library!
Why is response status a float64 and not int like it is in the native http.Response.StatusCode?
If it is not going to change to int, would you mind adding a comment before the field explaining the reason?
Thanks
How can i get website's wss frame data by c.Network?Could you give me some example?Thanks!
Just curious - I have a working request interception function:
url := "*"
err = c.Network.SetRequestInterception(ctx, network.NewSetRequestInterceptionArgs([]network.RequestPattern{network.RequestPattern{URLPattern: &url}}))
if err != nil {
log.Fatal("SetRequestInterception error:", err.Error())
}
go func() {
requestIntercepted, err := c.Network.RequestIntercepted(ctx)
if err != nil {
log.Fatal(err)
}
defer requestIntercepted.Close()
for {
select {
case <-requestIntercepted.Ready():
reply, err := requestIntercepted.Recv()
if err != nil {
log.Fatalf("requestIntercepted failed: %s", err)
continue
}
data, err := json.Marshal(reply)
if err != nil {
log.Fatalf("Error marshaling response: %s", err)
continue
}
log.Printf("requestIntercepted: %s", data)
err = c.Network.ContinueInterceptedRequest(ctx, network.NewContinueInterceptedRequestArgs(reply.InterceptionID))
if err != nil {
log.Fatalf("ContinueInterceptedRequest: %s", err)
continue
}
continue
}
}
}()
And I am loading a minimal html page:
https://www.sitepoint.com/a-minimal-html-document-html5-edition/
When I load this page I only receive requestIntercepted for the html file and favicon.ico - the intercept event for style.css never fires that I can see. I see the style.css request hit my test server and I see a ResponseReceived event for it.
Even more odd, if there is a problem with the stylesheet like it not existing, then I see neither the requestIntercepted or ResponseReceived events.
I thought that maybe I was dropping or consuming events inadvertently so I setup a second test file with a handful of css files listed, and I see the same behaviors for all of them - never a requestIntercepted event, every request hits the backend, but I only receive ResponseReceied events for the ones that exist.
If I open chrome devtools and reload the page, I see full details of the requests for all the stylesheets regardless of their existence.
If I enable RequestWillBeSent and LoadFailed/LoadFinished events then I see those for all stylesheets.
Any idea what is the cause of this behavior? I see it with both Chrome 64 and Chrome 66
While replying to #11 I realized the behavior of the Ready channel might be unexpected.
In the code sample:
select {
case <-loadEventFired.Ready():
return nil
case time.After(300 * time.Millisecond):
return errors.New("timeout")
}
Not calling loadEventFired.Recv()
after <-loadEventFired.Ready()
can result in unexpected bugs if the Ready channel is used again after this.
Calling <-loadEventFired.Ready()
will deplete the Ready channel and a future call will block unless Recv()
is called in between.
In a previous implementation, I always closed the Ready channel when an event could be received which made it non-blocking until Recv is called. I refactored this because I was also considering concurrent calls to <-loadEventFired.Ready()
, in this implementation, only one will resolve, in the previous, both would resolve and one call to Recv would block.
One option here is to try to populate the Ready channel when loadEventFired.Ready()
is invoked assuming there is an event in the pipeline.
Hi! Thanks for this awesome package!
I try to create a concurrent web crawler, that will process all JS on the pages, and get the final HTML document.
This how i do it now:
Google Chrome 69.0.3493.3 dev
) with --headless
and --remote-debugging-port=9222
flags.devtool.DevTools
instance.devt.Get(ctx, devtool.Page)
rpcc.DialContext(ctx, page.WebSocketDebuggerURL)
cdp.NewClient(conn)
with rpcc connection opened before.LoadEventFired
, etc.. and waiting for jobs from input channels.client.Page.Navigate(ctx, page.NewNavigateArgs(job))
, wait for fired event, and than call client.DOM.GetDocument(ctx, nil)
The problem is, when goroutine receive a job, and send the response to the responses chan, i receive same response for the different targets. So, for example, i send in channel two urls: http://example.com/
and http://awesomesite.net/
, and then in responses channel i get html code of first target http://example.com
twice, so it looks like when my worker receive the new target, it cancel the job from previous, and then two of my workers react on the same event, and get the same html code.
What am i doing wrong, and how to do concurrent requests that will not cancel each other?
Thank you!
The IO domain is used for streaming data (currently this is limited to tracing). In the future we might see stream in other parts as well (e.g. printing PDFs ChromeDevTools/devtools-protocol#41).
It's possible to read until EOF and close streams, so we should be able to implement a io.ReadCloser
quite painlessly.
Here's some pseudo-code (of io.Reader
) to illustrate:
func (r *ioReadCloser) Read(b []byte) (n int, err error) {
if r.data {
// Copy remaining data into b.
return n, err
}
if r.eof {
return 0, io.EOF
}
reply, err := r.c.IO.Read(ctx, &io.ReadArgs{
Handle: &r.handle,
Offset: &r.pos,
Size: len(b),
}
r.eof = reply.EOF
if reply.Base64Encoded != nil && *reply.Base64Encoded {
// Handle base64 decoding.
}
// Copy reply.Data into b.
// Store surplus in r.data?
return n, err
}
Regarding Size: len(b)
, this could also be defined when creating the reader, e.g. io.NewStreamReader(bufsize int)
.
The reader could also be made to optionally prefetch the next chunk without waiting for the next call to Read. This would mandate a predefined buffer size.
Sidenote: It's a bit unfortunate that the IO-domain (io
-package) conflicts name-wise with the standard library. Should we rename, and to what? Maybe cio
, cdpio
? Same goes for e.g. log
.
Hello, Mathias.
It seems like another stupid question, but i'm not ashamed to ask.
I'm trying to order defer statements:
ctx, cancel := context.WithTimeout(context.Background(), 55*time.Second)
devt := devtool.New("http://127.0.0.1:9222")
page, err := devt.Create(ctx)
if err != nil {
fmt.Print(err)
return
}
conn, err := rpcc.DialContext(ctx, page.WebSocketDebuggerURL, rpcc.WithBufferSize(2048562), rpcc.WithCompression())
if err != nil {
fmt.Print(err)
return
}
// wrong order
defer cancel()
defer devt.Close(nil, page)
defer conn.Close()
c := cdp.NewClient(conn)
abort := make(chan error, 2)
go func() {
select {
case <-ctx.Done():
case err := <-abort:
fmt.Printf("aborted: %s\n", err.Error())
cancel()
}
}()
if err = catchExceptionThrown(ctx, c.Runtime, abort); err != nil {
fmt.Print(err)
return
}
if err = catchLoadingFailed(ctx, c.Network, abort); err != nil {
fmt.Print(err)
return
}
...
but always get an error, even if also get a response.
So i think the problem is in order of defer statements.
Error messages differs from time to time:
aborted: cdp.Network: LoadingFailed Recv: rpcc: the connection is closing
cdp.Runtime: Evaluate: context canceled
aborted: cdp.Network: LoadingFailed Recv: context canceled
What is the right order of defer statements?
p.s: i have no errors if i just remove devt.Close()
, but i'm not sure that it is the right thing to leave target unclosed. What do U think?
Thanks a lot.
Hello!
Please provide an example of keystrokes.
My code does not work
key := "Space"
c.Input.DispatchKeyEvent(ctx, &input.DispatchKeyEventArgs{
Type: "keyDown",
Key: &key,
})
c.Input.DispatchKeyEvent(ctx, &input.DispatchKeyEventArgs{
Type: "keyUp",
Key: &key,
})
using socket instead of tcp ports?
Is it currently possible to distinguish between response errors and network errors in RPCC?
For example, with the DOM.getNodeForLocation
method, it can sometimes return an error: No node found at given location
. I don't really want to error out in all cases of this error, but for things like the RPCC websocket disconnecting, I would like to error out immediately.
Is there anyway to distinguish between these types of errors? Here's some (incorrect) psuedocode of what I'm trying to accomplish:
var args cdpcmd.DOMGetNodeForLocationArgs
if e := json.Unmarshal(req.Params, &args); e != nil {
return raw, e
}
reply, err := c.DOM.GetNodeForLocation(ctx, &args)
if err != nil {
if err.(*ResponseError) {
// handle differently
}
return raw, err
}
return json.Marshal(reply)
Thanks!
I am trying to setup to intercept network requests:
url := "*"
rii := network.NewSetRequestInterceptionArgs([]network.RequestPattern{network.RequestPattern{URLPattern: &url}})
err = c.Network.SetRequestInterception(ctx, rii)
if err != nil {
log.Fatal("SetRequestInterception error:", err.Error())
}
However, I receive the following error:
SetRequestInterception error:cdp.Network: SetRequestInterception: rpc error: 'Network.setRequestInterception' wasn't found (code = -32601)
This is with go1.8.3 and chrome 63. The chrome is latest, I can update go if it will resolve the issue.
The current API is well suited for making sure all errors are handled promptly, and as it is synchronous, it does not force users into an async pattern by default. As a result, handling a lot of events can be quite tedious. It takes ~15
lines of code to properly initiate, close and receive one type of event concurrently.
The following example is OK for one event, but not very nice when repeated 10 times, or more:
frameNavigated, err := c.Page.FrameNavigated(ctx)
if err != nil {
return err
}
go func() {
defer frameNavigated.Close()
for {
ev, err := frameNavigated.Recv()
if err != nil {
errCh <- err
return
}
ch <- ev
}
}()
To improve this, I propose an API that allows us to handle multiple events from one event client.
I imagine the usage of such an API could look something like this:
events := []cdp.Event{
page.FrameNavigatedEvent,
domain.SomeOtherEvent,
}
ec, err := cdp.CombineEvents(ctx, c, events...)
if err != nil {
return err
}
go func() {
defer ec.Close()
for {
ev, err := ec.Recv()
if err != nil {
errCh <- err
return
}
switch ev := ev.(type) {
case *page.FrameNavigatedReply:
log.Println(ev.Frame.URL)
case *domain.SomeOtherReply:
// Do stuff...
default:
panic("unhandled event reply")
}
}
}()
The logic behind CombineEvents
would be handled by generated code, getting rid of the boilerplate we are forced to write currently.
For those interested in what this might look like when implemented:
package cdp
type Event interface {
Event() string
}
type EventsClient struct {
closers []func() error
events chan interface{}
err chan error
}
func (ec *EventsClient) Close() error {
var err error
for _, c := range ec.closers {
err2 := c()
if err == nil {
err = err2
}
}
return err
}
func (ec *EventsClient) Recv() (interface{}, error) {
// TODO: Handle error...
return <-ec.events, nil
}
// CombineEvents combines multiple cdp events into a client that can receive any
// of those events via Recv.
//
// A goroutine is launched for each event.
func CombineEvents(ctx context.Context, c *Client, ev ...Event) (ec *EventsClient, err error) {
ec = &EventsClient{
events: make(chan interface{}, 1),
err: make(chan error, 1),
}
defer func() {
if err != nil {
ec.Close()
}
}()
for _, e := range ev {
switch e {
// Generate code for each event...
case page.FrameNavigatedEvent:
cl, err := c.Page.FrameNavigated(ctx)
if err != nil {
return nil, err
}
go func() {
clEv, err := cl.Recv()
if err != nil {
ec.err <- err
return
}
ec.events <- clEv
}()
}
}
return ec, nil
}
rpcc
to avoid launching a goroutine for each event (e.g. subscribing to multiple events?).c.Page.FrameNavigated(ctx, cdp.Sync(eventSyncChannel))
)?@matthewmueller since you've been working with events, I would really appreciate your input/thoughts here 😄!
Hi everybody,
I'm trying to show a rectangle in a chromium browser.
color := cdptype.DOMRGBA{R: 255, G: 0, B: 0}
cerr := c.Overlay.HighlightRect(ctx, cdpcmd.NewOverlayHighlightRectArgs(5, 5, 20, 20).SetColor(color))
if cerr != nil {
panic(cerr)
}
I get following error:
'Overlay.highlightRect' wasn't found (code = -32601)
There seems to be a mismatch between domains (There is a Highlight in the DOM Domain?)
Thank you! Keep up the great work!
I would like to set custom locale? Is there any way to achieve this?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.