GithubHelp home page GithubHelp logo

w3c / webdriver Goto Github PK

View Code? Open in Web Editor NEW
669.0 99.0 187.0 8.21 MB

Remote control interface that enables introspection and control of user agents.

Home Page: https://w3c.github.io/webdriver/

License: Other

HTML 98.93% JavaScript 0.92% CSS 0.14%
w3c-specification webdriver standard browser automation remote-control

webdriver's Introduction

WebDriver Standard

WebDriver is a remote control interface that enables introspection and control of user agents. It provides a platform- and language-neutral wire protocol as a way for out-of-process programs to remotely instruct the behavior of web browsers.

Provided is a set of interfaces to discover and manipulate DOM elements in web documents and to control the behavior of a user agent. It is primarily intended to allow web authors to write tests that automate a user agent from a separate controlling process, but may also be used in such a way as to allow in-browser scripts to control a — possibly separate — browser.

The standard is authored by the W3C [Browser Testing and Tools Working Group], and has produced the following documents:

Contribute

In short, change index.html and submit a pull request (PR) with a good commit message. Changes that affect behaviour must be accompanied with corresponding test changes to the Web Platform Tests repository.

We use ReSpec to help us maintain referential integrity, bibliographical data, and perform other mundane tasks such as styling. To preview your changes, just load index.html from disk in a browser. To verify the integrity of the document you can run make test.

You may add your name to the Acknowledgements section in your first PR, even for trivial fixes. The names are sorted lexicographically.

See CONTRIBUTING.md for more guidelines.

Vendor status documents

webdriver's People

Contributors

andreastt avatar automatedtester avatar christian-bromann avatar darobin avatar deniak avatar dontcallmedom avatar drmarcii avatar eranmes avatar foolip avatar illicitonion avatar jgraham avatar jimevans avatar jimevanssfdc avatar jlipps avatar johnchen0 avatar juangj avatar jugglinmike avatar lanwei22 avatar manoj9788 avatar marcoscaceres avatar mjzffr avatar plehegar avatar sevaseva avatar shekyan avatar shs96c avatar sideshowbarker avatar thejohnjansen avatar titusfortner avatar vkatsikaros avatar whimboo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

webdriver's Issues

Order of error checks in spec incompatible with proxy implementations

https://www.w3.org/Bugs/Public/show_bug.cgi?id=29218

James Graham:

"1. If the current browsing context is no longer open, return error with error code no such window.

  1. Handle any user prompts, and return its value if it is an error.
  2. Let cookie be the result of getting a property named "cookie" from the parameters argument."

This pattern doesn't work with a proxy implementation since it must read the full request before communicating with the backend that can know things like whether the browsing context is still open. Also the browsing context may close whilst the request is being read. So generally it seems better to delay these checks until after the request is fully processed.

Describe data structures coming over the wire with a JSON schema instead of WebIDL

https://www.w3.org/Bugs/Public/show_bug.cgi?id=26707

Andreas Tolfsen:

The WindowSize dictionary [1] uses the data type double for height/width but this isn't supported in JSON which only has a number type [2](which may be extracted to an integer or float in the local end).

(Furthermore ElementRect may use floats for the positioning of elements in the DOM, e.g. .5px, but WindowSize uses integers as no window managers support half pixels.)

  1. https://dvcs.w3.org/hg/webdriver/raw-file/tip/webdriver-spec.html#dictionary-windowsize-members
  2. http://www.ietf.org/rfc/rfc4627.txt

Extend the find-by-link-text location strategy to apply to all elements

https://www.w3.org/Bugs/Public/show_bug.cgi?id=24847

Wilhelm Joys Andersen:

Many of the interactive elements in web applications these days are not links, but arbitrary elements with event handlers. To test interaction with such elements, test authors often resort to using XPath.

XPath is an undesirable anti-pattern that should be killed with fire. To be able to fry it, we must first cater to its usecases and make it obsolete.

The first step can be to allow any element to be selected by its (visible) text:

https://dvcs.w3.org/hg/webdriver/raw-file/default/webdriver-spec.html#link-text

Switch to Window doesn't specify what to do with the current browsing context

It sets the current top-level browsing context, but not the current browsing context.

The only real ambiguity comes when "Switch to Window" is called with the handle of the current top-level browsing context (i.e., is a no-op):

  • Marionette does not change the current browsing context.
  • All other driver implementations I'm aware of (including Microsoft WebDriver and Apple SafariDriver) set the current browsing context to the top-level context, which seems to me like the right approach.

Define what response should be sent when an alert is open

https://www.w3.org/Bugs/Public/show_bug.cgi?id=26962

Andreas Tolfsen:

Section 5.2.1 says that “If any modal dialog box, such as those opened by on window.onbeforeunload or window.alert, is opened at any point in the page load, a response MUST be sent.”

Because we need to do this as a precondition for almost all commands I suggest we make it a definition that each command's algorithm can refer to.

The language also need to be cleaned up, and I suggest something along the lines of:

  • Define a global state that signifies whether an alert dialogue is open.
  • Create a definition of how what steps to take when the previous state is true, including the steps to populate the response with the correct status.
  • Add this as a precondition to all commands where we need to check for this.

I imagine we can use a language like this for the POST /session/{session_id}/url command:

“All alert dialogs created during beforeunload are subject to unexpected alert handling.”

And a definition of alert dialogs:

Window.alert, Window.confirm, and Window.prompt are considered alert dialogs.”

Then some text on how to handle the dialogs:

“Alert dialogs block document script execution and WebDriver behaves the same way. When alert dialogs are created commands are free to choose if they should affect their response. The following steps may be run when a command requests unexpected alert handling on request:

  • If the current alert is defined:
    • Let response's status be unexpected alert open.
    • Return response and abort the remaining steps.
  • Otherwise, return.

Then the definition of “current alert”:

“The remote end must keep a global state current alert that is an initially left undefined alert. When any of the alert dialogs appear, this state must be updated with a reference to that alert.”

And then we need a definition of an “alert” struct which we can use in the algorithm for interacting with the alert.

Use lower-case for screen orientation arguments

The open source wire protocol found in the Selenium project defines
permitted screen orientation arguments to be PORTRAIT and LANDSCAPE
(in upper casing).

My suggestion is that we specify this section to allow
case-insensitive arguments to this command.

Setting orientation to secondary view angles

https://www.w3.org/Bugs/Public/show_bug.cgi?id=23949

Andreas Tolfsen:

WebDriver currently supports setting the screen orientation to either
portrait or landscape mode on devices that support this configuration.
Many devices support further rotation by 90° so that the top of the
device aligns with the bottom border of the viewport.

In Android this is refered to as secondary orientations. My
suggestion is to use *-primary and *-secondary as optional additions
to specify the type of orientation. This would recognize:

PORTRAIT
LANDSCAPE
PORTRAIT-PRIMARY
LANDSCAPE-PRIMARY
PORTRAIT-SECONDARY
LANDSCAPE-SECONDARY

The PORTRAIT and LANDSCAPE orientations would default to
PORTRAIT-PRIMARY and LANDSCAPE-PRIMARY.

Investigate defining what a response is

We define what a command is but we are only loosely talking about responses. In fact, we talk about two different type of responses: those that are results of executing WebDriver commands, and those that are returned as part of running the navigate algorithm from HTML.

Order window handles

https://www.w3.org/Bugs/Public/show_bug.cgi?id=29003

Andreas Tolfsen:

Quoting thephilwells from #39:

Using the Java API, one should be able to do this

List handles = driver.getWindowHandles();
driver.switchTo().window(handles[1]);

..and able to reliably expect that they are viewing the second-oldest open window. This matters most when more than two windows are open, as when one needs to open more than one new window from the original window.

Need a way to enable / disable networking from webdriver

https://www.w3.org/Bugs/Public/show_bug.cgi?id=25179

James Graham:

The new generation of web apps are expected to continue to function when networking is not available for whatever reason. Obviously this is particularly important on mobile devices to close the gap between native apps and web apps.

At present it isn't possible to test the behaviour of an app when it is offline, or the transition between online and offline or vice-versa. This substantially decreases the utility of WebDriver for testing contemporary web applications. Neither is it possible to write testsuites for the features underlying offline support (AppCache, Service Worker), substantially increasing the chance of buggy or non-interoperable implementations.

The most obvious way to provide this would be to expose an API to webdriver that would allow disabling "content" networking i.e. from the point of view of the webpage it would look like the browser was offline, but privileged code (in particular webdriver itself) would still be able to perform network operations.

Does it make sense to require the use of "unknown error"?

https://www.w3.org/Bugs/Public/show_bug.cgi?id=29168

juangj:

From 8. Invalid SSL Certificates:

"In this case, implementations may choose to make accessing a site with bad HTTPS configurations cause a WebDriverException to be thrown. Remote end implementations must return an unknown error error code in this case."

It seems odd to say that the remote end MUST return an "unknown error" in a case where we know exactly what the error is. There doesn't seem to be any other fitting error code, though.

Is "unknown error" intended to just be the catch-all for errors that don't fit any of the other error codes?

Element location strategies for link text and partial link text references example

https://www.w3.org/Bugs/Public/show_bug.cgi?id=26487

Andreas Tolfsen:

The element location strategies for link text [1] and partial link text [2] references examples in pseudo code which doesn't make any sense. Besides this is uses the very misleading term “visible text” which is not further defined.

It clearly borrows the algorithm used for getting an element's text [3], so perhaps this algorithm should be generalized?

  1. https://dvcs.w3.org/hg/webdriver/raw-file/tip/webdriver-spec.html#link-text
  2. https://dvcs.w3.org/hg/webdriver/raw-file/tip/webdriver-spec.html#partial-link-text
  3. https://dvcs.w3.org/hg/webdriver/raw-file/tip/webdriver-spec.html#widl-WebElement-getElementText-DOMString

maximizeWindow inaccurately talks about resizing a window

https://www.w3.org/Bugs/Public/show_bug.cgi?id=26711

Andreas Tolfsen:

The prose on maximizing the window talks about resizing the window which in the context doesn't make any sense as there is already a precondition that the window manager understands the concept of maximizing the window for the command to succeed, and if it does it's implicitly understood that the window will be resized to what the window manager considers is a maximized window.

Namespacing in capabilities

We support namespacing with extension commands, where UAs are encouraged to use a vendor-specific prefix to separate their additional endpoints from those of the specification. The specification in return makes the guarantee never to specify anything that will conflict with that path namespace.

Similarly we should do this for the capabilities object. For keys such as firefoxOptions and chromeOptions it mostly does not matter, since they are unlikely to ever conflict or protrude on a future reserved keyword. But one can imagine a scenario where different intermediary nodes accept authentication login, and putting username and password fields could conflict.

Clarify whether sessionId is optional from the perspective of the local end before newSession

https://www.w3.org/Bugs/Public/show_bug.cgi?id=27766

Andreas Tolfsen:

Section 2.1 says that sessionId by default is null, but the WebIDL marks it as optional.

It's called out explicitly on the other parameters to the command object which are to be provided by the local end, but it's unclear if this applies to sessionId.

The question is whether sessionId should be undefined or null when calling newSession.

Since the spec allows a passing in a sessionId to newSession, it follows that local ends shouldn't send null to mean “undefined” since that (in section 4.1.1) talks about whether the field is “set” or not.

Should be possible to return errors with executeAsyncScript

https://www.w3.org/Bugs/Public/show_bug.cgi?id=28060

jleyba:

The callback provided to the executeAsyncScript function only accepts a single argument that is always treated as a successful completion. It should be possible for users to call this function with an error to indicate their script failed.

Node.js has popularized the "Error-first" callback approach: errors are passed as the first argument, successful values the second.

Another option would be to standardize on the Error-type. If the callback is invoked with an instanceof Error, the script is marked as a failure.

Unclear what happens if maximizeWindow maximizes window and is called again

https://www.w3.org/Bugs/Public/show_bug.cgi?id=26710

Andreas Tolfsen:

If maximizing the browser window is supported and the window gets maximized upon calling maximizeWindow or the window is somehow already maximized and maximizing the window is supported, it's unclear from reading the spec what happens if it's called again.

Should calling it again de-maximize it, that is defer to the window manager what position and dimensions to set the window to? In this case, should the command really be called “maximize” when what it does is toggling?

Proposal: Shadow DOM support in WebDriver

Marc Fisher:

The current proposal for dealing with Web Components and their corresponding Shadow DOMs in WebDriver treats them similarly to frames; at any point in time the WebDriver session is operating within a particular DOM, either the top-level DOM or one of the Shadow DOMs of a particular Web Component. However, this makes interacting with pages that use Web Components extremely taxing, as the current DOM will have to be switched on a regular basis, and it is unclear over what time frame an element id found within a particular DOM will be considered valid. Instead I propose that the WebDriver wire protocol be extended with commands to get the list of attached Shadow DOMs for an elements as opaques IDs and to support new element and elements commands that are scoped to a particular DOM. Additionally, element IDs from Shadow DOMs be completely accessible as long they are the corresponding element attached to the Shadow DOM and the Shadow DOM is attached to the page.

For a more thorough description see:
https://docs.google.com/document/d/1qP7Se3MDUac5P0V1Kfm2yaj3fFhBOFCyWyXLcsVwkTA/edit?usp=sharing

getWindowHandles() should return a list of windows ordered by window age

Using the Java API, one should be able to do this

List<String> handles = driver.getWindowHandles();
driver.switchTo().window(handles[1]);

..and able to reliably expect that they are viewing the second-oldest open window. This matters most when more than two windows are open, as when one needs to open more than one new window from the original window.

implicit timeout does not return a timeout response

https://www.w3.org/Bugs/Public/show_bug.cgi?id=28756

John Jansen:

CURRENTLY
"implicit - Set the amount of time the driver should wait when searching for elements. When searching for a single element, the driver should poll the page until an element is found or the timeout expires, whichever occurs first. When searching for multiple elements, the driver should poll the page until at least one element is found or the timeout expires, at which point it should return an empty list."

EXPECTED:
implicit - Set the amount of time the driver SHOULD wait when searching for elements. When searching for a single element, the driver should poll the page until an element is found or the timeout expires, whichever occurs first. When searching for multiple elements, the driver should poll the page until at least one element is found or the timeout expires. If the timeout expires before the driver has finished polling the page, the driver MUST return a "timeout" response.

Need clarification on JavaScript execution when Content Security Policy is in place

https://www.w3.org/Bugs/Public/show_bug.cgi?id=27223

Jim Evans:

If a page has a Content Security Policy applied (spec: https://w3c.github.io/webappsec/specs/content-security-policy/), it may prevent the execution of user-supplied JavaScript via the executeScript command. This is because the injected JavaScript would have no source which could be validated by the policy. The WebDriver spec should have language describing how a driver should behave in this event.

Missing text/selection manipulation primitives

https://www.w3.org/Bugs/Public/show_bug.cgi?id=29135

[email protected]:

As far as I could see, the WebDriver spec currently provides very little in terms of emulating textual manipulations.

NOTE: I will use the term "insertion point" to refer to the textual cursor within e.g. a text box, to differentiate it from the "pointer" cursor

Current Provisions

  • the entire textual content of an element can be retrieved
  • it is possible to [clear] an element or [sendKeys] to it (emulating keyboard input)
  • implicitly, the insertion point and selection can be manipulated using actions (click and pointerDown/pointerMove/pointerUp).

Primary Issues

Pointer actions work in term of offsets, but as far as I could tell

  • the specification provides no way to perform textual matching and transform that into bounding boxes, thus no way to easily position the insertion point or draw selections
  • the specification provides no way to query the insertion point or selection for position or bounding boxe, thus no way to get simple feedback while probing blindly

Use case

Test/demonstrate RTEs or other contenteditable elements, allow cross-platform text insertion within existing textual nodes rather than just around them

Possible solutions?

Rect textRect(needle[, element][, skip])

  • would return the same thing as Element Rect ({x, y, width, height} relative to the document element).
    • would only match visible text (so text contained in a visible element)
    • would generate an error if no matching visible text is found?
  • needle would be the text to look for, possibly a regex? The specification does not currently use regex anywhere so that might be a bit much.
  • skip would probably be necessary as the reference text could occur multiple times in the source.
  • a WebElement "root reference" would probably allow easier precise matching and less skipping.
  • Testing Chrome, Firefox and Safari on OSX, selecting a glyph requires going through the majority of the glyph so selecting from a textual boundary won't risk selecting the preceding glyph.
  • It's somewhat inconvenient for single-letter boundary selections though as there might be need for lots of skipping.
  • It doesn't try to count characters/glyphs and thus might help avoid possible confusion issues with respect to code units, normalisation (maybe?), codepoints and glyphs at the interface-level (these concerns may have to be handled at the spec level though).

Unknowns for this possible solution

  • would/should it be possible to match text across multiple elements? This is possible for users e.g. my browser's in-page search will find a match for "requests | preferences" on the current page even tough that spans two links and a span in two separate list elements.
  • would/should the rect be augmented with the text's container element(s) in the style of a DOM Range? It doesn't seem to make much sense from a user-interaction perspective.

Unsolved

Should it be possible to query the current selection's span/rect as well, independently from arbitrary text? I don't have a use-case for that right now but a "living" user would see the current text selection displayed in the UI so it could make sense.

Webdriver command batches suggestion

https://www.w3.org/Bugs/Public/show_bug.cgi?id=26266

Anton Zhuravsky:

Hi guys,

I have recently encountered an challenge while implementing automated tests and think we can design a pretty useful generic feature out of it.

The idea is pretty simple: create an interface to be able to create a batch of commands and get a notification upon their completion. So, basically the approach is:

  1. An API consumer requests WebDriver to notice batch start
  2. WebDriver replies with a unique batch id
  3. All the commands executed later on are associated with this batch
  4. An API consumer requests WebDriver to notice batch finish
  5. An API consumer is able to ask if the batched commands have been completed
  6. (Optional) WebDriver can notify API consumer about batch finalization

The motivation behind this is pretty simple: executing some actions can produce delayed side effects (send an AJAX request and handling a callback once it finishes; submitting the form into an iframe and handling onload event; etc). Unfortunately, currently there is no way to know that a command (or a set of commands) have finished working completely (including all side effects produced directly or indirectly by them).

Why do we need this is real world? Well, a number of examples can be provided: the simpliest is writing automated tests for javascript code, which requires asserting some values only after all asynchronous activity has completed.

Please note it is different from page loading modes as the effects are not limited to network / parser / DOM – one can set a delayed executing (via setTimeout) and aim to check if it worked properly (and, of course, only after setTimeout has fired), which gets handled by the proposed functionality.

If anyone could provide his thoughts on the suggestion it would be great – I am keen to discuss and polish the design of this feature :)

maximizeWindow needs an algorithm

https://www.w3.org/Bugs/Public/show_bug.cgi?id=26712

Andreas Tolfsen:

The maximize window section needs an algorithm as it's currently very confusing how it should be implemented by the driver.

Specifically the first paragraph talks about whether the window manager understands the concept of maximization; presumably the driver should return with an error if this isn't supported by the WM.

(It also talks about the return type void which isn't accurate. Also bug 26711.)

Ordering of array of web elements returned is undefined

https://www.w3.org/Bugs/Public/show_bug.cgi?id=26706

Andreas Tolfsen:

The only element location strategy that mentions ordering of the returned web elements is the CSS selector in section 9.2.1, and it uses a reference to querySelectorAll. If drivers choose to implement this differently it feels to me that we should define the sorting order explicitly or reference the ECMAScript specification's.

This is further complicated by the other location strategies not having ordering defined.

Link to relevant section:

https://dvcs.w3.org/hg/webdriver/raw-file/tip/webdriver-spec.html#element-location-strategies

Treatment of DOM Text Nodes in mixed content nodes

Terminology: "Mixed content nodes" is used in its meaning defined by XML spec 1.0 5th edition, "DOM text nodes" has the meaning defined by DOM Core spec


Problem dexcription

The entire WebDriver specification addresses retrieving, the visibility, accessing and interaction with 'WebElement's (concept/representation of DOM element nodes), while any of these traits remain undefined to DOM text nodes in mixed content.

Some examples where the accessing text node in mixed content may arise: problem - get the text of <div id="one"> without including the content of the children blockquotes:

  This will result in plenty of white-space Text nodes, highly probable 
  without special meaning for testing
  <div id='one'>
    <blockquote id='two'>dolore ipsum</blockquote>
    Ah, the pain itself
    <blockquote id='three'>Ah, the pain</blockquote>
 </div>

or

  The non-breakable spaces may carry meaning relevant for testing
  <div id='one'>
    <blockquote id='two'>dolore ipsum</blockquote>&nbsp;&nbsp;Ah, the pain itself
 </div>

The fact that retrieving/accessing the context of text nodes in mixed content is conflated into the WebElement.innerText as the sole solution makes hard the separation of the text nodes content from the content of children elements. Various coping strategies may be found, ranging from:

  1. easy to implement but not exact - e.g. iterate and subtract the inner text of children WebElements from the inner text of the parent (may lead to a great number of whitespace-only text nodes with content that cannot be separated by spaces with relevance for testing - see second example),
  2. more effective but harder to implement and maybe not supported by all browsers - e.g. via scripts able to access the actual Web Page DOM, "injected" viaExecute/Execure Async, keeping in mind that XPath document.evaluate is not supported by a wide range of Internet Explorer versions.

Furthermore, the inability to reference individual text nodes in mixed content also restrict the applicable XPath expression that a WebDriver may accept only to those able to return an WebElement.

The current specification also fails to address the expected reaction of the WebDriver when presented with a valid XPath expression that does not result in a reference to a WebElement (e.g Element attribute or Text node).

In the context of the examples above, the following is the reaction of using the Selenium WebDriver (java API):

driver.findElement(By.xpath("//*[@id='one']/text()"));

org.openqa.selenium.InvalidSelectorException: invalid selector: 
The result of the xpath expression "//*[@id='one']/text()" is: [object Text].
It should be an element.

The need to access text nodes in mixed content seems to exist in the industry with indications that it is not an uncommon need and all work-around approaches mentioned above are taken (including the Script execution - which extends the range of cross-browser support of document.evaluate by using a TreeWalker).

An enhancement request raised against the Selenium WebDriver got a response indicating that the same request would need to be lodged with all the WebDriver providers, the missing functionality being traced to the lack of clarity in the WebDriver specification for handling such cases.


A minimal-change suggestion

The following is thought to be possible solution to the need without introducing new concepts/interfaces into the specification, but only by enhancing the behaviour of existing ones.

  1. ~~~WebElement should implement a way of accessing *text values* of any of the children nodes individually, no matter if text-type or element-type children (maybe supported by `GET /session/{session id}/element/{element id}/child-text/{child-index}` ???) - at least with this functionality in place, applying set differences operations between the texts of all children and the inner-text of element children may yield the set of text nodes' content on individual basis (unpleasant as it may be to do it a every time one needs to individualize text nodes content).~~~
    It is somehow hard to find a solution to accessing the content of the text nodes in mixed content without introducing new concepts/interfaces, especially because
    • the interleaving order of text nodes with element nodes (or indeed, other type of nodes) may be significant
    • there's no guarantees that text content and element content can be distinguishable by their textual content only
    As the author of the present issue doesn't have deep enough knowledge of the WebDriver specification, the suggested approach is formulated in terms of Java method specs (assuming the Selenium WebDriver as a reference implementation):
    interface WebElement {
      // Already specified by Selenium WebDrive. 
      // Allows obtaining all the WebElement children by, for example, using By.xpath("./*")
      java.util.List<WebElement>     findElements(By by);
    

    /** Proposed extension: if the parent WebElement is of a mixed-content type and
    * there are sibling nodes of text type preceding this node, the method will return
    * the content of these text type nodes in the natural order of appearance in the
    * document (that is, the last element in this list is the closest to this WebElement).
    * Otherwise returns an empty list.
    * The method should have the same effect as applying an XPath selection of
    * 'preceding-sibling::text()' except for the lack of preceding-sibling::
    * axis inversion.
    */
    java.util.List<String> getPrecidingTextNodes();

    /** Proposed extension: if the parent WebElement is of a mixed-content type and
    * there are sibling nodes of text type following this node, the method will return
    * the content of these text type nodes. Otherwise returns an empty list.
    * The method should have the same effect as applying an XPath selection of
    * 'following-sibling::text()'
    java.util.List<String> getFollowingTextNodes();

    /** Proposed extension: if the (assumed common) WebElement parent of the two
    * parameters is of a mixed-content type, the method will return the content of
    * the text nodes occurring between the two, in the in the natural order of
    * appearance in the document.
    * The value in the first position of the returned list will be the content of
    * the closest text node to the element represented by the first parameter, the last
    * value of the returned list is the closest to the second one.
    * If the two nodes are presented in the reversed order from the order established
    * by their natural position in the context of their parent, the return of the method
    * is an empty list.
    * If the first parameter is null, the result of this method is the same as calling
    * the getPrecidingTextNodes method for the second parameter.
    * If the second parameter is null, the result of this method is the same as calling
    * the getFollowingTextNodes method for the first parameter.
    * If the two nodes represented by the parameters are not children of this WebElement,
    * the method throws.
    *
    * Note: except for testing of direct parent-ship of this, the result of this method should
    * be equivalent with
    * child1.getFollowingTextNodes().retainAll(child2.getPrecidingTextNodes())
    */
    java.util.List<String> getTextNodesBetween(WebElement child1, WebElement child2);

    }

    Note: even if two consecutive sibling WebElement nodes are presented to
    the getTextNodesBetween method, the method can return a list with a size of more
    than one for cases in which other node types that are present in the document break
    the flow of the text (comment and processing-instruction nodes).

    Note: of course, a WebDriver API which would introduce specific representations for text(), comment() and processing-instruction() nodes would be an exact DOM model of the represented Web page and thus open the opportunities for a richer automation logic (not based solely on artefacts producing a visual representation on the screen). But this would make the present proposal go beyond the *minimal-change suggestion* scope announced in this section.

  2. the behaviour of WebDriver when referencing non-WebElements through XPath/XPointer should be changed to return the closest parent WebElement rather than signalling an error. Examples of such XPath selectors where the parent element is to be returned instead //*[@id=]/@name or //*[@id=]/text() or //*[@id=]/text()[] .

POST /session/{id}/timeouts should take an array of timeouts

https://www.w3.org/Bugs/Public/show_bug.cgi?id=26613

Andreas Tolfsen:

Currently POST /session/{id}/timeouts takes a hash map of {"type": TYPE, "ms": N} which allows setting individual timeouts.

If we consider a local end client binding that wants to set them all at once (pseudo code):

driver.timeouts = [{type: "page load", ms: 123},
                   {type: "implicit", ms: 456},
                   {type: "script", ms: 789}]

This will currently require them to make three individual calls to the endpoint.

An optimization is to allow the endpoint to take an array of dicts instead:

[{"type": TYPE, "ms": N}, …]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.