wicg / direct-sockets Goto Github PK
View Code? Open in Web Editor NEWDirect Sockets API for the web platform
License: Other
Direct Sockets API for the web platform
License: Other
If the API is denied in private browsing modes, then you can easily detect if a user is in a private browser mode by simply attempting a connection, if its denied then you know the user is in some private mode,
this could be used as a way to track users using private modes, or potentially as a way of 'browser fingerprinting'
The specification right now contains:
To write a full specification, you'll need normative algorithm steps, paralleling implementation code, for all public APIs.
A good example of this for a streams-based spec is https://wicg.github.io/serial/. Notable places:
The requestPort() method shows how to normatively specify methods steps, including the processing of dictionary members, checking for transient activation, checking for permissions policy, etc.
The readable getter steps show how to lazily construct a ReadableStream instance, with the appropriate pullAlgorithm and cancelAlgorithm. You might not want to do this lazily, but the stream construction example there is the right amount of detail.
Note how it's very precise about error conditions, task queuing, state management, etc. But it's appropriately vague about the actual getting-bytes parts, saying "Invoke the operating system to read up to desiredSize bytes from the port, putting the result in the byte sequence bytes." I.e. you don't need to explain to people how to use OS TCP libraries.
Similarly the writable getter steps for the WritableStream construction.
the TCP constructor defines keepAliveDelay as so:
https://wicg.github.io/direct-sockets/#tcpsocketoptions-dictionary
keepAliveDelay member
Enables TCP Keep-Alive by setting SO_KEEPALIVE option on the socket equal to the specified Keep-Alive ping delay in milliseconds.
i don't understand this as SO_KEEPALIVE is a boolean in the UNIX & Windows worlds, it isn't a ping delay in milliseconds.
so what system/API is this targeting, and how are implementers expected to support it ?
Hello, I'm very enthusiastic about this API and was wondering what is the current approach for getting it to run?
I did try read some discussion here: https://groups.google.com/a/chromium.org/g/blink-reviews/search?q=direct%20sockets but I can honestly say I don't understand the code or C++ well enough to follow what's being said.
Also, conservatively, when would the team expect to have something they could ship in the main Chromium? Even as a conditionally disabled feature? Very interested in this work, thank you.
There are basically two models for programming against UDP sockets:
This spec seems to have chosen (2), since it uses async iterables (which have a built-in buffer of that sort)... but it did so without using the streams API. If (2) is indeed the desired design, then I suggest using ReadableStream
/ WritableStream
.
FWIW, we had a similar discussion in the early days of the streams API around https://github.com/sysapps/tcp-udp-sockets, which ended up using (2).
The current specification only supports UDP sockets which are bound to a remote address and port. To listen for incoming UDP packets from any host the remoteAddress
and remotePort
parameters must be made optional. This will also require adding support for specifying a remoteAddress
and remotePort
when constructing individual UDP messages.
https://github.com/WICG/raw-sockets/blob/master/docs/explainer.md lists "Distributed Hash Tables for P2P systems" as a usecase, but the "security considerations" section seems to suggest that user consent would be required for each host. This combination seems infeasible.
Define a TCPServerSocket
interface which provides a single ReadableStream
where each "chunk" is a TCPSocket
. Similar to WebTransport.incomingBidirectionalStreams
.
E.g.:
Node.js' socket API allows for a allowHalfOpen
option whose behavior is simply defined as:
If allowHalfOpen=false
, when a socket receives a fin from the peer, the socket will be automatically transitioned into a draining state where the writable side is closed, existing writes in the queue are permitted to drain, followed by a fin sent immediately when the queue is drained.
If allowHalfOpen=true
, when a socket receives a fin from the peer, only the readable side of the socket is closed. The writable side remains open until the user code explicitly closes the stream.
At all times, the stream allows a writable-closed-readable-open state.
For Node.js sockets, the default is allowHalfOpen=false
.
The key question here is whether this is something that y'all considered for direct sockets and ruled out, didn't consider at all, decided Node.js' behavior is wrong, etc.
Threat
Attackers may use the API to by-pass third parties' CORS policies.
Mitigation
We could forbid the API from being used for TCP with the well known HTTPS port, whenever the destination host supports CORS.
I don't get it. Isn't the whole reason for CORS "Resource Sharing" (e.g. indirectly using resources like cookies belonging to another domain). With direct sockets, no shared resource is being accessed, all the browser does is open a TCP connection (e.g. no cookie accessed or sent anywhere).
I can understand that TCP connections could be abused by some websites (e.g. using your browser for spamming, accessing unsecured local services, etc.) but this can be solved with a permission style popup just like with the geolocation or webcam APIs.
But I don't get how it has anything to do with CORS or why some arbitrary ports should be blocked.
sadly, IPv6 / IPv4 dual stack reliability is still not a thing everywhere. as such, it can be helpful for applications to expose forcing of a particular IP version when connecting via a hostname. the currently proposed API lacks such a knob. and since there's no DNS type of API available currently either, it's (practically speaking) impossible to workaround.
can we add another property TCPSocketOptions & UDPSocketOptions to control this ? would want to support at least 3 values -- any (the default), ipv4, and ipv6. i have no opinions as how best to encode that in the standard.
TCP is a byte stream transport so it should create a ReadableByteStream
rather than a regular byte stream. This will also allow developers to create a "byob" stream reader and avoid allocating memory for receive buffers.
It seems the API would allow IP mulicast, and if so it would help to describe some use cases for it. I'm not so closely involved any more but I did write a white paper, describing BBC R&D's work experimenting with multicast in the browser. This was based on a multicast profile of the QUIC transport protocol, available as an Internet-Draft: https://tools.ietf.org/html/draft-pardue-quic-http-mcast-06
@GrumpyOldTroll is pursuing browser multicast from a different angle -the MulticastReceiverAPI - https://github.com/GrumpyOldTroll/wicg-multicast-receiver-api/blob/master/explainer.md. So I'm pinging him here.
What is the current status of this proposal? Is there still an intent on moving it forward?
how open are we to exposing more socket properties ? i'd really like to be able to set IP_TOS for interactive connections.
On Windows versions >= Windows XP SP2 raw sockets not only require administrator privileges but are are extremely limited to the point that sending TCP data over raw sockets is impossible. The standard way to work around that is to install a third party driver such as WinPcap which for obvious reasons is a potentially serious security risk, especially if the functionality of such a driver is exposed to ordinary users instead of being restricted to administrators only.
In addition, such drivers are typically detected and blocked by most antivirus/antimalware software and even if Chrome/Chromium came with its own such driver rather than just installing WinPcap it would almost certainly be flagged as a "Hacktool" by such security software as well.
APIs that allow attackers to connect to attacker-owned servers on arbitrary low ports are dangerous because such connections may be interpreted by routers/firewalls that attempt to enable reverse connections on dynamic, separate ports by sniffing application traffic.
This is why Flash banned traffic to low ports to fix CVE-2017-2938.
For an illustration of how much an attack might look, see e.g. https://samy.pl/slipstream/.
In order to be able to use raw sockets on Linux (and presumably BSD) a program must be run with root privileges which means making Chrome/Chromium suid root. Even if said privileges are dropped immediately after setting CAP_NET_RAW (and presumably CAP_NET_ADMIN for setting promiscuous mode and/or mac spoofing) the security risks are still significantly higher than not having it run as root at all.
The security considerations section says that user consent would be host-specific, but also that DNS rebinding protection would be limited to preventing connections to "private network addresses". There are two big problems with this:
The most common method of having a page run Javascript the page doesn't know about is to insert it in an ad (see, for instance, Geoff Huston's IPv6 penetration measurements, which are mainly done via Google ads).
The security implications of this form of code injection needs to be explored.
Hi.
There is also a very promising use case - this is Chrome Headless on the backend.
IBM browser functions
currently the various socket settings can only be set once at construction (connect) time. i don't understand why. *NIX platforms allow all of these (send buf size, recv buf size, keep alive, nodelay) to be modified on the fly. Windows does too afaik. is there a platform that doesn't allow this that the API is catering to ?
having this limitation makes it difficult to map existing POSIX code onto the web platform without rewriting/forking things.
I did a read-through of the explainer and in addition to some more specific issues I will file, here are some minor ones. Especially-important ones are in bold. The ordering is roughly top-of-document to bottom-of-document.
The frequent mention of XMLHttpRequest is a bit strange and distracting, since that API has been superceded by the fetch API. I'd suggest just replacing mentions of it with fetch.
The "Initial Focus" section could be clearer on which bullet points from "Use cases" are in scope vs. out of scope for the initial focus. It's clear DHTs are out of scope but I'm unclear if avoiding "IP multicast and UDP Broadcast" removes other possibilities. I'd suggest a summary paragraph like: "In terms of the [use cases] identified above, this means our initial focus will be on solving: X, Y, Z, and W. Whereas, A, B, and C will not be solvable with the current proposal."
The "Permissions Policy integration" section could benefit from some background explanation of why permissions policy should be integrated at all. Something like, "This means that the direct sockets APIs will not be available to cross-origin subframes, unless the outer page explicitly delegates that ability."
"Security Considerations" mentions "high-trust mode" but does not give any further detail. This should link somewhere, probably.
The mitigation for CORS bypasses should mention this is not a complete mitigation as it means you could attack servers using anything besides "the well known HTTPS port" (which I assume means 443). E.g. you could attack port 80 or port 8080 or port 3000.
"User agents should reject connection attempts when Content Security Policy allows the unsafe-eval source expression": is this a "should" or a "must"/"will be required to"? In general any use of "should" is a bit suspicious.
The explainer gestures at a user-facing security model several times but never explains it. E.g. it talks about users typing things, or about an option to permit future connections from an origin to specific hosts. I think you need an up-front section, probably right after "Initial Focus", that explains this. Even if the goal is not for the spec to constrain user agent UI, you should explain what Chromium is planning to implement, as an example of the kind of UI that this API and your security analysis is being designed around.
The current name might be associated with IPPROTO_RAW / SOCK_RAW, as documented in https://linux.die.net/man/7/raw
The name native sockets
has been suggested.
WebSocketStream (shipped in Chromium) is an example of a duplex stream of the type used here for TCP sockets. However, the design differs from yours in how the stream is created and opened. It would probably be worth aligning.
See WebSocketStream API design and the explainer. I can't actually find the spec... @ricea?
Example: Many organizations permit submitting mail from an user's PC unencrypted and unauthenticated.
This extension would allow any Web page or service to connect to port 25 of the mailserver and inject mail.
I just want to mention how much I'm in favour of this API. Current in-browser p2p is a lame duck because of NAT. In order to bootstrap a mesh, we have to rely on centralized websockets servers for signaling. The introduction of raw sockets opens up the possibility for in-browser servers, which I believe to be essential for the next generation of web. Of course we do want a permission ask when user visits.
might be getting a little ahead of things, but can we make sure that zone indexes are handled by the spec, even if as a non-normative section ?
https://en.wikipedia.org/wiki/IPv6_address#Scoped_literal_IPv6_addresses_(with_zone_index)
i mention this as it's easy to overlook, and would be nice to have it as part of the initial spec when projects start to implement it to help mitigate.
i don't know if there are other web standards that expose interface level information that could be used as precedence. WebRTC maybe?
Threat
Attackers may use the API to by-pass third parties' CORS policies.
Mitigation
We could forbid the API from being used for TCP with the well known HTTPS port, whenever the destination host supports CORS.
A lot of services (Reddit and Twitter for instance) are available over the web yet have a CORS policy that forbids third-party websites from accessing them. That is so because they don’t want malicious scripts on third-party sites accessing their data.
Third-party clients for such services are in demand. The only way to do one at the moment is with a native app, which are plentiful for the aforementioned Reddit and Twitter.
I’ve personally made a third-party client for jeuxvideo.com’s forums in the past (biggest forums in Europe, and my app was eventually responsible for 10% of the messages posted there), which was loved both by users and the staff as it resulted in much more engagement. It was a web app though, with a back-end proxying the site, and it was delicate to do as the developers had to make special exceptions in their infrastructure specifically for my service when they had a lot of other priorities, resulting in my service often rendered unavailable.
Making a third-party client as a web app makes it available on all platforms, without requiring a download, while requiring less effort to implement it than on just one platform natively, and there are more developers available for the web platforms that can create such solutions. It’s an absolutely wonderful thing.
As long as the user is made aware clearly of what allowing Raw Sockets on an HTTPS service entails, I believe this should be made possible as it would unlock a ton of useful and very accessible services, without more harm done than is possible with non-HTTPS services accessed through Raw Sockets.
Hi! The proposal says “The api will only be available in high-trust mode.”
Is this high trust mode specified/described somewhere? If so, can you link it, please? If not, can you describe it here?
I was reading through #1 and w3ctag/design-reviews#548, and I feel that the security concerns could be mostly addressed by doing the following in concert:
*
and *:port
, the prompt could ask if the user would like to give the site permission to search their network and connect to local devices on their network (using port whatever if such a port is given) instead, possibly with an additional snippet noting the obvious privacy ramifications. This would appear to the user to be a lot more dangerous than simply "can this website connect to anything on the internet", and so it'd be a little harder to simply socially engineer around.I won't say it would address the social engineering concerns from the design review in full (virtually anything can be socially engineered around at least in theory - just ask your average scam boss or physical pentester), but I do feel this would at least be a step in the right direction.
So, there is clearly going to be massive security concerns with this API... as there has been with every attempt to do this in the past (e.g., https://github.com/sysapps/tcp-udp-sockets ... to Web Sockets themselves). Before we even start on this, it might be worth having a call with various folks around how feasible it is to even attempt this API.
the spec currently says:
If options["sendBufferSize"] is equal to 0, throw a TypeError.
If options["receiveBufferSize"] is equal to 0, throw a TypeError.
shouldn't those be less than or equal to 0
?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.