saltyrtc / saltyrtc-meta Goto Github PK
View Code? Open in Web Editor NEWProtocol description and organisational information for SaltyRTC implementations.
License: MIT License
Protocol description and organisational information for SaltyRTC implementations.
License: MIT License
This is not directly something related to the protocol, but should be implemented consistently across all clients. Therefore the issue in the meta repo.
How should we design the public API?
Right now, it works as follows in the Java version:
// create new instance
KeyStore permanentKey = new KeyStore();
SaltyRTC salty = new SaltyRTC(permanentKey, host, port, sslContext);
salty.connect()
// Wait until connected
salty.events.signalingStateChanged.register(new EventHandler<SignalingStateChangedEvent>() {
@Override
public boolean handle(SignalingStateChangedEvent event) {
if (event.getState() == SignalingState.CONNECTED) {
LOG.info("Connected!");
}
}
});
// Retrieve public permanent key and auth token
byte[] key = salty.getPublicPermanentKey();
byte[] token = salty.getAuthToken();
// Get current state
SignalingState state = salty.getSignalingState();
// Get current signaling channel
SignalingChannel channel = salty.getSignalingChannel();
// Send signaling data (e.g. offer, answer, ICE candidates)
salty.sendSignalingData(new Data("offer", "offer-string"));
// Wait for data (e.g. offer, answer, ICE candidates)
salty.events.data.register(new EventHandler<DataEvent>() {
@Override
public boolean handle(DataEvent event) {
if (event.getData().getType() == "answer") {
// handle
}
}
});
// Do handover
salty.handover(peerConnection);
Now the question is how to send user data. One approach is to wrap a data channel.
SecureDataChannel sc = salty.wrap(dataChannel);
The other approach would be to send data through the SaltyRTC instance.
salty.sendData(dataChannel, data);
I'd favor the second approach because it does not require wrapping (more code, more complexity, presence of encryption is not directly visible, more complexity decreases security). Otherwise users might mix up secure and non-secure data channels.
(Note that this does not yet consider the tasks system. But tasks need to be implementable outside of SaltyRTC, so it would not change much about the basics.)
In order to detect sub-protocol downgrade attacks, the client must send the same list of sub-protocols it has sent before the WebSocket connection has been established in the client-auth
message.
The server uses the same algorithm to calculate the best shared sub-protocol as before and checks that the chosen sub-protocol matches the initially chosen sub-protocol. Otherwise, it is possible that a downgrade attack took place. In this case, the server must close the connection with the close code for protocol error.
Although the signalling server does not need to be trusted, it is still a good idea to use transport encryption on the signalling channel. Otherwise, WebSocket paths, etc. could be intercepted which makes the signalling protocol vulnerable to DoS attacks.
Specifies which task has to be fulfilled by SaltyRTC's signalling channel. The responder sends a list of tasks sorted by priority in descending order in its auth
message. The initiator also holds such a task list and determines the best task match which it then sends in its auth
message.
For example, the responder sends the list ['ortc', 'webrtc']
in its auth
message. However, the initiator may not have ORTC support but it does support WebRTC. Its task list would be ['webrtc']
. Therefore, it would send 'webrtc'
in its auth
message and both clients know that their task is to set up a WebRTC peer connection.
None so far.
Should the sequence number be incremented globally, per session or per endpoint (server or responders)? (From what I understand, the sequence number has the same scope as the cookie and should be incremented for every message sent to the same endpoint.)
Should probably be written down in the spec.
Instead of repeating the whole messages' content which could not be relayed, the signalling server should send a SHA-256 hash of the message which could not be relayed.
The examples should be updated to describe packets in an extended JSON-like format:
{
"integer": 1,
"string": "hello",
"binary": b"ff0eee"
}
'b'-denoted data is binary and is represented as hex to make it easier to read.
When a client is connected to another client via the server, the close code in case of a protocol error should be somehow sent to the other client.
We should add a short recommendation for storing and finding trusted keys.
Complete the Security Mechanisms section.
Add this to the protocol:
A client MUST repeat messages after a
send-error
response from the server at least once but SHOULD send itn
times.
Reason: When a handover occurs in some tasks (WebRTC, ORTC) a message may be lost on the signalling channel. After a send-error
the message will be repeated at least once over the now established signalling channel the original signalling channel has been handed over to.
We should specify n
.
NaCl Box := Nonce || Ciphertext
When a SaltyRTC client or server receives a message, the nonce MUST be validated (e.g. the Cookie, the Channel Number and the Sequence Number). In case a sequence number would overflow, the Channel Number MUST be changed but MUST also remain unique per peer.
In case a token is being used, the responder starts sending a 'token' message. But who sends which message (first) if both peers trust each other?
For easy validation, we should provide constraints (length, value range, ....) for each field where the value is not crystal clear.
In TypeScript:
interface Message {
type: string,
}
interface Data extends Message {
type: 'data',
data_type: string,
data: any,
}
The data_type
field allows us to register event handlers for specific data messages only.
Open question: Should we make the data_type
field optional? If it's not specified, then the use of onData('type', function(msg) { ... })
type handlers is not possible in a generic way.
Related to #33.
We already discussed this sometime, but the discussion was probably not in this issue tracker.
When creating data channels in the browser, the id increments for each new channel. According to a few quick tests I did, once it reaches 1023 every new data channel will get the id 1023.
We should verify whether that's really true and then find out whether there's a way to prevent that behavior.
Instead they are JS RTCPeerConnection
instances. However, the interface does differ from the one that is currently used for ORTC. Meh! Can somebody confirm this?
In general, the client behaviour is not well documented in the protocol. We should write a reference client implementation and add missing behaviour to the protocol.
What should happen if...
In the browser, each ICE candidate triggers an event. To prevent sending signalling messages for every single ICE candidate, browser implementations should wait a very short amount of time and bundle candidates before they are being sent.
In ORTC, there is no requirement for the packet types offer and answer. Furthermore, the packet type candidates should be updated to cover both SDP-encoded candidates (WebRTC) and JSON-encoded candidates (ORTC).
We would like to introduce negotiable parameters that are being sent in client-auth
and server-auth
:
client-auth
.As described in Terminology, the NaCl Box is not at the end of the message but at the start. Furthermore, we should remove MAC from the figure to avoid confusion. (Yes, it's authenticated but we don't need to use or pass the MAC as an argument anywhere, NaCl does it for us).
Right now the mechanism to differentiate between encrypted and unecrypted packets is trial-and-error, since there is no prefix that allows the server to differentiate between the two pakets.
First of all, trial-and-error might be inefficient in some implementations. But the other aspect is that you cannot stream the data directly into a parser, because of the possible need for re-parsing. That means that you need to allocate memory to store the message, then you can try to decrypt it, and if that fails, parse it directly as msgpack.
But if we would add a type flag (e.g. a "flags" byte in the beginning) the implementation could differentiate the two packet types without any lookahead. That allows you to "stream" the bytes directly into the parser/decoder, without any allocation.
In Python terms, you could still follow the EAFP principle by trying to decrypt data[1:]
and falling back to parsing the data as msgpack.
Currently, the 'send-error' message contains a SHA-256 hash of the message that could not be relayed by the server. I'm not completely sure if it would make more sense to send the Channel Number and the Sequence Number instead. Please, comment!
The leading receiver byte is not part of the HMAC of the NaCl box which means the server could potentially modify it and the clients would not notice it. An alternative would be to make the receiver byte part of the channel number which has no proper use case for the signalling channel anyway.
For long preservation of the protocol. ๐
Should the 'restart' packet type trigger an ICE restart or a completely new WebRTC session?
Currently, the scope of the nonce fields is awkward after the handover. Basically, the normal human being would transition the peer scope over to a data channel instance scope.
This is really complicated and awkward. The spec of the task also only barely depicts the scopes above.
Is this correct?
The following things need to be validated on each incoming message:
Only applicable to signaling nonces
0x00
(server)0x00
(server), or...0x02-0xff
(responder)0x01
(initiator)Only applicable to signaling nonces
server-auth
message:
0x00
(undefined)server-auth
message:
server-auth
message:
0x01
(initiator)You should consider signing git commits & releases.
Of course this issue applies for all repos of SaltyRTC, I just posted it here once.
A WebSocket client can provide a list of subprotocols when connecting. We should choose a name for the protocol (and probably a version number, too).
Note: We do not need such a mechanism for #3 because the subprotocol will remain the same when switching over the communication channel.
The negotiation (to use ORTC or WebRTC) needs to be done in the handshake of the peers. In addition, we need to provide the subprotocol during that handshake. The responder sends a list of subprotocols and the initiator chooses which subprotocol to use.
Although chunked-dc
is quite a simple format, we should probably write a short specification for it.
The resulting spec should be referenced in the WebRTC and ORTC task.
The nonce composition is clear in context with the Data Channel. However, the nonce is not mentioned in context with the Signalling Channel.
This issue is a result of the discussion in #30
I think you need an icon for SaltyRTC. E.g. for using it in the GitHub organisation... ๐
And for making SaltyRTC popular. ๐ ๐
The WebSocket protocol allows us to close a connection with a specific close code. The code range [3000..3999] can be used for libraries and frameworks.
Proposed close codes:
Because we are using MsgPack, no data needs to be hex-encoded anymore (apart from the public key in the WebSocket Path). The protocol and the examples should be updated.
Checklist with entries like the following:
Then we can "rate" client libraries using that checklist and create a support matrix. It's also useful for people that want to write new libraries.
The spec only says that the path should be the hex encoded public key of the initiator.
It does not specify whether the letters in the hex string should be upper- or lowercase though.
The spec for the candidate message is as follows:
{
"type": "candidates",
"session": b"bb938a583e67923c6e9f2185a6a03773",
"sdp": [
"...",
"..."
]
}
That assumes that the candidates are sdp strings. In the WebRTC RTCIceCandidate
structure, that is contained in the candidate
string attribute. On the other hand, ORTC does not provide such an attribute: http://ortc.org/wp-content/uploads/2015/06/ortc.html#h-rtcicecandidate
Can we simply send the entire RTCIceCandidate
structure instead? Or is that too implementation-specific? Maybe we can define a key mapping?
There's a possible MITM attack between client and server I've discovered some time ago. To be honest, it's not much of a problem because messages between clients are still end-to-end encrypted and cannot be decrypted by the server at all. It only works on unsecure WebSocket servers (no TLS) or WebSocket servers with broken TLS. Still, it makes our transport encryption look rather pointless.
C = Client
S = Server
A = Attacker
A generates two key pairs, a1 and a2. pk_a1 is the public key of a1 and sk_a1 is the secret key of a1.
A replaces the path the initiator provided by pk_a2.
S <-- Path: pk_a2 ---------- A <-- Path: pk_c ------------ C
A replaces the key of the server in server-hello by pk_a1.
S --- server-hello: pk_s --> A --- server-hello: pk_a1 --> C
A decrypts client-auth by box(sk_a1, pk_c)
, then encrypts and sends it to the server by box(sk_a2, pk_s)
.
S <-- client-auth ---------- A <-- client-auth ----------- C
A decrypts server-auth by box(sk_a2, pk_s)
, then encrypts and sends it to the initiator by box(sk_a1, pk_c)
. Any further messages between initiator and server can be decrypted and modified by A.
S --- server-auth ---------> A --- server-auth ----------> C
The benefit for A is questionable at best because A needs to change the path. Therefore, no responder would be able to connect to the initiator. Still, A can take the role of the server for the initiator.
A generates two key pairs, a1 and a2. pk_a1 is the public key of a1 and sk_a1 is the secret key of a1.
A does not need to change the path for responders as the path does not take part during authentication of responders.
S <-- Path: pk_i ----------- A <-- Path: pk_i ------------ C
A replaces the key of the server in server-hello with pk_a1.
S --- server-hello: pk_s --> A --- server-hello: pk_a1 --> C
A replaces the key of the responder in client-hello with pk_a2.
S <-- client-hello: pk_a2 -- A <-- server-hello: pk_c ---- C
A decrypts client-auth by box(sk_a1, pk_c)
, then encrypts and sends it to the server by box(sk_a2, pk_s)
.
S <-- client-auth ---------- A <-- client-auth ----------- C
A decrypts server-auth by box(sk_a2, pk_s)
, then encrypts and sends it to the responder by box(sk_a1, pk_c)
. Any further messages between responder and server can be decrypted and modified by A.
S --- server-auth ---------> A --- server-auth ----------> C
In this case, the benefit for A is much better than with an initiator as the path does not need to be changed. Therefore, clients that communicate with one another via the server would not even know that their communication is completely visible to A. Still, the benefits are marginal: A would be able to announce new initiators, drop responders and let responders repeat messages.
So, my idea is to add an OPTIONAL but RECOMMENDED section to the server.
The server has an OPTIONAL permanent key pair (sk_sp and pk_sp). Clients know the public key of the server's permanent key pair. The key has no lifetime or any other fancy certificate stuff, it should be as simple as possible.
In the server-auth, we add another field that is REQUIRED to be set by the server if he has a permanent key pair. That field, named signed_keys contains sign(pk_s || pk_c, sk_sp)
. The client verifies that signature. (Note: I've edited my previous stupid idea here)
And that's it. An attacker would need to have the permanent key pair to sign its own keys.
Have I missed something? Is signing the concatenation of the public keys enough to mitigate replay attacks? Or should we add a x byte random field as well and add that to the concatenated data? @dbrgn
... for the WebRTC and ORTC tasks until the peer-to-peer connection is completely set up.
As soon as the signalling channel has done its job, there is no good reason to leave the signalling connection open. Further ICE candidates and other signalling messages can be sent over a dedicated signalling data channel.
Instead of JSON, we use MessagePack to pack/unpack binary data. Consequently, all former hex-encoded values must be sent as binary blobs.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.