GithubHelp home page GithubHelp logo

enbility / ship-go Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 3.0 271 KB

EEBUS SHIP protocol implementation in go

Home Page: https://enbility.net

License: MIT License

Go 99.75% Shell 0.25%
eebus ship

ship-go's People

Contributors

derandereandi avatar kr0llx avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

ship-go's Issues

Deadlock on WS Connection Close

Problem Description

The following use-case should be assumed. Two ship nodes EVSE and HEMS. The EVSE makes a connection attempt to the HEMS which, however, refuses. The shortened log of the EVSE looks like this:

13:06:03 INFO  Local SKI:  41c98b1bbe5fc7657ce311981951f12d304ab419
...
13:06:04 DEBUG ski: 23ef43d7a6487fb9a44c7fbb0d2b840514b8faa6 name: Demo-HEMS-123456789 brand: Demo model: HEMS typ: EnergyManagementSystem identifier: Demo-HEMS-123456789 register: false host: partex.local port: 4815 addresses: [127.0.0.1]
13:06:04 DEBUG delaying connection to 23ef43d7a6487fb9a44c7fbb0d2b840514b8faa6 by 138ms to minimize double connection probability
13:06:04 DEBUG trying to connect to 23ef43d7a6487fb9a44c7fbb0d2b840514b8faa6 at xxx.local
13:06:04 DEBUG initiating connection to 23ef43d7a6487fb9a44c7fbb0d2b840514b8faa6 at xxx.local:4815/ship/
...
13:06:04 TRACE Recv: 23ef43d7a6487fb9a44c7fbb0d2b840514b8faa6 {"connectionHello":[{"phase":"pending"},{"waiting":60000}]}
13:06:04 TRACE Send: 23ef43d7a6487fb9a44c7fbb0d2b840514b8faa6 {"connectionHello":[{"phase":"ready"},{"waiting":60000}]}
13:06:04 TRACE Recv: 23ef43d7a6487fb9a44c7fbb0d2b840514b8faa6 {"connectionHello":[{"phase":"aborted"}]}
13:06:04 TRACE 23ef43d7a6487fb9a44c7fbb0d2b840514b8faa6 SHIP state changed to: 16
13:06:05 DEBUG 23ef43d7a6487fb9a44c7fbb0d2b840514b8faa6 websocket read error:  websocket: close 4452: Node rejected by application
10:06:05 DEBUG 23ef43d7a6487fb9a44c7fbb0d2b840514b8faa6 error writing to websocket:  websocket: close sent

It can be seen that the connection request has been denied as expected. Later the HEMS initiates a connection to the EVSE. The log shows the following:

10:17:54 DEBUG incoming connection request from 23ef43d7a6487fb9a44c7fbb0d2b840514b8faa6
10:17:54 DEBUG closing incoming double connection, as the existing connection will be used

An incoming connection can be seen. The connection is then closed because there is supposedly already another connection. However this is should not be the case, as this is the connection that has been closed in the first connection attempt.

This results in the HEMS not being able to connect to the EVSE. As the EVSE thinks there is already a connection.

Possible Reason

The reason could be a deadlock in the function CloseConnection(...).

func (c *ShipConnection) CloseConnection(safe bool, code int, reason string) {

How?!

  1. The connection request of the EVSE to the HEMS is aborted by the HEMS.
  2. This results in the function CloseConnection(...) being invoked.
  3. In this function the function CloseDataConnection(...) is called:
    c.dataWriter.CloseDataConnection(closeCode, reason)

    4.This function checks whether the connection is already closed and sends a reason if not.

    ship-go/ws/websocket.go

    Lines 287 to 293 in 14685b6

    if w.isConnClosed() {
    return
    }
    if reason != "" {
    _ = w.writeMessage(websocket.CloseMessage, websocket.FormatCloseMessage(closeCode, reason))
    }
  4. This function then checks again if the connection still exists and sends the message otherwise.

    ship-go/ws/websocket.go

    Lines 268 to 280 in 14685b6

    if w.isConnClosed() {
    return false
    }
    w.muxConWrite.Lock()
    defer w.muxConWrite.Unlock()
    err := w.conn.WriteMessage(messageType, data)
    if err != nil {
    // ignore write errors if the connection got closed
    w.closeWithError(err, "error writing to websocket: ")
    return false
    }
  5. However, if now the send fails. E.g. if the connection has been closed in between checking and sending. A error is returned and therefore the function closeWithError(...) is called.
    https://github.com/enbility/ship-go/blob/14685b670c3d84c05d6d1f940b214a0ca7dcc810/ws/websocket.go#L143C1-L147C2
  6. This function calls ReportConnectionError(err)
    c.CloseConnection(false, 0, "")
  7. Which again calls CloseConnection(...) from step 2
  8. Here lies the problem as the function CloseConnection(...) uses a sync.Once.do(...) call which is blocks until the first call ends, whis is not possible as it is called recursively. Resulting in a Deadlock.

    ship-go/ship/connection.go

    Lines 151 to 153 in 14685b6

    func (c *ShipConnection) CloseConnection(safe bool, code int, reason string) {
    c.shutdownOnce.Do(func() {
    c.stopHandshakeTimer()

Solution

No apparent solution.

  • Checking if CloseConnection(...) has been run allready would be possible but this defeats the usage of sync.Once.do(...) in the first place.
  • Maybe find a better separation of the CloseConnection(...) content in order to prevent this from happening.
  • Don't send in connection close.

Add option to block incoming requests from being accepted

Right now every incoming connection request is handled and it can go through and trigger a pairing attempt. If the pairing is denied, it should be possible to disallow even a connection attempt right away from the remote service.

It might be useful to block it for some time only. Because if both block the attempt completely, the devices might not be able to be paired again if required later on.

Improve SHIP message handling

To improve testing and hardening the code, SHIP message handling should be improved:

  • Make sure messages are only processed, when the local state is ready for it
  • ...

Process when one service removed pairing/registration

Assume two services both successfully paired/registrated with each other. Now service 2 removes this pairing/registration, as service 1 still has that information it will immediately re-connect. Service 2 is still able to accept the pairing/registration request and will stay in state Hello_Pending_Listen.

Service 1 can't know if trust was removed or if there was some other kind of error, e.g. WIFI connection loss.

Is this the desired behavior or should this be different. I can't find any reference to this scenario in the spec.

Refactor `AllowWaitingForTrust` pairing process

Right now with AllowWaitingForTrust it is checked, wether the service is in a state of user interaction to be able to respond to a pairing request. If that returns false, it will abort the handshake process.

The requirement of having a direct return value is not optimal here. Hence check out what better solutions would fit here.

Handling of simultaneous connection attempts according to spec is too complex

The SHIP Spec defines in 12.2.2 how to "Prevent Double Connections with SKI Comparison":

If a SHIP node recognizes that there are two or more simultaneous connections to another SHIP node, the SHIP node with the bigger 160 bit SKI value SHALL only keep the most recent connection open and close all other connections to the same SHIP node (a previous release of this SHIP specification may permit a different preference). If an older connection is already in the SME data exchange phase, the SHIP node with the bigger SKI value SHOULD initiate a connection termination as described in section 13.4.7.

This is quite complex and could still run into issues as both devices may consider both connections "as old as the other".

This stack instead does the following: The connection initiated by the higher SKI will be kept.
This is a lot simpler and always clear to both ends.

There is no guideline on when mDNS records should be announced and removed

The SHIP specification does not detail, when/how mDNS records should be announced or removed. The Elli Connect wallbox never removes the announcement, even when it is paired and connected to a HEMS.

Suggestion: When a device can only connect to a single remote device, it should only announce its mDNS record when there is either no pairing or the paired device is not connected

Provide SHIP ID, Certificate and SKI for a connection

For a connection, and check to approve a connection, the used SKI, SHIP ID and actual certificate should be provided.

The implemented service should be able to persist all of these for later usage and connection attempt verifications.

Deadlock on DisconnectSKI

Description

The following use-case should be assumed. Two ship nodes EVSE and HEMS. The HEMS successfully connects to the EVSE. Now the HEMS wants to disconnect and uses the Service.DisconnectSKI(...) method.

Observation

The EVSE gets the disconnect and successfully removes the connection. The HEMS however never shows the successful disconnect, and is then also not able to connect to the same EVSE again.

Problem

After invoking the Service.DisconnectSKI(...) method the call tree will eventually end at the following point:

ship-go/hub/hub_pairing.go

Lines 100 to 111 in 14685b6

func (h *Hub) DisconnectSKI(ski string, reason string) {
h.muxCon.Lock()
defer h.muxCon.Unlock()
// The connection with the higher SKI should retain the connection
con, ok := h.connections[ski]
if !ok {
return
}
con.CloseConnection(true, 0, reason)
}

As can be seen the corresponding connection for the given ski is accessed. For this the mutex muxCon is used. Note tho, that the muxCon.close() is deferred. This means that it is only unlocked when the above function returns.

However inside the following con.CloseConnection(...) method muxCon is used again, resulting in a deadlock. The the corresponding positions are listed below.

// only remove this connection if it is the registered one for the ski!
// as we can have double connections but only one can be registered
if existingC := h.connectionForSKI(remoteSki); existingC != nil {
if existingC.DataHandler() == connection.DataHandler() {

// return the connection for a specific SKI
func (h *Hub) connectionForSKI(ski string) api.ShipConnectionInterface {
h.muxCon.Lock()
defer h.muxCon.Unlock()
con, ok := h.connections[ski]
if !ok {
return nil
}
return con
}

Lack of Notification for Removed Ship Nodes via Avahi mDNS Integration

Description

When a ship node is added, the Avahi mDNS picks it up and notifies the application through the api.ServiceReaderInterface about all currently visible services. However, when a ship node is removed, no update is sent. This poses a challenge for the implementer of the interface as there's no indication that the node is no longer available.

Version

Tested with CEMd (577b756) and eebus-go (45c4e2d)

Expected Behavior

The implementer of the api.ServiceReaderInterface gets also updated when a ship node gets removed.

SHIP ID and accessMethods.id meant to be identical?

The ShipID is defined in SHIP Spec 3. as:

Each SHIP node has a globally unique SHIP ID. The SHIP ID is used to uniquely identify a SHIP node,
e.g. in its service discovery. This ID is present in the mDNS/DNS-SD local service discovery;

In SHIP 13.4.6.2 the accessMethods.id is defined as

The originator's unique ID

I assume those two to mean the same. Is this correct?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.