drogue-iot / drogue-cloud Goto Github PK
View Code? Open in Web Editor NEWCloud Native IoT
Home Page: https://drogue.io
License: Apache License 2.0
Cloud Native IoT
Home Page: https://drogue.io
License: Apache License 2.0
The release archive promote the use of ./scripts/drogue.sh
in the generated assets ( install-*.zip ) while the source code contains ./hack/drogue.sh
This may lead to inconsistency in documentation.
Originally "make test" used docker/podman to run the build and the tests.
However, the tests now start containers as well. So that means that we start containers inside of containers. Unfortunately that broke "make test". "make container-test" still works, but requires the user to have all kind of development tools installed, which can be tricky on windows and mac os.
API tokens currently allow full access to a project. Allow users to limits this to specific resources and operations. Not too fine grained, but having some basics like read, write, admin.
1- start to consume events with the MQTT endpoint :
2- delete the app
=> The MQTT session should be ended I suppose.
The mqtt-integration is stuck in a while loop :
[2021-08-11T10:55:38Z INFO rdkafka::client] librdkafka: PARTCNT [thrd:main]: Topic events-example-app partition count changed from 3 to 0
[2021-08-11T10:55:38Z ERROR rdkafka::client] librdkafka: Global error: UnknownPartition (Local: Unknown partition): events-example-app [0]: desired partition is no longer available (Local: Unknown partition)
[2021-08-11T10:55:38Z ERROR rdkafka::client] librdkafka: Global error: UnknownPartition (Local: Unknown partition): events-example-app [1]: desired partition is no longer available (Local: Unknown partition)
[2021-08-11T10:55:38Z ERROR rdkafka::client] librdkafka: Global error: UnknownPartition (Local: Unknown partition): events-example-app [2]: desired partition is no longer available (Local: Unknown partition)
I have the same issue with the websocket service
Currently all services/endpoints expose their "health" information on the main API endpoint. That should change in a way that each endpoint/service has a dedicated "health" endpoint, which is only exposed internally.
In the past we had issues with Knative deployments having only one port to check. However, this should change as we will be using normal deployments for most services/endpoints soon.
The "authentication service" already is deployed that way, so that might be a good first candidate.
A goal of this task should be, that at least the configuration for this endpoint is consistent. The current implementation of the endpoints should be kept the same. Using some alternative way, for existing implementation for providing Kubernetes readiness/liveness information in Rust can be done in a separate issue.
The DCO app can be installed by the maintainers to enforce sign-offs, with minimal effort.
one helpful thing might be for the landing page of http://sandbox.drogue.cloud to mention drg login http://api.sandbox.drogue.cloud/
Instead of navigating to getting started > register devices
We want to have at least one additional deployment for a public cloud provider.
The goal is not to use OpenShift and abstract away all the differences. The goal also is not host a bunch of additional deployments for all the different APIs and requirements.
But to see what changes are required for a specific Kubernetes variant like GKE, Azure, AWS, DigitalOcean, … and learn some things that might help generalize the deployment.
Currently we are using webpack and wasm-pack to compile/package the frontend.
However, it looks like webpack 5 has issues with WASM and webpack 4 is no longer maintained, accumulating NPM security advisories.
Also, the stability of the toolchain leaves room for improvement.
Webpack is used for the proper console-frontend project, but also for the SwaggerUI embedded in that. So a replacement would need to think about the implications of that too. Splitting these up in two different "projects" would work too of course.
One potential replacement could be "trunk", which seems to become more popular in the Rust world: https://github.com/thedodd/trunk
Currently we have an OpenIdTokenProvider
and four different REST clients written in Rust.
Two of them use the token provider trait. One needs to pass in the original request token, and the third is the command line client.
We need to refactor this so that:
Maybe we should consider extracting this component into a dedicated repository and make it a dependency the drogue cloud backend and the command line application.
Currently we have a bunch of services, with a bunch of endpoints.
However, as we aligned the different APIs now, we could/should offer a single API endpoint.
Provide a basic metrics setup using prometheus:
drogue-cloud-metrics
Implement an example controller/operator using the device registry events. Syncing devices from our internal registry to TTN, using the v3 API.
This should:
Trying to use an API key to authenticate a client connecting to the MQTT integration, I get:
CONNECT failed as CONNACK contained an Error Code: BAD_USER_NAME_OR_PASSWORD.
When clicking delete nothing happens.
It'd be nice to have switches that would install separate components like knative infra, drogue infra, additional services, etc.
The installer zip for kubernetes cluster type is not included in releases.
Proposal : /api/v1/<appId>/devices/
Evaluate the different options and implement at least one Azure IoT compatible endpoint
Currently Keycloak is not set up to provide the required aud
(audience) information in the token. The console backend will reject incoming requests.
To my understanding this could be implemented by adding the following to the realm config:
clientScopes:
- name: good-service
attributes:
"include.in.token.scope": "true"
"display.on.consent.screen": "true"
protocolMappers:
- name: app-audience
protocol: openid-connect
protocolMapper: protocolMapper
consentRequired: false
config:
"included.client.audience": "drogue"
"id.token.claim": "false"
"access.token.claim": "true"
However, that isn't supported by the current version of the keycloak operator. It is only on "master" at the moment.
Maybe we can add this to the client also: https://stackoverflow.com/a/61059910
More information:
There are seveal packages that are needed to build the modules, like cmake, cyrus-sasl-devel and others.
They are included in the github actions image so it all works but if one want to run the build locally it's a bit cumbersome to go through each package as the build fails.
Having a list of package in the readme would be nice :)
Currently we have a little bit of a mess going on with IoT related information (device id, model id, …), mapping them to the cloud events attributes.
We need to define how to best use the existing attributes, and fix any spec violating constructs that we currently might have.
Currently everything is tied to drogue-cloud
namespace
kubectl describe ksvc command-endpoint
Name: command-endpoint
Namespace: drogue-iot
Labels: app.kubernetes.io/part-of=endpoints
image-source=build
Annotations: serving.knative.dev/creator: kubernetes-admin
serving.knative.dev/lastModifier: system:serviceaccount:knative-eventing:eventing-webhook
API Version: serving.knative.dev/v1
Kind: Service
Metadata:
Creation Timestamp: 2021-01-13T17:05:42Z
Generation: 2
Managed Fields:
API Version: serving.knative.dev/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:labels:
.:
f:app.kubernetes.io/part-of:
f:image-source:
f:spec:
.:
f:template:
.:
f:metadata:
.:
f:labels:
.:
f:bindings.knative.dev/include:
f:image-source:
f:spec:
Manager: kubectl-client-side-apply
Operation: Update
Time: 2021-01-13T17:05:42Z
API Version: serving.knative.dev/v1
Fields Type: FieldsV1
fieldsV1:
f:spec:
f:template:
f:spec:
f:containers:
Manager: webhook
Operation: Update
Time: 2021-01-13T17:06:05Z
API Version: serving.knative.dev/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:address:
.:
f:url:
f:conditions:
f:latestCreatedRevisionName:
f:latestReadyRevisionName:
f:observedGeneration:
f:traffic:
f:url:
Manager: controller
Operation: Update
Time: 2021-01-13T17:06:15Z
Resource Version: 7959
Self Link: /apis/serving.knative.dev/v1/namespaces/drogue-iot/services/command-endpoint
UID: 541a28b8-9a16-4aad-8b84-70e48c9a9890
Spec:
Template:
Metadata:
Creation Timestamp: <nil>
Labels:
bindings.knative.dev/include: true
Image - Source: build
Spec:
Container Concurrency: 0
Containers:
Env:
Name: RUST_LOG
Value: info
Name: K_SINK
Value: http://iot-commands-kn-channel.drogue-iot.svc.cluster.local
Name: K_CE_OVERRIDES
Image: quay.io/dejanb/command-endpoint:latest
Name: user-container
Readiness Probe:
Success Threshold: 1
Tcp Socket:
Port: 0
Resources:
Enable Service Links: false
Timeout Seconds: 300
Traffic:
Latest Revision: true
Percent: 100
Status:
Address:
URL: http://command-endpoint.drogue-iot.svc.cluster.local
Conditions:
Last Transition Time: 2021-01-13T17:06:13Z
Status: True
Type: ConfigurationsReady
Last Transition Time: 2021-01-13T17:06:15Z
Status: True
Type: Ready
Last Transition Time: 2021-01-13T17:06:15Z
Status: True
Type: RoutesReady
Latest Created Revision Name: command-endpoint-00002
Latest Ready Revision Name: command-endpoint-00002
Observed Generation: 2
Traffic:
Latest Revision: true
Percent: 100
Revision Name: command-endpoint-00002
URL: http://command-endpoint.drogue-iot.172.18.0.2.nip.io
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 10m service-controller Created Configuration "command-endpoint"
Normal Created 10m service-controller Created Route "command-endpoint"
I am using the following script
script.sh
which does the following
After running the script, the trust anchor is successfully added to the Application object, this can be verified as the app object contains the following.
"status": {
"trustAnchors": {
"anchors": [
{
"valid": {
"certificate": "...",
"notAfter": "2022-06-23T12:31:15Z",
"notBefore": "2021-06-23T12:31:15Z",
"subject": "O=Drogue IoT, OU=Cloud, CN=app12"
}
}
]
}
}
Then I use the device certificate to authenticate using the following command.
http --cert test-certs/device-certs.pem --cert-key test-certs/app-private.key POST https://http.sandbox.drogue.cloud/v1/foo
but it returns 403
.
While monitoring the server logs while running this request with @jbtrystram.
We found this
[2021-06-23T12:31:31Z DEBUG drogue_cloud_http_endpoint] Accepting client certificates: "[organizationName = \"Drogue IoT\", organizationalUnitName = \"Cloud\", commonName = \"d7\"]"
[2021-06-23T12:31:31Z DEBUG drogue_cloud_http_endpoint] Accepting client certificates: "[organizationName = \"Drogue IoT\", organizationalUnitName = \"Cloud\", commonName = \"d7\"]"
[2021-06-23T12:31:31Z DEBUG drogue_cloud_http_endpoint::x509] Try extracting client cert
[2021-06-23T12:31:32Z DEBUG actix_web::extract] Error for Option<T> extractor: UnknownError
[2021-06-23T12:31:32Z DEBUG drogue_cloud_http_endpoint::telemetry] Publish to 'foo'
[2021-06-23T12:31:32Z DEBUG actix_web::middleware::logger] Error in response: HttpEndpointError(AuthenticationError)
[2021-06-23T12:31:32Z INFO actix_web::middleware::logger] 10.130.2.1:53128 "POST /v1/foo HTTP/1.1" 403 76 "-" "HTTPie/0.9.8" 0.000071
[2021-06-23T12:31:33Z DEBUG drogue_cloud_http_endpoint::x509] Try extracting client cert
[2021-06-23T12:31:33Z DEBUG actix_web::extract] Error for Option<T> extractor: UnknownError
[2021-06-23T12:31:33Z DEBUG drogue_cloud_http_endpoint::telemetry] Publish to 'status'
[2021-06-23T12:31:33Z DEBUG drogue_client::openid::provider] Token still valid
[2021-06-23T12:31:33Z DEBUG hyper::client::pool] reuse idle connection for ("http", authentication-service)
[2021-06-23T12:31:33Z DEBUG hyper::proto::h1::io] flushed 1629 bytes
[2021-06-23T12:31:33Z DEBUG hyper::proto::h1::io] parsed 3 headers
[2021-06-23T12:31:33Z DEBUG hyper::proto::h1::conn] incoming body is content-length (464 bytes)
[2021-06-23T12:31:33Z DEBUG hyper::proto::h1::conn] incoming body completed
[2021-06-23T12:31:33Z DEBUG hyper::client::pool] pooling idle connection for ("http", authentication-service)
[2021-06-23T12:31:33Z DEBUG reqwest::async_impl::client] response '200 OK' for http://authentication-service/api/v1/auth
Similar to the MQTT Integration endpoint, we should have a web service version of this. This could also replace the current SSE based "Spy" endpoint.
Add tracing capabilities, allowing to use Jaeger/OpenTracing with our services/endpoints.
I just had two HTTP commands with a command delay, which resulted into a panic due to an .unwrap()
:
[2021-03-10T15:44:37Z DEBUG drogue_cloud_endpoint_common::commands] Device Id { app_id: "app_id", device_id: "device_id" } subscribed to receive commands
[2021-03-10T15:44:37Z DEBUG drogue_cloud_endpoint_common::commands] Device Id { app_id: "app_id", device_id: "device_id" } unsubscribed from receiving commands
thread 'actix-rt|system:0|arbiter:0' panicked at 'called `Option::unwrap()` on a `None` value', http-endpoint/src/command.rs:[2021-03-10T15:44:37Z DEBUG drogue_cloud_endpoint_common::commands] Device Id { app_id: "app_id", device_id: "device_id" } unsubscribed from receiving commands
27:39
thread 'actix-rt|system:0|arbiter:2' panicked at 'called `Option::unwrap()` on a `None` value', http-endpoint/src/command.rs:27:39
Although we know Kubernetes and the ecosystem well, users new to Drogue IoT may not be familiar with these technologies, and the bar for getting drogue cloud working just for evaluation is high.
The sandbox aleviates this a bit, but in the end it is just a sandbox.
If we could provide a smaller, more self-contained version of the drogue cloud that could run locally, ideally in a single binary packaged the same way as the drg
tool, that lower the bar significantly.
Some goals for such a tool would be:
Example for what this could look like:
Spinning up a local server with default auth, kafka and postgresql alternatives:
$ drg server run
Starting services...done!
Console: https://localhost:8080
Running a local server but using third party service for dependencies: Probably need more options for credentials and such:
$ drg server run --database-url myhost:5354 --kafka-bootstrap myhost:12345 --oauth-server https://cloud.google.com/...
...
Whether or not this would be baked into drg
is not that important, but I think if it were, it would be extremely simple, and then you can easily switch to a 'scalable Drogue IoT' instance using that very same tool.
We still have a single Kafka topic, but wanted to have different topics for a bit.
There are three modes we could have here:
While using the sandbox, when we try to view the details of the device, it works well, but it doesn't open when the device name has a space in it.
Here is a video to explain the situation.
The current structure of the management API is a bit "grown". We should re-structure the current API:
When deploying the stack, the console-backend has an empty client secret at first:
CLIENT_ID=drogue
CLIENT_SECRET=
While the secret has the proper content:
data:
CLIENT_ID: ZHJvZ3Vl
CLIENT_SECRET: ZTc0ODM2Y2QtMGY3NS00ZDZkLWIzNDEtNDIyNDE5YjZjMTk3
I assume the keycloak operator updates the secret later on, so we must check this somehow.
I wasn't able to get API keys to work. I am not sure if the Either<Bearer,Basic>
idea works as intended.
I added a few lines of debug output in the start_
function, but never saw that triggered.
Just a thought I had regarding our discussion on scripts vs helm vs operator. This felt too small to be an RFC, so I just thought I'd start a discussion here. The idea is just to modify/rename drogue.sh, status.sh or wrapping them in a drgadm
script.
The drgadm
script could just be drogue.sh renamed, wrap it, or combine it with helm or operator under the hood. I find it appealing that you don't really need to touch Kubernetes with such an interface. It would be similar to the kubeadm
tool that exists for installing and managing Kubernetes clusters, in the same way drg
is similar to kubectl
.
drgadm install -c kubernetes # Does whatever drogue.sh does
drgadm status # Invokes status.sh
Maybe the first step could be just to refactor the drogue.sh and status.sh into functions that can be sourced and invoked from this tool?
Where this ties into helm I'm not sure, but if it uses helm, we introduce a dependency other than kubectl. Maybe helm would just be separate to this, I'm not sure.
Some services fail to start:
NAME URL LATESTCREATED LATESTREADY READY REASON
device-management-service http://device-management-service.drogue-iot.10.103.42.167.nip.io device-management-service-00001 False RevisionMissing
http-endpoint http://http-endpoint.drogue-iot.10.103.42.167.nip.io http-endpoint-00002 http-endpoint-00002 True
influxdb-pusher http://influxdb-pusher.drogue-iot.10.103.42.167.nip.io influxdb-pusher-00001 False RevisionMissing
It is possible to nudge them:
➜ install-minikube-0.2.0-rc3 kn -n drogue-iot service update device-management-service -e N=1
Updating Service 'device-management-service' in namespace 'drogue-iot':
0.039s unsuccessfully observed a new generation
0.089s Configuration "device-management-service" does not have any ready Revision.
0.134s Configuration "device-management-service" is waiting for a Revision to become ready.
3.416s ...
3.460s Ingress has not yet been reconciled.
3.511s Waiting for load balancer to be ready
3.695s Ready to serve.
Service 'device-management-service' updated to latest revision 'device-management-service-bswsg-2' is available at URL:
http://device-management-service.drogue-iot.10.103.42.167.nip.io
However, this should not be necessary: tracking knative/serving#10344
The "spy" in the web console has a "start" button, but no "stop" or "pause" button.
It should have one. For "pause" it would be important what that would mean exactly. I guess "stop" is easier to implement.
Authenticate devices using username/password.
Currently we only test a fraction of the Web UI. We need to test more. Most likely in the drogue-cloud-testing
repository.
However, we might need to improve the console UI a bit, so that we can more easily identify UI components (using IDs).
We also should wait for jonhoo/fantoccini#134 to be resolved.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.