paritytech / helm-charts Goto Github PK
View Code? Open in Web Editor NEWParity Helm charts collection
License: GNU General Public License v3.0
Parity Helm charts collection
License: GNU General Public License v3.0
We should create dedicated charts to provide standardized deployment of polkadot and polkadot-parachain (cumulus). Those would be wrappers around the generic "node" helm-chart with values correctly set for their respective latest image tag as well as more comprehensive docs on how to correctly configure them.
Update:
As agreed upon this comment, this chart will now be a polkadot-stack
chart and polkadot-parachain-stack
.
docker images in use to be wrapped with the umbrella chart:
Automatic node port discovery was introduced in #28, however in some cases the operator will not want / be able to open a large range of ports (30000-32767) on their Kubernetes nodes.
An option should be added to fix the attributed nodePort, however in this case it might be impossible to support more than 1 replica for the statefulset as fixing the port will result in a port conflict for the second replica.
Similarly, it should be possible to deploy a node which uses a fixed p2p IP by using loadBalancerIP
for LoadBalancer services using a pre-reserved IP at the Cloud Provider (eg. for GCP).
However in this case, the p2p service would no longer be of type NodePort but instead of type LoadBalancer.
It would be useful to be able to independently annotate the auto-created p2p Services.
This would enable using kubernetes-sigs/external-dns to manage Bootnodes / RPC nodes DNS entries.
For example:
apiVersion: v1
kind: Service
metadata:
annotations:
cloud.google.com/load-balancer-type: Internal
external-dns.alpha.kubernetes.io/hostname: bootnode.testnet.parity.io.
PodDisruptionBudget support would be very useful to be resilient to Kubernetes node pool upgrades.
It should be configured as such:
node:
disruptionBudget:
# only one of minAvailable and maxUnavailable can be set
minAvailable:
maxUnavailable:
The template logic must check that minAvailable and maxUnavailable are not set at the same time.
If collator.isParachain
set to true
, node exposes 2 Prometheus ports (relaychain and parachain), If monitoring is enabled both ports should be scraped.
Only parachain metrics are collected.
Both parachain and relaychain metrics collected.
Currently, when a new polkadot release is available, people should change their values files to point to the new version.
I think the best is to release a new version for the node
helm chart per each release and use the chart appVersion
as default for the image tag, then people who are using our chart for their deployment can easily update their nodes by bumping the chart version instead of editing the values.yaml
file manually or by a pipeline.
paritytech/substrate#13384 removed support for some RPC flags as both HTTP and Websocket protocols are now served on a single port. Node Helm chart should be refactored to support this change.
When inserting keys into a substrate node those will end up in the /data/.../chains/chain_name/keystore
folder which in our setup is stored in the "data" Kubernetes volume. In a secure setup, we don't want this data volume to contain our private keys so we should mount a tmpfs (ie Kubernetes volume emptyDir type) on this path.
This will prevent keys from being persisted in the data disk after having been sourced from Hashicorp Vault or other secure places.
Use the new flag name introduced in paritytech/substrate#11934
The current --pruning
still works but doesn't show up in polkadot --help
anymore so let's use the new one.
When we use .extraDerivation
in .Values.node.vault.keys
the node helm chart will still inject vault key, even if vault agent failed to mount key:
cat: /vault/secrets/name: No such file or directory
Inserted key aura (type=aura, scheme=sr25519) into Keystore
The Inserted command will not fail, since it can derive from well-known key (for example //Alice
, //extraDerivation
)
we need to add a check in inject-vault-keys init-container. If the file does not exist the init container should fail.
The polkadot-introspector kvdb tool can be used to monitor the database continuously. We should add support for running this exporter as a sidecar in the node helm chart.
We have successfully set it up with this configuration but it feels like a lot of boilerplate to be adding for people who would like to set this up.
extraContainers:
- name: relaychain-kvdb-introspector
image: paritytech/polkadot-introspector:438d3406
command: [
"polkadot-introspector",
"kvdb",
"--db",
"/data/chains/versi_v1_9/db/full",
"--db-type",
"rocksdb",
"prometheus",
"--port",
"9620"
]
resources:
limits:
memory: "1Gi"
ports:
- containerPort: 9620
name: relay-kvdb-prom
volumeMounts:
- mountPath: /data
name: chain-data
- name: parachain-kvdb-introspector
image: paritytech/polkadot-introspector:438d3406
command: [
"polkadot-introspector",
"kvdb",
"--db",
"/data/chains/versi_v1_9/db/full/parachains/db",
"--db-type",
"rocksdb",
"prometheus",
"--port",
"9621"
]
resources:
limits:
memory: "1Gi"
ports:
- containerPort: 9621
name: para-kvdb-prom
volumeMounts:
- mountPath: /data
name: chain-data
Note there should be the option to run 1 or 2 sidecars. 1 to monitor the main db and 1 for the parachain db (as an option for relay chains).
We also need to create the appropriate ServiceMonitor for loading data in Prometheus.
In the node helm-chart, we force set the node.chainData.storageClass" to "default", this prevents Kubernetes from using the cluster defined default storage class if this one is not named
default`.
The solution is to not set the storageClass line in the YAML template when the value is unset.
Reference: https://kubernetes.io/docs/concepts/storage/storage-classes/
The following issue has been reported to me by @kogeler :
Containers:
kusama:
Container ID: containerd://8a9995292c78499072b0abaf828c67bfbf27024548bf144cb9e2dd511c5d7eb1
Image: parity/polkadot:v0.9.16
Image ID: docker.io/parity/polkadot@sha256:46ec2899a865ff7640ea3eaaf7306ecfc3128609fc40d7fe587f486c7ff9eba9
Ports: 9933/TCP, 9944/TCP, 9615/TCP, 30333/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Command:
/bin/sh
Args:
-c
RELAY_CHAIN_P2P_PORT="$(cat /data/relay_chain_p2p_port)"
echo "RELAY_CHAIN_P2P_PORT=${RELAY_CHAIN_P2P_PORT}"
exec polkadot \
--name=${POD_NAME} \
--base-path=/data/ \
--chain=${CHAIN} \
--pruning=archive --rpc-external --ws-external --rpc-methods=safe --rpc-cors=all --prometheus-external --telemetry-url='wss://submit.telemetry.parity-stg.parity.io/submit/ 1'
--listen-addr=/ip4/0.0.0.0/tcp/${RELAY_CHAIN_P2P_PORT} \
--listen-addr=/ip4/0.0.0.0/tcp/30333 \
Newline is not escaped after [flags' unwrapping](https://github.com/paritytech/helm-charts/blob/main/charts/node/templates/statefulset.yaml#L283).
If you check the pod you can see:
polkadot@kusama-public-sidecar-node-0:/$ ps -p 1 -o args
COMMAND
polkadot --name=kusama-public-sidecar-node-0 --base-path=/data/ --chain=kusama --pruning=archive --rpc-external --ws-external --rpc-methods=safe --rpc-cors=all --prometheus-external --telemetry-url=wss://submit.telemetry.parity-stg.parity.io/submit/ 1
--listen-addr
flags are absent in fact
Create end-to-end tests that would cover some of the most common scenarios for deploying a node with the Helm chart: a full node, an RPC node, a bootnode, a validator, and a collator.
The tests should be included into the CI pipeline. All the tests should be passing before a PR can be merged.
It could e.g. be achieved with one tool for both clouds: s5cmd, which potentially also downloads the backup faster than gsutil
. It would be interesting to do some tests.
We should be able to set the pruning option (pruned/archive) in the chart values to set the --pruning flags and add useful pod/service label.
It's not possible to template ingressClassName
field there, so only the annotation option is possible, which is deprecated since 1.22.
node Helm chart templates are missing some of the configurable parameters available in Kubernetes (like loading environment variables from a ConfigMap using envFrom
).
It is time-consuming (though rewarding on larger scale) to maintain our own standardized library of Helm templates. So instead I re-used the great template library from Bitnami during the development of staking-miner Helm chart
A node Helm chart templates should be refactored in the same way to be consistent with available Kubernetes features
The Bitnami Helm Readme Generator is very useful to maintain up to date README files. We should generalize the use of the tool:
values.yaml
comments need to be rewritten to follow the proper syntaxAs pointed out in this discussion, the node key
commands might not always be available on node binaries, we should switch to using the parity/subkey:latest
docker image.
Hello @PierreBesson and thank you for creating this helm chart.
I am trying to use it to deploy a faucet for Picasso Rococo
Steps I did before installing the helm chart
composablefi_faucet
Those are the values I used (removed the secrets data)
helm install substrate-faucet parity/substrate-faucet \
--set faucet.secret.SMF_BACKEND_FAUCET_ACCOUNT_MNEMONIC="removed" \
--set faucet.secret.SMF_BOT_MATRIX_ACCESS_TOKEN="removed" \
--set faucet.config.SMF_BACKEND_RPC_ENDPOINT="https://picasso-rococo-rpc-lb.composablenodes.tech/" \
--set faucet.config.SMF_BACKEND_INJECTED_TYPES='{}' \
--set faucet.config.SMF_BACKEND_NETWORK_DECIMALS='12' \
--set faucet.config.SMF_BOT_MATRIX_SERVER="https://matrix.org" \
--set faucet.config.SMF_BOT_MATRIX_BOT_USER_ID="@composablefi_faucet:matrix.org" \
--set faucet.config.SMF_BOT_NETWORK_UNIT="PICA" \
--set faucet.config.SMF_BOT_DRIP_AMOUNT="1"
Testing the endpoint picasso-rpc-lb.composablenodes.tech seems to be fine
curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "rpc_methods"}' https://westend-rpc.polkadot.io
{"jsonrpc":"2.0","result":{"methods":["account_nextIndex","author_hasKey","author_hasSessionKeys","author_insertKey
... (content removed)
echo '{"id":1, "jsonrpc":"2.0", "method": "rpc_methods"}' | websocat wss://picasso-rpc-lb.composablenodes.tech
{"jsonrpc":"2.0","result":{"methods":["account_nextIndex","assets_balanceOf","assets_listAssets","author_hasKey","author_hasSessionKeys","author_insertKey","author_pendingExtrinsics","author_removeExtrinsic","author_rotateKeys","author_submitAndWatchExtrinsic","author_sub
... (content removed)
By the way https://rococo-rpc.polkadot.io
seems to be down
curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "rpc_methods"}' https://rococo-rpc.polkadot.io
Service Unavailable
echo '{"id":1, "jsonrpc":"2.0", "method": "rpc_methods"}' | websocat wss://rococo-rpc.polkadot.io
websocat: WebSocketError: WebSocketError: Received unexpected status code (503 Service Unavailable)
websocat: error running
while https://westend-rpc.polkadot.io/
works
curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "rpc_methods"}' https://westend-rpc.polkadot.io
{"jsonrpc":"2.0","result":{"methods":["account_nextIndex","author_hasKey","author_hasSessionKeys","author_insertKey","author_pendingExtrinsics","author_removeExtrinsic","author_rotateKeys","author_submitAndWatchExtrinsic","author_submitExtrinsic","author_unwatchExtrinsi
... (content removed)
echo '{"id":1, "jsonrpc":"2.0", "method": "rpc_methods"}' | websocat wss://westend-rpc.polkadot.io
{"jsonrpc":"2.0","result":{"methods":["account_nextIndex","author_hasKey","author_hasSessionKeys","author_insertKey
... (content removed)
Here is what I have in the logs
kubectl logs substrate-faucet-85b7c64b7f-vcqsn
yarn run v1.22.5
$ node ./build/src/start.js
2023-03-30 12:46:26 API/INIT: Api will be available in a limited mode since the provider does not support subscriptions
[2023-03-30T12:46:26.928] [INFO] default - ๐ฐ Plip plop - Creating the faucets's account
[2023-03-30T12:46:26.929] [INFO] default - Ignore list: (1 entries)
[2023-03-30T12:46:26.929] [INFO] default - ''
SMF:
๐ฆ BOT:
โ
BACKEND_URL: "http://localhost:5555"
โ
DRIP_AMOUNT: 1
โ
MATRIX_ACCESS_TOKEN: *****
โ
MATRIX_BOT_USER_ID: "@composablefi_faucet:matrix.org"
โ
MATRIX_SERVER: "https://matrix.org"
โ
NETWORK_DECIMALS: 12
โ
NETWORK_UNIT: "PICA"
โ
FAUCET_IGNORE_LIST: ""
โ
DEPLOYED_REF: "unset"
โ
DEPLOYED_TIME: "unset"
[2023-03-30T12:46:26.965] [INFO] default - โ
BOT config validated
SMF:
๐ฆ BACKEND:
โ
FAUCET_ACCOUNT_MNEMONIC: *****
โ
FAUCET_BALANCE_CAP: 100
โ
INJECTED_TYPES: "[]"
โ
NETWORK_DECIMALS: 12
โ
PORT: 5555
โ
RPC_ENDPOINT: "https://picasso-rococo-rpc-lb.composablenodes.tech/"
โ
DEPLOYED_REF: "paritytech/faucet:latest"
โ
DEPLOYED_TIME: "2023-03-30T15:45:28"
โ
EXTERNAL_ACCESS: false
โ
DRIP_AMOUNT: "0.5"
โ
RECAPTCHA_SECRET: *****
[2023-03-30T12:46:26.979] [INFO] default - โ
BACKEND config validated
[2023-03-30T12:46:26.995] [INFO] default - Starting faucet v1.1.2
[2023-03-30T12:46:26.995] [INFO] default - Faucet backend listening on port 5555.
[2023-03-30T12:46:26.995] [INFO] default - Using @polkadot/api 10.0.1
Connected to the in-memory SQlite database.
Getting saved sync token...
Getting push rules...
Attempting to send queued to-device messages
Got saved sync token
Got reply from saved sync, exists? false
All queued to-device messages sent
2023-03-30 12:46:31 API/INIT: RPC methods not decorated: assets_balanceOf, assets_listAssets, crowdloanRewards_amountAvailableToClaimFor, ibc_clientUpdateTimeAndHeight, ibc_generateConnectionHandshakeProof, ibc_queryBalanceWithAddress, ibc_queryChannel, ibc_queryChannelClient, ibc_queryChannels, ibc_queryClientConsensusState, ibc_queryClientState, ibc_queryClients, ibc_queryConnection, ibc_queryConnectionChannels, ibc_queryConnectionUsingClient, ibc_queryConnections, ibc_queryDenomTrace, ibc_queryDenomTraces, ibc_queryEvents, ibc_queryLatestHeight, ibc_queryNewlyCreatedClient, ibc_queryNextSeqRecv, ibc_queryPacketAcknowledgement, ibc_queryPacketAcknowledgements, ibc_queryPacketCommitment, ibc_queryPacketCommitments, ibc_queryPacketReceipt, ibc_queryProof, ibc_queryRecvPackets, ibc_querySendPackets, ibc_queryUnreceivedAcknowledgement, ibc_queryUnreceivedPackets, ibc_queryUpgradedClient, ibc_queryUpgradedConnectionState, pablo_pricesFor, pablo_simulateAddLiquidity, pablo_simulateRemoveLiquidity
2023-03-30 12:46:32 API/INIT: picasso/10011: Not decorating unknown runtime apis: 0x9c53906fa888fe7c/1, 0x5c497be959ff24ab/1, 0xf60c4a6e7ca253cc/1, 0xa74824145d05c12a/1
Got push rules
Adding default global override for .org.matrix.msc3786.rule.room.server_acl
Checking lazy load status...
Checking whether lazy loading has changed in store...
Storing client options...
Stored client options
Getting filter...
[2023-03-30T12:46:36.615] [INFO] default - Fetched faucet balance ๐ฐ
Sending initial sync request...
Waiting for saved sync before starting sync processing...
Adding default global override for .org.matrix.msc3786.rule.room.server_acl
Caught /sync error TypeError: Cannot read properties of undefined (reading 'cryptoStore')
at /faucet/node_modules/matrix-js-sdk/lib/sync.js:1191:49
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async Object.promiseMapSeries (/faucet/node_modules/matrix-js-sdk/lib/utils.js:445:5)
at async SyncApi.processSyncResponse (/faucet/node_modules/matrix-js-sdk/lib/sync.js:1184:5)
at async SyncApi.doSync (/faucet/node_modules/matrix-js-sdk/lib/sync.js:843:9)
[2023-03-30T13:03:00.027] [INFO] default - Auto-joined !JTMeQUcNDfTSdeIIvP:matrix.org.
EventTimelineSet.addLiveEvent: ignoring duplicate event $yZzI-R0yRHKSxnsfHpGveuj3QMgs7T3Zugk7NB__mmI
[2023-03-30T13:03:06.314] [INFO] default - Auto-joined !JTMeQUcNDfTSdeIIvP:matrix.org.
2023-03-30 20:41:56 RPC-CORE: queryStorageAt(keys: Vec<StorageKey>, at?: BlockHash): Vec<StorageChangeSet>:: [502]: Bad Gateway
[2023-03-30T20:41:56.989] [ERROR] default - Error: [502]: Bad Gateway
at HttpProvider._HttpProvider_send (/faucet/node_modules/@polkadot/rpc-provider/cjs/http/index.js:162:19)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async callWithRegistry (/faucet/node_modules/@polkadot/rpc-core/cjs/bundle.js:172:28)
2023-03-30 20:44:06 RPC-CORE: queryStorageAt(keys: Vec<StorageKey>, at?: BlockHash): Vec<StorageChangeSet>:: [502]: Bad Gateway
[2023-03-30T20:44:06.311] [ERROR] default - Error: [502]: Bad Gateway
at HttpProvider._HttpProvider_send (/faucet/node_modules/@polkadot/rpc-provider/cjs/http/index.js:162:19)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async callWithRegistry (/faucet/node_modules/@polkadot/rpc-core/cjs/bundle.js:172:28)
For testing/benchmarks use cases, it should be possible to setup a node with non-persistent volumes. So an option to use Kubernetes ephemeral volumes would be useful.
Faucet image docker image: paritytech/polkadot-testnet-faucet#238
Substrate-connect light clients have been officially announced as ready to use for the general public (https://www.youtube.com/watch?v=TDbTCrDDO2U). However, from an operational point of view, an obstacle to adoption is that to work properly, the light client needs to access a bootnode which exposes it's p2p over websocket (typically --listen-addr /ip4/0.0.0.0/tcp/30444/ws --listen-addr /ip6/::/tcp/30444/ws
). Moreover, as browsers will only allow connecting to a secured websocket, you need a reverse proxy in front such as nginx to add a letsencrypt certificate.
I believe it would be valuable to offer an easy way to deploy such a bootnode to kubernetes with the following:
ping @tomaka
We want to make it easier to reason about the exposed ports for substrate nodes (especially collators that run 2 nodes in 1). So from internal discussions at Parity we brainstormed the following table. The logic is to reuse conventions that arose organically while minimizing confusion (eg. it is very hard to differentiate 30334 and 30344 at a glance).
To achieve this we propose to shift port numbers by -1000
for the secondary chain (ie. relay-chain for the collator). Note that most of the times those ports don't need to be exposed.
Type | Primary | Secondary |
---|---|---|
p2p_tcp | 30333 | 29333 |
p2p_ws | 30444 | 29444 |
prom | 9615 | 8615 |
rpc | 9933 | 8933 |
rpc_ws | 9944 | 8944 |
Staring from polkadot v0.9.28, logs are full with following lines:
2022-10-03 20:29:32 Accepting new connection 1/100
2022-10-03 20:29:33 Rejected connection: Transport(i/o error: unexpected end of file
Caused by:
unexpected end of file)
It is caused by readinessProbe it uses tcpSocket
to check if the port is open.
Previously port was closed until the node is synced but the substrate network was refactored several times and now this is not the case.
We want to encourage people outside of Parity to contribute to this repository.
https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/setting-guidelines-for-repository-contributors
Since helm v3.7.0 we have the option to refer to a subcharts templated helper functions. However if the parent chart uses identically named helper templates then the child ones will be overwritten.
Additionally chart.name
is fairly close to Chart.name
which is a helm builtin object as we should move to avoid confusion.
I suggest we change the naming of these functions from chart.blah
to node.blah
or some other nomenclature to avoid helper templates being overwritten when called from a parent chart.
When iterating over the {{- range $index, $key := .Values.node.keys }}
the helm chart fails to correctly insert values.node.command
in the subsequent initContainer command this is because when you use a range you change the scope of the Values.
Consider the following values.yaml
node:
keys:
- type: "gran"
scheme: "ed25519"
seed: "//Blah"
This results in the following error: nil pointer evaluating interface {}.node
as node.command is no longer in scope.
The following works but is obviously undesirable:
node:
keys:
- type: "gran"
scheme: "ed25519"
seed: "//Blah"
Values:
node:
command: "command"
An easy fix is evaluating {{ .Values.node.command }}
as $COMMAND
and setting that as an envar similar to how we set {{ .Values.node.chain }}
in the same init container. Alternatively we can set {{ .Values.node.command }}
as a helm variable before the loop.
Happy to put up a fix for this if you let me know your preferred method.
Available in k8s 1.24+, the maximum unavailable pods rolling update strategy will make it much smoother to roll out updates to Statefulset which can currently be done 1 pod at a time.
We should be careful to use the proper helm kubernetes version checks in the templates to ensure compatibility with the user cluster.
In our use case we use the pallet-node-authorization
and we need to be able to insert the node-key
.
Currently the only methods for achieving this are:
node.persistGeneratedNodeKey
true and then kubectl exec
into the container and put our key in /data/node-key
ORnode.persistGeneratedNodeKey
true and take the generated node-key
and recreate this in our node-authorization pallet, recreate the chainspec then download it using the node.customChainspecUrl
A better solution would be allowing us the option to create our own node-key
mounted as a read-only secret
instead of reading it from the RW data volume.
We can provide more detail if requested.
Available in k8s 1.23+, persistentVolumeClaimRetentionPolicy allows automatic cleanup of the statefulset associated PVCs when scaling down or deleting a Statefulset.
This will be a great help for managing tesnets that are frequently scaled up and down.
It would be convenient to have the node flags printed out before startup as those are built from a complex template.
When inserting keys via the values.node.keys
method the init container displays them, meaning that anyone who has read access to the statefulSet
can read them.
It would probably be better to mount these as secret
and inject them using a file redirect than an echo.
Ideally we should have 4 volume:
In this case we will able to use pvc of any working node as source for next node.
Originally posted by @BulatSaif in #111 (comment)
When using the node helm chart, it is really hard to figure out whether flags should be set in the chart values or inside node.flags
. The situation can be improved:
pruning
and database
options should be under chainData
and relayChainData
.--name=${POD_NAME}"
by default and the same telemetry URLs as the main chain.--ws-port
and --rpc-port
The parachain collator 0.9.300 supports collation via RPC relay chain node. In this mode the collator doesn't need to run a local relay-chain node and simply needs to point to a relay-chain RPC URL.
Add support for this mode:
node.collatorRelayChain.rpcUrl
which when set will add --relay-chain-rpc-url ws://rpc-node-url
and disable the relay-chain data and keystore volumeHi team ๐
I'm using the node
chart and having some trouble with p2p networking between pods.
It looks like I created a fork where pods in aks-testnet-23183882-vmss000001
node are isolated from those in the other node (peer discovery is working locally with --allow-private-ipv4
flag within each node).
> k get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ajuna-node-0 1/1 Running 0 5m51s 10.244.1.10 aks-testnet-23183882-vmss000000 <none> <none>
ajuna-node-1 1/1 Running 0 5m5s 10.244.2.13 aks-testnet-23183882-vmss000001 <none> <none>
ajuna-validator-0-0 1/1 Running 0 5m51s 10.244.2.11 aks-testnet-23183882-vmss000001 <none> <none>
ajuna-validator-1-0 1/1 Running 0 5m51s 10.244.2.12 aks-testnet-23183882-vmss000001 <none> <none>
ajuna-validator-2-0 1/1 Running 0 5m51s 10.244.1.11 aks-testnet-23183882-vmss000000 <none> <none>
Commands I'm using:
--chain=testnet
--name=$(POD_NAME)
--base-path=/data
--rpc-cors=all
--ws-external
--rpc-methods=safe
--allow-private-ipv4
--listen-addr=/ip4/0.0.0.0/tcp/30333
Any pointers you could share to debug this?
Horizontal Pod Autoscaler added in #120 when enabled conditionally removes replicas
field from the StatefulSet. Creation of p2p services relies on the presence of that field. Thus when replica count is scaled up or down by HPA additional p2p services are not created/removed and no Pods can work.
We need to check whether Helm has any mechanism to rely on the current replicas set by HPA. Or, if not implement some custom handler that monitors K8s scaling events and creates/removes p2p services accordingly
Add an option to the node helm chart for doing the following:
--jaeger-agent 127.0.0.1:$AGENT_PORT
flag to the nodeStartup probes appear to be consistently failing in our testnets:
polkadot version: 0.9.36
probe config:
Startup: http-get http://:http-rpc/health delay=0s timeout=1s period=10s #success=1 #failure=30
Warning Unhealthy 18m (x123 over 27h) kubelet Startup probe failed: Get "http://10.20.142.31:9933/health": dial tcp 10.20.142.31:9933: connect: connection refused
The service account name is already defined in serviceAccount.name
, so it's strange that it's possible to redefine it in node.serviceAccountName
and unclear what purpose it has.
Note apparently this is mandatory to set it to the right value for Vault auth to work:
vault.hashicorp.com/role: {{ .Values.node.vault.authRole | default (include "node.serviceAccountName" .) | squote }}
Would like to use HashiCorp Vault in Susbtrate/Polkadot helm charts for secret management and protecting sensitive data, in this case, the node keys, for instance.
Currently an exec to curl is used in the startup probe to work around the issue of the RPC endpoint only allowing local network access by default.
Possible solutions:
httpGet
probe. However it will be the user's responsibility to set the correct flags to allow kubernetes probes traffic to the RPC endpointAs mentioned in #108 (comment) it would be nice to have examples of provisioning a Substrate node with Ingress object (possibly for different popular Ingress Controllers). Ingress is usually required to have more control over proxying the traffic to the node. We at Parity is using Ingress to proxy traffic to boot nodes and RPC nodes. We can put examples of using it into a separate examples
directory to not pollute the original Helm chart
Do the following changes on the polkadot notification chart
If I recall, the polkadot images are pretty friendly with running as read-only
as long as you use a volume for the chain data.
containers:
- name: mynode
image: <the_image>
securityContext:
runAsUser: 1000
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
For tmp containers where we do not want to mount a volume, the solution to keep running the containe as read-only
is to mount whatever folder where the node needs to write as tmpfs.
node chart has init container backup-chain-gcs to dump db to gcp on startup.
If the pod has CrashLoopBackOff
the same dump will be uploaded to gcp every restart.
We need to create a "lock file" or "status of the last backup" file, Which will be checked before db is uploaded to GCP.
if the last backup is younger than 1h (1h - should be configurable in values.yml
) - skip the backup.
if the last backup failed (we have a lock file) - fail with an error message.
Since the pod is in CrashLoopBackOff
it will be hard to exec
to the pod and clean a lockfile, we need to add option to remove it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.