GithubHelp home page GithubHelp logo

w3f / 1k-validators-be Goto Github PK

View Code? Open in Web Editor NEW
68.0 68.0 81.0 17.73 MB

Thousand Validators Program backend.

License: Apache License 2.0

TypeScript 96.18% Dockerfile 0.07% Smarty 0.02% Shell 0.49% JavaScript 0.48% HTML 0.35% CSS 2.41%

1k-validators-be's People

Contributors

alexw3f avatar arthurhoeke avatar benjwi avatar benwhitejam avatar ccris02 avatar dependabot-preview[bot] avatar dependabot[bot] avatar fgimenez avatar florianfranzen avatar ironoa avatar itrouble avatar joepetrowski avatar krzysztof-jelski avatar kubaw3f avatar lamafab avatar legendnodes avatar leostake avatar lsaether avatar mathcryptodoc avatar meistermike2 avatar michalisfr avatar mohamedhabas11 avatar mutantcornholio avatar nexus2k avatar noc2 avatar pampatzoglou avatar paradox-tt avatar stakeworld avatar w3fbot avatar wpank avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

1k-validators-be's Issues

Add maximum accounts per identity

In order to limit validators of a single identity to running only a specified number of validator candidates in the program, we need a new parameter and check to ensure that only a maximum of the same identity (including sub-identities) are registered in the program.

Track if rewards are getting paid out

We need a tool to track if rewards are being paid out by all validators which are nominated as part of this programme.

We should probably expect validators to handle calling their own payouts so if we detect that a validator does not do this we can give a warning.

False nominations?

Hi,

In my understanding, only the valid candidates will be nominated by the system. However, some invalid candidates also have been nominated. The following data is retrieved from the '/nominators' endpoint in era 1616 and might be the false nominations. Here is the API to check false nominations. https://onekv.herokuapp.com/falseNominations

[
  {
    "stash": "HhcrzHdB5iBx823XNfBUukjj4TUGzS9oXS8brwLm4ovMuVp",
    "name": "KIRA Staking",
    "elected": false,
    "nominatorAddress": "5C8ZU7zugMubgENdcyiZouHcVYSoeWbF8TpXSdWjStzYbFZW",
    "reason": "KIRA Staking has an identity but is not verified by registrar."
  },
  {
    "stash": "EtJ4HxHYEDvYWRJAdmV4hYpTbGMJCmEgnLC8zAf6u5ZyT7C",
    "name": "WolfEdge-Capital",
    "elected": false,
    "nominatorAddress": "5C8ZU7zugMubgENdcyiZouHcVYSoeWbF8TpXSdWjStzYbFZW",
    "reason": "WolfEdge-Capital does not have an identity set."
  },
  {
    "stash": "Dcw5vVBmon1PCERJXkYLvvMVmAE8xdqytUwNQLE8p1Hm33J",
    "name": "robonomics_team-01",
    "elected": false,
    "nominatorAddress": "5DZN69GLFZbm7cF65QBSHC7Ndeqwgjsq7XptnvYbSHHxe7aa",
    "reason": "robonomics_team-01 has an identity but is not verified by registrar."
  },
  {
    "stash": "J7Z1bxUB7qhxjqT5js6yAkCZoU1VYNxPvTdg9mtyNNbU845",
    "name": "Cube3-KSM-Val1-ValidatorA",
    "elected": false,
    "nominatorAddress": "5DZN69GLFZbm7cF65QBSHC7Ndeqwgjsq7XptnvYbSHHxe7aa",
    "reason": "Cube3-KSM-Val1-ValidatorA does not have an identity set."
  },
  {
    "stash": "CgpV58FSvuzGmfZXfiAQfkdDMVcFtpMq91ahk2zNYZdjdR9",
    "name": "LunaNova-KSM-Val1-ValidatorA",
    "elected": false,
    "nominatorAddress": "5GgyyiDPHNKSoE2sWCn5dMuJmAgoXnM4dzrmPmBucakiqPYh",
    "reason": "LunaNova-KSM-Val1-ValidatorA does not have an identity set."
  },
  {
    "stash": "FrQ4W8Bo6wgXzkaGHLzVFSsfbWWHvqGGNP1YkRmTPSkN17J",
    "name": "otter-sv-validator-1",
    "elected": false,
    "nominatorAddress": "5GgyyiDPHNKSoE2sWCn5dMuJmAgoXnM4dzrmPmBucakiqPYh",
    "reason": "otter-sv-validator-1 offline. Offline since 0."
  },
  {
    "stash": "HRYTEruAjwDD46kkgaTYpGHQC6uea3AkeLJg4iterSmmjo2",
    "name": "Tornado-V1",
    "elected": false,
    "nominatorAddress": "5GgyyiDPHNKSoE2sWCn5dMuJmAgoXnM4dzrmPmBucakiqPYh",
    "reason": "Tornado-V1 offline. Offline since 0."
  },
  {
    "stash": "DAexrmQxJ8TKiqpcU2QSn2QiGppGCpWZkJ9p7Nyhm7DW6nB",
    "name": "liberty-sv-validator-0",
    "elected": false,
    "nominatorAddress": "5GgyyiDPHNKSoE2sWCn5dMuJmAgoXnM4dzrmPmBucakiqPYh",
    "reason": "liberty-sv-validator-0 does not have an identity set."
  }
]

Add endpoint to fetch individual candidates

In creating a details page, it would be nice to have an endpoint to query by address that return only the individual candidate data.

So something like /candidates/<validator_address/.

Docs

Add some documentation to explain how everything works.

Abstract the constraints to be more modular

Right now the backend is pretty specific to the 1k-v use case, however if we abstracted the requirements of the validators into its own constraints.js file and allowed this to be passed in as an option it could allow the backend to be used by other nominator services.

Enable better fault detection

Right now faults are not always given for behaviour that should induce a fault. These fault events should be more strictly enforced.

Fix Docker-Compose setup

Right now the docker-compose setup for testing things locally doesn't quite work. The docker images are a bit out of date.

These should be updated, and also the telemetry frontend should be added as well for double checking things. Perhaps it might also be helpful to include another node or two.

https://github.com/wpank/polkadot-local-network/tree/master/scripts/testing

One thing I've also added to a similar approach in the above repo is having a bunch of scripts to do things to the docker containers. So having some of these might be a good way to test things out as well.

Recover from inconsistent API connection

If the API is inconsistent when the ChronJob goes to endRound or startRound then the transactions will not be made. The script should have a way to recover from an inconsistent API connection.

  • It should detect if the connection is inconsistent.
  • If it's inconsistent it should wait until the API connecting is good before trying to send transactions.
  • It should have reliable monitoring of transactions.

Update README

The README should be updated with any new information and the differences between the Kusama program and the Polkadot program.

Rank Reform

For a period of every 4 eras on Kusama and every era on Polkadot (ie eraPeriod or floor(currentEra / 4)), we should make historical Rank events, indicating that an address has gone up a rank for the period of time.

It may look something like the following:

(say the current era is 1000)

{
    address: "<address>"
    eraPeriod: 250,
    erasActive: [996, 997, 999],
    newRank: 27
}

In this case it would be easier to keep track of previous events, and also compensate for times when the backend misses the times to increment rank. In this case it can backfill previous missed ranks appropriately by looking at the last rank event.

Revise Nominations to Efficiently Distribute Stake

At the moment a lot of the nominator accounts distribute stake unevenly.

_doNominations should be revised so that each nominator accounts will nominate(account_balance / lowest_staked_validator * 1.05) amount of candidates.

clients aren't being tracked properly when updated

When a client updates they aren't being registered as running the latest version, and the backend requires a restart. I think the node registering logic is missing the field to update the client version.

TypeError: is not a function when running yarn docker

Not long after running yarn docker, I get the following error:

1kv_1         | (node:28) UnhandledPromiseRejectionWarning: TypeError: this.api.query.staking.activeEra is not a function
1kv_1         |     at ChainData.<anonymous> (/code/src/chaindata.ts:13:52)
1kv_1         |     at Generator.next (<anonymous>)
1kv_1         |     at /code/src/chaindata.ts:8:71
1kv_1         |     at new Promise (<anonymous>)
1kv_1         |     at __awaiter (/code/src/chaindata.ts:4:12)
1kv_1         |     at ChainData.getActiveEraIndex (/code/src/chaindata.ts:12:51)
1kv_1         |     at ScoreKeeper.<anonymous> (/code/src/scorekeeper.ts:198:27)
1kv_1         |     at Generator.next (<anonymous>)
1kv_1         |     at /code/src/scorekeeper.ts:8:71
1kv_1         |     at new Promise (<anonymous>)
1kv_1         | (node:28) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 2)
1kv_1         | (node:28) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

This is with:
node v13.10.1 (npm v6.14.2),
yarn 1.22.0

This is on a machine running ubuntu 19.10 with a fresh clone of this repo, and ranks don't end up increasing.

Strangely I don't get the error while on another machine (also ubuntu 19.10), with similar versions of node and yarn. Not really sure what to make of it.

batch API calls

All api calls should be batched in order to reduce the amount of "over the air" calls we do, as well as reduce the room for async failures and bugs.

Review

  1. scorekeeper.ts
    nodes = nodes.filter((node: any) => node.offlineAccumulated / WEEK <= 0.02);
    constant variable for the UP_TIME

  2. index.ts & scorekeeper.ts

      const scorekeeperFrequency = Config.global.test? '0 0-59/3 * * * *' : '0 0 0 * * *';
      scorekeeper.begin(scorekeeperFrequency);

the crobJob will re-run for everyday, and the nomination transaction logic in the nominator.ts that could possible fail due to the RPC node stucks / tx fail or something like that.
say the sitation like : we have 5 validators would like to nominate (A,B,C,D,E)
A - Success
B - Fail
C - Success
How could we handle validator B nomination ? (based on the current situation, the validator might need to wait 1 day). Suggest to have the logic to handle that.
Also

      await nominator.nominate(toNominate);
      this.db.newTargets(nominator.address, toNominate); 
  this.db.newTargets(nominator.address, toNominate);   <--- this should only update when nomination successful call.

By making the points more useful, it would be great to design the game like

  1. Basic nomination amount (say 3,000 KSM) at the begining
  2. As the validators' uptime keep consistenly stable say > 99% (increase 5% of the nomination)
  3. if the validator is not that stable at all, we can just reduce the amount of nominations by certain percentage.

More better design would also consider the era points element as well, this can ensure the validator has done some actal works.
The above design would require we have multiple accounts for holding different amounts since we cannot change the amount as we want immediately. so would be like
Basic nomination amount: have 20 addresses contain 3000 KSM
Medium nomination amount: have 20 addresses contain 6000 KSM
and so on.

Dockerize the "fast substrate" executable

Currently we use a custom built "fast substrate" in order to do the testing. Ideally we can use a mocked substrate to do the testing, but that's probably a whole project in itself.

The least we can do is dockerize the fast substrate so that tests can reliably run on different architecture and CI.

Candidate Endpoint Improvements

  • Remove SentryId, SentryOnlineSince, SentryOfflineSince
  • Include an array of InvalidityReasons

For Polkadot:

  • Include Kusama 1kv Address

Telemetry connection never reconnects and leads to offline accumulated miscalculation

Transcript from Riot:

@will My validator has been up continuously since 9th May 2am BST but nominations have been inconsistent as of late i.e. on 1 day, then off 1 day, then on 1 day, then off 4 days, then on 3 days, then off since a day ago.
So I've dug through https://github.com/w3f/1k-validators-be, queried the backend URL mentioned above, can see a seemingly errnoneous "offlineAccumulated" value of 76104897 (ms) and then the text in "/invalid" that my node has been offline 1268 minutes this week.
As checkSingleCandidate() imposes a 98% weekly uptime requirement, this would explain why nominations were apparently pulled yesterday, and perhaps some of the other occasions as well.
As there happen to be 18 other nodes who also appear in the in the "/invalid" list, all with 1268 minutes of offline time this week, this would appear to be a problem on the 1K backend side.
i.e.

$ curl -s 'https://otv-backend.w3f.community/invalid'|grep 'has been offline 1268\.' -c
19

I did notice this in the validator logs:
2020-05-14 12:15:40 ?? Disconnected from /dns4/telemetry-backend.w3f.community/tcp/443/x-parity-wss/%2Fsubmit: Sink(Custom { kind: Other, error: B(Custom { kind: Other, error: Io(Os { code: 104, kind: ConnectionReset, message: "Connection reset by peer" }) }) })
2020-05-14 12:22:30 ?? Pre-sealed block for proposal at 2304220. Hash now 0x37bf872e064f3a1523dce3390a50c4a93256697106215d3c860a896ffc436b95, previously 0x4b645e02cc97fab6db108603d2c7bcff2a802405fe939e1```
Also running lsof on the validator process I don't see any connections to the w3f telemetry server, so it looks like once the telemetry connection goes down, polkadot never tries to bring it up again.
shadewolf
@will Obviously it is up to W3F and Parity as to how they nominate their stake but in this instance I would suggest they consider resetting the weekly offline accumulated time of affected validators.
sebytza05
shadewolf: yup, i found too in the validator logs 2020-05-14 13:15:40 โš ๏ธ Disconnected from /dns4/telemetry-backend.w3f.community/tcp/443/x-parity-wss/%2Fsubmit%2F: Sink(Custom { kind: Other, error: B(Custom { kind: Other, error: Io(Os { code: 104, kind: ConnectionReset, message: "Connection reset by peer" }) }) })
offlineSinceand has been offline n minutes this week are so useless right now

Summary

It looks like the problem is as mentioned above, that telemetry is kicking off validators and polkadot does not reconnect.

Create FaultEvent's

Right now validators will accumulate faults, however there aren't may clear indicators as to what those faults were for.

It would be nice to have an endpoint to query with the validator, time, and fault reason.

This can be listed under an individual candidate endpoint ideally.

related: #460

Make the nominating states persistent across restarts

The service will be occasionally or routinely restarted in order to add more validators to the configuration. This means that all state that is held in the program will be lost unless it's persisted in the database. Currently, we only persist node data in the database and keep nominator state in memory. We should add additional methods on the database to allow for saving nominator data.

The ensureUpgrades procedure might be broken

Hi, I am running a validator based on the latest client code which is 0.8.24-5cbc418a-x86_64-linux-gnu right now. However, I keep getting "xxx is not running the latest client cod" message.
I thought the reason is the networkId (i.e., Sentry Node Network ID) of these validators is null. Therefore, the ensureUpgrades process will bypass these validators (

const nodes = await this.db.allNodes();
).
The allCandidates and allNodes should be the same lists now since the sentry node is no longer required. (
async allNodes(): Promise<any[]> {
).

Fix ranks incosistently updating

By itself, ranks inconsistently update. As a stopgap, retroactive ranks have been introduced. Retroactive ranks should be removed and regular rank increases should be fixed.

Add round records and the round server endpoint

Right now we all the data exposed as the /nodes or the /nominators endpoint. One thing that would be helpful is to expose the /rounds endpoint with historical data of nominators and their targets and whether the targets ended up doing good or bad. It should expose this for all prior rounds.

Monitor offline reports

and alert the Riot room when a validator was reported offline. Decide on the tolerance of offline reports before docking the rank of the validator.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.