gnolang / tx-indexer Goto Github PK

View Code? Open in Web Editor NEW

9.0 14.0 13.0 425 KB

A minimal Tendermint2 indexer capable of serving chain data

License: Apache License 2.0

Makefile 0.25% Go 99.43% Dockerfile 0.32%

gnoland indexer jsonrpc2 tendermint2

tx-indexer's Issues

[bug] Transaction data lost during a batch job

Description

When querying block and block_result information with the indexer's batch operation,
they are not mapped correctly.
(When mapping Block to BlockResult, it seems like it should map to the block height, not the index number of the iteration.)

This issue causes some transactions to be lost, and transactions that do not fit into a block are stored.

In the code below,
the results of RPC requests called in batches are not guaranteed to be in order because they have different response times.

[worker.go]

blockResultsRaw, err := batch.Execute(context.Background())
	if err != nil {
		// Try to fetch sequentially
		return getTxResultsSequentially(blocks, client)
	}

	// Extract the results
	for resultsIndex, resultsRaw := range blockResultsRaw {
		results, ok := resultsRaw.(*core_types.ResultBlockResults)
		if !ok {
			return nil, errors.New("unable to cast batch result into ResultBlockResults")
		}

		height := results.Height
		deliverTxs := results.Results.DeliverTxs

		txResults := make([]*types.TxResult, blocks[resultsIndex].NumTxs)

		for txIndex, tx := range blocks[resultsIndex].Txs {
			result := &types.TxResult{
				Height:   height,
				Index:    uint32(txIndex),
				Tx:       tx,
				Response: deliverTxs[txIndex],
			}

			txResults[txIndex] = result
		}

		fetchedResults[resultsIndex] = txResults
	}

Environment Details:

OS and version: macOS Sonoma 14.0
Go version: go1.22.1
Branch and commit hash causing this issue (if applicable): -

Steps to Reproduce

To initialize the db, remove the indexer-db folder.
Set the rpc address with blocks and transactions like portal-loop or test3 to remote option and run indexer.
If you look up the transactions, you can see that they are missing or out of order.

Expected Behavior

The block data and block_result data must be matched and stored.

Actual Behavior

The block and block result don't match, causing transaction information to be lost or reordered.

Proposed Solution (Optional)

When mapping block and block result, should be mapped to the block height, not the index number of the iteration.
Alternatively, order the RPC fetch results by block height.

[feature] Add an `Events` field to the transaction

Description

gnolang/gno#1653 as a change,
adds data to the Events field in the transaction result.

Add functionality to tx-indexer to handle events data.
The tx-indexer should be able to filter and verify the data in events.

[feature] Add support for Swagger / OpenAPI

Description

This task concerns adding support for Swagger / OpenAPI documentation for the RPC endpoints offered by the indexer.

Improvement: Refactor JSON RPC filters for memory efficiency.

Description

The JSON RPC filters currently handle incoming data by storing all pending objects in memory, awaiting transmission to the client. This approach leads to significant duplication of elements in memory for each user, which increases the risk of encountering Out-of-Memory (OOM) errors.

To mitigate these issues, I propose the following changes to how data is managed:

Rather than storing all filtered objects in memory, maintain only a pointer for each client subscribed to the endpoint. This pointer will indicate the block height last processed for that client.
When a client makes a new request, fetch all relevant data entries from the database starting from the block height indicated by the stored pointer.
Apply filters to the data only on request.
After the data has been transmitted to the client, update the pointer to reflect the most recent block height that has been processed. We can use PebbleDB to store these pointers. This not only ensures durability across indexer restarts but also facilitates atomic updates of pointers.

Proposal: Collect performance metrics

Description

We assume by standard operating that the JSON-RPC interface will sit behind load-management services such as load balancers, DDoS protection services, etc. However, there may be some operations in this service that are so resource expensive they continue to pose a DoS vector, especially when load management software is unaware of application-level context. (For example, a common web DoS technique is the "WordPress XMLRPC flood", which targets certain expensive operations in WordPress.)

By collecting some stats on the load of various operations accessible through the JSON-RPC interface on "typical" hardware specs, we can highlight any obvious vectors for DoS that may require special security controls such as application-level rate limiting.

Proposal: Generic Filter Functionality

Description

We are suffering from Feature Creep on the filtering side of GraphQL API implementation, and we can still not cover most of the use cases (we cannot filter blocks by the proposer, for example).

I did a quick research and it is possible to implement a plugin on gqlgen that can generate the filter objects we need to be able to filter by all the fields we want.

Proposal

Implement a `gqlgen` plugin

We can create a plugin implementing CodeGenerator interface:

type CodeGenerator interface {
	GenerateCode(cfg *codegen.Data) error
}

That will allow us to scan all existing objects in the schema, get their fields with their types, and generate the needed objects for applying the filtering.

Proposed Filter structure

A filter will contain And filters, Or filters, Not filter, and field filters. They will be composed as follows (we are using Block filter as an example):

type FilterBlock struct {
	And []*FilterBlock
	Or []*FilterBlock
	Not *FilterBlock

	hash *FilterString

	num_txs *FilterNumber
   
	txs *NestedFilterBlockTransaction
}

type FilterString struct {
	Exists bool
    Eq       string
    Neq    string

    Like    string
    Nlike  string

	// TODO: maybe there are more useful ones?
}

That will allow us to create filters like:

{
   "_and":[
      {
         "num_txs":{
            "gt":10,
            "lt":100
         }
      },
      {
         "proposer_address_raw":{
            "eq":"ADDR"
         },
         "_or":[
            {
               "tx":{
                  "hash":{
                     "eq":"TXHASH"
                  }
               }
            }
         ]
      }
   ],
   "_or":[
      {
         "consensus_hash":{
            "eq":"HASH"
         }
      }
   ]
}

That will be the same as:

WHERE (((num_txs > 10 AND num_txs < 100) AND proposer_address_raw = "ADDR") OR tx_hash="TXHASH") OR consensus_hash = "HASH"

Schema tagging for filter creation

We can scan the existing graphql schema to find specific annotations, and from that, generate the needed filters. Example:

# plugin:filter
type Block {

  # plugin:filter
  hash: String!

  # plugin:filter
  height: Int!

  version: String!

  # plugin:filter
  chain_id: String!

  # plugin:filter
  time: Time!
	
[...]

That will allow us to create filters only from the types and fields we want/need. In the future we can be more specific, like only creating eq filter for a field: (# plugin:filter(types:[eq])))

Special cases

We have filters that are used outside the filtering logic, for example, to query the storage (block height). We need to expose this values to retrieve them. One idea (still thinking if it is the best) will be add a special tag to these values to expose a method to retrieve the max and min values from the filters, something like:

  # plugin:filter(expose:[minmax])
  height: Int!

Generating the following method in the generated filter struct:

func (f *FilterBlock) GetMinMaxHeight()(min int, max int) {
	// TODO check height filters values (eq, gt, lt and so on) in a recursive way, and return the min value and max value
}

Move encode.go to internal package

Description

Move storage/encode.go to storage/internal/ascenc or similar.

[feature] Add a "repair" mode

Description

Add an indexer repair subcommand that would verify the integrity of all blocks / transaction results, and fetch the ones that are missing (with retries).

In the current indexer implementation, I have opted to ignore blocks or transactions that are un-fetchable (for one reason or another, usually backwards compatibility support), so this subcommand should verify the DB is up to date with the chain data.

[infra] Create Healthcheck enpoint

Create an HTTP endpoint responding to GET requests to be used as HEALTCHECK endpoint.

The endpoint is supposed to answer 200 on healthy service, no specific body or content-type is required.

gnolang / tx-indexer Goto Github PK

tx-indexer's Issues

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Proposed Solution (Optional)

Description

Description

Description

Description

Description

Proposal

Implement a gqlgen plugin

Proposed Filter structure

Schema tagging for filter creation

Special cases

Description

Description

Recommend Projects

Recommend Topics

Recommend Org

Jobs

Implement a `gqlgen` plugin