crate-workbench / cratedb-toolkit Goto Github PK

CrateDB Toolkit.

Home Page: https://cratedb-toolkit.readthedocs.io/

License: GNU Affero General Public License v3.0

Python 99.58% Dockerfile 0.37% Shell 0.05%

data-retention olap olap-database expiration data-expiration retention retention-policies retention-policy toolkit cratedb

cratedb-toolkit's Introduction

CrateDB Toolkit

About

This software package includes a range of modules and subsystems to work with CrateDB and CrateDB Cloud efficiently.

You can use CrateDB Toolkit to run data I/O procedures and automation tasks of different kinds around CrateDB and CrateDB Cloud. It can be used both as a standalone program, and as a library.

It aims for DWIM-like usefulness and UX, and provides CLI and HTTP interfaces, and others.

Status

Please note that the cratedb-toolkit package contains alpha-, beta- and incubation-quality code, and as such, is considered to be a work in progress. Contributions of all kinds are much welcome, in order to make it more solid, and to add features.

Breaking changes should be expected until a 1.0 release, so version pinning is strongly recommended, especially when using it as a library.

Install

Install package.

pip install --upgrade cratedb-toolkit

Verify installation.

ctk --version

Run with Docker.

alias ctk="docker run --rm ghcr.io/crate-workbench/cratedb-toolkit ctk"
ctk --version

Development

Contributions are very much welcome. Please visit the documentation to learn about how to spin up a sandbox environment on your workstation, or create a ticket to report a bug or share an idea about a possible feature.

cratedb-toolkit's People

Contributors

Stargazers

Watchers

Forkers

pilosus surister hlcianfagna

cratedb-toolkit's Issues

Testing: Adapt "Testcontainers" implementation to `unittest`

Introduction

Over here, we reported about the state of the "Testcontainers for Python" implementation, for supporting application testing with CrateDB.

GH-57

About

At the issue referenced above, we will need to resolve this backlog item, in order to make the test layer usable for applications/libraries which are using Python's unittest module for testing.

While a pytest-based wrapper adapter around the "Testcontainers" implementation is nice, the projects crate-python and crash are using Python's builtin unittest module. Can we also grow a unittest-based wrapper adapter, to be reusable by both downstream projects?

Task

Use testing infrastructure from cratedb_toolkit.testing.testcontainers.cratedb and maybe cratedb_toolkit.tests.conftest.CrateDBFixture, and adapt that to unittest instead of using the pytest-specific details.

First Candidate

As a first candidate to apply this adapter, we identified the crash terminal program. This other ticket there outlines how/where to use the unittest-based adapter instead of the previous one.

crate/crash#402

Apply PyMongo-like amalgamation to AstraPy, to emulate DataStax Astra DB

Introduction

In the spirit of the PyMongo driver amalgamation, it looks like AstraPy, the Python client SDK for DataStax Astra and Stargate, based on the DataStax python-driver, has a very similar interface.

GH-83

Features

According to the data sheet of DataStax Astra DB, a few or all of those features would need to be unlocked to achieve reasonable feature parity.

Supported APIs

REST
Document (JSON)
GraphQL
gRPC API with equivalent performance as drivers
CQL API

Supported Languages

Java
Node.js
C#
Python
Go

Supported Data formats

Tabular (Column-family)
Document (JSON)
Key-Value

Resources

[I/O] Enable using CrateDB-specific DDL statements on `ctk load table` interface

About

@wierdvanderhaar suggested to improve the ctk load table interface such that the target table can be created with CrateDB-specific features like partitioned table, and friends. Thanks.

SQLAlchemy: Improve uniqueness polyfill to accept multiple columns (composite unique constraint / unique composite key)

About

GH-59 added a few utilities useful for additonal SQLAlchemy support, for example, a polyfill mechanism to emulate unique constraints.

Problem

The implementation can only handle unique constraints on single columns. In practice, we need more.

References

crate/mlflow-cratedb#52 (review).

[infra] Improve `quote_table_name` to accept full-qualified table identifiers

@seut: Do you think this routine needs to be improved? You mentioned something about "quoting going south". Maybe the root cause is here, because the routine may only handle a few situations correctly?

Yes the issue here is that it will not quote anything if the ident contains a . like foo.bar. This looks a bit wrong as it normally should quote an ident which contains dots, this would be a valid table name for PG but not for CrateDB which forbids using a . inside a table identifier (see https://cratedb.com/docs/crate/reference/en/latest/general/ddl/create-table.html#naming-restrictions).
But then I wonder why you're not using sqlalchemy.sql.expression.quoted_name which works similar afaik.
That it tries to detect a FQ identifier, splits it and quotes all parts separately, would be a bit custom but needed if the schema and table idents are not quoted dedicated.

Originally posted by @seut in #88 (comment)

[io] Configure schema name for `ctk cfr import`

@seut, while working on GH-153 and GH-165, disovered a flaw. Thank you.

On ctk cfr import, it is not possible to convey a schema name where the table data should be imported into?

CFR: Problem with `sys.jobs_log` table on `sys-export` operation

Problem

On a CrateDB database instance up for two days or so, I received this error when running ctk cfr --debug sys-export.

polars.exceptions.ComputeError: could not append value: "line 1:25: mismatched input '-' expecting {<EOF>, ';'}" of type: str to the builder; make sure that all rows have the same schema or consider increasing `infer_schema_length`

Details

14:05:17        [cratedb_toolkit.util.cli            ] ERROR   : could not append value: "line 1:25: mismatched input '-' expecting {<EOF>, ';'}" of type: str to the builder; make sure that all rows have the same schema or consider increasing `infer_schema_length`

it might also be that a value overflows the data-type's capacity
Traceback (most recent call last):
  File "/path/to/cratedb-toolkit/cratedb_toolkit/cfr/cli.py", line 50, in sys_export
    path = stc.save()
           ^^^^^^^^^^
  File "/path/to/cratedb-toolkit/cratedb_toolkit/cfr/systable.py", line 149, in save
    df = self.read_table(tablename=tablename)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/cratedb-toolkit/cratedb_toolkit/cfr/systable.py", line 107, in read_table
    return pl.read_database(
           ^^^^^^^^^^^^^^^^^
  File "/path/to/polars/io/database/functions.py", line 267, in read_database
    ).to_polars(
      ^^^^^^^^^^
  File "/path/to/polars/io/database/_executor.py", line 462, in to_polars
    frame = frame_init(
            ^^^^^^^^^^^
  File "/path/to/polars/io/database/_executor.py", line 274, in _from_rows
    return frames if iter_batches else next(frames)  # type: ignore[arg-type]
                                       ^^^^^^^^^^^^
  File "/path/to/polars/io/database/_executor.py", line 261, in <genexpr>
    DataFrame(
  File "/path/to/polars/dataframe/frame.py", line 376, in __init__
    self._df = sequence_to_pydf(
               ^^^^^^^^^^^^^^^^^
  File "/path/to/polars/_utils/construction/dataframe.py", line 433, in sequence_to_pydf
    return _sequence_to_pydf_dispatcher(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/functools.py", line 909, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/polars/_utils/construction/dataframe.py", line 644, in _sequence_of_tuple_to_pydf
    return _sequence_of_sequence_to_pydf(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/polars/_utils/construction/dataframe.py", line 561, in _sequence_of_sequence_to_pydf
    pydf = PyDataFrame.from_rows(
           ^^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.ComputeError: could not append value: "line 1:25: mismatched input '-' expecting {<EOF>, ';'}" of type: str to the builder; make sure that all rows have the same schema or consider increasing `infer_schema_length`

[wtf] Improve query library

About

The CrateDB SQL query collection on behalf of library.py, added with GH-88, needs further improvements. It has been assembled from a wave of quick & dirty operations, collected from different sources, without much review.

Details

While working on the code base, @seut discovered a few specific shortcomings in this area. Thank you. Maybe @WalBeh, @hlcianfagna, @hammerhead, or others have something to contribute to answer those questions.

Settings

Why only this small subset of settings? It's also not really dedicated to a concrete topic as both, rebalance and recovery settings are queried.

cratedb-toolkit/cratedb_toolkit/wtf/library.py

Lines 369 to 383 in 7b4e305

class Settings:

"""

Reflect cluster settings.

"""

info = """

SELECT

name,

master_node,

settings['cluster']['routing']['allocation']['cluster_concurrent_rebalance']

AS cluster_concurrent_rebalance,

settings['indices']['recovery']['max_bytes_per_sec'] AS max_bytes_per_sec

FROM sys.cluster

LIMIT 1;

"""

Shards

This looks unreasonable complicated to just translate the primary boolean into a string.

cratedb-toolkit/cratedb_toolkit/wtf/library.py

Lines 422 to 439 in 7b4e305

allocation = InfoElement(

name="shard_allocation",

sql="""

SELECT

IF(s.primary = TRUE, 'primary', 'replica') AS shard_type,

COALESCE(shards, 0) AS shards

FROM

UNNEST([true, false]) s(primary)

LEFT JOIN (

SELECT primary, COUNT(*) AS shards

FROM sys.allocations

WHERE current_state != 'STARTED'

GROUP BY 1

) a ON s.primary = a.primary;

""",

label="Shard Allocation",

description="Support identifying issues with shard allocation.",

)

Why selecting the 2nd decision? This looks problematic e.g. when only 1 shard exists there isn't a 2nd decision.

cratedb-toolkit/cratedb_toolkit/wtf/library.py

Lines 534 to 543 in 7b4e305

table_allocation_special = InfoElement(

name="table_allocation_special",

label="Table Allocations Special",

sql="""

SELECT decisions[2]['node_name'] AS node_name, COUNT(*) AS table_count

FROM sys.allocations

GROUP BY decisions[2]['node_name'];

""",

description="Table allocation. Special.",

)

Isn't the query above more detailed? I think this one can be skipped...

cratedb-toolkit/cratedb_toolkit/wtf/library.py

Lines 591 to 601 in 7b4e305

translog_uncommitted_size = InfoElement(

name="translog_uncommitted_size",

label="Total uncommitted translog size",

description="A large number of uncommitted translog operations can indicate issues with shard replication.",

sql="""

SELECT COALESCE(SUM(translog_stats['uncommitted_size']), 0) AS translog_uncommitted_size

FROM sys.shards;

""",

transform=get_single_value("translog_uncommitted_size"),

unit="bytes",

)

Thoughts

In general, I am happy to remove any item which should be skipped, and improve all others which have shortcomings, into a DWIM shape, based on your suggestions. Thanks already, and thanks in advance!

Prevent multiple strategies operating on the same table

About

The idea behind the composite primary key PRIMARY KEY ("strategy", "table_schema", "table_name") was to prevent duplicate strategies on the same table. Too bad we don't have UNIQUE constraints in CrateDB.

Regression?

Is there elsewhere in the code a check to prevent duplicates (i.e., for the same table, one entry with DELETE and 3 days retention, and another with DELETE and 5 days retention on the same table)?

Originally posted by @hammerhead in #20 (comment)

Apply database schema already when connecting

At ¹ and ², we have been using SQLAlchemy's abilities to specify the database schema on the connection string already, using ?schema=foobar. In this way, table names will not need to be addressed in full-qualified notation "by hand". Instead, they can be addressed by using basename only, when selecting the schema at connection time already.

Let's also do it in the same spirit here.

[wtf] Provide `tail -f` on top of `sys.jobs_log` table

Feature Pitch

I would like to implement tailing the sys.jobs_log table in one way or another. Actually, tail --follow.

Status

Deferred from #88 (comment).

Build self-contained native binaries

About

We might think about using traditional PyInstaller to build self-contained native binaries? Alternatively, let's try Briefcase, or PyApp?

ctk.exe, anyone?

References

[I/O] InfluxDB adapter: `Failed to establish a new connection: [Errno 111] Connection refused`, when using Docker runtime

Procedure

docker run -d --name crate -p 4200:4200 crate/crate:latest

docker run -d --name influxdb -p 8086:8086 \
    --env=DOCKER_INFLUXDB_INIT_MODE=setup \
    --env=DOCKER_INFLUXDB_INIT_USERNAME=user1 \
    --env=DOCKER_INFLUXDB_INIT_PASSWORD=secret1234 \
    --env=DOCKER_INFLUXDB_INIT_ORG=example \
    --env=DOCKER_INFLUXDB_INIT_BUCKET=testdrive \
    --env=DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=token \
    influxdb:latest

alias ctk="docker run --rm -it ghcr.io/crate-workbench/cratedb-toolkit:latest ctk"
ctk load table influxdb2://example:token@localhost:8086/testdrive/demo --cratedb-sqlalchemy-url "crate://crate@localhost:4200/testdrive/demo"

Problem

urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f9ebfd32450>: Failed to establish a new connection: [Errno 111] Connection refused

References

Documentation link returns 404

Hi @amotl,

I recently sent out references to cratedb-toolkit to a few users. In the README, there is a link to the documentation pointing to https://cratedb-toolkit.readthedocs.io/ which returns a 404. Is there documentation already available that can be linked to or does it still need to be created?

Thanks!

[CLI] Improve UX for analysis and debugging purposes

About

GH-88 mostly revolves around queries to leverage information from CrateDB's internal sys.* tables. On a subsequent iteration, corresponding WTF-like tooling to support analysis and debugging purposes would also be helpful.

References

GH-88

PyStow

About

👜 Easily pick a place to store data for your python code.

-- https://github.com/cthoyt/pystow

Thoughts

Maybe relevant for cratedb_tookit.datasets.

[LIB] Improve UX for ad hoc applications

About

For certain ad hoc applications like presenting functionalities in Jupyter Notebooks, accessing data from CrateDB in Python, or otherwise exploring it, querying should not be more difficult than like how EasyDB, TinyDB, dataset, and Datasette are demonstrating it, with or without using SQLite.

EasyDB

from easydb import EasyDB

db = EasyDB("filename.db")
for record in db.query("SELECT * FROM mytable"):
  print(record)

TinyDB

from tinydb import TinyDB, Query

db = TinyDB("/path/to/db.json")
db.insert({'int': 1, 'char': 'a'})
db.insert({'int': 1, 'char': 'b'})

db.search((User.name == 'John') & (User.age <= 30))

dataset

import dataset

db = dataset.connect('sqlite:///:memory:')

table = db['sometable']
table.insert(dict(name='John Doe', age=37))
table.insert(dict(name='Jane Doe', age=34, gender='female'))

john = table.find_one(name='John Doe')

Datasette

datasette serve path/to/database.db
open http://localhost:8001/

References

GH-81

[I/O] MongoDB adapter: Error while using translate

@delusion8399 added a report at crate/mongodb-cratedb-migration-tool#25. Because we are archiving that repository, we are adding the report here instead.

Hi, getting this error while using translate

    type = max(types, key=lambda item: types[item]["count"])
ValueError: max() arg is an empty sequence

Share and use datasets via Python code

About

Easily consume datasets from tutorials and/or production applications like others are doing it, using Python code.

References

Add package datasets.
Add a convience function cratedb_toolkit.tutorial.load_dataset like datasets.load_dataset, xarray.tutorial.load_dataset, or azureml.opendatasets.
Add convenient access to datasets at https://github.com/crate/cratedb-datasets.
See also NycTlcYellow class.
See also https://github.com/MicrosoftDocs/azure-docs/tree/main/articles/open-datasets.
https://github.com/coderholic/django-cities
from sklearn.datasets import load_iris
https://github.com/OvertureMaps/data
https://ml-explore.github.io/mlx-data/build/html/python/common_datasets.html
https://github.com/pinecone-io/pinecone-datasets
https://github.com/orgs/fivetran/repositories
https://github.com/posit-dev/great-tables
https://github.com/lerocha/chinook-database
https://github.com/tensorflow/datasets

Standards

Data Catalog Vocabulary (DCAT) - Version 3
https://www.w3.org/TR/vocab-dcat-3/

[I/O] Add `--overwrite` option to `ctk load table` interface

About

@wierdvanderhaar reported that the InfluxDB I/O subsystem lacks an --overwrite option. Currently, the implementation probably always overwrites the target collection.

[I/O] Use cr8 for loading tables from PostgreSQL

About

@hlcianfagna elaborated about typical cr8 usage patterns, which did not make it into the ctk load table interface yet. Thanks!

Details

Regarding copying the content from one table to a new one with different settings/partitioning options/etc, you can use the cr8 insert-from-sql utility. It accepts a --fetch-size parameter which defaults at 100 records, and a --concurrency parameter which defaults at 25.

This tool reads through the PostgreSQL protocol connecting on port 5432, so username and passwords for the source need to be encoded in the connection string. Writing happens through the HTTP endpoint of CrateDB on port 4200, and it can go to a separate cluster.

If your passwords have special characters, you need to encode them properly.

CLI Example

cr8 insert-from-sql \
  --src-uri "postgresql://readuser:readpwd@localhost:5432/doc" --query "SELECT * FROM sourcetable;" \
  --hosts writeuser:writepassword@localhost:4200 --table doc.targettable

[wtf] Review `job-statistics collect --once` mechanics

Problem

This incantation, coming from GH-88, does not produce any viable results, while it should?

ctk wtf job-statistics collect --once
ctk wtf job-statistics view

Backlog

Resolve problem.
Improve test cases.

[I/O] InfluxDB adapter: CRATEDB_SQLALCHEMY_URL not working when using the Docker-aliased command

Procedure

alias ctk="docker run --rm -it ghcr.io/crate-workbench/cratedb-toolkit:latest ctk"
export CRATEDB_SQLALCHEMY_URL=crate://crate@localhost:4200/testdrive/demo
ctk load table influxdb2://example:token@localhost:8086/testdrive/demo

Problem

KeyError: 'Either CrateDB Cloud Cluster identifier or CrateDB SQLAlchemy or HTTP URL needs to be supplied. Use --cluster-id / --cratedb-sqlalchemy-url / --cratedb-http-url CLI options or CRATEDB_CLOUD_CLUSTER_ID / CRATEDB_SQLALCHEMY_URL / CRATEDB_HTTP_URL environment variables.'

References

ValueError: max() arg is an empty sequence

@hammerhead reported this problem, happening right away when invoking cratedb-toolkit without any command line options.

~/ cratedb-toolkit          
Traceback (most recent call last):
  File "/usr/local/bin/cratedb-toolkit", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1054, in main
    with self.make_context(prog_name, args, **extra) as ctx:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/click/core.py", line 920, in make_context
    self.parse_args(ctx, args)
  File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1610, in parse_args
    echo(ctx.get_help(), color=ctx.color)
         ^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/click/core.py", line 699, in get_help
    return self.command.get_help(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1298, in get_help
    self.format_help(ctx, formatter)
  File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1331, in format_help
    self.format_options(ctx, formatter)
  File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1533, in format_options
    self.format_commands(ctx, formatter)
  File "/usr/local/lib/python3.11/site-packages/click_aliases/__init__.py", line 65, in format_commands
    max_len = max(len(cmd) for cmd in sub_commands)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: max() arg is an empty sequence

[io] Consolidate number of database connections

@seut, while working on GH-153 and GH-165, disovered a flaw. Thank you.

Apparently, throughout a single action invocation of ctk cfr, multiple connections to the database are opened. It looks like there is a connection management flaw.

Document SQLAlchemy ORM models

About

In order to document the SQLAlchemy models, have a look at this Sphinx extension for documenting SQLAlchemy ORMs.

-- https://github.com/chrisjsewell/sphinx-sqlalchemy

[wtf] Use `table_relations` from `sys.snapshots` (CrateDB 5.9)

About

The new column table_relations contains an array of objects including the table_schema and table_name of all tables inside the snapshot.

References

Testing: Improve "Testcontainers for Python" implementation

Introduction

We are aiming to provide canonical "Testcontainers" implementations for Java and Python, per testcontainers-java and testcontainers-python.

At Testcontainers for Java and CrateDB, we demonstrate how to run integration tests of Java applications with CrateDB, based on Testcontainers for Java, see testcontainers/testcontainers-java#6790.
crate/cratedb-examples#72 needs to be resolved by working through the backlog.

About

At the spots enumerated below, we added the first version of a corresponding Python implementation, originally conceived at daq-tools/lorrystream#47.

Implementation: cratedb_toolkit/testing/testcontainers/cratedb.py
pytest fixtures: tests/conftest.py
test case: tests/testing/test_cratedb_sqlalchemy.py

Backlog

Add documentation
GH-53
GH-58
Currently, the adapter and test layer is being exercised using an SQLAlchemy connection and corresponding test case. It makes sense to also exercise and demonstrate a pure DBAPI-based variant of the same thing.
It will be nice to have a modern test layer which forms a cluster, for both Java and Python. I think cr8 has it already?
Cherry-pick CrateDB invocation options from cr8: '-Cdiscovery.initial_state_timeout=0', '-Cnetwork.host=127.0.0.1', '-Cudc.enabled=false', '-Ccluster.name=cr8-tests'
Revisit downstream issues crate/cratedb-examples#72 and crate/cratedb-examples#282.
Upstream to testcontainers-python.

[io] Croud related IO subsystem tests may fail when a croud configuration file exist for the local user

Some croud related IO subsystem tests may fail when a local croud configuration settings for the local user exists. These settings will be taken into account and passed to the croud API, causing the mocked croud calls to not match anymore.

Failing tests:

cratedb_toolkit.io.croud.test_import_url
cratedb_toolkit.io.croud.test_import_file

They are failing in my setup when the stored croud.yml configuration file contains a different endpoint, example:

current-profile: dev
profiles:
  dev:
    endpoint: https://console.cratedb-dev.cloud

AnyIO

About

High level asynchronous concurrency and networking framework that works on top of either trio or asyncio
Topics.

AnyIO is an asynchronous networking and concurrency library that works on top of either asyncio or trio. It implements trio-like structured concurrency (SC) on top of asyncio and works in harmony with the native SC of trio itself.

References

https://github.com/agronholm/anyio

	class Settings:
	"""
	Reflect cluster settings.
	"""

	info = """
	SELECT
	name,
	master_node,
	settings['cluster']['routing']['allocation']['cluster_concurrent_rebalance']
	AS cluster_concurrent_rebalance,
	settings['indices']['recovery']['max_bytes_per_sec'] AS max_bytes_per_sec
	FROM sys.cluster
	LIMIT 1;
	"""

	allocation = InfoElement(
	name="shard_allocation",
	sql="""
	SELECT
	IF(s.primary = TRUE, 'primary', 'replica') AS shard_type,
	COALESCE(shards, 0) AS shards
	FROM
	UNNEST([true, false]) s(primary)
	LEFT JOIN (
	SELECT primary, COUNT(*) AS shards
	FROM sys.allocations
	WHERE current_state != 'STARTED'
	GROUP BY 1
	) a ON s.primary = a.primary;
	""",
	label="Shard Allocation",
	description="Support identifying issues with shard allocation.",
	)

	table_allocation_special = InfoElement(
	name="table_allocation_special",
	label="Table Allocations Special",
	sql="""
	SELECT decisions[2]['node_name'] AS node_name, COUNT(*) AS table_count
	FROM sys.allocations
	GROUP BY decisions[2]['node_name'];
	""",
	description="Table allocation. Special.",
	)

	translog_uncommitted_size = InfoElement(
	name="translog_uncommitted_size",
	label="Total uncommitted translog size",
	description="A large number of uncommitted translog operations can indicate issues with shard replication.",
	sql="""
	SELECT COALESCE(SUM(translog_stats['uncommitted_size']), 0) AS translog_uncommitted_size
	FROM sys.shards;
	""",
	transform=get_single_value("translog_uncommitted_size"),
	unit="bytes",
	)

crate-workbench / cratedb-toolkit Goto Github PK

cratedb-toolkit's Introduction

CrateDB Toolkit

About

Status

Install

Development

cratedb-toolkit's People

Contributors

Stargazers

Watchers

Forkers

cratedb-toolkit's Issues

Introduction

About

Task

First Candidate

Introduction

Features

Supported APIs

Supported Languages

Supported Data formats

Resources

About

About

Problem

References

Problem

Details

About

Details

Settings

Shards

Thoughts

About

Regression?

Footnotes

Feature Pitch

Status

About

References

Procedure

Problem

References

About

References

About

Thoughts

About

EasyDB

TinyDB

dataset

Datasette

References

About

References

Standards

About

About

Details

CLI Example

Problem

Backlog

Procedure

Problem

References

About

About

References

Introduction

About

Backlog

About

References

Recommend Projects

Recommend Topics

Recommend Org

Jobs