GithubHelp home page GithubHelp logo

logicalclocks / rondb Goto Github PK

View Code? Open in Web Editor NEW
532.0 14.0 45.0 1.58 GB

This is RonDB, a distribution of NDB Cluster developed and used by Hopsworks AB. It also contains development branches of RonDB.

Home Page: https://www.rondb.com

License: Other

CMake 1.04% C++ 77.44% C 16.21% Batchfile 0.01% Perl 0.54% CSS 0.01% Makefile 1.18% JavaScript 0.21% HTML 0.43% Objective-C 0.70% Python 0.03% Shell 0.78% M4 0.14% Roff 0.18% Awk 0.01% Starlark 0.05% NASL 1.05% Pawn 0.01% DIGITAL Command Language 0.01% Module Management System 0.01%

rondb's Introduction

RonDB Logo

Give a star and make a developer happy


What is RonDB?

RonDB is a stable distribution of NDB Cluster, a key-value store with SQL capabilities. It is based on a release of MySQL, an SQL database server.

Quick start

Coming soon.

License

License information can be found here. In test packages where this file is renamed README-test, the license file is renamed LICENSE-test. This distribution may include materials developed by third parties. For license and attribution notices for these materials, please refer to the license file.

Who is behind RondDB?

RonDB is brought to you by the RonDB team at Hopsworks.

More information

RonDB is the fastest key-value store with SQL capabilities, available now in the cloud. It’s an open source distribution of NDB Cluster, thus providing the same core technology and performance as NDB, but as a managed platform in the cloud. And also brings large data storage capabilities. RonDB has dedicated support for features required for a high-performance online feature store, including the LATS performance.

LATS: low Latency, high Availability, high Throughput, scalable Storage

You can read more about RonDB on our blog.

MySQL

It is brought to you by the MySQL team at Oracle.

Collaborate with us

There are no existing managed databases available today with these attributes. Would you like to be part of it? Feel free to contribute!

The main development branches to track is currently 21.04 (Stable branch) and 22.10.1 (New stable branch).


Copyright (c) 2000, 2021, Oracle and/or its affiliates.
Copyright (c) 2021, 2023, Hopsworks and/or its affiliates.

rondb's People

Contributors

arnabray21 avatar bjornmu avatar bkandasa avatar blaudden avatar dahlerlend avatar frazerclement avatar gkodinov avatar glebshchepa avatar gurusami avatar harinvadodaria avatar jdduncan avatar jhauglid avatar kahatlen avatar kdjakevin avatar lkotula avatar lkshminarayanan avatar ltangvald avatar marcalff avatar mikaelronstrom avatar nacarvalho avatar nryeng avatar phulakun avatar roylyseng avatar thayumanavar77 avatar thirunarayanan avatar trosten avatar vaintroub avatar vasild avatar weigon avatar zmur avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rondb's Issues

Seg fault in Native API

I get seg fault using the following API when aValue is greater in length of the column.

NdbOperation::equal(const char* anAttrName, const char* aValue)
NdbOperation::equal(const char* anAttrName, const char* aValue, Uint32 len)

Table def

CREATE TABLE `chartable` (                                            
  `id` char(5) NOT NULL,                                              
  `value` char(5) DEFAULT NULL,                                       
  PRIMARY KEY (`id`)                                                  
) ENGINE=ndbcluster DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
int main(int argc, char **argv) {
  char connection_string[] = "localhost:1186";
  Init(connection_string);

  Ndb *ndb_object  = nullptr;
  RS_Status status = GetNDBObject(ndb_connection, &ndb_object);

  ndb_object->setCatalogName("test");
  const NdbDictionary::Dictionary *dict = ndb_object->getDictionary();
  const NdbDictionary::Table *table     = dict->getTable("chartable");
  const NdbDictionary::Column *col      = table->getColumn("id");

  NdbTransaction *transaction = ndb_object->startTransaction(table);
  if (transaction == nullptr) {
    std::cout << "Tx Start failed" << std::endl;
  }

  NdbOperation *operation = transaction->getNdbOperation(table);
  if (operation == nullptr) {
    std::cout << "get operation failed" << std::endl;
  }
  operation->readTuple(NdbOperation::LM_CommittedRead);


  // char pk[col->getLength()];
  // for (int i = 0; i < col->getLength(); i++) {
    // pk[i] = 0;
  // }
  std::string pkstr = "000000000000000000000000000000000000000000";
  // std::memcpy(pk, pkstr.c_str(), pkstr.length());
    // std::cout << "mem copy workd" << std::endl;


  int ret = operation->equal("id", pkstr.c_str(), pkstr.length());
  if (ret != 0) {
    std::cout << "Op equal failed" << std::endl;
  }

  NdbRecAttr *val_rec = operation->getValue("value", NULL);

  ret = transaction->execute(NdbTransaction::Commit);
  if (ret != 0) {
    std::cout << "execute failed" << std::endl;
  }

  if (transaction->getNdbError().classification == NdbError::NoDataFound) {
    std::cout << "NOT FOUND" << std::endl;
  } else {
    std::cout << "data: " << val_rec->aRef() << std::endl;
  }
  ndb_object->closeTransaction(transaction);
  CloseNDBObject(&ndb_object);
  Shutdown();
  return 0;
}

Upgrade from RonDB 21.04 to 22.10 fails

Hello and thank you for your time!

I tried an rolling upgrade from RonDB-21.04.9 to RonDB-22.10.0 but it failed, even though it ought to work and is documented:

https://docs.rondb.com/rondb_upgrade/#upgrading-from-rondb-2104-to-2210

I get an error (last four lines):


2023-07-24 21:36:54 [ndbd] INFO     -- DIH reported initial start, now starting the Node Inclusion Protocol
For help with below stacktrace consult:
https://dev.mysql.com/doc/refman/en/using-stack-trace.html
Also note that stack_bottom and thread_stack will always show up as zero.
stack_bottom = 0 thread_stack 0x0
ndbmtd(my_print_stacktrace(unsigned char const*, unsigned long)+0x2e) [0x5b13ae]
ndbmtd(ndb_print_stacktrace()+0x45) [0x6371e5]
ndbmtd(ErrorReporter::handleError(int, char const*, char const*, NdbShutdownType)+0x26) [0x6f7b96]
ndbmtd(SimulatedBlock::progError(int, int, char const*, char const*) const+0xf9) [0x5f5629]
ndbmtd(Qmgr::execCM_REGCONF(Signal*)+0x599) [0x7e6b69]
ndbmtd() [0x517a25]
ndbmtd() [0x5dd293]
ndbmtd(mt_job_thread_main+0x317) [0x5cd947]
ndbmtd() [0x6372cc]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7f620c601609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f620ba24293]
2023-07-24 21:36:54 [ndbd] INFO     -- incompatible version own=0x160a00 other=0x150409,  shutting down
2023-07-24 21:36:54 [ndbd] INFO     -- QMGR (Line: 1715) 0x00000006
2023-07-24 21:36:54 [ndbd] INFO     -- Error handler shutting down system
2023-07-24 21:36:56 [ndbd] ALERT    -- Node 1: Forced node shutdown completed. Occurred during startphase 1. Caused by error 6304: 'Unsupported version(Restart error). Temporary error, restart node'.
(END)

First I tried to upgrade from RonDB-21.04.8 directly to RonDB-22.10.0 and got the same error of "incompatible version/Unsupported version" as above. Then I figured I probably out to upgrade to RonDB-21.04.9 first. This upgrade to 21.04.9 worked but then the upgrade to 22.10.0 fails in the same fashion as stated.

I observed the documentation, that is, I start with a state:

Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=1    @94.75.251.146  (RonDB-21.04.9, Nodegroup: 0)
id=2    @94.75.251.146  (RonDB-21.04.9, Nodegroup: 0, *)

[ndb_mgmd(MGM)] 2 node(s)
id=65   @94.75.251.146  (RonDB-21.04.9)
id=66   @94.75.251.146  (RonDB-21.04.9)

[mysqld(API)]   5 node(s)
id=67   @94.75.251.146  (RonDB-21.04.9)
id=68   @94.75.251.146  (RonDB-21.04.9)
id=231  @94.75.251.146  (RonDB-21.04.9)
id=232 (not connected, accepting connect from any host)
id=233 (not connected, accepting connect from any host)

and first upgrade the management nodes and arrive at:

Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=1    @94.75.251.146  (RonDB-21.04.9, Nodegroup: 0)
id=2    @94.75.251.146  (RonDB-21.04.9, Nodegroup: 0, *)

[ndb_mgmd(MGM)] 2 node(s)
id=65   @94.75.251.146  (RonDB-22.10.0)
id=66   @94.75.251.146  (RonDB-22.10.0)

[mysqld(API)]   5 node(s)
id=67 (not connected, accepting connect from any host)
id=68 (not connected, accepting connect from any host)
id=231 (not connected, accepting connect from any host)
id=232 (not connected, accepting connect from any host)

then I stop data node 1 and start a new data node of RonDB-22.10.0 with the --initial flag but that fails as stated above.

I also noticed that when I have the two new management nodes I get a lot of warnings and alerts of missed heartbeats:

2023-07-24 21:42:15 [MgmtSrvr] WARNING  -- Node 2: Node 66 missed heartbeat 3
2023-07-24 21:42:15 [MgmtSrvr] WARNING  -- Node 1: Node 65 missed heartbeat 3
2023-07-24 21:42:15 [MgmtSrvr] WARNING  -- Node 1: Node 66 missed heartbeat 3
2023-07-24 21:42:16 [MgmtSrvr] WARNING  -- Node 2: Node 65 missed heartbeat 4
2023-07-24 21:42:16 [MgmtSrvr] ALERT    -- Node 2: Node 65 declared dead due to missed heartbeat
2023-07-24 21:42:16 [MgmtSrvr] INFO     -- Node 1: Communication to Node 65 closed
2023-07-24 21:42:16 [MgmtSrvr] INFO     -- Node 2: Communication to Node 65 closed
2023-07-24 21:42:16 [MgmtSrvr] INFO     -- Node 2: Lost arbitrator node 65 - process failure [state=6]
2023-07-24 21:42:16 [MgmtSrvr] INFO     -- Node 2: President restarts arbitration thread [state=1]

Hm, I tried to repeat the process writing this issue ticket and after I stopped data node 1, node 2 dies as well:

2023-07-25 11:09:44 [ndbd] INFO     -- Node 1 disconnected in state: 0
2023-07-25 11:09:45 [ndbd] INFO     -- Node 1 disconnected in state: 0 - Repeated 2 times
2023-07-25 11:09:45 [ndbd] INFO     -- findNeighbours from: 6718 old (left: 1 right: 1) new (65535 65535)
For help with below stacktrace consult:
https://dev.mysql.com/doc/refman/en/using-stack-trace.html
Also note that stack_bottom and thread_stack will always show up as zero.
2023-07-25 11:09:45 [ndbd] ALERT    -- Network partitioning - no arbitrator available
2023-07-25 11:09:45 [ndbd] INFO     -- President restarts arbitration thread [state=8]
stack_bottom = 0 thread_stack 0x0
ndbmtd(my_print_stacktrace(unsigned char const*, unsigned long)+0x2e) [0x5ae73e]
ndbmtd(ndb_print_stacktrace()+0x45) [0x65a835]
ndbmtd(ErrorReporter::handleError(int, char const*, char const*, NdbShutdownType)+0x26) [0x6eda06]
ndbmtd(SimulatedBlock::progError(int, int, char const*, char const*) const+0xf9) [0x61bb69]
ndbmtd(Qmgr::startArbitThread(Signal*)+0x2ab) [0x7b2d9b]
ndbmtd(Qmgr::handleArbitCheck(Signal*)+0x46e) [0x7b344e]
ndbmtd() [0x5fedf4]
ndbmtd() [0x6010b8]
ndbmtd(mt_job_thread_main+0x1e6) [0x6028f6]
ndbmtd() [0x65a93c]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7f69efb23609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f69eef46293]
2023-07-25 11:09:45 [ndbd] INFO     -- Arbitrator decided to shutdown this node
2023-07-25 11:09:45 [ndbd] INFO     -- QMGR (Line: 8009) 0x00000002
2023-07-25 11:09:45 [ndbd] INFO     -- Error handler shutting down system
2023-07-25 11:09:47 [ndbd] ALERT    -- Node 2: Forced node shutdown completed. Caused by error 2305: 'Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other no
de(s)(Arbitration error). Temporary error, restart node'.
2023-07-25 11:10:53 [ndbd] INFO     -- Angel pid: 653061 started child: 653062

I start both data nodes up again with RonDB-21.04.9 and try again:

-- RonDB -- Management Client --
ndb_mgm> show
Connected to Management Server at: nl3:1187
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=1    @94.75.251.146  (RonDB-21.04.9, Nodegroup: 0)
id=2    @94.75.251.146  (RonDB-21.04.9, Nodegroup: 0, *)

[ndb_mgmd(MGM)] 2 node(s)
id=65   @94.75.251.146  (RonDB-22.10.0)
id=66   @94.75.251.146  (RonDB-22.10.0)

[mysqld(API)]   5 node(s)
id=67 (not connected, accepting connect from any host)
id=68 (not connected, accepting connect from any host)
id=231 (not connected, accepting connect from any host)
id=232 (not connected, accepting connect from any host)
id=233 (not connected, accepting connect from any host)

ndb_mgm> 1 stop

This time the data node 2 didn't die:

Connected to Management Server at: nl3:1187
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=1 (not connected, accepting connect from nl3)
id=2    @94.75.251.146  (RonDB-21.04.9, Nodegroup: 0, *)

[ndb_mgmd(MGM)] 2 node(s)
id=65   @94.75.251.146  (RonDB-22.10.0)
id=66   @94.75.251.146  (RonDB-22.10.0)

[mysqld(API)]   5 node(s)
id=67 (not connected, accepting connect from any host)
id=68 (not connected, accepting connect from any host)
id=231 (not connected, accepting connect from any host)
id=232 (not connected, accepting connect from any host)
id=233 (not connected, accepting connect from any host)

ndb_mgm> 

and I try to start a new data node 1 with RonDB-22.10.0:

ndbmtd --ndb-connectstring=nl3:1186,nl3:1187 --ndb-nodeid=1 --initial

But same result:

2023-07-25 11:47:42 [ndbd] INFO     -- We are running with 16 LDM workers and 4 REDO log parts. This means that we need to use a mutex to access REDO log parts
2023-07-25 11:47:42 [ndbd] INFO     -- Watchdog KillSwitch off.
2023-07-25 11:47:42 [ndbd] INFO     -- Starting QMGR phase 1
2023-07-25 11:47:42 [ndbd] INFO     -- DIH reported initial start, now starting the Node Inclusion Protocol
For help with below stacktrace consult:
https://dev.mysql.com/doc/refman/en/using-stack-trace.html
Also note that stack_bottom and thread_stack will always show up as zero.
stack_bottom = 0 thread_stack 0x0
ndbmtd(my_print_stacktrace(unsigned char const*, unsigned long)+0x2e) [0x5b13ae]
ndbmtd(ndb_print_stacktrace()+0x45) [0x6371e5]
ndbmtd(ErrorReporter::handleError(int, char const*, char const*, NdbShutdownType)+0x26) [0x6f7b96]
ndbmtd(SimulatedBlock::progError(int, int, char const*, char const*) const+0xf9) [0x5f5629]
ndbmtd(Qmgr::execCM_REGCONF(Signal*)+0x599) [0x7e6b69]
ndbmtd() [0x517a25]
ndbmtd() [0x5dd293]
ndbmtd(mt_job_thread_main+0x317) [0x5cd947]
ndbmtd() [0x6372cc]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7f15c0910609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f15bfd33293]
2023-07-25 11:47:42 [ndbd] INFO     -- incompatible version own=0x160a00 other=0x150409,  shutting down
2023-07-25 11:47:42 [ndbd] INFO     -- QMGR (Line: 1715) 0x00000006
2023-07-25 11:47:42 [ndbd] INFO     -- Error handler shutting down system
2023-07-25 11:47:44 [ndbd] ALERT    -- Node 1: Forced node shutdown completed. Occurred during startphase 1. Caused by error 6304: 'Unsupported version(Restart error). Temporary error, restart node'.

Please tell me if I should add more information to the ticket. I'd be glad if you pointed out what I am doing wrong.

Thanks again, Max

Evaluate Profile-Guided Optimization

Hi!

Recently I tested a lot of software with PGO and measured the performance improvements from PGO - the results are here. Since my results show interesting improvements on a lot of databases (including MySQL, MariaDB, and PostgreSQL) I think it would be a good idea to measure PGO effects on RonDB as well. If the results will show an improvement - would be great to see a note in the documentation about PGO.

Rondb scalability?

I cannot find the details about Rondb in its documents. If I don’t remember wrong, I’ve read in the “MySQL Cluster 7.5 inside and out” book, NDB data node cluster can scale up to 48 nodes, but the message address size is 16 bit. Can you guys give me some more details about this?

I also created a topic here: https://community.rondb.com/t/rondb-scalability/18

Bug in storage/ndb/tools/NdbImportCsv.cpp date functions

For time "1111-11-11 11:11:11.123" with precision 3 packing and unpacking corrupts the time

    date_str= "1111-11-11 11:11:11.123"
    MYSQL_TIME l_time;
    MYSQL_TIME_STATUS status;
    bool ret = str_to_datetime(date_str, date_str_len, &l_time, 0, &status);

    size_t packed_len = col->getSizeInBytes();
    int precision     = col->getPrecision();
    std::cout << date_str << " Precision is " << precision << " size : " << packed_len << "   sign "
              << l_time.neg << " table: " << table_dic->getName() << std::endl;
    Datetime2 d2;
    d2.day      = l_time.day;
    d2.month    = l_time.month;
    d2.year     = l_time.year;
    d2.hour     = l_time.hour;
    d2.minute   = l_time.minute;
    d2.second   = l_time.second;
    d2.fraction = l_time.second_part;
    d2.sign     = 1;
    std::cout << "mysql " << l_time.year << " " << l_time.month << " " << l_time.day << " "
              << l_time.hour << " " << l_time.minute << " " << l_time.second << " "
              << l_time.second_part << std::endl;

    unsigned char packed[packed_len];
    pack_datetime2(d2, packed, precision);

    Datetime2 d3;
    unpack_datetime2(d3, packed, precision);
    std::cout << "unpacked  " << d3.year << " " << d3.month << " " << d3.day << " " << d3.hour << " "
              << d3.minute << " " << d3.second << " " << d3.fraction << std::endl;

output

1111-11-11 11:11:11.123 Precision is 3 size : 7   sign 0 table: date_table3
mysql 1111 11 11 11 11 11 123000
unpacked 1111 11 11 11 11 27 5035

An error occurred while starting the data node

Hi,
I'm trying ndb cluster these days. I compiled a debug version binary and tried to launch a ndb cluster on my computer. But got an error while starting the data node. Here is some information:

config.ini:

[ndbd default]
# Options affecting ndbd processes on all data nodes:
NoOfReplicas=2    # Number of fragment replicas
DataMemory=98M    # How much memory to allocate for data storage

[ndb_mgmd]
# Management process options:
HostName=127.0.0.1          # Hostname or IP address of management node
DataDir=/root/runtime/ndb_mgmd/data  # Directory for management node log files

[ndbd]
# Options for data node "A":
                                # (one [ndbd] section per data node)
HostName=127.0.0.1          # Hostname or IP address
NodeId=2                        # Node ID for this data node
DataDir=/root/runtime/ndbd_1/data   # Directory for this data node's data files

[ndbd]
# Options for data node "B":
HostName=127.0.0.1          # Hostname or IP address
NodeId=3                        # Node ID for this data node
DataDir=/root/runtime/ndbd_2/data   # Directory for this data node's data files

[mysqld]
# SQL node options:
HostName=127.0.0.1          # Hostname or IP address
                                # (additional mysqld connections can be
                                # specified for this node for various
                                # purposes such as running ndb_restore)

my.cnf

[mysqld]
# Options for mysqld process:
ndbcluster                      # run NDB storage engine

[mysql_cluster]
# Options for NDB Cluster processes:
ndb-connectstring=127.0.0.1  # location of management server

data node error log

2023-07-10 15:24:05 [ndbd] INFO     -- Angel pid: 26851 started child: 26852
2023-07-10 15:24:05 [ndbd] INFO     -- Wrote data node PID: 26852 into pidfile /root/runtime/ndbd_1/data/ndb_2.pid
2023-07-10 15:24:05 [ndbd] INFO     -- Normal start of data node using checkpoint and log info if existing
2023-07-10 15:24:05 [ndbd] INFO     -- Configuration fetched from '127.0.0.1:1186', generation: 1
2023-07-10 15:24:05 [ndbd] INFO     -- Changing directory to '/root/runtime/ndbd_1/data'
2023-07-10 15:24:05 [ndbd] INFO     -- Activating node 1
2023-07-10 15:24:05 [ndbd] INFO     -- Activating node 2
2023-07-10 15:24:05 [ndbd] INFO     -- Activating node 3
2023-07-10 15:24:05 [ndbd] INFO     -- Activating node 4
2023-07-10 15:24:05 [ndbd] INFO     -- SchedulerSpinTimer = 0
2023-07-10 15:24:05 [ndbd] INFO     -- AutomaticThreadConfig = 1, NumCPUs = 0
2023-07-10 15:24:05 [ndbd] INFO     -- Use automatic thread configuration
2023-07-10 15:24:05 [ndbd] INFO     -- Auto thread config uses:
 8 LDM threads,
 8 Query threads,
 8 tc threads,
 16 Recover threads,
 1 main threads,
 1 rep threads,
 4 recv threads,
 2 send threads
2023-07-10 15:24:05 [ndbd] INFO     -- Number of RR Groups = 1
For help with below stacktrace consult:
https://dev.mysql.com/doc/refman/en/using-stack-trace.html
Also note that stack_bottom and thread_stack will always show up as zero.
2023-07-10 15:24:05 [ndbd] INFO     -- MaxNoOfTriggers set to 200000
2023-07-10 15:24:05 [ndbd] INFO     -- Automatic Memory Configuration start
2023-07-10 15:24:05 [ndbd] INFO     -- SchemaMemory is 587 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- TransactionMemory is 300 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- Redo log buffer size total are 0 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- Undo log buffer is 0 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- LongMessageBuffer is 51539607572 MBytes   <------------------ HERE IS THE PROBLEM
2023-07-10 15:24:05 [ndbd] INFO     -- Send buffer sizes are 24 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- Job buffer sizes are 0 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- Static overhead is 208 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- OS overhead is 2667 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- Backup Page memory is 0 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- Restore memory is 0 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- Packed signal memory is 0 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- NDBFS memory is 32 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- SharedGlobalMemory is 700 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- Total memory is 126736 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- Used memory is 51539612090 MBytes
2023-07-10 15:24:05 [ndbd] INFO     -- AutomaticMemoryConfig mode requires at least 512 MByte of space for DataMemory and DiskPageBufferMemory
2023-07-10 15:24:05 [ndbd] ALERT    -- Not enough memory using automatic memory config, exiting, required 5050 MBytes
stack_bottom = 0 thread_stack 0x0
/tmp/build/bin/ndbd(my_print_stacktrace(unsigned char const*, unsigned long)+0x2e) [0x8fc0ee]
/tmp/build/bin/ndbd(ErrorReporter::handleError(int, char const*, char const*, NdbShutdownType)+0x2e) [0x856dfe]
/tmp/build/bin/ndbd(Configuration::setupConfiguration()+0x99d) [0x8763dd]
/tmp/build/bin/ndbd(ndbd_run(bool, int, char const*, int, char const*, bool, bool, bool, unsigned int, int, int, unsigned long)+0x28b) [0x4fa98b]
/tmp/build/bin/ndbd(real_main(int, char**)+0x513) [0x4f90c3]
/tmp/build/bin/ndbd(angel_run(char const*, Vector<BaseString> const&, char const*, int, char const*, bool, bool, bool, int, int)+0x10b2) [0x4f8af2]
/tmp/build/bin/ndbd(real_main(int, char**)+0x434) [0x4f8fe4]
/tmp/build/bin/ndbd(main+0x3a) [0x4f51da]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fc6cc308555]
/tmp/build/bin/ndbd() [0x4f6860]
2023-07-10 15:24:05 [ndbd] ALERT    -- Node 2: Forced node shutdown completed. Occurred during startphase 0. Caused by error 2350: 'Invalid configuration received from Management Server(Configuration error). Permanent error, external action needed'.

The data node failed to start successfully because the "LongMessageBuffer" was wrongly calculated to a very large number, exceeding the total memory size. Upon reviewing the related code, I found the issue in the get_and_set_long_message_buffer() function. Specifically, in the get_num_threads() call, the thread number is calculated using fields from the globalData object. However, these fields are uninitialized (set to 0) if globalData.isNdbMt==false when the setupConfiguration() function executes. This results in an incorrect thread count being used in the LongMessageBuffer calculation, leading to the excessively large buffer size that exceeds total memory.

void
Configuration::setupConfiguration(){
......
 /**
   * This is parts of get_multithreaded_config
   */
  do
  {
    globalData.isNdbMt = NdbIsMultiThreaded();
    g_eventLogger->info("Fxxk globalData.isNdbMt: %u", globalData.isNdbMt);
    if (!globalData.isNdbMt) <----------------BREAK HERE, SO globalData IS NOT INITIALIZED
      break;
    ......
  } while(0);
  ......
  if (automatic_memory_config)
  {
    if (!calculate_automatic_memory(it_p))  <----------------------------FAIL HERE
    {
      ERROR_SET(fatal, NDBD_EXIT_INVALID_CONFIG,
                "Invalid configuration fetched",
                "Could not handle automatic memory config");
      DBUG_VOID_RETURN;
    }
  }
  ......
}

ROOT CAUSE:

Uint32
Configuration::get_num_threads()
{
  Uint32 num_ldm_threads = globalData.ndbMtLqhThreads; <---------------0
  Uint32 num_tc_threads = globalData.ndbMtTcThreads;
  Uint32 num_query_threads = globalData.ndbMtQueryThreads;
  Uint32 num_main_threads = globalData.ndbMtMainThreads;
  Uint32 num_recv_threads = globalData.ndbMtReceiveThreads;
  return num_ldm_threads +
         num_tc_threads +
         num_query_threads +
         num_main_threads +
         num_recv_threads;
}

Uint64
Configuration::get_and_set_long_message_buffer(
                 const ndb_mgm_configuration_iterator *p)
{
  Uint32 long_signal_buffer = 0;
  ndb_mgm_get_int_parameter(p, CFG_DB_LONG_SIGNAL_BUFFER, &long_signal_buffer);
  Uint64 long_signal_buffer64 = Uint64(long_signal_buffer);
  if (long_signal_buffer64 == 0)
  {
    Uint32 num_threads = get_num_threads();  <---------------------RETRUN 0
    g_eventLogger->info("Fxxk num_threas: %u", num_threads);
    long_signal_buffer64 = (Uint64(32) * MBYTE64);
    long_signal_buffer64 += (Uint64(num_threads - 1) * Uint64(12) * MBYTE64);   <----Uint64(0-1), GOT BIG UINT64
  }
  globalData.theLongSignalMemory = long_signal_buffer64;
  return long_signal_buffer64;
}

Temporary Solution:
setting LongMessageBuffer in config.ini can avoid this problem, but I think we'd better avoid it in the code :)

Potential bug in Dbtc

In DbtcMain.cpp line 18010:

  for (Uint32 i = 0; i < 10000; i++)

Perhaps the limit shouldn't be a constant?

Certain schemas(?) don't work with events.

It seems that certain table schemas do not work well with NDB events. That is, if you create a table with a particular schema and then create an event on that table, some of the column values will always be undefined when you receive that event.

After looking through the code a little, I think it might have to do with buffer alignment issues and type sizes? Something about when the NdbRecAttr objects are created. But I don't really know.

Reproducability:

Using the following schema, create a table:

CREATE TABLE `example` (
    `column1` BIGINT NOT NULL,
    `column2` INT NOT NULL,
    `column3` TINYINT  NOT NULL,
    `column4` BIGINT NOT NULL,
    `column5` BIGINT NOT NULL,
    `column6` BIGINT NOT NULL,
    PRIMARY KEY (`column1`, `column4`, `column6`)
) ENGINE=NDB DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs;

You can use this modified version of ndb_apievent.cpp to test: https://pastebin.com/jsLcffnQ

Or you can use your own program/modify the example yourself. Just specify the name of the table created above and use the same event columns. I created an event and specified "column1", "column2", "column3", "column4". For every received event (INSERT, UPDATE, DELETE), the pre- and post-values for "column2" and "column3" will be undefined.

Observations:

  • If I specify all of the columns, then they all receive values correctly.
  • If the primary key is set to just column1, then the event described above works.
  • Likewise, the event works if the primary key is set to (column1, column4).

If this is intended behavior, then maybe it could be documented a little more clearly?

Docker image / Docker-compose sample needed

Hello, it would be great to have a docker version to allow everyone to test it quickly as we do with mysql for eg:
docker run --rm --name mysql-db -p 3306:3306 -e MYSQL_ROOT_PASSWORD=mypassword -d mysql

Rust driver?

Hello! Rondb seems pretty cool!
When will there be a rust driver? I'd like to test rondb out in my current project

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.