Comments (17)
Thx for the detailed report, we will look into the issue, we are about to release 22.10.1, so will ensure that we fix the issues related to this. It is correct that one needs at least 21.04.9 to upgrade to 22.10.0.
from rondb.
Thanks Mikael!
Btw. this is running the new RonDB backend for Dydra, which now passes our complete test suite at https://github.com/dydra/http-api-tests/. The backend is based on our Common Lisp bindings for the NDB API, which are by now complete enough to support that backend.
From Berlin, Max
from rondb.
Thx Max,
This is very interesting information. It's very interesting to hear about new use cases for RonDB.
Let us know also if there are features that could be of use for you in RonDB future development.
from rondb.
Hi Mikael, any news here? Was any work done in that direction for 22.10.1?
I noticed that 22.10.1 is not released yet but surely in an almost stable state, right? Should we try to upgrade directly to 22.10.1? It is still an experimental environment for us and not production-level, so we could live even with data loss and reload our data in that case. (Even more so as the import with the rondb backend is now much faster than with our existing lmdb based backends.)
Max
from rondb.
from rondb.
Hi again, that worked much better, on the second attempt.
In the first attempt it ran out of memory:
ndb_2_out.log ended with:
2024-01-09 19:36:04 [ndbd] ERROR -- Global memory manager is out of memory completely, no memory in shared global memory left and no memory in reserved memory that we can steal either.
For help with below stacktrace consult:
https://dev.mysql.com/doc/refman/en/using-stack-trace.html
Also note that stack_bottom and thread_stack will always show up as zero.
stack_bottom = 0 thread_stack 0x0
ndbmtd(my_print_stacktrace(unsigned char const*, unsigned long)+0x41) [0xa8f2a1]
ndbmtd(ErrorReporter::handleError(int, char const*, char const*, NdbShutdownType)+0x37) [0x955077]
ndbmtd(SimulatedBlock::progError(int, int, char const*, char const*) const+0x106) [0xa505a6]
ndbmtd(Ndbcntr::execSYSTEM_ERROR(Signal*)+0xac) [0x8169fc]
ndbmtd() [0xa7754c]
ndbmtd(mt_job_thread_main+0x37b) [0xa7994b]
ndbmtd() [0xa10842]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f14a76ec609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f14a6d74133]
2024-01-09 19:36:05 [ndbd] ERROR -- Global memory manager is out of memory completely, no memory in shared global memory left and no memory in reserved memory that we can steal either. - Repeated 45 times
2024-01-09 19:36:05 [ndbd] INFO -- Killed by node 2 as copyfrag failed, error: 827
2024-01-09 19:36:05 [ndbd] INFO -- NDBCNTR (Line: 380) 0x00000006
2024-01-09 19:36:05 [ndbd] INFO -- Error handler shutting down system
2024-01-09 19:36:08 [ndbd] ALERT -- Node 2: Forced node shutdown completed. Occurred during startphase 5. Caused by error 2303: 'System error, node killed during node restart by other node(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
and in the cluster management console it said:
ndb_mgm> Node 2: Data usage increased to 80%(140284 32K pages of total 175214)
Node 2: Data usage increased to 90%(156283 32K pages of total 173448)
Node 2: Index usage increased to 80%(16761 32K pages of total 20764)
Node 2: Index usage increased to 90%(16921 32K pages of total 18673)
Node 2: Data usage increased to 99%(170917 32K pages of total 172292)
Node 2: Forced node shutdown completed. Occurred during startphase 5. Caused by error 2303: 'System error, node killed during node restart by other node(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
I deleted the biggest table, got the messsage:
ndb_mgm> Node 1: Data usage decreased to 65%(433721 32K pages of total 660940)
and tried again.
Now it came up and I could update the whole cluster.
I still notice that I cannot load as much data as before. I tried to load a bigger file (but still smaller than the data that was loaded before) and get:
Error with code 827: Out of memory in Ndb Kernel, table data (increase DataMemory)
while the management console told me:
ndb_mgm> Node 2: Data usage increased to 80%(139381 32K pages of total 173993)
So, I guess the automatic memory configuration changed and that causes the problems...
Yes, that seems to be the problem: My config just contains:
TotalMemoryConfig=48G
SharedGlobalMemory=16G
RonDB-21.04.9 turned this into:
TransactionMemory is 1965 MBytes
SharedGlobalMemory is 16384 MBytes
Total memory is 49152 MBytes
Used memory is 25714 MBytes
Remaining memory is 23437 MBytes
Setting DataMemory to 21093 MBytes
while RonDB-22.10.1 makes:
TransactionMemory is 15439 MBytes
SharedGlobalMemory is 16384 MBytes
Total memory is 49152 MBytes
Used memory is 42581 MBytes
Remaining memory is 6570 MBytes
Setting DataMemory to 5913 MBytes
I'll reconfigure and then it should work. So the change automatic memory configuration seems to be then only caveat so far. nice!
from rondb.
Update: configuring TransactionMemory
explicitly did not help, as the new rondb increases it explicitly to a high value to a good 12 GB.
But I figured that I do not need to configure SharedGlobalMemory
in that case and that turned out to be working well.
It is only a bit surprising that the two data nodes come up with different results, after I restarted both.
Node 1:
Total memory is 49152 MBytes
Used memory is 23807 MBytes
Remaining memory is 25344 MBytes
Setting DataMemory to 22810 MBytes
Setting DiskPageBufferMemory to 2450 MBytes
Node 2:
Total memory is 49152 MBytes
Used memory is 28747 MBytes
Remaining memory is 20404 MBytes
Setting DataMemory to 18364 MBytes
Setting DiskPageBufferMemory to 1973 MBytes
from rondb.
Looks like something I need to look into right away, could you provide the config.ini you used?
from rondb.
Also would be great to see the full node log, at least the part about the memory allocation sizes.
from rondb.
For both 21.04 and 22.10.1
from rondb.
Also presume you use AutomaticThreadConfig=1, so then also interesting to know how many CPUs the machine has.
from rondb.
The difference on the 2 nodes could be happening if they have different number of CPUs
from rondb.
Could reproduce with a very simple test. It seems that we changed memory configuration in a number of places to safeguard against running out of memory. But obviously we've been overcautious. However the difference of node 1 and node 2 is harder to understand unless they have a different set of CPUs. Will look into details on each of the differences and see what is the best strategy. The transaction memory is likely due to some configuration setting that you have used that affects TM calculations, so need to see the config.ini to understand that part. In my test TM was equal in 21.04.16 and 22.10.1
from rondb.
As part of this fix I will also ensure that MaxNoOfConcurrentOperations also ensures that we can handle transactions of this size. This means that each operation will consume about 1.5kB of TransactionMemory, so setting it to 2M for example will set a minimum of 3G TransactionMemory.
from rondb.
PR for RONDB-581 created for memory configuration issue
from rondb.
Thank you! Most of the stuff we discussed in e-mail already. While i could not transit from 21.04.9 to 21.10, the upgrade to 21.10.1 worked now successfully. And from your comments I conclude that we should be using 21.10.1 already.
As part of this fix I will also ensure that MaxNoOfConcurrentOperations also ensures that we can handle transactions of this size. This means that each operation will consume about 1.5kB of TransactionMemory, so setting it to 2M for example will set a minimum of 3G TransactionMemory.
Good!
Adapting cl-ndbapi to 22.10.1 seemed also minor. I needed to activate -std=c++17
for my C/C++ wrapper as the ndbapi now uses constructs of c++ 17. And there seem to be changes for NDB.set_eventbuf_max_alloc()
and compare_ndbrecord
both of which I do not use in the moment.
Is there some documentation about ndbapi changes between the two versions available?
from rondb.
Note: I've added NDB.set_eventbuf_max_alloc() again. Only compare_ndbrecord really changed in an incompatible fashion.
The signature changed from
int compare_ndbrecord(const NdbReceiver *r1,
const NdbReceiver *r2,
const NdbRecord *key_record,
const NdbRecord *result_record,
bool descending,
bool read_range_no)
in rondb 21.04.9 to
int compare_ndbrecord(const NdbReceiver *r1,
const NdbReceiver *r2,
const NdbRecord *key_record,
const NdbRecord *result_record,
const unsigned char *result_mask,
bool descending,
bool read_range_no)
in 22.10.1.
Well, probably meant to be internal. But it is in the interface and thus also included by swig.
from rondb.
Related Issues (18)
- Test Issue
- benchmarks needed.
- Docker image / Docker-compose sample needed HOT 7
- Rondb scalability? HOT 1
- Certain schemas(?) don't work with events.
- Potential bug in Dbtc HOT 1
- Bug in JamEvent::isEmpty
- Rust driver? HOT 2
- Can't find Key-value store implementation
- Seg fault in Native API
- Segmentation fault NDB api ndb_end(int)
- Bug in storage/ndb/tools/NdbImportCsv.cpp date functions
- Fix mysql server to throw an error when a table with float primary key is created
- Initiated by signal 11. Caused by error 6000
- Evaluate Profile-Guided Optimization HOT 4
- An error occurred while starting the data node HOT 2
- Documentation about differences between InnoDB and RonDB storage engines HOT 14
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rondb.