Comments (12)
Which redis client are you using?
from ardb.
Clients are iOS and Android using hiredis and jedis respectively. The clients pass through a TCP load balancer and they reach two instances of a redis proxy I wrote in vert.x (java).
Unfortunately my boss said it's not (yet) possible to publish it opensource.
The proxy is really simple: it's a transparent TCP proxy where I hooked a redis protocol parser. The parser reads from the stream and creates Command objects which are passed to a security validator. The validator decides whether to let the command be forwarded or, in case of malicious command, close the connection.
from ardb.
Today I found more of these errors and my data set all corrupted: ardb wouldn't start, I tried to remove the LOCK file, but it didn't help. So I had to rename the data folder and create a new one and start over.
[17302] 04-09 18:03:02,281 ERROR No handler found for:
[17302] 04-09 18:03:02,281 DEBUG Process recved cmd:$5 with flags:0
[17302] 04-09 18:03:02,281 ERROR No handler found for:$5
[17302] 04-09 18:03:02,281 DEBUG Process recved cmd:lpush with flags:0
[17302] 04-09 18:03:02,281 DEBUG Process recved cmd:$19 with flags:0
[17302] 04-09 18:03:02,281 ERROR No handler found for:$19
[17302] 04-09 18:03:02,281 DEBUG Process recved cmd:mpme.search.request with flags:0
[17302] 04-09 18:03:02,281 ERROR No handler found for:mpme.search.request
[17302] 04-09 18:03:02,281 DEBUG Process recved cmd:$193 with flags:0
...
[18232] 05-14 14:47:40,104 ERROR No handler found for:$57 with size:3
[18232] 05-14 14:47:40,104 ERROR Invalid command's ascii codes:36 53 55
[18232] 05-14 14:47:40,104 ERROR No handler found for:mpme.search.response.57b2510d-3e94-491b-8b58-0ecefd64f2fa with size:57
[18232] 05-14 14:47:40,104 ERROR Invalid command's ascii codes:109 112 109 101 46 115 101 97 114 99 104 46 114 101 115 112 111 110 115 101 46 53 55 98 50 53 49 48 100 45 51 101 57 52 45 52 57 49 98 45 56 98 53 56 45 48 101 99 101 102 100 54 52 102 50 102 97
[18232] 05-14 14:47:40,104 ERROR No handler found for:$2 with size:2
[18232] 05-14 14:47:40,104 ERROR Invalid command's ascii codes:36 50
[18232] 05-14 14:47:40,104 ERROR No handler found for:15 with size:2
[18232] 05-14 14:47:40,104 ERROR Invalid command's ascii codes:49 53
[18232] 05-14 15:33:11,763 ERROR No handler found for: with size:1
[18232] 05-14 15:33:11,763 ERROR Invalid command's ascii codes:0
[18232] 05-14 15:33:11,763 ERROR No handler found for:$3 with size:2
[18232] 05-14 15:33:11,763 ERROR Invalid command's ascii codes:36 51
[18232] 05-14 15:33:11,763 ERROR No handler found for:$22 with size:3
[18232] 05-14 15:33:11,763 ERROR Invalid command's ascii codes:36 50 50
[18232] 05-14 15:33:11,763 ERROR No handler found for:mpme.cache.artist.1871 with size:22
[18232] 05-14 15:33:11,763 ERROR Invalid command's ascii codes:109 112 109 101 46 99 97 99 104 101 46 97 114 116 105 115 116 46 49 56 55 49
All this happened with ardb 0.7.0 at the stage in which you enhanced the error messages for #39. No core has been dumped, just the server crashed and could not be restarted due to corrupt data.
from ardb.
'No handler found' error indicate that the redis command received is broken. Is the client connected ardb full redis protocol compatible? I found 'Invalid command's ascii codes:0' in the log, an ascii 0 followed '\r\n' would not be valid in redis protocol.
And about the crash, any information found in sys log?
from ardb.
Invalid commands may occur (I'm accepting data from mobile clients through a vert.x proxy) but on the other hand, how can broken command cause data corruption?
May 14 21:13:15 ip-10-49-11-32 kernel: [24743925.343771] ardb-server[18456]: segfault at 1 ip 00000000004d017e sp 00007fcd2017f330 error 6 in ardb-server[400000+2cb000]
This is the only thing I found in syslog. But more stuff in dmesg:
[22905597.903340] Out of memory: Kill process 9369 (ardb-server) score 294 or sacrifice child
[22905597.903355] Killed process 9369 (ardb-server) total-vm:6003276kB, anon-rss:2257200kB, file-rss:0kB
[22905598.013644] java invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[22905598.013652] java cpuset=/ mems_allowed=0
[22905598.013657] CPU: 0 PID: 996 Comm: java Not tainted 3.11.0-15-generic #25-Ubuntu
[22905598.013660] 0000000000000000 ffff8800ee5d1960 ffffffff816e7375 00000000000201da
[22905598.013664] ffff8800ee5d19d0 ffffffff816e22d0 0000000000000000 00000000001d59d6
[22905598.013667] 00000000000201da ffffffff816ef0fa ffff880042002ee0 ffffffff00000000
[22905598.013670] Call Trace:
[22905598.013683] [<ffffffff816e7375>] dump_stack+0x45/0x56
[22905598.013686] [<ffffffff816e22d0>] dump_header+0x7f/0x1c2
[22905598.013691] [<ffffffff816ef0fa>] ? error_exit+0x2a/0x60
[22905598.013697] [<ffffffff81142449>] oom_kill_process+0x1a9/0x310
[22905598.013702] [<ffffffff812dc4a5>] ? security_capable_noaudit+0x15/0x20
[22905598.013705] [<ffffffff81142b84>] out_of_memory+0x414/0x450
[22905598.013709] [<ffffffff81148360>] __alloc_pages_nodemask+0x870/0x920
[22905598.013714] [<ffffffff81183c69>] alloc_pages_current+0xa9/0x160
[22905598.013717] [<ffffffff8113f2b7>] __page_cache_alloc+0x97/0xc0
[22905598.013720] [<ffffffff81141325>] filemap_fault+0x185/0x400
[22905598.013725] [<ffffffff811637ff>] __do_fault+0x6f/0x540
[22905598.013732] [<ffffffff81006154>] ? pte_mfn_to_pfn.part.13+0x74/0x100
[22905598.013735] [<ffffffff81166a33>] handle_pte_fault+0x93/0xab0
[22905598.013738] [<ffffffff81006154>] ? pte_mfn_to_pfn.part.13+0x74/0x100
[22905598.013740] [<ffffffff81006269>] ? xen_pmd_val+0x19/0x20
[22905598.013743] [<ffffffff810051f9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[22905598.013746] [<ffffffff811681f9>] handle_mm_fault+0x299/0x670
[22905598.013753] [<ffffffff810c3501>] ? futex_wake_op+0x521/0x650
[22905598.013757] [<ffffffff816f265d>] __do_page_fault+0x14d/0x530
[22905598.013760] [<ffffffff810c4622>] ? do_futex+0x142/0x620
[22905598.013763] [<ffffffff81004e42>] ? xen_mc_flush+0x182/0x1b0
[22905598.013765] [<ffffffff816f2a6c>] do_page_fault+0x2c/0x50
[22905598.013770] [<ffffffff816eee98>] page_fault+0x28/0x30
[22905598.013772] Mem-Info:
...
from ardb.
[22905597.903340] Out of memory: Kill process 9369 (ardb-server) score 294 or sacrifice child
It seems that there is a out of memory error.
from ardb.
So it’s normal that OOM gives corrupted, unrecoverable data? Is there a way to protect data under these circumstances?
-Simone
On Saturday, 17 May 2014 at 09:18, yinqiwen wrote:
[22905597.903340] Out of memory: Kill process 9369 (ardb-server) score 294 or sacrifice child
It seems that there is a out of memory error.—
Reply to this email directly or view it on GitHub (#38 (comment)).
from ardb.
I'm back on this issue now.
What's the log reported when ardb can not restart? In the code, ardb would try to repair it the data corrupted, and in the meantime it would log the error.
from ardb.
I can't reproduce it anymore. It just gave me "terminated" like this:
src/ardb-server /etc/ardb.conf
[26301] 05-25 10:53:46,474 INFO Start init storage engine.
Terminated
Today I had a similar problem, but this time I understood it was my 1 min periodic cron job seeing that ARDB was not responding and tried to kill it and restart it every minute.
Once I disabled that, I could restart it.
Now I have the script running every 5 minutes. Here's the code inside of it, just for reference:
function isNotHung {
timeout 10 redis-cli info > /dev/null
}
function startARDB {
echo "restarting at `date`"
cd /home/ubuntu/ardb/ && src/ardb-server ardb.conf
}
if [ ! `pidof ardb-server` ]
then
startARDB
else
if isNotHung
then
# running
TPS=`redis-cli info |grep qps | awk -F\: {'print $2'}| sed 's/\r$//'` ; NOW=`date +%s`; redis-cli -h our-other-redis-server.com zadd ardb_tps $NOW $TPS; redis-cli -h our-other-redis-server.com zremrangebyrank ardb_tps 1440 -1
else
echo "hung"
killall ardb-server; startARDB
fi
fi
from ardb.
The leveldb‘s init method may be blocked for a long time if there is too many data. Maybe longer than 5 minutes.
from ardb.
And what happens if I try to start ardb-server multiple times every 60 seconds? :D
from ardb.
According leveldb's code, it's doing a data compact task. I think it just interrupt the compact task again and again.
from ardb.
Related Issues (20)
- How to set column family using RocksDB HOT 1
- sdiff in ardb is incompatible with Redis
- Make failed on Redhat 8 HOT 4
- Append to environment CXXFLAGS
- Badger support?
- Performance issues HOT 1
- Can't connect to ardb with StackExchange.Redis client HOT 1
- 运行一段时间之后 ardb 不再处理任何请求 HOT 1
- file unexpected auto-cleaned when using rocksdb
- brpoplpush does not conform to redis return values
- error: implicitly-declared ‘constexpr rocksdb::FileDescriptor::FileDescriptor(const rocksdb::FileDescriptor&)’ is deprecated [-Werror=deprecated-copy] HOT 3
- 怎么导入Redis的Snapshot文件里的数据呢?
- Replica ignore "slave-ignore-expire" on first full backup sync
- Add Support for RediSearch
- Talk / Presentation request
- Rocksdb used too many mem HOT 2
- Stop compaction on starting service? HOT 1
- Arm64 Build Error HOT 1
- ROCKSDB build error HOT 5
- Where should I save a custom config file HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ardb.