Comments (15)
1,Server启动时会在info.log里输出全量的设置,能否提供一下?
2,有没有尝试过压测单节点的情况?
3,把logback.xml里的DemoEventLogger调整成debug,看下event.log压测过程中连接断开的原因是什么?
from bifromq.
1,Server启动时会在info.log里输出全量的设置,能否提供一下? 2,有没有尝试过压测单节点的情况? 3,把logback.xml里的DemoEventLogger调整成debug,看下event.log压测过程中连接断开的原因是什么?
1 ---
bootstrap: true
clusterConfig:
env: "Test"
host: "10.89.144.26"
port: 8899
seedEndpoints: "10.89.144.129:8899,10.89.144.26:8899,10.89.144.62:8899,10.89.144.69:8899,10.89.144.121:8899"
mqttServerConfig:
connTimeoutSec: 5
maxConnPerSec: 3000
maxDisconnPerSec: 1000
maxMsgByteSize: 262144
maxResendTimes: 5
maxConnBandwidth: 524288
defaultKeepAliveSec: 60
qos2ConfirmWindowSec: 5
bossELGThreads: 1
workerELGThreads: 16
tcpListener:
enable: true
host: "0.0.0.0"
port: 1883
tlsListener:
enable: true
host: "0.0.0.0"
port: 8883
sslConfig:
certFile: "server.crt"
keyFile: "server_pkcs8.key"
trustCertsFile: "root.crt"
clientAuth: "REQUIRE"
wsListener:
enable: true
host: "0.0.0.0"
port: 8080
wsPath: "/mqtt"
wssListener:
enable: false
host: "0.0.0.0"
port: 8443
wsPath: "/mqtt"
rpcClientConfig:
workerThreads: 100
rpcServerConfig:
host: "10.89.144.26"
port: 0
workerThreads: 100
baseKVRpcServerConfig:
port: 0
stateStoreConfig:
queryThreads: 100
tickerThreads: 10
bgWorkerThreads: 100
distWorkerConfig:
queryPipelinePerStore: 10000
compactWALThreshold: 5000
dataEngineConfig:
type: "rocksdb"
dataPathRoot: ""
manualCompaction: false
compactMinTombstoneKeys: 200000
compactMinTombstoneRanges: 100000
compactTombstoneRatio: 0.3
asyncWALFlush: false
fsyncWAL: false
walEngineConfig:
type: "rocksdb"
dataPathRoot: ""
manualCompaction: true
compactMinTombstoneKeys: 2500
compactMinTombstoneRanges: 2
compactTombstoneRatio: 0.3
asyncWALFlush: false
fsyncWAL: false
balanceConfig:
scheduleIntervalInMs: 5000
balancers:
- "com.baidu.bifromq.dist.worker.balance.ReplicaCntBalancerFactory"
inboxStoreConfig:
queryPipelinePerStore: 10000
compactWALThreshold: 2500
gcIntervalSeconds: 600
purgeDelaySeconds: 180
dataEngineConfig:
type: "rocksdb"
dataPathRoot: ""
manualCompaction: false
compactMinTombstoneKeys: 200000
compactMinTombstoneRanges: 100000
compactTombstoneRatio: 0.3
asyncWALFlush: false
fsyncWAL: false
walEngineConfig:
type: "rocksdb"
dataPathRoot: ""
manualCompaction: true
compactMinTombstoneKeys: 2500
compactMinTombstoneRanges: 2
compactTombstoneRatio: 0.3
asyncWALFlush: false
fsyncWAL: false
balanceConfig:
scheduleIntervalInMs: 5000
balancers:
- "com.baidu.bifromq.inbox.store.balance.ReplicaCntBalancerFactory"
- "com.baidu.bifromq.inbox.store.balance.RangeSplitBalancerFactory"
- "com.baidu.bifromq.inbox.store.balance.RangeLeaderBalancerFactory"
retainStoreConfig:
queryPipelinePerStore: 100
compactWALThreshold: 2500
gcIntervalSeconds: 600
dataEngineConfig:
type: "rocksdb"
dataPathRoot: ""
manualCompaction: false
compactMinTombstoneKeys: 200000
compactMinTombstoneRanges: 100000
compactTombstoneRatio: 0.3
asyncWALFlush: false
fsyncWAL: false
walEngineConfig:
type: "rocksdb"
dataPathRoot: ""
manualCompaction: true
compactMinTombstoneKeys: 5000
compactMinTombstoneRanges: 2
compactTombstoneRatio: 0.3
asyncWALFlush: false
fsyncWAL: false
balanceConfig:
scheduleIntervalInMs: 5000
balancers:
- "com.baidu.bifromq.retain.store.balance.ReplicaCntBalancerFactory"
apiServerConfig:
enable: true
httpPort: 8091
apiBossThreads: 1
apiWorkerThreads: 2
httpsListenerConfig:
enable: false
port: 8090
2 单节点压测 我们用的是小body,可以满足要求
from bifromq.
1,Server启动时会在info.log里输出全量的设置,能否提供一下? 2,有没有尝试过压测单节点的情况? 3,把logback.xml里的DemoEventLogger调整成debug,看下event.log压测过程中连接断开的原因是什么?
我们中间试过一次,配置 dist和inbox的dataEngine为memory,CPU是可以打上去的,而且能够支撑我们的压测完成,集群一直都很正常。
stateStoreConfig:
distWorkerConfig:
dataEngineConfig:
type: memory
inboxStoreConfig:
dataEngineConfig:
type: memory
from bifromq.
尝试压测了一下这个场景用例,压测了两个小时没复现此问题。从描述来看,大量CLOSE_WAIT像是客户端最终异常断开,然后未完成tcp断开的完整握手动作,导致服务端残存了大量的CLOSE_WAIT连接。
from bifromq.
尝试压测了一下这个场景用例,压测了两个小时没复现此问题。从描述来看,大量CLOSE_WAIT像是客户端最终异常断开,然后未完成tcp断开的完整握手动作,导致服务端残存了大量的CLOSE_WAIT连接。
出现大量的CLOSE_WAIT 只是现象,分析是因为bifromq服务卡死,导致客户端请求无法响应,所以才会出现大量的CLOSE_WAIT,此时使用mqtt client连接也无法连接上服务。我们的困惑是在于,我的资源都没有到瓶颈,为啥使用RocksDB 无法支持我的压测,我们这里进行了多次压测,只要是使用RocksDB作为存储引擎,一般几分钟后,就会导致服务不可用。memory引擎则没有这个问题。我们还试过将Rocksdb的数据写入到tmpfs,也不行。我们只配置
walEngineConfig:
type: "memory" 测试也是不行,只有
dataEngineConfig:
type: "memory" 才能支撑我的测试
from bifromq.
cleansession=true qos0,消息链路上不会涉及rocksdb的io。bifromq会输出jvm的metrics,可以看下heap和direct buffer的使用情况。
from bifromq.
cleansession=true qos0,消息链路上不会涉及rocksdb的io。bifromq会输出jvm的metrics,可以看下heap和direct buffer的使用情况。
我们检查了GC,没看到什么问题。开启监控后有个rocksdb的指标感觉耗时比较高
basekv_le_rocksdb_flush_time_seconds_count{env="Test",kvspace="111877626712162304_0",storeId="a79ce47a-e01f-4db5-9651-fb4b3f7b3659",type="wal",} 168.0
basekv_le_rocksdb_flush_time_seconds_sum{env="Test",kvspace="111877626712162304_0",storeId="a79ce47a-e01f-4db5-9651-fb4b3f7b3659",type="wal",} 5.7988E-5
basekv_le_rocksdb_flush_time_seconds_count{env="Test",kvspace="111877626717339649_0",storeId="79675d6a-a7e9-4a52-9f8b-b659b58b9533",type="wal",} 488610.0
basekv_le_rocksdb_flush_time_seconds_sum{env="Test",kvspace="111877626717339649_0",storeId="79675d6a-a7e9-4a52-9f8b-b659b58b9533",type="wal",} 0.029379993
basekv_le_rocksdb_flush_time_seconds_count{env="Test",kvspace="111877626699644928_0",storeId="18207f92-f591-49a1-9887-ab30549816a0",type="wal",} 7.0
basekv_le_rocksdb_flush_time_seconds_sum{env="Test",kvspace="111877626699644928_0",storeId="18207f92-f591-49a1-9887-ab30549816a0",type="wal",} 1.0189E-5
from bifromq.
三个kvspace的flush耗时都是1微妙左右,为什么认为很高?
from bifromq.
三个kvspace的flush耗时都是1微妙左右,为什么认为很高?
我们刚开始的压测是在没有sub的情况下,纯发送数据到服务器,CPU负载只有1/3,但是如果我们开启了消费模式,CPU负载就会升到2/3了,如果35万个请求全部发送40K的包,很快就会挂掉。
所以我们在想是不是对于发送请求,在调用链上存在限制CPU使用的配置?
from bifromq.
尝试压测了一下这个场景用例,压测了两个小时没复现此问题。从描述来看,大量CLOSE_WAIT像是客户端最终异常断开,然后未完成tcp断开的完整握手动作,导致服务端残存了大量的CLOSE_WAIT连接。
你们的压测场景是什么?能否贴一下你们的bifromq的配置?
from bifromq.
https://bifromq.io/docs/test_report/test_report/
from bifromq.
你这个链接的场景的payload都是小body体,只有几百个字节,我们使用40K的payload,CPU打不上去,很快就挂了,小body体我们也没有问题。我的问题就是使用40K大payload压测的时候,发现32核的CPU,bifromq只能使用三分之一左右,然后集群很快就会被打挂,如何调整参数,都没法让bifromq的CPU使用率上去,使用内存存储就可以将集群CPU打上去。
from bifromq.
可以把com.baidu.bifromq.mqtt.handler
的debug日志打开,看下压测过程中连接断开的具体原因是什么?
from bifromq.
可以把
com.baidu.bifromq.mqtt.handler
的debug日志打开,看下压测过程中连接断开的具体原因是什么?
from bifromq.
不是event.log里的内容。logback里给 com.baidu.bifromq.mqtt.handler
配置个debug logger,看下输出的内容是否有异常抛出。
from bifromq.
Related Issues (20)
- Connections be kicked when shuting down one BifroMQ node HOT 2
- bifromq 本地开发如何搭建环境? HOT 1
- 集群在进行压测后无法连接。 HOT 7
- 请问bifromq怎么创建多租户? HOT 1
- bifromq 管理控制台有没有规划,可以实现监控,配置等功能 HOT 1
- 单Pub单Sub 单Topic情况下的无法保障topic顺序消费 HOT 1
- LWT doesn't work when the client process is killed, in v2.1 release
- authPlugin无法进入auth(MQTT5AuthData authData) 方法
- clean session HOT 5
- Setting Provider plugin value invalid HOT 1
- 长时间运行后api-server出现java.lang.StackOverflowError in
- docker bifromq:3.0.0版本客户端无法正常接收推送消息 HOT 2
- 有否正式商用的同行嘛? HOT 1
- bifromq 性能测试系统可能阻塞在DistWorkerCoProcFactory的 ForkJoinPool HOT 1
- Dist 模块 dataEngineConfig.type 为 rocksdb 时,qos0 发送性能比 dataEngineConfig 为 memory 差很多 HOT 15
- docker bifromq:3.0.2版本集群搭建,集群模式不可用! HOT 11
- 建议在POM文件中升级lombok版本以便支持Java21的编译 HOT 1
- 集群cpu很高 HOT 12
- Bifromq Bridge HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bifromq.