Comments (16)
Hi!
It't not a good idea to use ip-address in configuration files as they can be changed, better to use FQDN that can be resolved on every node.
Caught exception: ydb/core/driver_lib/cli_utils/cli_cmds_server.cpp:349: cannot detect node ID for ydb1:19001
Did you follow on-premise deployment documentation while configuring multinode cluster? Could you show your configuration file and server startup command line?
from ydb.
config.txt
Config.txt is renamed "config.yaml"
comand
/opt/ydb/ydbd/bin/ydbd server --tcp --yaml-config /opt/ydb/cfg/config.yaml --node static
--grpc-port 2135 --ic-port 19001 --mon-port 8765
--log-file-name logs/storage_start.log > logs/storage_start_output.log 2>logs/storage_start_err.log &
or
/opt/ydb/ydbd/bin/ydbd server --tcp --yaml-config /opt/ydb/cfg/config.yaml --node 1
--grpc-port 2135 --ic-port 19001 --mon-port 8765
--log-file-name logs/storage_start.log > logs/storage_start_output.log 2>logs/storage_start_err.log &
I am look with example https://ydb.tech/ru/docs/getting_started/self_hosted/ydb_local from sh
from ydb.
/opt/ydb/ydbd/bin/ydbd server --tcp --yaml-config /opt/ydb/cfg/config.yaml --node static
--grpc-port 2135 --ic-port 19001 --mon-port 8765
--log-file-name logs/storage_start.log > logs/storage_start_output.log 2>logs/storage_start_err.log &
This should works fine.
I am look with example https://ydb.tech/ru/docs/getting_started/self_hosted/ydb_local from sh
You'd better use example config mirror-3dc-3-nodes.yaml
Please note - for fault tolerant cluster (mirror-3 erasure) you must have at least 3 nodes with 3 disks (block device or file on filesystem). Also you can specify only one disk per node and use erasure: none
, but if one of the node fails - your database will be unavailable.
from ydb.
Please check my problem on single node.
If using FQDN node started with error
Caught exception: ydb/core/driver_lib/cli_utils/cli_cmds_server.cpp:349: cannot detect node ID for ydb1:19001
If using hostname node started - it good for single node, but if using cluster with hostname not work (view my first post)
command.txt
config_not_work.txt
config_work.txt
error.txt
from ydb.
You should specifiy host name in configuration file same as you get it using hostname
command.
So, if you want to use FQDN (and it's nessesary for multinode cluster to communicate between nodes) configure your server host name with FQDN (sudo hostname ydb1.ru-central1.internal
or add to /etc/hostname
), then use FQDN in configuration file.
from ydb.
Good, interconnect it fixs, but next problem
in config, a change only host names (
config.txt
)
https://github.com/ydb-platform/ydb/blob/main/ydb/deploy/yaml_config_examples/mirror-3dc-3-nodes.yaml
In log erros
2022-05-08T16:35:41.407210Z :BOOTSTRAPPER NOTICE: tablet: 72057594037936131, type: Console, boot
2022-05-08T16:35:41.407944Z :BS_PROXY_DISCOVER ERROR: [a6cef7aa52e3541a] Status# ERROR Marker# DSPDM02
2022-05-08T16:35:41.407959Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-08T16:35:41.407970Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-08T16:35:41.407973Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 Type: Console, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1
2022-05-08T16:35:41.408397Z :BOOTSTRAPPER NOTICE: tablet: 72057594037936131, type: Console, boot
2022-05-08T16:35:41.409057Z :BS_PROXY_DISCOVER ERROR: [02f6be02ea6f8374] Status# ERROR Marker# DSPDM02
2022-05-08T16:35:41.409075Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-08T16:35:41.409083Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-08T16:35:41.409086Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 Type: Console, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1
2022-05-08T16:35:41.409399Z :BOOTSTRAPPER NOTICE: tablet: 72057594037936131, type: Console, boot
2022-05-08T16:35:41.410328Z :BS_PROXY_DISCOVER ERROR: [6c625560d57343e7] Status# ERROR Marker# DSPDM02
2022-05-08T16:35:41.410343Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-08T16:35:41.410347Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-08T16:35:41.410350Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 Type: Console, EReason: ReasonBootBSError, Suggested
ydb using 1 core cpu and network traffic 10-14Mbit/s, but
command
/opt/ydb/ydbd/bin/ydbd -s grpc://localhost:2135 admin blobstorage config init --yaml-file cfg/config.yaml > logs/init_storage.log 2>&1
not finish correct, finish wth timeout
MP-0130 Tablet request timed out Marker# MBT4
from ydb.
This config assumes using 3 raw block devices that were already formated as described https://ydb.tech/en/docs/deploy/manual/deploy-ydb-on-premises#prepare-and-format-disks-on-each-server-%7B#-prepare-disks%7D
If you use local data file - replace drive: path
in configs on all nodes.
from ydb.
I am created 9 disk (for 3 machine) and using command for mount
sudo parted /dev/nvme0n1 mklabel gpt -s
sudo parted -a optimal /dev/nvme0n1 mkpart primary 0% 100%
sudo parted /dev/nvme0n1 name 1 ydb_disk_01
sudo partx --u /dev/nvme0n1
from ydb.
You have misconfiguration - you labeled disk as ydb_disk_01
but used /dev/disk/by-partlabel/ydb_disk_ssd_01
in config.txt (this is our misspell in the documentation and we will fix it, thank you for the report!).
The easiest way to fix it now is replace ydb_disk_ssd_0[1-3] with ydb_disk_01 in your config.
Also don't forget to format each labeled disk using sudo LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/ydb/lib /opt/ydb/bin/ydbd admin bs disk obliterate /dev/disk/by-partlabel/ydb_disk_01
command for every disk on every server.
from ydb.
ok, sory, stupid mistake, but is not fixs error, now error
2022-05-13T13:37:20.837011Z :BOOTSTRAPPER NOTICE: tablet: 72057594037936128, type: Cms, boot
2022-05-13T13:37:20.837829Z :BS_PROXY_DISCOVER ERROR: [69afb1ded53966b6] Status# ERROR Marker# DSPDM02
2022-05-13T13:37:20.837839Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-13T13:37:20.837847Z :TABLET_MAIN ERROR: Tablet: 72057594037936128 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-13T13:37:20.837849Z :TABLET_MAIN ERROR: Tablet: 72057594037936128 Type: Cms, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1
full commands
sudo parted /dev/vdb mklabel gpt -s
sudo parted -a optimal /dev/vdb mkpart primary 0% 100%
sudo parted /dev/vdb name 1 ydb_disk_01
sudo partx --u /dev/vdb
sudo LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/ydb/ydbd/lib /opt/ydb/ydbd/bin/ydbd admin bs disk obliterate /dev/disk/by-partlabel/ydb_disk_01
sudo parted /dev/vdc mklabel gpt -s
sudo parted -a optimal /dev/vdc mkpart primary 0% 100%
sudo parted /dev/vdc name 1 ydb_disk_02
sudo partx --u /dev/vdc
sudo LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/ydb/ydbd/lib /opt/ydb/ydbd/bin/ydbd admin bs disk obliterate /dev/disk/by-partlabel/ydb_disk_02
sudo parted /dev/vdd mklabel gpt -s
sudo parted -a optimal /dev/vdd mkpart primary 0% 100%
sudo parted /dev/vdd name 1 ydb_disk_03
sudo partx --u /dev/vdd
sudo LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/ydb/ydbd/lib /opt/ydb/ydbd/bin/ydbd admin bs disk obliterate /dev/disk/by-partlabel/ydb_disk_03
from ydb.
for additional, testing simple 1 node 1 disk install
error
2022-05-14T11:05:59.794320Z :BS_PROXY_DISCOVER ERROR: [69485e2bc1c4387b] Result# TEvDiscoverResult {Status# ERROR BlockedGeneration# 0 Id# [0:0:0:0:0:0:0] Size# 0 MinGeneration# 0 ErrorReason# "Group# 0 disintegrated, type A."} Marker# BSD01
2022-05-14T11:05:59.794326Z :BS_PROXY_DISCOVER ERROR: [16ad85e56deff3ae] StepDiscovery Die. Disintegrated. DomainRequestsSent# 1 DomainReplies# 1 DomainSuccess# 0 ParityParts# 0 Handoff# 0 Marker# BSD08
2022-05-14T11:05:59.794327Z :BS_PROXY_DISCOVER ERROR: [16ad85e56deff3ae] Result# TEvDiscoverResult {Status# ERROR BlockedGeneration# 0 Id# [0:0:0:0:0:0:0] Size# 0 MinGeneration# 0 ErrorReason# "Group# 0 disintegrated, type A."} Marker# BSD01
2022-05-14T11:05:59.794330Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-14T11:05:59.794333Z :BOOTSTRAPPER NOTICE: tablet: 72057594046382081, type: Mediator, boot
2022-05-14T11:05:59.794340Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-14T11:05:59.794342Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-14T11:05:59.794345Z :TABLET_MAIN ERROR: Tablet: 72075186232723360 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-14T11:05:59.794345Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-14T11:05:59.794346Z :TABLET_MAIN ERROR: Tablet: 72075186232723360 Type: SchemeShard, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1, Details: Group# 0 disintegrated, type A.
2022-05-14T11:05:59.794350Z :TABLET_MAIN ERROR: Tablet: 72057594037936130 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-14T11:05:59.794351Z :TABLET_MAIN ERROR: Tablet: 72057594037936130 Type: TenantSlotBroker, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1, Details: Group# 0 disintegrated, type A.
2022-05-14T11:05:59.794363Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-14T11:05:59.794364Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 Type: Console, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1, Details: Group# 0 disintegrated, type A.
2022-05-14T11:05:59.794372Z :BS_PROXY_DISCOVER ERROR: [fae4ae78692b6f02] StepDiscovery Die. Disintegrated. DomainRequestsSent# 1 DomainReplies# 1 DomainSuccess# 0 ParityParts# 0 Handoff# 0 Marker# BSD08
2022-05-14T11:05:59.794374Z :BS_PROXY_DISCOVER ERROR: [fae4ae78692b6f02] Result# TEvDiscoverResult {Status# ERROR BlockedGeneration# 0 Id# [0:0:0:0:0:0:0] Size# 0 MinGeneration# 0 ErrorReason# "Group# 0 disintegrated, type A."} Marker# BSD01
2022-05-14T11:05:59.794380Z :TABLET_MAIN ERROR: Tablet: 72057594037932033 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-14T11:05:59.794381Z :TABLET_MAIN ERROR: Tablet: 72057594037932033 Type: BSController, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1, Details: Group# 0 disintegrated, type A.
2022-05-14T11:05:59.794381Z :BS_PROXY_DISCOVER ERROR: [946392aebc310326] StepDiscovery Die. Disintegrated. DomainRequestsSent# 1 DomainReplies# 1 DomainSuccess# 0 ParityParts# 0 Handoff# 0 Marker# BSD08
2022-05-14T11:05:59.794383Z :BS_PROXY_DISCOVER ERROR: [946392aebc310326] Result# TEvDiscoverResult {Status# ERROR BlockedGeneration# 0 Id# [0:0:0:0:0:0:0] Size# 0 MinGeneration# 0 ErrorReason# "Group# 0 disintegrated, type A."} Marker# BSD01
2022-05-14T11:05:59.794386Z :BOOTSTRAPPER NOTICE: tablet: 72075186232723360, type: SchemeShard, boot
2022-05-14T11:05:59.794393Z :BOOTSTRAPPER NOTICE: tablet: 72057594037936130, type: TenantSlotBroker, boot
from ydb.
Its not work correct
sudo groupadd ydb
sudo useradd ydb -g ydb
sudo usermod -aG disk ydb
after
sudo chmod 777 /dev/disk/
sudo chmod 777 /dev/disk/by-partlabel
sudo chmod 777 /dev/disk/by-partlabel/ydb_disk_01
sudo chmod 777 /dev/disk/by-partlabel/ydb_disk_02
sudo chmod 777 /dev/disk/by-partlabel/ydb_disk_03
cluster correctd started,
now, simple testing and close issue
from ydb.
It's not a good idea to give 777 premissions on /dev/disk. We tested on Debian systems and it's enough to add user ydb (under which server is running) in disk group.
from ydb.
@aHsirG Everything works fine? Can we close this issue?
from ydb.
All fine)
My problem with permission on Ubuntu, i am not tested on Debian for undstanding problem with permission only ubuntu or my mistake.
Will check on Debian in a few days.
from ydb.
After retest
sudo groupadd ydb
sudo useradd ydb -g ydb
sudo usermod -aG disk ydb
it good work on Debian and Ubuntu, but need restarting VM.
I did not know this because I am not a unix administrator.
Thanks a lot for your help!
from ydb.
Related Issues (20)
- Refactor tests yql/udfs/common/clickhouse/client/test
- [qp] support correct extra memory allocation in CA and enable it in all types of queries
- Add Acceleration logic to Dsproxy Patch requests
- [[Large storage survivability]]
- Support duplicate column names in SDKs HOT 5
- Support duplicated column names in result (C++ SDK)
- Support online static group reconfiguration through distconf
- Fix slow DSProxy requests on returning Fail Realm
- Support indexed tables / SyncIndex
- CLI blocks on version checks if cli S3 storage is not available
- RFC: parallel proposal and planning of volatile transactions
- DataShard: readset arbiter for volatile transactions
- Support donor mode when moving static group disks
- Fill in "forbidden" pdisk set correctly when moving static group disks
- Correct handling static group change in NodeWarden
- Add safety latch to NodeWarden's distconf group reassigner, validate VDisk statuses before actually reassigning group
- DataShard: bug in AddRepeatableReadConflict
- [[make design of new storage API]] HOT 1
- [feature] Universally Unique Lexicographically Sortable Identifier Data Type HOT 1
- Fix BlobDepot monitoring in old viewer
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ydb.