GithubHelp home page GithubHelp logo

Comments (16)

mvgorbunov avatar mvgorbunov commented on May 15, 2024

Hi!
It't not a good idea to use ip-address in configuration files as they can be changed, better to use FQDN that can be resolved on every node.

Caught exception: ydb/core/driver_lib/cli_utils/cli_cmds_server.cpp:349: cannot detect node ID for ydb1:19001

Did you follow on-premise deployment documentation while configuring multinode cluster? Could you show your configuration file and server startup command line?

from ydb.

aHsirG avatar aHsirG commented on May 15, 2024

config.txt
Config.txt is renamed "config.yaml"
comand
/opt/ydb/ydbd/bin/ydbd server --tcp --yaml-config /opt/ydb/cfg/config.yaml --node static
--grpc-port 2135 --ic-port 19001 --mon-port 8765
--log-file-name logs/storage_start.log > logs/storage_start_output.log 2>logs/storage_start_err.log &
or
/opt/ydb/ydbd/bin/ydbd server --tcp --yaml-config /opt/ydb/cfg/config.yaml --node 1
--grpc-port 2135 --ic-port 19001 --mon-port 8765
--log-file-name logs/storage_start.log > logs/storage_start_output.log 2>logs/storage_start_err.log &

I am look with example https://ydb.tech/ru/docs/getting_started/self_hosted/ydb_local from sh

from ydb.

mvgorbunov avatar mvgorbunov commented on May 15, 2024

/opt/ydb/ydbd/bin/ydbd server --tcp --yaml-config /opt/ydb/cfg/config.yaml --node static
--grpc-port 2135 --ic-port 19001 --mon-port 8765
--log-file-name logs/storage_start.log > logs/storage_start_output.log 2>logs/storage_start_err.log &

This should works fine.

I am look with example https://ydb.tech/ru/docs/getting_started/self_hosted/ydb_local from sh

You'd better use example config mirror-3dc-3-nodes.yaml
Please note - for fault tolerant cluster (mirror-3 erasure) you must have at least 3 nodes with 3 disks (block device or file on filesystem). Also you can specify only one disk per node and use erasure: none, but if one of the node fails - your database will be unavailable.

from ydb.

aHsirG avatar aHsirG commented on May 15, 2024

Please check my problem on single node.
If using FQDN node started with error
Caught exception: ydb/core/driver_lib/cli_utils/cli_cmds_server.cpp:349: cannot detect node ID for ydb1:19001
If using hostname node started - it good for single node, but if using cluster with hostname not work (view my first post)
command.txt
config_not_work.txt
config_work.txt
error.txt
img
info

from ydb.

mvgorbunov avatar mvgorbunov commented on May 15, 2024

You should specifiy host name in configuration file same as you get it using hostname command.
So, if you want to use FQDN (and it's nessesary for multinode cluster to communicate between nodes) configure your server host name with FQDN (sudo hostname ydb1.ru-central1.internal or add to /etc/hostname), then use FQDN in configuration file.

from ydb.

aHsirG avatar aHsirG commented on May 15, 2024

Good, interconnect it fixs, but next problem

in config, a change only host names (
config.txt
)
https://github.com/ydb-platform/ydb/blob/main/ydb/deploy/yaml_config_examples/mirror-3dc-3-nodes.yaml

In log erros
2022-05-08T16:35:41.407210Z :BOOTSTRAPPER NOTICE: tablet: 72057594037936131, type: Console, boot
2022-05-08T16:35:41.407944Z :BS_PROXY_DISCOVER ERROR: [a6cef7aa52e3541a] Status# ERROR Marker# DSPDM02
2022-05-08T16:35:41.407959Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-08T16:35:41.407970Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-08T16:35:41.407973Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 Type: Console, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1
2022-05-08T16:35:41.408397Z :BOOTSTRAPPER NOTICE: tablet: 72057594037936131, type: Console, boot
2022-05-08T16:35:41.409057Z :BS_PROXY_DISCOVER ERROR: [02f6be02ea6f8374] Status# ERROR Marker# DSPDM02
2022-05-08T16:35:41.409075Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-08T16:35:41.409083Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-08T16:35:41.409086Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 Type: Console, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1
2022-05-08T16:35:41.409399Z :BOOTSTRAPPER NOTICE: tablet: 72057594037936131, type: Console, boot
2022-05-08T16:35:41.410328Z :BS_PROXY_DISCOVER ERROR: [6c625560d57343e7] Status# ERROR Marker# DSPDM02
2022-05-08T16:35:41.410343Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-08T16:35:41.410347Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-08T16:35:41.410350Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 Type: Console, EReason: ReasonBootBSError, Suggested

ydb using 1 core cpu and network traffic 10-14Mbit/s, but
command
/opt/ydb/ydbd/bin/ydbd -s grpc://localhost:2135 admin blobstorage config init --yaml-file cfg/config.yaml > logs/init_storage.log 2>&1
not finish correct, finish wth timeout
MP-0130 Tablet request timed out Marker# MBT4

from ydb.

mvgorbunov avatar mvgorbunov commented on May 15, 2024

This config assumes using 3 raw block devices that were already formated as described https://ydb.tech/en/docs/deploy/manual/deploy-ydb-on-premises#prepare-and-format-disks-on-each-server-%7B#-prepare-disks%7D
If you use local data file - replace drive: path in configs on all nodes.

from ydb.

aHsirG avatar aHsirG commented on May 15, 2024

I am created 9 disk (for 3 machine) and using command for mount
sudo parted /dev/nvme0n1 mklabel gpt -s
sudo parted -a optimal /dev/nvme0n1 mkpart primary 0% 100%
sudo parted /dev/nvme0n1 name 1 ydb_disk_01
sudo partx --u /dev/nvme0n1

from ydb.

mvgorbunov avatar mvgorbunov commented on May 15, 2024

You have misconfiguration - you labeled disk as ydb_disk_01 but used /dev/disk/by-partlabel/ydb_disk_ssd_01 in config.txt (this is our misspell in the documentation and we will fix it, thank you for the report!).
The easiest way to fix it now is replace ydb_disk_ssd_0[1-3] with ydb_disk_01 in your config.
Also don't forget to format each labeled disk using sudo LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/ydb/lib /opt/ydb/bin/ydbd admin bs disk obliterate /dev/disk/by-partlabel/ydb_disk_01 command for every disk on every server.

from ydb.

aHsirG avatar aHsirG commented on May 15, 2024

ok, sory, stupid mistake, but is not fixs error, now error

2022-05-13T13:37:20.837011Z :BOOTSTRAPPER NOTICE: tablet: 72057594037936128, type: Cms, boot
2022-05-13T13:37:20.837829Z :BS_PROXY_DISCOVER ERROR: [69afb1ded53966b6] Status# ERROR Marker# DSPDM02
2022-05-13T13:37:20.837839Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-13T13:37:20.837847Z :TABLET_MAIN ERROR: Tablet: 72057594037936128 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-13T13:37:20.837849Z :TABLET_MAIN ERROR: Tablet: 72057594037936128 Type: Cms, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1
image
image

full commands
sudo parted /dev/vdb mklabel gpt -s
sudo parted -a optimal /dev/vdb mkpart primary 0% 100%
sudo parted /dev/vdb name 1 ydb_disk_01
sudo partx --u /dev/vdb
sudo LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/ydb/ydbd/lib /opt/ydb/ydbd/bin/ydbd admin bs disk obliterate /dev/disk/by-partlabel/ydb_disk_01

sudo parted /dev/vdc mklabel gpt -s
sudo parted -a optimal /dev/vdc mkpart primary 0% 100%
sudo parted /dev/vdc name 1 ydb_disk_02
sudo partx --u /dev/vdc
sudo LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/ydb/ydbd/lib /opt/ydb/ydbd/bin/ydbd admin bs disk obliterate /dev/disk/by-partlabel/ydb_disk_02

sudo parted /dev/vdd mklabel gpt -s
sudo parted -a optimal /dev/vdd mkpart primary 0% 100%
sudo parted /dev/vdd name 1 ydb_disk_03
sudo partx --u /dev/vdd
sudo LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/ydb/ydbd/lib /opt/ydb/ydbd/bin/ydbd admin bs disk obliterate /dev/disk/by-partlabel/ydb_disk_03

from ydb.

aHsirG avatar aHsirG commented on May 15, 2024

for additional, testing simple 1 node 1 disk install
error
2022-05-14T11:05:59.794320Z :BS_PROXY_DISCOVER ERROR: [69485e2bc1c4387b] Result# TEvDiscoverResult {Status# ERROR BlockedGeneration# 0 Id# [0:0:0:0:0:0:0] Size# 0 MinGeneration# 0 ErrorReason# "Group# 0 disintegrated, type A."} Marker# BSD01
2022-05-14T11:05:59.794326Z :BS_PROXY_DISCOVER ERROR: [16ad85e56deff3ae] StepDiscovery Die. Disintegrated. DomainRequestsSent# 1 DomainReplies# 1 DomainSuccess# 0 ParityParts# 0 Handoff# 0 Marker# BSD08
2022-05-14T11:05:59.794327Z :BS_PROXY_DISCOVER ERROR: [16ad85e56deff3ae] Result# TEvDiscoverResult {Status# ERROR BlockedGeneration# 0 Id# [0:0:0:0:0:0:0] Size# 0 MinGeneration# 0 ErrorReason# "Group# 0 disintegrated, type A."} Marker# BSD01
2022-05-14T11:05:59.794330Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-14T11:05:59.794333Z :BOOTSTRAPPER NOTICE: tablet: 72057594046382081, type: Mediator, boot
2022-05-14T11:05:59.794340Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-14T11:05:59.794342Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-14T11:05:59.794345Z :TABLET_MAIN ERROR: Tablet: 72075186232723360 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-14T11:05:59.794345Z :TABLET_MAIN ERROR: Handle::TEvDiscoverResult, result status ERROR
2022-05-14T11:05:59.794346Z :TABLET_MAIN ERROR: Tablet: 72075186232723360 Type: SchemeShard, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1, Details: Group# 0 disintegrated, type A.
2022-05-14T11:05:59.794350Z :TABLET_MAIN ERROR: Tablet: 72057594037936130 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-14T11:05:59.794351Z :TABLET_MAIN ERROR: Tablet: 72057594037936130 Type: TenantSlotBroker, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1, Details: Group# 0 disintegrated, type A.
2022-05-14T11:05:59.794363Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-14T11:05:59.794364Z :TABLET_MAIN ERROR: Tablet: 72057594037936131 Type: Console, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1, Details: Group# 0 disintegrated, type A.
2022-05-14T11:05:59.794372Z :BS_PROXY_DISCOVER ERROR: [fae4ae78692b6f02] StepDiscovery Die. Disintegrated. DomainRequestsSent# 1 DomainReplies# 1 DomainSuccess# 0 ParityParts# 0 Handoff# 0 Marker# BSD08
2022-05-14T11:05:59.794374Z :BS_PROXY_DISCOVER ERROR: [fae4ae78692b6f02] Result# TEvDiscoverResult {Status# ERROR BlockedGeneration# 0 Id# [0:0:0:0:0:0:0] Size# 0 MinGeneration# 0 ErrorReason# "Group# 0 disintegrated, type A."} Marker# BSD01
2022-05-14T11:05:59.794380Z :TABLET_MAIN ERROR: Tablet: 72057594037932033 HandleFindLatestLogEntry, msg->Status: ERROR
2022-05-14T11:05:59.794381Z :TABLET_MAIN ERROR: Tablet: 72057594037932033 Type: BSController, EReason: ReasonBootBSError, SuggestedGeneration: 0, KnownGeneration: 1, Details: Group# 0 disintegrated, type A.
2022-05-14T11:05:59.794381Z :BS_PROXY_DISCOVER ERROR: [946392aebc310326] StepDiscovery Die. Disintegrated. DomainRequestsSent# 1 DomainReplies# 1 DomainSuccess# 0 ParityParts# 0 Handoff# 0 Marker# BSD08
2022-05-14T11:05:59.794383Z :BS_PROXY_DISCOVER ERROR: [946392aebc310326] Result# TEvDiscoverResult {Status# ERROR BlockedGeneration# 0 Id# [0:0:0:0:0:0:0] Size# 0 MinGeneration# 0 ErrorReason# "Group# 0 disintegrated, type A."} Marker# BSD01
2022-05-14T11:05:59.794386Z :BOOTSTRAPPER NOTICE: tablet: 72075186232723360, type: SchemeShard, boot
2022-05-14T11:05:59.794393Z :BOOTSTRAPPER NOTICE: tablet: 72057594037936130, type: TenantSlotBroker, boot

from ydb.

aHsirG avatar aHsirG commented on May 15, 2024

Its not work correct
sudo groupadd ydb
sudo useradd ydb -g ydb
sudo usermod -aG disk ydb
after
sudo chmod 777 /dev/disk/
sudo chmod 777 /dev/disk/by-partlabel
sudo chmod 777 /dev/disk/by-partlabel/ydb_disk_01
sudo chmod 777 /dev/disk/by-partlabel/ydb_disk_02
sudo chmod 777 /dev/disk/by-partlabel/ydb_disk_03
cluster correctd started,
now, simple testing and close issue

from ydb.

mvgorbunov avatar mvgorbunov commented on May 15, 2024

It's not a good idea to give 777 premissions on /dev/disk. We tested on Debian systems and it's enough to add user ydb (under which server is running) in disk group.

from ydb.

mvgorbunov avatar mvgorbunov commented on May 15, 2024

@aHsirG Everything works fine? Can we close this issue?

from ydb.

aHsirG avatar aHsirG commented on May 15, 2024

All fine)
My problem with permission on Ubuntu, i am not tested on Debian for undstanding problem with permission only ubuntu or my mistake.
Will check on Debian in a few days.

from ydb.

aHsirG avatar aHsirG commented on May 15, 2024

After retest
sudo groupadd ydb
sudo useradd ydb -g ydb
sudo usermod -aG disk ydb
it good work on Debian and Ubuntu, but need restarting VM.
I did not know this because I am not a unix administrator.

Thanks a lot for your help!

from ydb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.