GithubHelp home page GithubHelp logo

Startup error about core HOT 28 CLOSED

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024
Startup error

from core.

Comments (28)

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024 1

@bajtos thank you very much, it works now

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024 1

As you said, if the Docker service is running under the docker user, then you can change the owning user instead of the owning group.

sudo chown docker /tmp/state

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024 1

@bajtos But if I use /tmp/state , then I reboot the machine, the directory will be deleted automatically

So, I modified the path to ~/tmp/state, then the problem was resolved

from core.

bigjdunham avatar bigjdunham commented on July 18, 2024 1

In my case I had to change the folder owner and group of the local folder to UID/GID 1000:1000. Since that's what is used internally for the "node" user in the container. After that it worked to keep the files persistent.

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024

I used the following command to create a container, but it failed.

docker run --name station --detach --env STATE_ROOT=/state --env FIL_WALLET_ADDRESS=0x000000000000000000000000000000000000dEaD ghcr.io/filecoin-station/core:20.4.1 -v /tmp/state1:/state

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024

I want to persist my station ID, but I failed to start using the above command

cs144@cs144:~$ sudo docker run --name station --detach --env STATE_ROOT=/state --env FIL_WALLET_ADDRESS=0x000000000000000000000000000000000000dEaD ghcr.io/filecoin-station/core:20.4.1 -v /tmp/state1:/state 
[sudo] password for cs144:
db717d6522c5c02665360a842b4050b0dd30f129f684c0a87ba6c77a097177f8 
cs144@cs144:~$ sudo docker logs station
 v20.12.2 

from core.

juliangruber avatar juliangruber commented on July 18, 2024

Are you saying it failed because it doesn't produce any logs beside "v20.12.2"?

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024

Yes, Under normal circumstances, it should output Spark's work log, but I failed to create the container using the above command, and the container terminated after creation

cs144@cs144:~$ sudo docker exec -it station /bin/bash
 Error response from daemon: Container db717d6522c5c02665360a842b4050b0dd30f129f684c0a87ba6c77a097177f8 is not running

from core.

juliangruber avatar juliangruber commented on July 18, 2024

I can reproduce that it exits after the version message when you pass a state root this way

from core.

juliangruber avatar juliangruber commented on July 18, 2024

actually I don't know why it logs v20.12.2, that's not the Core version

from core.

juliangruber avatar juliangruber commented on July 18, 2024

@bajtos you initially looked into mounting state_root, do you have an idea here?

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024

I can reproduce that it exits after the version message when you pass a state root this way

I'm also curious where v20.12.2 comes from

from core.

zipiju avatar zipiju commented on July 18, 2024

This v20.12.2 looks to be the Node.js version.

Did some poking around as well, as would like to persist the state between the upgrades (deleting and then installing the container again), however the issue looks to be, at least on this platform, the containers can't do any write operations at the container mount, meaning the modules can't create directories, nor any files on the host as that will result in permissions error.
Do not exactly understand how this works between the container and the host or if this issue is specific to this platform.
Even tried chmod 777 /usr/src/app/.local/state in the container, but even with that Core is unable to create subdirectories there and there is no option to change folder permissions on the host:

Error: EACCES: permission denied, mkdir '/usr/src/app/.local/state/secrets'

My solution for now is to install the container, shell into it, cat station_id, create station_id at the host, mount the file, restart the container and since it will be R/O, it will run without any issues.
This will persist updates unless the expected directory/file structure will change.
One question though - is it enough to persist station_id, or should we also persist for example Spark's state, which contents looks to be identical on all the nodes?

from core.

bajtos avatar bajtos commented on July 18, 2024

I'm also curious where v20.12.2 comes from

AFAICT, this is caused by -v /tmp/state1:state added after the docker image name. When I add this argument before the image name, Station Core starts.

On fixing the permissions: Maybe we must explicitly tell Docker to mount the volume as read-write?

The following command works for me on macOS and writes state files to /tmp/state on the host computer:

docker run --name station --detach \
  --env STATE_ROOT=/state \
  --env FIL_WALLET_ADDRESS=0x000000000000000000000000000000000000dEaD \
  -v /tmp/state:/state:rw \
  ghcr.io/filecoin-station/core

Can you please check whether the command above works on your machine too?

One question though - is it enough to persist station_id, or should we also persist for example Spark's state, which contents looks to be identical on all the nodes?

Persisting station_id is crucial.

Persisting other files is not strictly required now. However, we may add more state files in the future that require to be persisted, like the recently introduced station_id file.

I highly recommend persisting the entire state directory.

It's also important to NOT share the state between Station instances, we expect each Station to have its own exclusive state storage.

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024

I have made attempts in both the sudo user group and root user groups
I create a shell script named crest.sh

docker run --name station --detach 
  --env STATE_ROOT=/state 
  --env FIL_WALLET_ADDRESS=0x000000000000000000000000000000000000dEaD
  -v /tmp/state:/state:rw
  ghcr.io/filecoin-station/core:20.4.1 

Under the sudo user group

cs144@cs144:~$ sudo docker rm station station 
cs144@cs144:~$ sudo ./crest.sh 7d41269e8983907c4714af18addfaa7a270fc762a4ce5e20aef906c22d6c229c 
cs144@cs144:~$ sudo docker logs station 
Usage: station.js [options]
Options:
  -j, --json          Output JSON                                      [boolean]
      --experimental  Also run experimental modules                    [boolean]
  -v, --version       Show version number                              [boolean]
  -h, --help          Show help                                        [boolean]

[Error: EACCES: permission denied, mkdir '/state/secrets'] {
  errno: -13,
  code: 'EACCES',
  syscall: 'mkdir',
  path: '/state/secrets'
}

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024

Under the root user

root@cs144:/home/cs144# docker rm station station 
root@cs144:/home/cs144# ./crest.sh 
4badd17bc1758849e21676701a2c2fe664a053cec2cb5ad0890481c3a54e361f 
root@cs144:/home/cs144# docker logs station
Usage: station.js [options]

Options:
  -j, --json          Output JSON                                      [boolean]
      --experimental  Also run experimental modules                    [boolean]
  -v, --version       Show version number                              [boolean]
  -h, --help          Show help                                        [boolean]

[Error: EACCES: permission denied, mkdir '/state/secrets'] {
  errno: -13,
  code: 'EACCES',
  syscall: 'mkdir',
  path: '/state/secrets'
}

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024

The above two attempts all failed to create a new station

from core.

bajtos avatar bajtos commented on July 18, 2024

Thank you, @csmasterpath2023, for testing. Maybe this issue is specific to Linux?

The following StackOverflow answer explains the problem with users & groups and permissions inside the Docker container:

https://stackoverflow.com/a/29251160/69868

It looks too complicated to me, I would hope the situation has improved since 2015 when the answer was posted. But maybe it's a place where to start?

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024

Yes , I use the Ubuntu 23.10 (GNU/Linux 6.5.0-28-generic x86_64)

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024

@bajtos You are welcome, I haven't looked at the link you provided yet, but I think the threshold for the vast majority of users is so high that we may lose a lot of users

from core.

bajtos avatar bajtos commented on July 18, 2024

Can you please run the following command to list directories & permissions and post the output?

❯ docker run --name station  \
  --env STATE_ROOT=/state \
  --env FIL_WALLET_ADDRESS=0x000000000000000000000000000000000000dEaD \
  -v /tmp/state:/state:rw \
  ghcr.io/filecoin-station/core /bin/sh -c "ls -l; ls -l /state"

This is the output I get on my macOS:

total 672
-rw-r--r--   1 root root    213 May 13 15:20 Dockerfile
-rw-r--r--   1 root root  12581 May 13 15:20 LICENSE.md
-rw-r--r--   1 root root   6085 May 13 15:20 README.md
drwxr-xr-x   2 root root   4096 May 13 15:20 bin
drwxr-xr-x   2 root root   4096 May 13 15:20 commands
drwxr-xr-x   2 root root   4096 May 13 15:20 lib
drwxr-xr-x   3 node node   4096 May 13 15:21 modules
drwxr-xr-x 178 node node   4096 May 13 15:21 node_modules
-rw-r--r--   1 root root 620658 May 13 15:20 package-lock.json
-rw-r--r--   1 root root   1843 May 13 15:20 package.json
drwxr-xr-x   2 root root   4096 May 13 15:20 scripts
drwxr-xr-x   2 root root   4096 May 13 15:20 test
-rw-r--r--   1 root root    562 May 13 15:20 tsconfig.json
total 0
drwxr-xr-x 4 node node 128 May 14 13:15 modules
drwxr-xr-x 3 node node  96 May 14 13:15 secrets

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024
cs144@cs144:~$ sudo docker run --name station    --env STATE_ROOT=/state   --env FIL_WALLET_ADDRESS=0x000000000000000000000000000000000000dEaD   -v /tmp/state:/state:rw   ghcr.io/filecoin-station/core:20.4.1 /bin/sh -c "ls -l; ls -l /state"total 664
-rw-r--r--   1 root root    213 May  3 11:33 Dockerfile
-rw-r--r--   1 root root  12581 May  3 11:33 LICENSE.md
-rw-r--r--   1 root root   6085 May  3 11:33 README.md
drwxr-xr-x   2 root root   4096 May  3 11:33 bin
drwxr-xr-x   2 root root   4096 May  3 11:33 commands
drwxr-xr-x   2 root root   4096 May  3 11:33 lib
drwxr-xr-x   3 node node   4096 May  3 11:33 modules
drwxr-xr-x 173 node node   4096 May  3 11:33 node_modules
-rw-r--r--   1 root root 612898 May  3 11:33 package-lock.json
-rw-r--r--   1 root root   1843 May  3 11:33 package.json
drwxr-xr-x   2 root root   4096 May  3 11:33 scripts
drwxr-xr-x   2 root root   4096 May  3 11:33 test
-rw-r--r--   1 root root    562 May  3 11:33 tsconfig.json
total 0

from core.

bajtos avatar bajtos commented on July 18, 2024

Ah, of course, Station Core was not able to create any state files/directories, therefore the second ls printed just "total 0".

Could you please run the following two commands?

Check permissions inside the container:

docker run --name station  \
  --env STATE_ROOT=/state \
  --env FIL_WALLET_ADDRESS=0x000000000000000000000000000000000000dEaD \
  -v /tmp/state:/state:rw \
  ghcr.io/filecoin-station/core /bin/sh -c "ls -ld /state"

Check the permissions on your host computer:

ls -ld /tmp/state

from core.

bajtos avatar bajtos commented on July 18, 2024

This is really weird. Here is what I see on my computer:

In the container:

drwxr-xr-x 4 root root 128 May 14 14:10 /state

On the host:

drwxr-xr-x  4 bajtos  wheel  128 14 May 16:10 /tmp/state

On the container, the root-owned state directory contains files created by the node user.

drwxr-xr-x 4 root root 128 May 14 14:10 /state
total 0
drwxr-xr-x 4 node node 128 May 14 14:10 modules
drwxr-xr-x 3 node node  96 May 14 14:10 secrets

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024

In the container

cs144@cs144:~$ sudo docker run --name station    --env STATE_ROOT=/state   --env FIL_WALLET_ADDRESS=0x000000000000000000000000000000000000dEaD   -v /tmp/state:/state:rw   ghcr.io/filecoin-station/core:20.4.1 /bin/sh -c "ls -ld /state"
drwxr-xr-x 2 root root 4096 May 14 14:01 /state

from core.

csmasterpath2023 avatar csmasterpath2023 commented on July 18, 2024

On the host

cs144@cs144:~$ ls -ld /tmp/state drwxr-xr-x 2 root root 4096 May 14 14:01 /tmp/state

from core.

bajtos avatar bajtos commented on July 18, 2024

On the host

cs144@cs144:~$ ls -ld /tmp/state drwxr-xr-x 2 root root 4096 May 14 14:01 /tmp/state

ok, I think I know what can be problem here.

  • The directory /tmp/state is owned by root.
  • What is the user under which the Docker daemon runs? Is it also root?

I think you should be able to find that user by running ps aux | grep dockerd or ps aux | grep docker.

According to https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-20-04, the Docker service runs under the user in the group docker, so maybe all you need is to change the group from root to docker and allow group members to write to the directory.

chgrp docker /tmp/state
chmod g+w /tmp/state

Alternatively, if the Docker service is running under the docker user, then you can change the owning user instead of the owning group.

chown docker /tmp/state

from core.

bajtos avatar bajtos commented on July 18, 2024

@csmasterpath2023 could you please describe what commands you executed? Other operators running on Linux can find that useful.

from core.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.