GithubHelp home page GithubHelp logo

Comments (4)

halfgaar avatar halfgaar commented on May 23, 2024

Weird, neither my system nor the Github builders have that issue. I can even reduce it to 512.

Can you give some more info about your system, branch, compiler, etc, etc?

from flashmq.

quinox avatar quinox commented on May 23, 2024

Happy to help. I can provide shell access if that makes it easier for you (I don't mind doing the legwork though).

The details:

  • git revision: upstream master 7e5d0d0eec1312789895ae532a7c4202c5cabe90 (tagged v1.11.0) (clean state)
  • OS: Gentoo with kernel 6.6.21
  • I tried it with 2 compilers:
    • ./run-make-from-ci.sh says: The CXX compiler identification is GNU 12.3.1
    • ./run-make-from-ci.sh --compiler $CXX says: The CXX compiler identification is Clang 17.0.6
  • I use Fish as shell, but it also happens when I use Bash
  • I use a normal user account, not root. This might be important because I do see this error message in the testcase output:

[2024-05-03 16:43:07.442] [ERROR] Setting ulimit nofile failed: 'Operation not permitted'. This means the default is used.

  • Limits:
quinox@gofu ~/p/F/F/buildtests (master)> ulimit --all -S
Maximum size of core files created                              (kB, -c) 0
Maximum size of a process’s data segment                        (kB, -d) unlimited
Control of maximum nice priority                                    (-e) 0
Maximum size of files created by the shell                      (kB, -f) unlimited
Maximum number of pending signals                                   (-i) 128081
Maximum size that may be locked into memory                     (kB, -l) 8192
Maximum resident set size                                       (kB, -m) unlimited
Maximum number of open file descriptors                             (-n) 1024
Maximum bytes in POSIX message queues                           (kB, -q) 800
Maximum realtime scheduling priority                                (-r) 0
Maximum stack size                                              (kB, -s) 8192
Maximum amount of CPU time in seconds                      (seconds, -t) unlimited
Maximum number of processes available to current user               (-u) 128081
Maximum amount of virtual memory available to each process      (kB, -v) unlimited
Maximum contiguous realtime CPU time                                (-y) unlimited

quinox@gofu ~/p/F/F/buildtests (master)> ulimit --all -H
Maximum size of core files created                              (kB, -c) unlimited
Maximum size of a process’s data segment                        (kB, -d) unlimited
Control of maximum nice priority                                    (-e) 0
Maximum size of files created by the shell                      (kB, -f) unlimited
Maximum number of pending signals                                   (-i) 128081
Maximum size that may be locked into memory                     (kB, -l) 8192
Maximum resident set size                                       (kB, -m) unlimited
Maximum number of open file descriptors                             (-n) 4096
Maximum bytes in POSIX message queues                           (kB, -q) 800
Maximum realtime scheduling priority                                (-r) 0
Maximum stack size                                              (kB, -s) unlimited
Maximum amount of CPU time in seconds                      (seconds, -t) unlimited
Maximum number of processes available to current user               (-u) 128081
Maximum amount of virtual memory available to each process      (kB, -v) unlimited
Maximum contiguous realtime CPU time                                (-y) unlimited


Does it not leak files for you, or does it not crash for you?

The grep for epoll is for no special reason except it shows the leakage nicely (note my limit is 1024):

$ strace -fF ./flashmq-tests 2>&1 | grep 'epoll_create.*= [1-9][0-9]*$'
[pid  6338] epoll_create(999)           = 4
[pid  6338] epoll_create(999)           = 5
[pid  6340] epoll_create(999)           = 9
[pid  6340] epoll_create(999)           = 11
[pid  6340] epoll_create(999)           = 13
[pid  6340] epoll_create(999)           = 15
[pid  6340] epoll_create(999)           = 17
[pid  6340] epoll_create(999)           = 19
...
[pid  6338] epoll_create(999)           = 1000
[pid  6338] epoll_create(999)           = 1002
[pid  6338] epoll_create(999)           = 1004
[pid  6338] <... epoll_create resumed>) = 1006
[pid  6338] <... epoll_create resumed>) = 1009
[pid  6338] <... epoll_create resumed>) = 1007
[pid  6338] <... epoll_create resumed>) = 1012
[pid  6338] <... epoll_create resumed>) = 1014
[pid  6338] <... epoll_create resumed>) = 1017
[pid  6338] <... epoll_create resumed>) = 1019
[pid  6338] <... epoll_create resumed>) = 1020
[pid  6338] epoll_create(999)           = 39
[pid  6338] epoll_create(999)           = 40
[pid  6884[2024-05-03 16:35:05.910] [DEBUG] Adding event 'keep-alive check' to the timer with an interval of 5000
d>) = 1023
fish: Process 6334, 'strace' from job 1, 'strace -fF ./flashmq-tests 2>&1…' terminated by signal SIGABRT (Abort)

from flashmq.

quinox avatar quinox commented on May 23, 2024

Capturing the state using lsof -nPX in a second window, the biggest capture I made:

  • Open handles: 8364
    • lsof takes time to run, the 172 handles over the 8192 limit are probably handles that already disappeared before lsof was done (and that's why they are of type "unknown" below)
  • By type:
quinox@gofu ~> gawk '{ print $7 }' /tmp/lsof_1714751372.txt  | sort | uniq -c | sort -h -r
   6880 a_inode
    881 0
    190 unknown
    168 FIFO
    144 REG
...
  • Zooming in on the a_inode handles:
quinox@gofu ~> gawk '$7 == "a_inode" { print $11 }' /tmp/lsof_1714751372.txt | sed 's/:.*//' | sort | uniq -c | sort -hr
   3542 eventfd:$num
   3338 eventpoll:$num

from flashmq.

halfgaar avatar halfgaar commented on May 23, 2024

Thanks, that error from setrlimit made it clear. It's interesting that doesn't work for you.

Anyway, It was kind of an accident I never ran into it. The setrlimit it just something FlashMQ does, so it also did so in tests. Some epoll and eventfd file descriptors plainly lacked a close, or even a destructor to call close() in... I fixed it.

from flashmq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.