GithubHelp home page GithubHelp logo

Comments (11)

dturner-tw avatar dturner-tw commented on April 27, 2024

https://lkml.org/lkml/2014/3/4/441

from watchman.

sunshowers avatar sunshowers commented on April 27, 2024

Are there any processes being triggered? Watchman could potentially be leaking fds to children, though I think we use O_CLOEXEC or equivalent everywhere.

from watchman.

sunshowers avatar sunshowers commented on April 27, 2024

The sudo ls -l /proc/*/fd/* 2>/dev/null | grep -c anon_inode:inotify you ran -- what processes actually held inotify descriptors?

FWIW, we run with a max_user_instances of 128 and a max_user_watches of 1000000.

from watchman.

dturner-tw avatar dturner-tw commented on April 27, 2024

(I'm going to answer your questions wrt 3.13.5 on virtualbox, because it's annoying to reboot my dev machine).

There are no triggers.

What's holding descriptors:
ps ax|egrep sudo ls -l /proc/*/fd/* 2>/dev/null|cut -d / -f 3|sort -u |grep -v self|tr '\n '|'|sed 's/\(.*\)./\1/'

1519 ? Ss 0:00 init --user
1613 ? Ss 0:00 dbus-daemon --fork --session --address=unix:abstract=/tmp/dbus-EeYjp63CNa
1619 ? Ss 0:00 upstart-event-bridge
1626 ? Ss 0:00 /usr/lib/x86_64-linux-gnu/hud/window-stack-bridge
1633 ? S 0:00 upstart-file-bridge --daemon --user
1635 ? S 0:00 upstart-dbus-bridge --daemon --session --user --bus-name session
1637 ? S 0:00 upstart-dbus-bridge --daemon --system --user --bus-name system
1640 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/bamf/bamfdaemon
1641 ? Ssl 0:08 /usr/bin/ibus-daemon --daemonize --xim
1658 ? Ssl 0:00 /usr/lib/gnome-settings-daemon/gnome-settings-daemon
1664 ? Ssl 0:00 /usr/lib/x86_64-linux-gnu/hud/hud-service
1665 ? Sl 0:00 /usr/lib/gvfs/gvfsd
1667 ? Ssl 0:00 /usr/lib/at-spi2-core/at-spi-bus-launcher --launch-immediately
1668 ? Ssl 0:00 gnome-session --session=ubuntu
1673 ? Ssl 0:00 /usr/lib/unity/unity-panel-service
1678 ? S 0:00 /bin/dbus-daemon --config-file=/etc/at-spi2/accessibility.conf --nofork --print-address 3
1685 ? Sl 0:00 /usr/lib/at-spi2-core/at-spi2-registryd --use-gnome-session
1688 ? Sl 0:00 /usr/lib/ibus/ibus-dconf
1689 ? Sl 0:01 /usr/lib/ibus/ibus-ui-gtk3
1693 ? Sl 0:00 /usr/lib/gvfs//gvfsd-fuse -f /run/user/1000/gvfs
1696 ? Sl 0:00 /usr/lib/ibus/ibus-x11 --kill-daemon
1736 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/indicator-printers-service
1745 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/indicator-power/indicator-power-service
1746 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/indicator-bluetooth/indicator-bluetooth-service
1751 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/indicator-application-service
1752 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/indicator-sound/indicator-sound-service
1753 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/indicator-sync/indicator-sync-service
1758 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/indicator-messages/indicator-messages-service
1762 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/indicator-session/indicator-session-service
1765 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/indicator-datetime-service
1770 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/indicator-keyboard-service --use-gtk --use-bamf
1776 ? S<l 0:00 /usr/bin/pulseaudio --start --log-target=syslog
1788 ? Sl 0:02 /usr/lib/ibus/ibus-engine-simple
1830 ? Sl 0:00 /usr/bin/gnome-screensaver --no-daemon
1851 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/notify-osd
1853 ? Sl 0:00 /usr/lib/dconf/dconf-service
1857 ? Sl 0:00 /usr/lib/evolution/evolution-source-registry
1858 ? Rl 1:00 compiz
1871 ? Sl 0:00 nautilus -n
1880 ? Sl 0:00 /usr/lib/gnome-settings-daemon/gnome-fallback-mount-helper
1892 ? Sl 0:00 /usr/lib/policykit-1-gnome/polkit-gnome-authentication-agent-1
1903 ? Sl 0:00 nm-applet
1940 ? Sl 0:00 /usr/lib/gvfs/gvfs-udisks2-volume-monitor
1943 ? Sl 0:00 /usr/lib/evolution/evolution-calendar-factory
1948 ? S 0:00 /usr/lib/x86_64-linux-gnu/gconf/gconfd-2
1965 ? Sl 0:00 /usr/lib/gvfs/gvfs-gphoto2-volume-monitor
1978 ? Sl 0:00 /usr/lib/gvfs/gvfs-afc-volume-monitor
1983 ? Sl 0:00 /usr/lib/gvfs/gvfs-mtp-volume-monitor
1994 ? Sl 0:00 /usr/lib/gvfs/gvfsd-trash --spawner :1.7 /org/gtk/gvfs/exec_spaw/0
2003 ? Sl 0:00 /usr/lib/gvfs/gvfsd-burn --spawner :1.7 /org/gtk/gvfs/exec_spaw/1
2007 ? Ss 0:00 /bin/sh -c /usr/bin/gtk-window-decorator
2008 ? Sl 0:00 /usr/bin/gtk-window-decorator
2012 ? Sl 0:00 telepathy-indicator
2026 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/unity-scope-home/unity-scope-home
2029 ? Sl 0:00 zeitgeist-datahub
2038 ? Sl 0:00 /usr/bin/zeitgeist-daemon
2045 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/zeitgeist-fts
2053 ? S 0:00 /bin/cat
2068 ? Sl 0:00 /usr/bin/unity-scope-loader applications/applications.scope applications/scopes.scope commands.scope
2070 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/unity-lens-files/unity-files-daemon
2095 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/unity-lens-music/unity-music-daemon
2154 ? Sl 0:00 /usr/bin/python3 /usr/share/unity-scopes/flickr/unity_flickr_daemon.py
2156 ? Sl 0:00 /usr/bin/python3 /usr/share/unity-scopes/picasa/unity_picasa_daemon.py
2157 ? Sl 0:00 /usr/bin/python3 /usr/share/unity-scopes/facebook/unity_facebook_daemon.py
2159 ? Sl 0:10 gnome-terminal
2166 pts/1 Ss 0:00 bash
2271 ? Sl 0:00 /usr/lib/gvfs/gvfsd-http --spawner :1.7 /org/gtk/gvfs/exec_spaw/2
5504 ? Sl 0:00 update-notifier
5545 pts/0 Ss+ 0:00 bash
5634 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/deja-dup/deja-dup-monitor
6058 pts/1 S+ 0:00 bash
6059 pts/1 S+ 0:00 bash

from watchman.

sunshowers avatar sunshowers commented on April 27, 2024

That seems like you aren't doing any filtering based on the kind of descriptor -- I'm particularly interested in processes that are holding inotify descriptors open. Basically the PIDs you get when you run sudo ls -l /proc/*/fd/* 2>/dev/null | grep anon_inode:inotify.

from watchman.

dturner-tw avatar dturner-tw commented on April 27, 2024

Sorry about that -- this is the list if I add the grep anon_inode:inotify at the appropriate place in the previous command.

1577 ? Ss 0:00 init --user
1671 ? Ss 0:00 dbus-daemon --fork --session --address=unix:abstract=/tmp/dbus-cRInMU4p2A
1693 ? S 0:00 upstart-file-bridge --daemon --user
1697 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/bamf/bamfdaemon
1699 ? Ssl 0:01 /usr/bin/ibus-daemon --daemonize --xim
1720 ? Ssl 0:00 /usr/lib/gnome-settings-daemon/gnome-settings-daemon
1727 ? S 0:00 /bin/dbus-daemon --config-file=/etc/at-spi2/accessibility.conf --nofork --print-address 3
1731 ? Ssl 0:00 gnome-session --session=ubuntu
1750 ? Sl 0:00 /usr/lib/ibus/ibus-dconf
1754 ? Sl 0:00 /usr/lib/ibus/ibus-ui-gtk3
1768 ? Sl 0:00 /usr/lib/ibus/ibus-x11 --kill-daemon
1877 ? S<l 0:00 /usr/bin/pulseaudio --start --log-target=syslog
1892 ? Sl 0:00 /usr/lib/ibus/ibus-engine-simple
1936 ? Sl 0:00 /usr/bin/gnome-screensaver --no-daemon
1965 ? Sl 0:00 /usr/lib/evolution/evolution-source-registry
1970 ? Rl 0:15 compiz
1980 ? Sl 0:00 nautilus -n
2050 ? Sl 0:00 /usr/lib/gvfs/gvfs-udisks2-volume-monitor
2077 ? Sl 0:00 /usr/lib/gvfs/gvfs-afc-volume-monitor
2110 ? Sl 0:00 /usr/lib/gvfs/gvfsd-trash --spawner :1.5 /org/gtk/gvfs/exec_spaw/0
2123 ? Ss 0:00 /bin/sh -c /usr/bin/gtk-window-decorator
2124 ? Sl 0:00 /usr/bin/gtk-window-decorator
2141 ? Sl 0:00 /usr/bin/unity-scope-loader applications/applications.scope applications/scopes.scope commands.scope
2143 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/unity-lens-files/unity-files-daemon
2160 ? Sl 0:00 zeitgeist-datahub
2185 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/unity-lens-music/unity-music-daemon
2214 ? Sl 0:01 gnome-terminal

from watchman.

dturner-tw avatar dturner-tw commented on April 27, 2024

While running other simultaneous tests with watchman, I noticed the tests suddenly freezing up. I'm not quite sure of the precise sequence of events (because automated testing), but the tests were doing (watch, since query) one or more times, then deleting the watched directory. Now watchman is in a funny state: there appear to be 8 watchman processes running (I think usually there is only one). watchman watch-list hangs attempting to read from the socket, as does watchman watch. However, providing bad syntax (eg "watchman since /tmp/foo") doesn't hang, indicating that the problem isn't the socket but whatever's going on inside watchman.

And it's definitely a kernel bug -- I found this in my dmesg:

[152513.914195] watchman[4963]: segfault at 7ff04ddb09d0 ip 00007ff05b831f60 sp 00007ff04c5acce8 error 4 in libpthread-2.15.so[7ff05b825000+18000]
[152516.577861] watchman[6138]: segfault at 7f1962e099d0 ip 00007f1970489f60 sp 00007f1962406ce8 error 4 in libpthread-2.15.so[7f197047d000+18000]
[153010.703990] BUG: unable to handle kernel NULL pointer dereference at (null)
[153010.704036] IP: < (null)>
[153010.704060] PGD 1b1b4e067 PUD 1cc1f1067 PMD 0
[153010.704084] Oops: 0010 [#1] SMP
[153010.704103] Modules linked in: btrfs raid6_pq zlib_deflate xor ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs reiserfs usb_storage cdc_acm joydev pci_stub vboxpci(OF) vboxnetadp(OF) vboxnetflt(OF) vboxdrv(OF) bnep rfcomm bluetooth parport_pc ppdev uvcvideo videobuf2_core binfmt_misc videodev snd_hda_codec_hdmi snd_hda_codec_conexant videobuf2_vmalloc videobuf2_memops snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi arc4 snd_rawmidi iwldvm mac80211 snd_seq_midi_event snd_seq psmouse thinkpad_acpi snd_timer snd_seq_device iwlwifi nvram serio_raw snd tpm_tis cfg80211 soundcore snd_page_alloc mac_hid mei_me mei lpc_ich lp ext2 parport dm_crypt i915 drm_kms_helper e1000e wmi drm ptp pps_core ahci libahci sdhci_pci sdhci i2c_algo_bit video
[153010.704453] CPU: 1 PID: 3586 Comm: watchman Tainted: GF W O 3.11.0-17-generic #31~precise1-Ubuntu
[153010.704493] Hardware name: LENOVO 4177Q5U/4177Q5U, BIOS 83ET76WW (1.46 ) 07/05/2013
[153010.704529] task: ffff8801b6ae0000 ti: ffff88009ed30000 task.ti: ffff88009ed30000
[153010.704564] RIP: 0010:[<0000000000000000>] < (null)>
[153010.704600] RSP: 0018:ffff88009ed31dc0 EFLAGS: 00010246
[153010.704624] RAX: 00000000b98ab901 RBX: ffff88015a9b5228 RCX: 00000000000188d0
[153010.704655] RDX: 000000000000b98a RSI: ffff880100b95c00 RDI: ffff88015a9b5228
[153010.704686] RBP: ffff88009ed31dd8 R08: 0000000000000001 R09: ffffea0000d85640
[153010.704718] R10: ffffffff811f9628 R11: 0000000000000000 R12: ffff88015a9b5228
[153010.704750] R13: ffff880100b95ca0 R14: 00000000ffffffff R15: ffff880100b95c00
[153010.704783] FS: 00007f5b19087700(0000) GS:ffff88021e240000(0000) knlGS:0000000000000000
[153010.704819] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[153010.704846] CR2: 0000000000000000 CR3: 00000001dd9d1000 CR4: 00000000000427e0
[153010.704877] Stack:
[153010.704888] ffffffff811f7120 ffff88015a9b5228 ffff8801e460c7f0 ffff88009ed31e28
[153010.704926] ffffffff811f7847 ffff88009ed31e18 ffff880100b95c70 ffff88009ed31f50
[153010.704965] ffff880100b95c00 0000000000000010 ffff8802120433c0 ffff8802120433c0
[153010.705005] Call Trace:
[153010.705024] [] ? fsnotify_put_mark+0x30/0x40
[153010.705054] [] fsnotify_clear_marks_by_group_flags+0x87/0xb0
[153010.705088] [] fsnotify_clear_marks_by_group+0x13/0x20
[153010.705119] [] fsnotify_destroy_group+0x16/0x40
[153010.705150] [] inotify_release+0x26/0x50
[153010.705177] [] __fput+0xba/0x240
[153010.705201] [] ____fput+0xe/0x10
[153010.705226] [] task_work_run+0xc8/0xf0
[153010.706464] [] do_notify_resume+0xac/0xc0
[153010.707730] [] int_signal+0x12/0x17
[153010.708933] Code: Bad RIP value.
[153010.710089] RIP < (null)>
[153010.711251] RSP
[153010.712319] CR2: 0000000000000000
[153010.718669] ---[ end trace 17ed2927fe522cd1 ]---

from watchman.

dturner-tw avatar dturner-tw commented on April 27, 2024

Killing all of the watchman processes appears to restore order to the universe, mostly, but watchman is now rather fragile -- I get a fair number of "synchronization failed: Connection timed out" errors when doing queries, which I normally don't; killing watchman makes them go away, but only for a little while.

from watchman.

wez avatar wez commented on April 27, 2024

I don't think it is especially productive to try sanity check watchman behavior on a system with a broken inotify implementation :-/ Watchman relies on the filesystem notification layer of the system on which it runs; if that isn't working then watchman is going to have a hard time telling you what you want to know.

If you can't get things running on a working kernel, then my suggestions would be to try running your tests on a different filesystem (maybe this is a filesystem specific kernel bug?)

Regarding the blocking/stuck behavior you mentioned, it would be useful to see a gstack of the watchman server process and compare gstacks of the other processes that you saw (assuming that those are acting in client mode. It's possible that they are just the threads of the server process that are showing up in whatever tool you're using to inspect the system).

from watchman.

wez avatar wez commented on April 27, 2024

I'm going to close this out because it really seems to be a kernel problem and not something we can solve in watchman. Sorry!

from watchman.

dturner-tw avatar dturner-tw commented on April 27, 2024

FWIW, this seems to be fixed on 3.14-rc5.

On Sun, Mar 30, 2014 at 10:25 PM, Wez Furlong [email protected]:

Closed #27 #27.

Reply to this email directly or view it on GitHubhttps://github.com//issues/27
.

from watchman.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.