feh / nocache Goto Github PK

View Code? Open in Web Editor NEW

539.0 539.0 52.0 125 KB

minimize caching effects

License: BSD 2-Clause "Simplified" License

Makefile 5.27% C 88.11% Shell 6.62%

nocache's People

Contributors

Stargazers

Watchers

nocache's Issues

Describe how this compares with pagecache-management tools

Thanks - nice tool!

A tool similar to nocache is pagecache-mangagement - A tool to permit the management of the pagecache usage of arbitrary applications - Google Project Hosting
It might help to compare and contrast the two in some documentation here, or steal some code, or connect with the authors, or whatever.

Off-by-one issue with NOCACHE_MAX_FDS

As discussed here, NOCACHE_MAX_FDS is treated as an artificial cap for RLIMIT_NOFILE. Therefore it should follow the same convention for its value - it should be one greater than the maximum file descriptor we are interested in. Currently, this is not the case, the value of NOCACHE_MAX_FDS is now 2 greater than the maximum file descriptor that will be handled by nocache.

This can easily be proven by adding this simple test to maxfd.t:

t "env NOCACHE_MAX_FDS=4 LD_PRELOAD=../nocache.so cat testfile.$$ >/dev/null && ! ../cachestats -q testfile.$$" "file is not in cache because it has an FD < 4"

Since FDs 0, 1 and 2 are stdin, stdout and stderr, the testfile should have FD 3. Therefore, setting NOCACHE_MAX_FDS to 4 should make sure the test passes. It does not. Setting it to 5 grants a passing test. Reverting this line also resolves the problem.

FTBFS on armhf: symbol `fopen64' is already defined

As reported in Debian, nocache FTBFS on armhf as follows:

cc -g -O2 -Werror=implicit-function-declaration -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -Wall -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -fPIC -c -o pageinfo.o pageinfo.c
sed 's!##libdir##!$(dirname "$0")!' <nocache.in >nocache
chmod a+x nocache
/tmp/ccsSPzHW.s: Assembler messages:
/tmp/ccsSPzHW.s:3650: Error: symbol `fopen64' is already defined

Segmentation fault with SELinux

Hi,

Have you had the opportunity to test nocache on an OS with SELinux installed?
When I use it on a RHEL5 (64bit) system, with SELinux installed but disabled, I get a segmentation fault shortly after libsepol (from SELinux) is loaded:

# LD_PRELOAD=/usr/lib64/nocache.so strace -f /bin/ls
execve("/bin/ls", ["/bin/ls"], [/* 28 vars */]) = 0
...
open("/lib64/libsepol.so.1", O_RDONLY)  = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340<\240\34<\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=247528, ...}) = 0
mmap(0x3c1ca00000, 2383168, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3c1ca00000
mprotect(0x3c1ca3b000, 2097152, PROT_NONE) = 0
mmap(0x3c1cc3b000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3b000) = 0x3c1cc3b000
mmap(0x3c1cc3c000, 40256, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3c1cc3c000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ae9698ba000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ae9698bb000
arch_prctl(ARCH_SET_FS, 0x2ae9698bb390) = 0
mprotect(0x3c1d414000, 4096, PROT_READ) = 0
mprotect(0x37fd002000, 4096, PROT_READ) = 0
mprotect(0x3c1c146000, 16384, PROT_READ) = 0
mprotect(0x3d6f607000, 4096, PROT_READ) = 0
mprotect(0x3c1bc19000, 4096, PROT_READ) = 0
munmap(0x2ae9698a9000, 61160)           = 0
set_tid_address(0x2ae9698bb420)         = 10941
set_robust_list(0x2ae9698bb430, 0x18)   = 0
rt_sigaction(SIGRTMIN, {0x3c1d205350, [], SA_RESTORER|SA_SIGINFO, 0x3c1d20de60}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x3c1d2052a0, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x3c1d20de60}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=10240*1024, rlim_max=RLIM_INFINITY}) = 0
access("/etc/selinux/", F_OK)           = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

I can use nocache with other binaries not linked against libsepol, on the same machine.

Would you know why this happens, and how to fix it?

Thank you.

And thank you for nocache :)

nocache works, systemd-run doesn't

This is the first issue I have ever opened to say that "your program works better than the alternative that you suggest"!

Test case, first without any optimizations:

backup-process takes 7 minutes (a mksquashfs chroot chroot.squashfs command)
While it's running, user-process takes 2.5 minutes (an ls -lR /home/username command)

Then I try with nocache:

nocache backup-process takes 9 minutes
While it's running, user-process takes 1.5 minutes

==> this is the correct optimization I'm looking for, to favor the user process.

Then I try the same thing with the systemd-run command that you mention in https://github.com/Feh/nocache#if-you-use-systemd.
I get the same results as "without any optimizations", 7 minutes and 2.5 minutes.

I do NOT notice this thing you mention:

During (notice how buff/cache only goes up by ~300MiB):

I was NOT able to make systemd-run limit the pagecache. buff/cache for me was using all the RAM, and my "free" column went down to almost zero.

The systemd.resource-control manpage mentions that:

Options for controlling the Legacy Control Group Hierarchy (Control Groups version 1 are now fully deprecated: CPUShares=weight, StartupCPUShares=weight, MemoryLimit=bytes, ...

So I tested with the recommended MemoryHigh instead, and I even tried MemoryMax, to no avail.
I was unable to make systemd-run NOT fill up the RAM with the pagecache, and as a result, the user-process couldn't keep its own pages in RAM and needed 2.5 minutes instead of 1.5 minute.

==> So I guess the actual issue is, "could you please update README.md with a working example for the newer cgroups v2?"

Default Thread limit is not respecting container cpu limit but underline node(vm)

Hi Team,

What I have seen is that by default thread is allocated based on cpu cores but when it comes to container in kubernetes specifiying request limits for cpu is not getting applied in defaults but have to specify explicitiy in policy.xml.

Could someone also clear whether this thread is based on cpu core or what ?

Lock pages from preventing of being paged and tune malloc.

pavlinux@3bab187

nocache causes the file modification time to be updated

If I have a file like this:

-rw-r--r-- 1 root root   23 Aug 26 10:13 test

And a script like this:

#!/usr/bin/env bash
nocache -n 2 cat test

Running the script causes the test file to have it's time of last modification updated. Here's what it looks like after running the above:

-rw-r--r-- 1 root root   23 Sep  2 10:15 test

I noticed this after my backup script ran I found myself with a whole filesystem with the same last modified time on it :P

nocache.c:148: init_mutexes: Assertion `fds_lock != NULL' failed.

As reported in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=918464#50
nocache may fail as follows:

$ timeout 11 ./nocache apt show coreutils 1>>/dev/null
apt: nocache.c:148: init_mutexes: Assertion `fds_lock != NULL' failed.
Aborted
Error 134

Here is what Sven Joachim wrote about it:

The good news is that I seem to have found the explanation for the
failed assertion. In line 147 of nocache.c we have

fds_lock = malloc(max_fds * sizeof(*fds_lock));

and malloc obviously returned NULL. With a debug printf statement I
found out that max_fds == 1073741816, with sizeof(*fds_lock) == 40 it is
not too surprising that malloc failed.

Why is max_fds so high? In the systemd changelog I found out the
following:

,----
| systemd (240-2) unstable; urgency=medium
| 
|   * Don't bump fs.nr_open in PID 1.
|     In v240, systemd bumped fs.nr_open in PID 1 to the highest possible
|     value. Processes that are spawned directly by systemd, will have
|     RLIMIT_NOFILE be set to 512K (hard).
|     pam_limits in Debian defaults to "set_all", i.e. for limits which are
|     not explicitly configured in /etc/security/limits.conf, the value from
|     PID 1 is taken, which means for login sessions, RLIMIT_NOFILE is set to
|     the highest possible value instead of 512K. Not every software is able
|     to deal with such an RLIMIT_NOFILE properly.
|     While this is arguably a questionable default in Debian's pam_limit,
|     work around this problem by not bumping fs.nr_open in PID 1.
|     (Closes: #917167)
| 
|  -- Michael Biebl <[email protected]>  Thu, 27 Dec 2018 14:03:57 +0100
`----

And this sid system has an uptime of 13 days, so was booted with systemd
240-1 which explains the high RLIMIT_NOFILE. On a freshly booted
laptop, I get max_fds == 1048576 instead, and obviously malloc'ing 40
Megabytes rather than 40 Gigabytes of RAM is easily possible.

Mention no help for huge files

nocache only calls posix_fadvise( POSIX_FADV_DONTNEED ) when the file descriptor is closed, and that is too late if the file is huge and has already been read.

This limitation should be mentioned in the README, because it effectively renders nocache useless for big files.

This guy was actually the one that found out about it:
https://sebastian.marsching.com/blog/archives/161-Design-flaws-of-the-Linux-page-cache.html

unexpected nocache overhead on trivial commands

I made the mistake of applying nocache to a shell script, and things did not go well. Every command in the script had an extra CPU second added to its run time. I wish the documentation had warned me about this:

$ time /bin/true
real    0m0.013s
user    0m0.000s
sys     0m0.013s

$ time nocache /bin/true
real    0m1.017s
user    0m0.647s
sys     0m0.365s

$ time mkdir -p /tmp/foo
real    0m0.013s
user    0m0.001s
sys     0m0.012s

$ time nocache mkdir -p /tmp/foo
real    0m1.002s
user    0m0.645s
sys     0m0.356s

$ time rm -rf /tmp/foo
real    0m0.010s
user    0m0.000s
sys     0m0.011s

$ time nocache rm -rf /tmp/foo
real    0m1.130s
user    0m0.738s
sys     0m0.390s

$ time date
Fri Apr  1 03:46:44 EDT 2022
real    0m0.008s
user    0m0.000s
sys     0m0.008s

$ time nocache date
Fri Apr  1 03:46:47 EDT 2022
real    0m1.093s
user    0m0.779s
sys     0m0.309s

$ time /bin/echo hi
hi
real    0m0.009s
user    0m0.000s
sys     0m0.008s

$ time nocache /bin/echo hi
hi
real    0m1.022s
user    0m0.691s
sys     0m0.328s

Conflicts with other utilities/libraries overriding application behavior via LD_PRELOAD

....like libeatmydata, for example. If LD_PRELOAD variable is already defined, it should be extended, not overridden. Here's an example patch:

diff --git a/nocache b/nocache
index f6df6b1..be36b61 100755
--- a/nocache
+++ b/nocache
@@ -1,3 +1,10 @@
 #!/bin/sh
-export LD_PRELOAD="./nocache.so"
+libnocache="/usr/local/lib/nocache.so"
+
+if [ -n "$LD_PRELOAD" ]; then
+    export LD_PRELOAD="$libnocache $LD_PRELOAD"
+else
+    export LD_PRELOAD="$libnocache"
+fi
+
 exec "$@"

Trivial suggestions for README.md

Please link to the Linux MM team's wiki which describes the problem and why they haven't already fixed it:

https://linux-mm.org/AdvancedPageReplacement (and related pages on that wiki)

Also, your alternate methods shows you to do it using raw cgroups.
Please mention that systemd 231+ users can easily get the same results on a per-unit basis with MemoryHigh=.
For example,

    # my-update-script.service
    [Service]
    Type=oneshot
    Exec=/usr/local/bin/my-update-script
    MemoryHigh=128M

The exact MemoryHigh= is not important -- it should be bigger than the script's typical peak usage, and significantly lower than your total RAM.

PS: MemoryHigh= uses the second generation of cgroup stuff. IIRC your README.md is still documenting the first-generation cgroup stuff? I think the details are buried in https://www.kernel.org/doc/Documentation/cgroup-v2.txt but I haven't gone digging for a while.

undesired side effects on git

Hi,

trying following:

git clone https://github.com/Feh/nocache.git
cd nocache
nocache git pull

gives:

error: git-remote-https died of signal 11
Already up-to-date.

Replacing the clone url "https://" by "git://" makes the error to disappear. Any clue how nocache causes this?

GCC cache size flags

Anybody try/test this? :)

gcc --param "l1-cache-size=0" --param "l1-cache-line-size=0" --param "l2-cache-size=0"

Doesn't work for pv in conjunction with physical partitions/disks?

There's this utility pv which allows to see a nice progress of files being read/written and it looks like nocache has no effect on it.

Steps to reproduce:

nocache pv /dev/sda > /dev/null

Buffers/cache still grow.

SIGSEGV on exit with GnuTLS

nocache git annex works well, but always dies with a segfault when exiting:

$ nocache git annex version
git-annex version: 6.20160126
build flags: Assistant Webapp Pairing Testsuite S3(multipartupload)(storageclasses) WebDAV Inotify DBus DesktopNotify XMPP ConcurrentOutput TorrentParser Feeds Quvi
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 SHA1E SHA1 MD5E MD5 WORM URL
remote types: git gcrypt S3 bup directory rsync web bittorrent webdav tahoe glacier ddar hook external
local repository version: 5
supported repository versions: 5 6
upgrade supported from repository versions: 0 1 2 4 5
error: git-annex died of signal 11

Affects both the official binary builds and the Arch Linux community/git-annex builds.

community/git-annex 6.20160126-5
aur/nocache-git 100.a56cbb4-1
Linux 4.4.1-1-ARCH x86_64

effect on write cache

So I am considering this to control a service I use that pollutes the read cache with data that will get no hits or very low hit rate, sadly the service has no O_DIRECT feature. I am hoping this wil do what I want.

However my concern is if it also disables write cache akin to fsync I only want to disable read caching.

Is this possible, and if no can it be patched in?

Do SubProcess calls work as expected.

Looking at using NoCache to test real hardware performance for storage at the filesystem level. The test suite is managed by higher level scripts (make, bash/sh, python etc). Question is, if I have a makefile or bash script spawning subprocesses, does:

nocache bash test.sh

nocache make

behave as if every single subprocess call in the script had nocache in their definition?

Support O_DIRECT?

My observations show that method's being currently used possibly may fail to prevent cache pollution.

I suppose using O_DIRECT could be more promising way to achieve the goal.

Maybe UNIX-way name for .so library ?

diff --git a/Makefile b/Makefile
index cfb6126..502d2ef 100644
--- a/Makefile
+++ b/Makefile
@@ -17,7 +17,7 @@ CFLAGS+= -Wall
 GCC = gcc $(CFLAGS) $(CPPFLAGS) $(LDFLAGS)

 .PHONY: all
-all: $(CACHE_BINS) nocache.so nocache
+all: $(CACHE_BINS) libnocache.so nocache

 $(CACHE_BINS):
        $(GCC) -o $@ [email protected]
@@ -32,14 +32,14 @@ nocache:
        sed 's!##libdir##!$$(dirname "$$0")!' <nocache.in >$@
        chmod a+x $@

-nocache.so: $(NOCACHE_BINS)
-       $(GCC) -pthread -shared -Wl,-soname,nocache.so -o nocache.so $(NOCACHE_BINS) -ldl
+libnocache.so: $(NOCACHE_BINS)
+       $(GCC) -pthread -shared -Wl,-soname,libnocache.so -o libnocache.so $(NOCACHE_BINS) -ldl

 $(mandir) $(libdir) $(bindir):
        mkdir -v -p $@

 install: all $(mandir) $(libdir) $(bindir) nocache.global
-       install -m 0644 nocache.so $(libdir)
+       install -m 0644 libnocache.so $(libdir)
        install -m 0755 nocache.global $(bindir)/nocache
        install -m 0755 $(CACHE_BINS) $(bindir)
        install -m 0644 $(MANPAGES) $(mandir)
@@ -47,11 +47,11 @@ install: all $(mandir) $(libdir) $(bindir) nocache.global
 .PHONY: uninstall
 uninstall:
        cd $(mandir) && $(RM) -v $(notdir $(MANPAGES))
-       $(RM) -v $(bindir)/nocache $(libdir)/nocache.so
+       $(RM) -v $(bindir)/nocache $(libdir)/libnocache.so

 .PHONY: clean distclean
 clean distclean:
-       $(RM) -v $(CACHE_BINS) $(NOCACHE_BINS) nocache.so nocache nocache.global
+       $(RM) -v $(CACHE_BINS) $(NOCACHE_BINS) libnocache.so nocache nocache.global

 .PHONY: test
 test: all

Builded nocache package to main destributives

ArchLinux CentOS Fedora Mandriva Mint SUSE etc packges are availible on
ftp://updates.etersoft.ru/pub/Korinf/projects/nocache/

Maybe added it to readme.md?

Command line parsing issue(s?) causing undefined behaviour

Version 1.1 from package in Ubuntu 20.04 (x86-64). Probably not a major or exploitable issue as the program isn't suid etc, but still should probably be fixed.

$ cachestats -v
open: Bad address

$ valgrind cachestats -v
==196391== Memcheck, a memory error detector
==196391== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==196391== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==196391== Command: cachestats -v
==196391== 
==196391== Syscall param openat(filename) points to unaddressable byte(s)
==196391==    at 0x49A6EAB: open (open64.c:48)
==196391==    by 0x1091C1: open (fcntl2.h:53)
==196391==    by 0x1091C1: main (cachestats.c:49)
==196391==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==196391== 
open: Bad address
==196391== 
==196391== HEAP SUMMARY:
==196391==     in use at exit: 0 bytes in 0 blocks
==196391==   total heap usage: 2 allocs, 2 frees, 1,496 bytes allocated
==196391== 
==196391== All heap blocks were freed -- no leaks are possible
==196391== 
==196391== For lists of detected and suppressed errors, rerun with: -s
==196391== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

It appears the issue is passing a single flag and no file. Also happens with -q.

Also while not leading to any undefined behaviour, the parsing seems a bit odd with cachedel as well:

$ cachedel 
usage: cachedel [-n <n>] <file> -- call fadvise(DONTNEED) <n> times on file
$ cachedel -n
open: No such file or directory
$ cachedel -n 2
usage: cachedel [-n <n>] <file> -- call fadvise(DONTNEED) <n> times on file

That middle line doesn't seem to follow the pattern.

Finally, it appears -- works with cachedel but not cachestats. Incredibly unlikely use case, but for robustness this should probably work:

$ echo a > -v
$ cachestats -v
open: Bad address
$ cachestats -- -v
open: No such file or directory

Note, cachestats works with files starting with - as long as they are not exactly -v or -q. And cachedel seems to implement the -- option to treat the remaining command line as non-flags.

"transmission-gtk" freezes right after downloading because of "nocache"

When downloading of any torrent is complete, the program hangs. Then sending the SIGTERM signal does not terminate the Transmission process, only SIGKILL does. "transmission-gtk" does not freeze if I run it without "nocache". I checked this bug on 2 versions of "transmission-gtk". The version of "nocache" is 1.0.

Supposed GCC 4.9 build failure

Hi,

@noushi: There is a supposed build failure with GCC 4.9: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=746888 – It seems to me only the test failed. Can we maybe simply exclude running the tests for Debian packaging, since it’s rather volatile in a way and should not block the building process? Can you re-trigger the build?

Thanks,
Julius

Where noCACHE ?

Recursively find in linux kernel tree:

$ time ./nocache find /media/kernel/linux/
...
real    0m12.242s
user    0m1.219s
sys     0m0.868s

$ time ./nocache find /media/kernel/linux/
real    0m1.963s
user    0m1.015s
sys     0m0.475s

At first time - 12 seconds, at next time - 2 sec. :)

Incompatibility with glibc 2.28

With the latest glibc release, nocache appears to completely freeze within certain programs (examples: strace, git-annex). Still looking for the cause.

turn off the cache for a directory

This rather is a support request. Can I disable the cache for a specific directories? Benefit is as follows. A program transmits big files over a network and stores hosts' metadata in the metadata file. It's better not to disable cache for the metadata file.

Reconsider use of fdatasync()

Nocache is a great way to prevent buffer cache pollution e.g. by rsync.

Unfortunately I just found out the hard way that it also totally destroys rsync performance with small-file workloads due to calling fdatasync() (via sync_if_writable() in free_unclaimed_pages()). Since this happens for every little file (incl. temp files!) on close(), performance becomes completely unacceptable. Not only is it dog slow, it is also not helpful for drive longevity.

I have for now preloaded libeatmydata in addition to nocache, which negates this effect and restores performance; however this should not be necessary. fadvise() works just fine without prior fsync() - buffer pages are flushed out in nice batches, and still marked for reuse.

Unless there is a technical reason that I'm unaware of please consider removing the use of fdatasync().

FTBFS on kfreebsd

cachedel.c:49:36: error: 'POSIX_FADV_DONTNEED' undeclared (first use in this function)
cachedel.c:49:36: note: each undeclared identifier is reported only once for each function it appears in
make[2]: *** [cachedel] Error 1

backup still using cache

I'm using the version of nocache that comes with Ubuntu.
I am making backup of my home directory using tar, something like
nocache tar -zcvp /home -f /mnt/usbhdd/backup.tgz
but when I run
cachestats /mnt/usbhdd/backup.tgz
while running the backup command, it still reports about 3GB of pages in cache.
My understanding was that the used cache should be much smaller. Is there any way to prevent tar and gzip from trashing the cache of the system during backup?

BUG: pthread_mutex_lock on destroy

nocache/nocache.c

Line 207 in 2b6ea1f

pthread_mutex_lock(&fds_lock[i]);

May be unlock ? ))

nocache makes MC very slow

First, MC takes at least a few seconds to start.

Then, every directory change takes half a second.

I've no idea why it's happening.

Function destroy() unused?

Function nowhere not used? Reserved?

Please support large files on 32bit

If you use cachestats or cachedel on a large file with a 32 bit build of the tool it says:

open: Value too large for defined data type

nocache runs, but actually does nothing, the file is still cached at the end.

I tried installing the 64 bit version of the tool, and running a 32 bit command but it says:

ERROR: ld.so: object '/usr/lib/nocache/nocache.so' from
LD_PRELOAD cannot be preloaded: ignored.

(If it matters this is version 0.9-2 on Debian.)

feh / nocache Goto Github PK

nocache's People

Contributors

Stargazers

Watchers

Forkers

nocache's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs