On 2.5.4 I verified that thousands of files in chunkserver's directories are completel

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

chunkserver: bypass OS cache (posix_fadvise/POSIX_FADV_DONTNEED) about lizardfs HOT 8 CLOSED

lizardfs commented on May 19, 2024

chunkserver: bypass OS cache (posix_fadvise/POSIX_FADV_DONTNEED)

from lizardfs.

Comments (8)

Zorlin commented on May 19, 2024

+1. I originally read this and went "no, that's crazy, it'd wreck performance" until I realized you were talking about only caching directory and file listings instead of whole files.

from lizardfs.

onlyjob commented on May 19, 2024

Thanks. Cache displacement may be a primary reason for significant performance degradation unless you running chunkserver on dedicated machine. Chunkservers may be quite active -- I observe over 10000 cached chunk files creating enough pressure to notice slowdown in everything else running on the same server...

from lizardfs.

Zorlin commented on May 19, 2024

I'm particularly interested in this patch as we have some machines serving as much as 60-80TB from a single box (via Supermicro JBOD with consumer drives). What would be the (ballpark) performance impact on that sort of dedicated machine?

from lizardfs.

onlyjob commented on May 19, 2024

I'm not qualified to prepare this patch -- I'm simply incompetent in C/C++ these days as the last time I did C coding was back in 1995...

I'm starting chunkservers using nocache wrapper as follows:

/usr/bin/nocache -n 2 /usr/sbin/mfschunkserver -d start

and although I've been doing it only for limited time subjectively it feels like everything runs smoother, cache no longer seems over-utilised etc.. I think we're not talking about any performance "impact" whatsoever, even on dedicated machines. Indeed cache would be better used for directories, executables and whatnot rather than wasted for chunks because if chunks are cached everything else will be eventually displaced from the cache.

from lizardfs.

onlyjob commented on May 19, 2024

I meant to say that on dedicated machines you will not see performance improvement (it will just run as usual or slightly better) while most beneficial it will be on shared servers where other services are running as well.

from lizardfs.

Zorlin commented on May 19, 2024

Hi @onlyjob,

I ask because we have something like 100 million chunks per server and I suspect at that scale this could have more impact than you think.

from lizardfs.

Zorlin commented on May 19, 2024

I would also be interested in objective measurements. I would suggest pulling data from the CGI or probe under each of the following conditions:

Normal behavior, just after a cold start
Nocache run, just after a cold start
Normal behavior after one hour of activity
Nocache run after one hour of activity.

from lizardfs.

onlyjob commented on May 19, 2024

@Zorlin:

I ask because we have something like 100 million chunks per server and I suspect at that scale this could have more impact than you think.

Yes, if we're talking about impact of overusing cache when cache hit ratio is extremely low on large data set as yours... :)

Easy enough you should be able to get some data yourself although please remember that nocache method is not 100% effective so some files will be partially cached but not as much as without it...

Also I'd suggest to run at least for several hour before comparing stats.
At the moment I'm not in position to do an objective test as I will have to generate a similar load for which I would have to stop all clients. Earlier I reached my conclusion regarding benefits of not-caching large data sets on nodes of distributed file systems. I've just started chinkserver with nocache this morning and some hours later I'm going to check how many chunk files are cached comparing to normal situation without nocache... In any case I expect performance benefits for other applications/services, not for chunkservers themselves...

from lizardfs.

chunkserver: bypass OS cache (posix_fadvise/POSIX_FADV_DONTNEED) about lizardfs HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs