Comments (8)
+1. I originally read this and went "no, that's crazy, it'd wreck performance" until I realized you were talking about only caching directory and file listings instead of whole files.
from lizardfs.
Thanks. Cache displacement may be a primary reason for significant performance degradation unless you running chunkserver on dedicated machine. Chunkservers may be quite active -- I observe over 10000 cached chunk files creating enough pressure to notice slowdown in everything else running on the same server...
from lizardfs.
I'm particularly interested in this patch as we have some machines serving as much as 60-80TB from a single box (via Supermicro JBOD with consumer drives). What would be the (ballpark) performance impact on that sort of dedicated machine?
from lizardfs.
I'm not qualified to prepare this patch -- I'm simply incompetent in C/C++ these days as the last time I did C coding was back in 1995...
I'm starting chunkservers using nocache
wrapper as follows:
/usr/bin/nocache -n 2 /usr/sbin/mfschunkserver -d start
and although I've been doing it only for limited time subjectively it feels like everything runs smoother, cache no longer seems over-utilised etc.. I think we're not talking about any performance "impact" whatsoever, even on dedicated machines. Indeed cache would be better used for directories, executables and whatnot rather than wasted for chunks because if chunks are cached everything else will be eventually displaced from the cache.
from lizardfs.
I meant to say that on dedicated machines you will not see performance improvement (it will just run as usual or slightly better) while most beneficial it will be on shared servers where other services are running as well.
from lizardfs.
Hi @onlyjob,
I ask because we have something like 100 million chunks per server and I suspect at that scale this could have more impact than you think.
from lizardfs.
I would also be interested in objective measurements. I would suggest pulling data from the CGI or probe under each of the following conditions:
- Normal behavior, just after a cold start
- Nocache run, just after a cold start
- Normal behavior after one hour of activity
- Nocache run after one hour of activity.
from lizardfs.
I ask because we have something like 100 million chunks per server and I suspect at that scale this could have more impact than you think.
Yes, if we're talking about impact of overusing cache when cache hit ratio is extremely low on large data set as yours... :)
Easy enough you should be able to get some data yourself although please remember that nocache
method is not 100% effective so some files will be partially cached but not as much as without it...
Also I'd suggest to run at least for several hour before comparing stats.
At the moment I'm not in position to do an objective test as I will have to generate a similar load for which I would have to stop all clients. Earlier I reached my conclusion regarding benefits of not-caching large data sets on nodes of distributed file systems. I've just started chinkserver
with nocache
this morning and some hours later I'm going to check how many chunk files are cached comparing to normal situation without nocache
... In any case I expect performance benefits for other applications/services, not for chunkservers themselves...
from lizardfs.
Related Issues (20)
- lizardfs.com seems to be down HOT 1
- docs, repo, and ubu package are broken and wrong
- Enable/Enforce IO Timeouts for Mounts? HOT 2
- How to get the list of files which goal is 1 HOT 4
- :)
- Best way to disable rebalancing?
- Is there a way to keep overgoal chunks until free space falls below a threshold? HOT 1
- Shadow to master transition fails with existing metadata when no connection to master was established HOT 3
- Debian & Ubuntu Repo no longer exists? HOT 11
- Upgrade 3.12 to 3.13 HOT 8
- Website http://lizardfs.com/ is down HOT 2
- Fix documentation for installation HOT 5
- Unable to use lizardfs mount on Debian 11 bullseye with lizardfs-client HOT 3
- [PATCH] CMake 3.25 compatibility HOT 2
- IPv6 support HOT 1
- Status of the project? HOT 1
- Certificate expired on dev.lizardfs.com HOT 3
- aarch64 compiling got error: comparison is always true due to limited range of data type [-Werror=type-limits] HOT 2
- I got a file with 1 missing EC(2,1) chunk, but I was able to find the chunk files on disks. How can I make lizardfs use then? HOT 1
- Dead / Stagnant Project HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lizardfs.