Hello, I am using HybridCache to extend the size of the cache to fit

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

HybridCache Output about cachelib HOT 9 CLOSED

facebook commented on April 21, 2024

HybridCache Output

from cachelib.

Comments (9)

sathyaphoenix commented on April 21, 2024 1

Is the key size also included in the allocation size class measurement? I thought it was only the data size.

allocSizes for DRAM is the size that includes key + data + Item overhead. If you don't specify one, cachebench has defaults which I believe uses a 1.25 factor. For Navy, this is similar for LOC if you don't use stack allocation. SOC does not need alloc size since everything is stack allocated. The best option if you don't care about size classes for Navy is to use stack allocation mode (ie specify an empty array of "navySizeClasses") and use "truncateItemToOriginalAllocSizeInNvm" to avoid any extra unused space from DRAM Item to be truncated while writing to SSD. See https://cachelib.org/docs/Cache_Library_User_Guides/Configuring_cachebench_parameters#large-item-engine-parameters for more info.

One more question about the memory usage. I ran the above configuration and the memory usage currently when around 10% of simulation is done is around 3GB? Isn't that too high? It continues to increase.

It does sound higher for a cacheSizeMB of 100. Cachebench itself uses some memory when consistency checking mode is used , but for trace replay that should be relatively small. CacheLib's memory usage should mostly come from the hash table for DRAM cache and the sparse map for Navy. I doubt these could add up to 3GB unless the configuration you chose for htBucketPower is misconfigured.

from cachelib.

sathyaphoenix commented on April 21, 2024

If you were to use a synthetic workload generator, the number of items in the cache is controlled by the "numKeys" that cachebench uses to generate the workload. If that is small enough to fit in the cache, then the cache will never become full. I see that you are using a trace file here. So it is likely that the total number of unique keys in the trace file does not exceed the cache size. Can you verify if that's the case.

from cachelib.

pbhandar2 commented on April 21, 2024

I can verify that it is not the case since the number of items in the cache increases as I increase the size of the cache. Also since there are cache eviction, there have to be items that didn't fit. I suspected it was because of different allocation classes so I adjusted the allocation size manually which worked. So now for the following configuration:

{
"cache_config": {
"cacheSizeMB": 100,
"allocSizes": [4135],
"nvmCachePaths": ["/flash/cahce.file"],
"nvmCacheSizeMB": 350,
"navyBigHashSizePct": 0,
"navySizeClasses": [4608]
},
"test_config": {
"enableLookaside": "true",
"generator": "block-replay",
"numThreads": 1,
"traceFilePath": "/home/pranav/csv_traces/w105.csv",
"traceBlockSize": 512,
"diskFilePath": "/disk/disk.file",
"pageSize": 4096,
"minLBA": 0
}
}

22:40:34 68453 ( 0.07M) ops completed
== Allocator Stats ==
Items in RAM : 24,335 (24,3354135=100MB)
Items in NVM : 71,695 (71,6954608=320MB)
Alloc Attempts: 252,306 Success: 100.00%

But I have a few questions about allocation sizes. For "allocSizes" since I am adding fixed 4KB pages why do I have to use the value 4135 for the items to be admitted to the cache? I used lower value and no items got admitted. The same with "navySizeClasses" where I have to used to first largest value that I can use larger than 4096 which is 4096 + 512 which is the navyBlockSize. Is the key size also included in the allocation size class measurement? I thought it was only the data size.

One more question about the memory usage. I ran the above configuration and the memory usage currently when around 10% of simulation is done is around 3GB? Isn't that too high? It continues to increase.

MiB Mem : 122748.8 total, 110476.4 free, 2999.3 used, 9273.1 buff/cache

PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM

154419 root 20 0 2868816 2.0g 62592 S 102.7 1.6

from cachelib.

pbhandar2 commented on April 21, 2024

I see. Thank you for the suggestions! Really helpful.

I use the default value for htBucketPower which is 22 (uint64_t htBucketPower{22}; // buckets in hash table). Could you please explain to me a little how this could relate to increased memory usage?

from cachelib.

sathyaphoenix commented on April 21, 2024

htBucketPower of 22 means that the hashtable has 2^22 slots, 4 bytes each. That should account for about 16MB. The sparse map in Navy is about 10 bytes per entry. If you have a heap profile, it would help you chase the source of the memory usage.

from cachelib.

pbhandar2 commented on April 21, 2024

For the following stats,

00:01:58 107561 ( 0.11M) ops completed
== Allocator Stats ==
Items in RAM : 24,335
Items in NVM : 70,217
Alloc Attempts: 485,754 Success: 100.00%
RAM Evictions : 386,763
Cache Gets : 352,541
Hit Ratio : 45.88%
RAM Hit Ratio : 35.92%
NVM Hit Ratio : 15.54%
RAM eviction rejects expiry : 0
RAM eviction rejects clean : 34,832
NVM Read Latency p50 : 523.00 us
NVM Read Latency p90 : 2330.00 us
NVM Read Latency p99 : 49953.00 us
NVM Read Latency p999 : 65187.00 us
NVM Read Latency p9999 : 65187.00 us
NVM Read Latency p99999 : 65187.00 us
NVM Read Latency p999999 : 65187.00 us
NVM Read Latency p100 : 0.00 us
NVM Write Latency p50 : 7199.00 us
NVM Write Latency p90 : 26071.00 us
NVM Write Latency p99 : 86461.00 us
NVM Write Latency p999 : 575699.00 us
NVM Write Latency p9999 : 2087190.00 us
NVM Write Latency p99999 : 2087190.00 us
NVM Write Latency p999999 : 2087190.00 us
NVM Write Latency p100 : 0.00 us
NVM bytes written (physical) : 1.50 GB
NVM bytes written (logical) : 1.35 GB
NVM bytes written (nand) : 0.00 GB
NVM app write amplification : 1.12
NVM dev write amplification : 0.00
NVM Gets : 225,903, Coalesced : 0.00%
NVM Puts : 351,932, Success : 99.63%, Clean : 0.00%, AbortsFromDel : 1,311, AbortsFromGet : 0
NVM Evicts : 279,697, Clean : 0.00%, Unclean : 57, Double : 0
NVM Deletes : 376,019 Skipped Deletes: 83.57%
== Hit Ratio Stats Since Last ==
Cache Gets : 28,905
Hit Ratio : 51.11%
RAM Hit Ratio : 50.62%
NVM Hit Ratio : 0.99%
== Throughput Stats ==
Total Ops : 0.11 million
Total sets: 462,205
get : 31/s, success : 100.00%
set : 264/s, success : 97.50%
del : 0/s, found : 0.00%

Memory usage is 2.8GB. 1.7G Resident. The overhead from the buckets and the Navy entry seems to low to account for the memory usage which seems to be increasing and off by almost 1GB. I will try to look into it.

from cachelib.

pbhandar2 commented on April 21, 2024

I forgot to ask what does stack allocation mode do?

Also why is the success rate of set go low under high pressure? I have a workload where it has gone down as much as 54%.

== Throughput Stats ==
Total Ops : 9.82 million
Total sets: 33,731,387
get : 614/s, success : 100.00%
set : 2,769/s, success : 40.59%
del : 0/s, found : 0.00%

from cachelib.

therealgymmy commented on April 21, 2024

@pbhandar2

what does stack allocation mode do?

This mode only applies to BlockCache (Navy's large item cache). It will write items of different sizes into the same region (a region is the granularity which we write and evict in BlockCache). This is as opposed to using size class mode where items of the same sizes will be written into the same region.

It is recommended that user uses stack alloc + in-mem buffers, as we will be deprecating the other write modes in the near future.

why is the success rate of set go low under high pressure?

This indicates allocation failures when trying to insert into the RamCache.

Can you describe your workload and cache setup in more details? Are the object sizes spanning a large range? What is the ram-cache size? One possible scenario is that ram-cache is too small, and object sizes span a vast range. This could lead to a lot of contention on allocation classes which have little memory (and thus alloc failures).

from cachelib.

sathyaphoenix commented on April 21, 2024

@pbhandar2 marking this as resolved. please re-open a new discussion or issue if there are any more un-addressed issue.

from cachelib.

HybridCache Output about cachelib HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs