Comments (7)
We noticed that on certain OSes, it's necessary for the acquiring thread to yield, so that another thread may release. We made it work everywhere by transforming this loop (and similar), adding a max iterations count and a second loop with a yield.
from rpmalloc.
I will look into this tonight - but yeah, doing a busy spin and fallback to yield sounds like a reasonable option
from rpmalloc.
I have confirmed that it is as you suspected @ak-mdufour, I was able to reproduce the deadlock and freeze the thread in the debugger (didn't know I could do that till now!) that was spinning blocking the lower priority thread from releasing the lock (core affinity blocked the thread from roaming to another core), with the high pri thread frozen I was able to break the deadlock. Now I just have to find a reliable way to have the scheduler yield to lower priority threads (for all platforms). I tried a 1us sleep but that doesn't seem to be enough time for the lower pri thread to get scheduled (at least on Switch). I could experiment with priority inversion as well, bump the thread priority up when the lock is acquired and down when spinning... For the record I changed the deferred free list to use an explicit atomic 32bit lock (instead of using the pointer as the lock) and then changed all the spin locks to use the following function:
static inline void _rpmalloc_yield() {
#if PLATFORM_PLAYSTATION || PLATFORM_SWITCH || PLATFORM_POSIX
// Need to sleep for a period of time to allow other lower priority
// threads a chance to run, don't like this arbitrary time value, ideally
// we should find a more deterministic way to schedule the lower priority
// thread.
usleep(1);
#elif PLATFORM_WINDOWS
// Windows will let lower priority threads run for the remainder
// of this threads time slice when Sleep(0) is used.
Sleep(0);
#else
# error "Platform not supported."
#endif
}
static inline void
_rpmalloc_acquire_lock(atomic32_t* lock) {
// NOTE: (manderson) We could get deadlocked if a thread with a higher
// priority preempts a lower priority thread that is holding the lock and
// has a core affinity mask limited to the same core. This should prevent
// that case by attempting to acquire the lock and periodically yielding
// to give the lower priority thread a chance to release the lock.
int64_t const attempts_per_yield = 1000;
int64_t yields = 0;
int64_t attempts = 1;
while (!atomic_cas32_acquire(lock, 1, 0)) {
_rpmalloc_spin();
if ((attempts % attempts_per_yield) == 0) {
yields++;
_rpmalloc_yield();
}
attempts++;
}
#if ENABLE_STATISTICS
atomic_add64(&_lock_calls, 1);
atomic_add64(&_lock_attempts, attempts);
atomic_add64(&_lock_yields, yields);
if (attempts > atomic_load64(&_peak_lock_attempts)) {
atomic_store64(&_peak_lock_attempts, attempts);
}
#endif
}
from rpmalloc.
Yeah, I can see the issue with thread priority and core affinity ... will have a think about how this could be generalized, I like the approach with spin-then-yield.
(Also, careful about talking about certain platforms and their specific internals, you don't want to be breaking any NDAs here - which is also why a solution in the main repo would have to be platform agnostic)
from rpmalloc.
Sadly yielding offers no strong guarantee that the owning thread will be scheduled; a futex-based solution may be more robust, for platforms that offer it.
from rpmalloc.
I ended up using a mutex. I didn't like the idea that spinning threads could end up taking longer to acquire the lock by arbitrarily backing off till the blocked thread got an opportunity to release the lock. The mutex is per heap so that still gives decent granularity. With that in place I haven't had any other problems, I added some code to periodically cache each heap's deferred spans and trim the cache to tighten up memory usage. @mjansson would you be interested in reviewing my changes, I could PM you.
from rpmalloc.
Would love to. Whatever method works - join the discord at https://discord.gg/njzRV5Q9 or drop me an email at [email protected]
from rpmalloc.
Related Issues (20)
- Segmentation fault when revising parameters(span_size and span_map_count) HOT 1
- undefined `false` when compiling HOT 1
- Linking error: unresolved external symbol _rpmalloc_module_init HOT 5
- `SIGABRT` when building w/ `-DENABLE_ASSERTS=1 -DENABLE_OVERRIDE=1 -DENABLE_PRELOAD=1` HOT 1
- Memory usage seems large compared with alternatives and is unaffected by cache settings. HOT 7
- rpmalloc_heap_free - thread safe? HOT 1
- Segmentation fault with the rewrite branch in certain benchmarks HOT 5
- enable_unlimited_thread_cache HOT 1
- Build fails using clang-16 HOT 3
- rpmalloc_heap_free_all has some bug HOT 3
- rpmalloc and msvc asan HOT 4
- Add version information
- _rpmalloc_span_extract_free_list_deferred is not lock-free HOT 1
- get_thread_id() implementation throws SIGILL on my ARM CPU HOT 1
- Android dlopen failed due to TLS_MODEL attribute HOT 1
- Building `develop` as part of LLVM with the built-in option fails on Windows HOT 3
- Too high memory consumption HOT 3
- Warning on MSVC 17.10
- How to properly release(free) a heap?
- rpmalloc_heap_aligned_calloc has incorrect RPMALLOC_ATTRIB_ALLOC_SIZE2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rpmalloc.