GithubHelp home page GithubHelp logo

kuszmaul / supermalloc Goto Github PK

View Code? Open in Web Editor NEW
281.0 281.0 41.0 5.07 MB

A Super Fast Multithreaded malloc() for 64-bit Machines

License: Other

Emacs Lisp 0.01% C++ 12.52% C 4.11% Makefile 1.65% TeX 77.39% Gnuplot 4.26% Perl 0.05% Shell 0.02%

supermalloc's People

Contributors

kuszmaul avatar willtor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

supermalloc's Issues

sysbench crashes under supermalloc

On 12/15/14:

Reported by prohaska:

crash @
#0 0x00007ffff7413504 in try_put_cached (obj=0x8000c4000980, co=0x7ffff7f6cac0, size=8, cache_size=8192) at ../src/cache.cc:557
#1 0x00007ffff741358f in try_put_cached_both (obj=0x8000c4000980, cb=0x7ffff7f6cac0, size=8, cache_size=8192) at ../src/cache.cc:578
#2 0x00007ffff7413d99 in cached_free (ptr=0x8000c4000980, bin=0) at ../src/cache.cc:768
#3 0x00007ffff740b64b in free (p=0x7fffc4000980) at ../src/malloc.cc:257

how to reproduce:

step 1: build a debug environment

mkdir percona-supermalloc

cd percona-supermalloc

bash -x ~/make.bash

step 2: run mysqld with supermalloc and tokudb

cd install

gdb bin/mysqld

gdb> set environment LD_PRELOAD=SOME_PATH/libsupermalloc_pthread.so

gdb> run --defaults-file=~/rfp.cnf --gdb --plugin-load=tokudb=ha_tokudb.so

step 3: make sure that supermalloc is loaded into mysqld

step 4: get ready for sysbench

mysql -S/tmp/rfp.sock -uroot

mysql> show engines; # make sure that tokudb is running

mysql> grant all on . to rfp@localhost;

mysql> create database sbtest

step 5: use sysbench to prepare the tables

sysbench --test=$HOME/launchpad/sysbench/sysbench/tests/db/oltp.lua --mysql-user=rfp --mysql-table-engine=tokudb --oltp-tables-count=8 --oltp-table-size=1000000 --mysql-socket=/tmp/rfp.sock prepare

this should create 8 tables with 1M rows each. it is failing on my machine after a couple of tables are built.

need non-overwrite tests

Need non-overwrite tests in which objects of various sizes are allocated and freed, and we make sure that nothing overwrites each other.

I propose to do the following repeatedly

  • Allocate an object (of various sizes). Call it p.
  • Fill the ith byte of p with h(p,i) where h is a hash function.

Later when we free on an object, check its hash values.

better stats

We would like the following statistics

  • How much rss has been given to the user (assuming the user touches all of what she has requested). This is not the sum of the sizes of the allocations, since, for example, we round up to the next size bin. This is not a multiple of the page size either, since we have unallocated objects not counted here.
  • How much rss is currently used for unallocated objects on pages shared with allocated objects.
  • How much rss is currently used for supermalloc data structures. If you sum this with the previous two you should get the total rss that supermalloc thinks is in use.
  • How much virtual memory is currently allocated by supermalloc. (This only grows)

make check fails on multiple Linux boxes

Just tried to build it on two separate x86 boxes. Testing with LD_PRELOAD also fails.

~/SuperMalloc-master/release> make check
SUPERMALLOC_THREADCACHE=1 ./../release/aligned_alloc
Didn't get error on alignment 0
Aborted (core dumped)
make: *** [../release/aligned_alloc.check] Error 134
~/SuperMalloc-master/release> uname -a
Linux codemonkey 3.13.0-53-generic #89-Ubuntu SMP Wed May 20 10:34:39 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

~/SuperMalloc/release$ make check
SUPERMALLOC_THREADCACHE=1 ./../release/aligned_alloc
Illegal instruction
../Makefile.include:57: recipe for target '../release/aligned_alloc.check' failed
make: *** [../release/aligned_alloc.check] Error 132
~/SuperMalloc/release$ 
~/SuperMalloc/release$ uname -a
Linux tesla 3.14-2-amd64 #1 SMP Debian 3.14.15-2 (2014-08-09) x86_64 GNU/Linux```

posix_memalign returns NULL when size == 0

Prohaska wrote:

This is different than jemalloc, which maps size to 1 when size == 0 and always returns a non NULL pointer

Bradley wrote:

The SuperMalloc behavior appears to be correct. (So is jemalloc.) From man posix_memalign

The function posix_memalign() allocates size bytes and places the
address of the allocated memory in *memptr. The address of the allo‐
cated memory will be a multiple of alignment, which must be a power of
two and a multiple of sizeof(void *). If size is 0, then posix_mema‐
lign() returns either NULL, or a unique pointer value that can later be
successfully passed to free(3).

Rich writes

Yes, I read the man page and agree that your implementation is correct. However, it is different than libc malloc and jemalloc. This causes ft-index to fail since apparently it assumes behavior that is different than supermalloc. This may also cause other programs to fail.

Bradley wrote

I'm going to close this issue as "not a bug" for now. This bug may come back and bite me later...

See github.mit.edu closed on 11/25/14

Error in Build!

$ cd release

release$ make

set -e; rm -f ../release/env.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/env.cc -MG -MF ../release/env.d.$$; \
              sed 's,\(env\)\.o[ :]*,../release/\1.o ../release/env.d : ,g' < ../release/env.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/env.d; \
              rm -f ../release/env.d.$$
set -e; rm -f ../release/has_tsx.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/has_tsx.cc -MG -MF ../release/has_tsx.d.$$; \
              sed 's,\(has_tsx\)\.o[ :]*,../release/\1.o ../release/has_tsx.d : ,g' < ../release/has_tsx.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/has_tsx.d; \
              rm -f ../release/has_tsx.d.$$
set -e; rm -f ../release/futex_mutex.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/futex_mutex.cc -MG -MF ../release/futex_mutex.d.$$; \
              sed 's,\(futex_mutex\)\.o[ :]*,../release/\1.o ../release/futex_mutex.d : ,g' < ../release/futex_mutex.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/futex_mutex.d; \
              rm -f ../release/futex_mutex.d.$$
set -e; rm -f ../release/stats.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/stats.cc -MG -MF ../release/stats.d.$$; \
              sed 's,\(stats\)\.o[ :]*,../release/\1.o ../release/stats.d : ,g' < ../release/stats.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/stats.d; \
              rm -f ../release/stats.d.$$
set -e; rm -f ../release/footprint.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/footprint.cc -MG -MF ../release/footprint.d.$$; \
              sed 's,\(footprint\)\.o[ :]*,../release/\1.o ../release/footprint.d : ,g' < ../release/footprint.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/footprint.d; \
              rm -f ../release/footprint.d.$$
set -e; rm -f ../release/bassert.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/bassert.cc -MG -MF ../release/bassert.d.$$; \
              sed 's,\(bassert\)\.o[ :]*,../release/\1.o ../release/bassert.d : ,g' < ../release/bassert.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/bassert.d; \
              rm -f ../release/bassert.d.$$
set -e; rm -f ../release/cache.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/cache.cc -MG -MF ../release/cache.d.$$; \
              sed 's,\(cache\)\.o[ :]*,../release/\1.o ../release/cache.d : ,g' < ../release/cache.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/cache.d; \
              rm -f ../release/cache.d.$$
set -e; rm -f ../release/small_malloc.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/small_malloc.cc -MG -MF ../release/small_malloc.d.$$; \
              sed 's,\(small_malloc\)\.o[ :]*,../release/\1.o ../release/small_malloc.d : ,g' < ../release/small_malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/small_malloc.d; \
              rm -f ../release/small_malloc.d.$$
set -e; rm -f ../release/large_malloc.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/large_malloc.cc -MG -MF ../release/large_malloc.d.$$; \
              sed 's,\(large_malloc\)\.o[ :]*,../release/\1.o ../release/large_malloc.d : ,g' < ../release/large_malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/large_malloc.d; \
              rm -f ../release/large_malloc.d.$$
set -e; rm -f ../release/huge_malloc.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/huge_malloc.cc -MG -MF ../release/huge_malloc.d.$$; \
              sed 's,\(huge_malloc\)\.o[ :]*,../release/\1.o ../release/huge_malloc.d : ,g' < ../release/huge_malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/huge_malloc.d; \
              rm -f ../release/huge_malloc.d.$$
set -e; rm -f ../release/rng.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/rng.cc -MG -MF ../release/rng.d.$$; \
              sed 's,\(rng\)\.o[ :]*,../release/\1.o ../release/rng.d : ,g' < ../release/rng.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/rng.d; \
              rm -f ../release/rng.d.$$
set -e; rm -f ../release/makechunk.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/makechunk.cc -MG -MF ../release/makechunk.d.$$; \
              sed 's,\(makechunk\)\.o[ :]*,../release/\1.o ../release/makechunk.d : ,g' < ../release/makechunk.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/makechunk.d; \
              rm -f ../release/makechunk.d.$$
set -e; rm -f ../release/malloc.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/malloc.cc -MG -MF ../release/malloc.d.$$; \
              sed 's,\(malloc\)\.o[ :]*,../release/\1.o ../release/malloc.d : ,g' < ../release/malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/malloc.d; \
              rm -f ../release/malloc.d.$$
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -I../src  -c ../src/bassert.cc -o ../release/bassert.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11  ../src/objsizes.cc ../release/bassert.o -o ../release/objsizes
./../release/objsizes  ../release/generated_constants.cc >  ../release/generated_constants.h
set -e; rm -f ../release/cache.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/cache.cc -MG -MF ../release/cache.d.$$; \
              sed 's,\(cache\)\.o[ :]*,../release/\1.o ../release/cache.d : ,g' < ../release/cache.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/cache.d; \
              rm -f ../release/cache.d.$$
set -e; rm -f ../release/small_malloc.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/small_malloc.cc -MG -MF ../release/small_malloc.d.$$; \
              sed 's,\(small_malloc\)\.o[ :]*,../release/\1.o ../release/small_malloc.d : ,g' < ../release/small_malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/small_malloc.d; \
              rm -f ../release/small_malloc.d.$$
set -e; rm -f ../release/large_malloc.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/large_malloc.cc -MG -MF ../release/large_malloc.d.$$; \
              sed 's,\(large_malloc\)\.o[ :]*,../release/\1.o ../release/large_malloc.d : ,g' < ../release/large_malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/large_malloc.d; \
              rm -f ../release/large_malloc.d.$$
set -e; rm -f ../release/huge_malloc.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/huge_malloc.cc -MG -MF ../release/huge_malloc.d.$$; \
              sed 's,\(huge_malloc\)\.o[ :]*,../release/\1.o ../release/huge_malloc.d : ,g' < ../release/huge_malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/huge_malloc.d; \
              rm -f ../release/huge_malloc.d.$$
set -e; rm -f ../release/makechunk.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/makechunk.cc -MG -MF ../release/makechunk.d.$$; \
              sed 's,\(makechunk\)\.o[ :]*,../release/\1.o ../release/makechunk.d : ,g' < ../release/makechunk.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/makechunk.d; \
              rm -f ../release/makechunk.d.$$
set -e; rm -f ../release/malloc.d; \
              g++ -MM -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/malloc.cc -MG -MF ../release/malloc.d.$$; \
              sed 's,\(malloc\)\.o[ :]*,../release/\1.o ../release/malloc.d : ,g' < ../release/malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/malloc.d; \
              rm -f ../release/malloc.d.$$
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/malloc.cc -o ../release/malloc.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/makechunk.cc -o ../release/makechunk.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/rng.cc -o ../release/rng.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/huge_malloc.cc -o ../release/huge_malloc.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/large_malloc.cc -o ../release/large_malloc.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/small_malloc.cc -o ../release/small_malloc.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/cache.cc -o ../release/cache.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/footprint.cc -o ../release/footprint.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/stats.cc -o ../release/stats.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/futex_mutex.cc -o ../release/futex_mutex.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src   -c -o ../release/generated_constants.o ../release/generated_constants.cc
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/has_tsx.cc -o ../release/has_tsx.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/env.cc -o ../release/env.o
mkdir -p ../release/lib
gcc-ar cr ../release/lib/supermalloc.a  ../release/malloc.o  ../release/makechunk.o  ../release/rng.o  ../release/huge_malloc.o  ../release/large_malloc.o  ../release/small_malloc.o  ../release/cache.o  ../release/bassert.o  ../release/footprint.o  ../release/stats.o  ../release/futex_mutex.o  ../release/generated_constants.o  ../release/has_tsx.o  ../release/env.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11   ../release/malloc.o  ../release/makechunk.o  ../release/rng.o  ../release/huge_malloc.o  ../release/large_malloc.o  ../release/small_malloc.o  ../release/cache.o  ../release/bassert.o  ../release/footprint.o  ../release/stats.o  ../release/futex_mutex.o  ../release/generated_constants.o  ../release/has_tsx.o  ../release/env.o -shared -ldl -o ../release/lib/libsupermalloc.so
cc -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c11    -I../release   -I../src  -c ../tests/aligned_alloc.c -o ../release/aligned_alloc.o
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11  ../release/aligned_alloc.o -ldl -L../release/lib -Wl,-rpath,../release/lib -ldl ../release/lib/libsupermalloc.so -o ../release/aligned_alloc
../tests/aligned_alloc.c: In function ‘main’:
../tests/aligned_alloc.c:59:4: error: argument 2 value ‘9223372036854775808’ exceeds maximum object size 9223372036854775807 [-Werror=alloc-size-larger-than=]
  p = aligned_alloc(alignment, size);
    ^
/usr/include/stdlib.h:468:14: note: in a call to allocation function ‘aligned_alloc’ declared here
 extern void *aligned_alloc (size_t __alignment, size_t __size)
              ^
../tests/aligned_alloc.c:74:4: error: argument 2 value ‘9511602413006487553’ exceeds maximum object size 9223372036854775807 [-Werror=alloc-size-larger-than=]
  p = aligned_alloc(alignment, size);
    ^
/usr/include/stdlib.h:468:14: note: in a call to allocation function ‘aligned_alloc’ declared here
 extern void *aligned_alloc (size_t __alignment, size_t __size)
              ^
../tests/aligned_alloc.c:87:4: error: argument 2 value ‘18446744073709551600’ exceeds maximum object size 9223372036854775807 [-Werror=alloc-size-larger-than=]
  p = aligned_alloc(alignment, size);
    ^
/usr/include/stdlib.h:468:14: note: in a call to allocation function ‘aligned_alloc’ declared here
 extern void *aligned_alloc (size_t __alignment, size_t __size)
              ^
lto1: all warnings being treated as errors
lto-wrapper: fatal error: g++ returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
../Makefile.include:122: recipe for target '../release/aligned_alloc' failed
make: *** [../release/aligned_alloc] Error 1
rm ../release/aligned_alloc.o

lto1 error when running make

I tried to install SuperMalloc on my computer (Ubuntu-15.10).

When my compiler(g++-4.9.3) was running the following command,
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm -std=c++11 ../release/aligned_alloc.o -ldl -L../release/lib -Wl,-rpath,../release/lib -ldl ../release/lib/libsupermalloc.so -o ../release/aligned_alloc
an error message showed.
lto1: fatal error: bytecode stream generated with LTO version 4.0 instead of the expected 3.0

Can anyone help me deal with this problem?

gap near 128 bytes and 256 bytes

There's a gap near 128 for unaligned malloc. We don't use 128 since it's a power of two, so if you allocate 113 bytes, you get 160 bytes, which is 29.4% external fragmentation.

If we use the 128-byte size, then we'll get reduced associativity.

One solution is to add another object size (say at 136). Doing that will change the calculation between sizes and bins (which currently works up to 320 using bit hacking).

There's another gap at 256 (since we don't like 256) which we could fix by adding a size at 272.

make check results in "Aborted (core dumped)"

Environment:
Ubuntu 15.10

uname -r
4.2.0-22-generic

uname -p
x86_64

gcc --version
gcc (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010

Steps to reproduce:

git clone https://github.com/kuszmaul/SuperMalloc.git
cd SuperMalloc/release
make
make check

SUPERMALLOC_THREADCACHE=1 ./../release/aligned_alloc
Didn't get error on alignment 0
Aborted (core dumped)
../Makefile.include:59: recipe for target '../release/aligned_alloc.check' failed
make: *** [../release/aligned_alloc.check] Error 134

try putting the cached_objects and the lock on the same cache line

It might be faster in the locking implementation, but it might slow down the HTM implementation. In fact it might slow down the locking implementation since we spin on the lock. Spinning on the lock will then cause the cache line that we are trying to modify to get upset. Probably a bad idea...

huge aligned_malloc probably broken

Huge aligned_malloc alilgnments can end up returning values are not quite aligned. We align the pointer up and return it. I doubt that the pointer can successfully be used on a call to free(), however, since we probably didn't fill in the chunk table for the returned value.

One way to fix this is to do the huge alignment, align the pointer, and then fill in the chunk table. We'll end up with useless stuff hanging off in front and and behind the aligned data. It is tempting to unmap that useless stuff, but Linux has a poor search algorithm if there are holes in the VM space. It's probably better to simply waste up to the factor of two.

Need to test error paths for when mmap() fails

We put in some nontrivial code. Need to make sure that all the error paths actually work.

Here's the call chain. We want to make sure that all these functions, at all their call sites, actually call their error-handling code.

mmap_size called by
chunk_create_slow (returns 0 on error)
mmap_chunk_aligned_block (returns 0)

chunk_create_slow called by
mmap_chunk_aligned_block (returns 0)

mmap_chunk_aligned_block called by
get_power_of_two_chunks (returns 0)
large_malloc (returns 0)
initialize_malloc (aborts on a problem)
small_malloc (returns 0)

get_power_of_two_chunks called by
huge__malloc (returns null)

large_malloc
cached_malloc (returns NULL)

initialize_malloc (aborts on problem)

small_malloc called by
cached_malloc (returns NULL)

huge_malloc called by
malloc (returns NULL)
aligned_malloc_internal (returns NULL)

cached_malloc
malloc (returns NULL)
aligned_malloc_internal (returns NULL)

aligned_malloc_internal
aligned_malloc (returns NULL)
posix_memalign (some fancy footwork on the error to return the right value)

cache-index-friendly

Make the object sizes not create associativity conflicts in the cache. See Dice's paper.
Closed 9/26/14 see github.mit.edu

provide a mode to give 8-byte alignment

Linux libc provides 8-byte-aligned data.

It would be good if there was a mode in which supermalloc provided 8-byte aligned data. Right now we have sizes such as 10, 12, 14, 20, and 28, which are not aligned.

Another useful mode would be if the object is 16 bytes or bigger to make it 16-byte aligned (and also 32).

Another useful mode would be to make sure that values are not 16-byte aligned (for debugging codes that might use libc malloc in the future. I'm not sure this is a good idea, however.

Fragmenting virtual memory

The cache-index test fails if we try to allocate more than a few terabytes of huge objects (each of which is only a few megabytes). I suspect it's some sort of memory fragmentation. Some simple experimentation indicates that we can mmap a total of almost 2^47 bytes.

superalloc_pthread lib

Hi, Brad:
I am working on a non-transaction memory version.
And got an error that superalloc_pthread lib is not compiled, just the regular one is in /lib folder.
I used master branch, cd releases then make then make check.
Any ideas on where I went wrong please?

grab all free objects in a folio

When we go to the global data structure to get objects, we should go ahead and fetch all the free objects out of a folio, since it's the same amount of lock contention. We don't actually expect to get lots of free objects out of a folio on average for random malloc/free workloads, but there are workloads where this optimization helps, and I don't see how it can hurt.

run tokudb fractal tree tests.

git clone [email protected]:Tokutek/ft-index
mkdir ft-index-build
cd ft-index-build
cmake -DCMAKE_BUILD_TYPE=Debug ../ft-index
make -j8
cd ft/tests
LD_PRELOAD=libsupermalloc_pthread.so ./ft-test
this will hit the posix_memalign size == 0 bug (See #24)

use CPUID to find a cpu number?

CPUID is slower on my laptop:

[bradley@30-87-232 fragments]$ ./coreid 
./coreid 
cpuid0 = 0xd  vendor=GenuineIntel
coreid 0xb: regs=0x00000001 0x00000002 0x00000100 0x00000001
coreid 0x4: regs=0x1c004121 0x01c0003f 0x0000003f 0x00000000
 0 100000000 0 0 0 0 0 0
3.117ns/cpuid
 0 100000000 0 0 0 0 0 0
20.446ns/sched_getcpu()
 0 100000000 0 0 0 0 0 0
2.895ns/cached sched_getcpu()

Closed 11/23/14 see github.mit.edu

for small objects need bigger folios

For all the small objects up to 256, the folio size is one page.

That introduces associativity problems. For example for 96-byte objects (1.5 cache lines) we end up using only 43 of the cache lines because we reset the counter when we hit the end of a page. For 96-byte objects, we need 3-page folios to avoid this problem.

In general, we could take the LCM of the object and the page size. That's pretty simple, but produces some good-sized folios. (For the newly proposed size in #27, it produces 17-page folios for size 136). That's probably OK.

How to use

How do I link SuperMalloc to replace malloc()? Is this correct?

LD_PRELOAD=lib/SuperMalloc/release/lib/libsupermalloc_pthread.so ./build/hello_world

I'm using it this way and I'm not seeing any performane improvement in a benchmark of my software. Just curious of how I can tell SuperMalloc is being used correctly?

gap in primes

There are two good-sized gaps in the prime numbers: Between 7 and 11 cache lines we should use a 9-cacheline (non-prime) bin, and between 13 and 17, we should use a 15-cacheline (nonprime) bin.

Closed 11/23/14 see github.mit.edu

Unable to build on CentOS 6.5

Nothing in the documentation indicates what platforms are supported, or minimum requirements (which is an advantage of using autotools or cmake). What minimum version of G++, etc.?

Platform is running Linux 2.6.32-431.23.3.el6.x86_64, and the code fails to compile from doing a make from the release directory. The actual failure is the code in huge_malloc (huge_malloc.cc) as MADV_HUGEPAGE and MADV_NOHUGEPAGE don't exist. Since it explicitly sets one mode or the other, the assumption is the default on my system is no huge page support, so I commented out the entire block to get it to compile.

Now, I get further. But still no go. When compiling huge_malloc.cc:
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm -std=c++11 -I../release -I../src -c ../src/huge_malloc.cc -o ../release/huge_malloc.o

Gives the following error:
/tmp/ccX5VXF3.s: Assembler messages:
/tmp/ccX5VXF3.s:1614: Error: no such instruction: xbegin .L37' /tmp/ccX5VXF3.s:2237: Error: no such instruction: xabort $9'
/tmp/ccX5VXF3.s:2267: Error: no such instruction: xend' /tmp/ccX5VXF3.s:2381: Error: no such instruction: xbegin .L28'
/tmp/ccX5VXF3.s:2407: Error: no such instruction: `xabort $9'
make: *** [../release/huge_malloc.o] Error 1

If I comment out most of this file, we get a similar error building large_malloc.cc:
g++ -W -Wall -Werror -O3 -flto -ggdb -pthread -fPIC -mrtm -std=c++11 -I../release -I../src -c ../src/large_malloc.cc -o ../release/large_malloc.o
/tmp/ccYsHMMz.s: Assembler messages:
/tmp/ccYsHMMz.s:2681: Error: no such instruction: xbegin .L74' /tmp/ccYsHMMz.s:2838: Error: no such instruction: xbegin .L66'
/tmp/ccYsHMMz.s:2869: Error: no such instruction: xabort $9' /tmp/ccYsHMMz.s:2881: Error: no such instruction: xend'
make: *** [../release/large_malloc.o] Error 1

we aren't accounting for the size well for the last few large bins

It looks like we don't keep track of how many pages are actually allocated to the application in the last few "large" bins. For example we have a bin of size 1044480 (4096 pages), and one of size 520192 (127 pages). For a 128-page allocation, we don't keep track of the page count and end up thinking that there are 255 pages allocated.

We need some metadata about the actual allocated size for these internmediate sizes. Perhaps the bin numbers should go to page-count mode at 4 pages (and we should dealign the returned pointers, so if a user asks for 4 pages and it's not aligned, we allocate it in the 8-page bucket, and remember that there are 5 pages actually allocated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.