Comments (25)
To be fixed
from zstd.
Hi Konstantin
That's a pretty good idea.
Actually, I was expecting the other way round, that is, some application being interested in the decompression code without the compression part.
That being said, both objectives implies the same capability.
I feel we should wait for v0.4 do look into this issue.
The reason is, the code structure will be changed, in a way which will make this capability easier to create. It will probably not be enough, but at least a good step into the right direction, so it will be easier to study what remains to be done.
from zstd.
With the release of 0.4.x serie, it becomes possible to target this objective.
Zstd code is now more clearly separated between compression and decompression.
It is also possible to not generate legacy code (a logical complementary capability).
So, what's missing now is primarily the lack of compression / decompression separation within huff0
and fse
.
There are mainly 2 ways to achieve this :
- Modify huff0 and fse, in order to separate compress and decompress functions
- possible, will require some time, and increase number of files
- Make huff0 and fse integration
static
- expectation : unused static functions will simply not be generated (dead code elimination)
- added benefit : currently public fse / huff0 symbols will no longer be present within ABI. Only zstd public symbols would remain.
- requires a few tricky source code modifications.
The second option currently looks the more promising to me.
from zstd.
+1 for static
from zstd.
There is also a need to define what kind of objective to target.
Presuming it's possible to compile for only compression or only decompression, what should be the result ?
- A static library ? (for example, zstd_compress.a)
- A list of files and parameters ?
Speaking of static library : how does it work today ?
Presuming libzstd.a
is compiled and generated, what happens when linking a program which only uses compression or decompression code from this library ? Does the linker keep unused code out of the resulting binary, achieving the objective ?
From https://en.wikipedia.org/wiki/Static_library :
With static linking, it is enough to include those parts of the library that are directly and indirectly referenced by the target executable
from zstd.
Here are some facts that I know about ELF linking:
- ELF linker manipulates sections, not functions.
- ELF linker traverses all input files in one pass from left to right (this behavior can be modified by command line options, at least in GNU implementation). It picks sections according to following rules
** All code (.text) sections from .o files specified on command line are linked into final ELF object (executable or shared library)
* Code sections from *.a files are included only if they are referenced from section, which was already included - In GCC and Clang there is option -ffunction-sections, which forces compiler to create separate section for each function. (There is also similar option -fdata-section for data, e.g. string literals). If GNU linker is invoked with --gc-sections, it will throw away all unuswd functions.
- Aforementioned options are not present in old versions of GCC and binutils, e.g. people using embedded cross-toolchains with gcc < 4.2 are probably out of luck
from zstd.
So, if you place compression and decompression functions into different source files, make static library from it, and refenece only, e.g., decompression function from executable, compression functions won't be linked in. If they are not in different files, -ffunction-sections is required when building static library, and -Wl,--gc-sections when linking executable.
from zstd.
So dead code elimination from static library is in theory possible, in practice not obvious, with various complexities and limitations in the way.
OK. So now let's suppose that it's possible to build 2 static libraries dedicated for compression and decompression. Note that, within the compression one, there are multiple methods possibles, depending on compression level. That means I expect the compression library to remain "relatively big".
Here is btw current object file sizes, compiled using -Os
(optimized for size) for x64 target :
ls -ls *.o *.a
16 -rw-r--r-- 1 yann yann 13480 déc. 2 18:33 fse.o
24 -rw-r--r-- 1 yann yann 20592 déc. 2 18:33 huff0.o
52 -rw-r--r-- 1 yann yann 53064 déc. 2 18:33 zstd_compress.o
12 -rw-r--r-- 1 yann yann 10328 déc. 2 18:33 zstd_decompress.o
100 -rw-r--r-- 1 yann yann 100042 déc. 2 18:33 libzstd.a
I suspect the request to remove decompression code is tied to reducing final code size. According to above measures, it will indeed reduce code size, but by no more than 30 % (including parts of fse
and huff0
).
Is it enough ? Is there a size objective basically ?
from zstd.
Currently I'm interested only in size of zstd application. I'm planning to use it for real-time compression of coredumps and related info on embedded system to transmit over network, and this data will never be decompressed on that device. I don't have strict size requirements, so 30% would be fine. I just wanted not to bring unnecessarily bloat into firmware.
from zstd.
Currently I'm interested only in size of zstd application
When you say "application", you mean the ./zstd
command line utility ?
from zstd.
Yep
from zstd.
OK. It's a more defined objective, but also quite more work.
On top of separating library functions, it will be necessary to modify program files.
They aren't created with this objective in mind, so it will take some time.
from zstd.
I thought something like annulen@777033b would be enough to exclude decompressor from zstd cli. Am I wrong?
from zstd.
I've re-target the objective to "reduce size of ./zstd
utility".
According to current experiment, the proposed change at annulen/zstd@777033b will probably not be enough.
I've experimented with removing non essential capabilities, starting with the integrated benchmark suite.
Just removing any mention of benchmark from zstdcli.c
doesn't seem to be enough : the final exe size doesn't change much. I suspect that's because the public symbols are still generated, even if not used, as they could be called externally, like a dll. So, to get some size benefits, it's also necessary to remove bench.c
from the compilation chain.
I'm starting to lean the binary along these principles. I suspect I'll have something to propose by tomorrow.
from zstd.
There is a new update in "dev" branch : 28e7cef
It proposes a new build option for ./zstd
command line utility : make zstd-frugal
Tested on Linux x64 with gcc 4.8.4, the resulting binary is 107KiB, down from 270KiB (default).
To decrease size, it gives away legacy support and bench functionalities.
It still do both compression and decompression, but as stated earlier, separating both will take quite some more time. So I figured this solution could be a good stop gap.
from zstd.
Thanks!
from zstd.
Here are files sizes on MIPS (stripped):
- Default options: 220K
- ZSTD_LEGACY_SUPPORT=0: 164K
- ZSTD_LEGACY_SUPPORT=0, zstd-noBench target: 136K
- zstd-frugal: 108K
from zstd.
Looks good to me, in line with x64 experience.
Is the target size good enough for you ?
from zstd.
Yep!
More numbers if you are interested:
- -O3 -DZSTD_LEGACY_SUPPORT=0 -ffunction-sections -fdata-sections -Wl,--gc-sections: 116K
- same + -DZSTDC_NO_DECOMPRESSOR (my patch): 92K
- option2 + -flto -fwhole-program: 88K
- option2 with -O2 instead of -O3: 84K
- option4 with decompressor enabled: 108K
from zstd.
Sounds good, your path seems able to reduce size even further, even though the public symbols are still present in compiled object files. Hey, better grab the gain ....
from zstd.
Latest development release allow compilation of compression / decompression separately.
You can have a look at the "dev" branch, and try make zstd-compress
and make zstd-decompress
.
from zstd.
v0.6.1 makes it possible to compile only the compressor or only the decompressor
from zstd.
how is it supposed to work?
# cd zstd
# cd programs
# make zstd-decompress
cc -I../lib -I../lib/common -I../lib/dictBuilder -I../lib/legacy -O3 -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow -Wstrict-aliasing=1 -Wswitch-enum -Wdeclaration-after-statement -Wstrict-prototypes -Wundef -DZSTD_NOBENCH -DZSTD_NODICT -DZSTD_NOCOMPRESS -DZSTD_LEGACY_SUPPORT=0 ../lib/common/entropy_common.c ../lib/common/fse_decompress.c ../lib/common/xxhash.c ../lib/common/zstd_common.c ../lib/decompress/huf_decompress.c zstdcli.c fileio.c -o zstd-decompress
/tmp/ccgl3PmR.o: In function `FIO_createDResources':
fileio.c:(.text+0x961): undefined reference to `ZSTD_createDStream'
fileio.c:(.text+0x976): undefined reference to `ZSTD_DStreamInSize'
fileio.c:(.text+0x98d): undefined reference to `ZSTD_DStreamOutSize'
fileio.c:(.text+0xa80): undefined reference to `ZSTD_initDStream_usingDict'
/tmp/ccgl3PmR.o: In function `FIO_decompressFrame':
fileio.c:(.text+0x1105): undefined reference to `ZSTD_resetDStream'
fileio.c:(.text+0x11e6): undefined reference to `ZSTD_decompressStream'
/tmp/ccgl3PmR.o: In function `FIO_decompressFilename':
fileio.c:(.text+0x1c08): undefined reference to `ZSTD_freeDStream'
/tmp/ccgl3PmR.o: In function `FIO_decompressMultipleFilenames':
fileio.c:(.text+0x1e23): undefined reference to `ZSTD_freeDStream'
/tmp/cc7gc0T1.o: In function `main':
zstdcli.c:(.text.startup+0x938): undefined reference to `ZSTD_maxCLevel'
collect2: error: ld returned 1 exit status
make: *** [zstd-decompress] Error 1
this is using checkout 83543a7
from zstd.
@bittorf It's already fixed at "dev" branch:
https://github.com/facebook/zstd/commits/dev
from zstd.
Fixed by @inikep in dev
branch
from zstd.
Related Issues (20)
- Can zstd decompress files such as .zst.001, .zst.002, and so on? HOT 5
- Question: how does dictionary achieve superior compression for small data? HOT 6
- Any way to skip incorrect data and try next data block when decompressing? HOT 3
- higher zstd compression level resulting in larger compressed data HOT 2
- aarch64/x86 causing different compression outputs with row match finder HOT 2
- ZStd 64 bit library compiles with VS 2022 crashes on old CPUs HOT 1
- Weird code size when -mbmi2 or -mno-bmi2 is specified HOT 2
- Compressing and decompressing with dictionaries, between different zstd versions HOT 3
- A question about the streaming compression interface HOT 1
- Question in understanding Zstd Digested Dictionaries HOT 2
- Take fixes of zstd tool before it included in latest HOT 4
- question: does `zstd_decompress` function has tolerance of data race HOT 1
- Provide Linux & Darwin (macOS) builds via GitHub Releases
- Disable auto vectorization of xxhash64, when AVX512 is present. HOT 5
- No check if Reserved of Symbol_Compression_Modes is 0 HOT 8
- Spec cleanup: Should fixup behavior when repeat1-1==0 be specified or changed to an error? HOT 3
- Strange tags make automation crazy HOT 1
- Modernize macros to use `do { } while (0)` instead of `{ }` HOT 9
- [question] Seek for insights on the suitable case for zstd dictionary compression HOT 5
- zstd not buildable with PAC/BTI becauseof `huf_decompress_amd64.S` HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zstd.