GithubHelp home page GithubHelp logo

Comments (25)

Cyan4973 avatar Cyan4973 commented on April 19, 2024 1

To be fixed

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

Hi Konstantin

That's a pretty good idea.
Actually, I was expecting the other way round, that is, some application being interested in the decompression code without the compression part.
That being said, both objectives implies the same capability.

I feel we should wait for v0.4 do look into this issue.
The reason is, the code structure will be changed, in a way which will make this capability easier to create. It will probably not be enough, but at least a good step into the right direction, so it will be easier to study what remains to be done.

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

With the release of 0.4.x serie, it becomes possible to target this objective.

Zstd code is now more clearly separated between compression and decompression.
It is also possible to not generate legacy code (a logical complementary capability).

So, what's missing now is primarily the lack of compression / decompression separation within huff0 and fse.

There are mainly 2 ways to achieve this :

  • Modify huff0 and fse, in order to separate compress and decompress functions
    • possible, will require some time, and increase number of files
  • Make huff0 and fse integration static
    • expectation : unused static functions will simply not be generated (dead code elimination)
    • added benefit : currently public fse / huff0 symbols will no longer be present within ABI. Only zstd public symbols would remain.
    • requires a few tricky source code modifications.

The second option currently looks the more promising to me.

from zstd.

annulen avatar annulen commented on April 19, 2024

+1 for static

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

There is also a need to define what kind of objective to target.
Presuming it's possible to compile for only compression or only decompression, what should be the result ?

  • A static library ? (for example, zstd_compress.a)
  • A list of files and parameters ?

Speaking of static library : how does it work today ?
Presuming libzstd.a is compiled and generated, what happens when linking a program which only uses compression or decompression code from this library ? Does the linker keep unused code out of the resulting binary, achieving the objective ?

From https://en.wikipedia.org/wiki/Static_library :

With static linking, it is enough to include those parts of the library that are directly and indirectly referenced by the target executable

from zstd.

annulen avatar annulen commented on April 19, 2024

Here are some facts that I know about ELF linking:

  • ELF linker manipulates sections, not functions.
  • ELF linker traverses all input files in one pass from left to right (this behavior can be modified by command line options, at least in GNU implementation). It picks sections according to following rules
    ** All code (.text) sections from .o files specified on command line are linked into final ELF object (executable or shared library)
    *
    Code sections from *.a files are included only if they are referenced from section, which was already included
  • In GCC and Clang there is option -ffunction-sections, which forces compiler to create separate section for each function. (There is also similar option -fdata-section for data, e.g. string literals). If GNU linker is invoked with --gc-sections, it will throw away all unuswd functions.
  • Aforementioned options are not present in old versions of GCC and binutils, e.g. people using embedded cross-toolchains with gcc < 4.2 are probably out of luck

from zstd.

annulen avatar annulen commented on April 19, 2024

So, if you place compression and decompression functions into different source files, make static library from it, and refenece only, e.g., decompression function from executable, compression functions won't be linked in. If they are not in different files, -ffunction-sections is required when building static library, and -Wl,--gc-sections when linking executable.

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

So dead code elimination from static library is in theory possible, in practice not obvious, with various complexities and limitations in the way.

OK. So now let's suppose that it's possible to build 2 static libraries dedicated for compression and decompression. Note that, within the compression one, there are multiple methods possibles, depending on compression level. That means I expect the compression library to remain "relatively big".

Here is btw current object file sizes, compiled using -Os (optimized for size) for x64 target :

ls -ls *.o *.a
 16 -rw-r--r-- 1 yann yann  13480 déc.   2 18:33 fse.o
 24 -rw-r--r-- 1 yann yann  20592 déc.   2 18:33 huff0.o
 52 -rw-r--r-- 1 yann yann  53064 déc.   2 18:33 zstd_compress.o
 12 -rw-r--r-- 1 yann yann  10328 déc.   2 18:33 zstd_decompress.o
100 -rw-r--r-- 1 yann yann 100042 déc.   2 18:33 libzstd.a

I suspect the request to remove decompression code is tied to reducing final code size. According to above measures, it will indeed reduce code size, but by no more than 30 % (including parts of fse and huff0).
Is it enough ? Is there a size objective basically ?

from zstd.

annulen avatar annulen commented on April 19, 2024

Currently I'm interested only in size of zstd application. I'm planning to use it for real-time compression of coredumps and related info on embedded system to transmit over network, and this data will never be decompressed on that device. I don't have strict size requirements, so 30% would be fine. I just wanted not to bring unnecessarily bloat into firmware.

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

Currently I'm interested only in size of zstd application

When you say "application", you mean the ./zstd command line utility ?

from zstd.

annulen avatar annulen commented on April 19, 2024

Yep

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

OK. It's a more defined objective, but also quite more work.

On top of separating library functions, it will be necessary to modify program files.
They aren't created with this objective in mind, so it will take some time.

from zstd.

annulen avatar annulen commented on April 19, 2024

I thought something like annulen@777033b would be enough to exclude decompressor from zstd cli. Am I wrong?

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

I've re-target the objective to "reduce size of ./zstd utility".

According to current experiment, the proposed change at annulen/zstd@777033b will probably not be enough.

I've experimented with removing non essential capabilities, starting with the integrated benchmark suite.

Just removing any mention of benchmark from zstdcli.c doesn't seem to be enough : the final exe size doesn't change much. I suspect that's because the public symbols are still generated, even if not used, as they could be called externally, like a dll. So, to get some size benefits, it's also necessary to remove bench.c from the compilation chain.

I'm starting to lean the binary along these principles. I suspect I'll have something to propose by tomorrow.

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

There is a new update in "dev" branch : 28e7cef

It proposes a new build option for ./zstd command line utility : make zstd-frugal

Tested on Linux x64 with gcc 4.8.4, the resulting binary is 107KiB, down from 270KiB (default).
To decrease size, it gives away legacy support and bench functionalities.

It still do both compression and decompression, but as stated earlier, separating both will take quite some more time. So I figured this solution could be a good stop gap.

from zstd.

annulen avatar annulen commented on April 19, 2024

Thanks!

from zstd.

annulen avatar annulen commented on April 19, 2024

Here are files sizes on MIPS (stripped):

  • Default options: 220K
  • ZSTD_LEGACY_SUPPORT=0: 164K
  • ZSTD_LEGACY_SUPPORT=0, zstd-noBench target: 136K
  • zstd-frugal: 108K

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

Looks good to me, in line with x64 experience.

Is the target size good enough for you ?

from zstd.

annulen avatar annulen commented on April 19, 2024

Yep!

More numbers if you are interested:

  • -O3 -DZSTD_LEGACY_SUPPORT=0 -ffunction-sections -fdata-sections -Wl,--gc-sections: 116K
  • same + -DZSTDC_NO_DECOMPRESSOR (my patch): 92K
  • option2 + -flto -fwhole-program: 88K
  • option2 with -O2 instead of -O3: 84K
  • option4 with decompressor enabled: 108K

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

Sounds good, your path seems able to reduce size even further, even though the public symbols are still present in compiled object files. Hey, better grab the gain ....

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

Latest development release allow compilation of compression / decompression separately.
You can have a look at the "dev" branch, and try make zstd-compress and make zstd-decompress .

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

v0.6.1 makes it possible to compile only the compressor or only the decompressor

from zstd.

bittorf avatar bittorf commented on April 19, 2024

how is it supposed to work?

# cd zstd
# cd programs
# make zstd-decompress

cc      -I../lib -I../lib/common -I../lib/dictBuilder -I../lib/legacy -O3 -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow -Wstrict-aliasing=1 -Wswitch-enum -Wdeclaration-after-statement -Wstrict-prototypes -Wundef   -DZSTD_NOBENCH -DZSTD_NODICT -DZSTD_NOCOMPRESS -DZSTD_LEGACY_SUPPORT=0 ../lib/common/entropy_common.c ../lib/common/fse_decompress.c ../lib/common/xxhash.c ../lib/common/zstd_common.c ../lib/decompress/huf_decompress.c zstdcli.c fileio.c -o zstd-decompress
/tmp/ccgl3PmR.o: In function `FIO_createDResources':
fileio.c:(.text+0x961): undefined reference to `ZSTD_createDStream'
fileio.c:(.text+0x976): undefined reference to `ZSTD_DStreamInSize'
fileio.c:(.text+0x98d): undefined reference to `ZSTD_DStreamOutSize'
fileio.c:(.text+0xa80): undefined reference to `ZSTD_initDStream_usingDict'
/tmp/ccgl3PmR.o: In function `FIO_decompressFrame':
fileio.c:(.text+0x1105): undefined reference to `ZSTD_resetDStream'
fileio.c:(.text+0x11e6): undefined reference to `ZSTD_decompressStream'
/tmp/ccgl3PmR.o: In function `FIO_decompressFilename':
fileio.c:(.text+0x1c08): undefined reference to `ZSTD_freeDStream'
/tmp/ccgl3PmR.o: In function `FIO_decompressMultipleFilenames':
fileio.c:(.text+0x1e23): undefined reference to `ZSTD_freeDStream'
/tmp/cc7gc0T1.o: In function `main':
zstdcli.c:(.text.startup+0x938): undefined reference to `ZSTD_maxCLevel'
collect2: error: ld returned 1 exit status
make: *** [zstd-decompress] Error 1

this is using checkout 83543a7

from zstd.

inikep avatar inikep commented on April 19, 2024

@bittorf It's already fixed at "dev" branch:
https://github.com/facebook/zstd/commits/dev

from zstd.

Cyan4973 avatar Cyan4973 commented on April 19, 2024

Fixed by @inikep in dev branch

from zstd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.