gildor2 / fast_zlib Goto Github PK
View Code? Open in Web Editor NEWHeavily optimized zlib compression algorithm
Home Page: http://www.gildor.org/en/projects/zlib
License: BSD 3-Clause "New" or "Revised" License
Heavily optimized zlib compression algorithm
Home Page: http://www.gildor.org/en/projects/zlib
License: BSD 3-Clause "New" or "Revised" License
Hi Konstantin,
I have tested your fast-longest-match with the data set I need compress, and fast-longest-match does outperforms the original zlib in terms of speed. I want to understand how it works and so I read your brief description of the algorithm and checked the source code. Now I still have a few questions about your algorithm, could you help me better understand it?
When algorithm finds a match (intermediate one), it computes hashes for all matching bytes, and selects a different hash chain - that one which points farther from current position. ...
Q1: Do you mean when finding a match, the algorithm computes hashes for every 3 bytes in the matching string? Then I think when these hash values are inserted into the hash table head[]
(or the prev[]
if head[x]
already exists, they will fall into different positions. Is my understanding correct?
Q2: Regarding to "select a different hash chain - that one which points farther from current position. If we have 2 hash chains, one points at distance 10, and another one at distance 1000, it is not worth checking distance 10 because other bytes will not match ..."
prev[]
, eg. checking prev[1000]
betters checking prev[100]
?current position 1100, string: abcdabcd......
position 1000, string: abcdabcd......
position 100, string: abcdabcd.......
Here we can check the string starting on position 100, because the possible match length is 1000+. If checking position 1000, the possible match length is 100+.
Is this what you mean?
Q3: To this piece of code:
cur_match -= offset;
offset = 0;
next_pos = cur_match;
for (i = 0; i <= len - MIN_MATCH; i++) {
pos = prev[(cur_match + i) & wmask];
if (pos < next_pos) {
/* this hash chain is more distant, use it */
next_pos = pos;
offset = i;
}
}
/* Switch cur_match to next_pos chain */
cur_match = next_pos;
I think this piece of code implements the magic "jumping to the hash chain which points to farther distance". I don't quite understand what it tries to do though. Looks to me it does something like this:
when a matching string "abcdefg" at position 1000 is found, try to find a longest match, such as "abcdefg123456"
- if prev[1000] exists, record prev[1000] as p1 (a farther position with the same match, using hash value of "abc")
- if prev[1001] ("bcd") exists, record prev[1001] as p2
- if p2 < p1, record next_pos = p2
- do similar thing for prev[1002] ("cde"), prev[1003] ("def"), prev[1004] ("efg"), find the *farthest* next_pos
Is this process the one you mentioned as "jumping to the hash chain which points to farther distance"? What's the fundamental difference between this one and the zlib behavior, which is keep checking p = prev[1000], p1 = prev[p], p2 = prev[p1]
?
My questions might be too long, but I really want to understand this algorithm better, and look forward to getting some feedback from you.
Thank you.
Euccas
Would you support the version of 64-bit assembly?
Привет, не знаю куда копать дальше...
Написал managed-обертку над zlib (очень нужна скорость в одном месте). С версией 1.2.3 c сайта WinImage работает. С твоими dll (что с cdecl что c WinApi) работает но "неустойчиво".
Я написал цикл 100 раз чтобы скорость померить, вот где-то на 12 шаге все ломается...
Т.е. изначально несколько раз полный цикл
deflateInit2_ -> deflate(Z_NO_FLUSH)->deflate(Z_FINISH)->deflateEnd() отрабатывает...
а иногда (итерации на 12 например) выбрасывает Z_STREAM_ERROR (-2)
чаще всего на deflate(Z_FINISH) но бывает и на deflateEnd и на deflate(Z_NO_FLUSH)
Может быть сборочку сделаешь чтобы диагностику в консольку писало что этой скотине не нравится? или могу свою тестовую прогу дать...
Hey, @gildor2 thanks for the awesome code. I'm using zlib(boost iostreams) in a small game server to compress(deflate) the game packets and i want to know if this flags("CFLAGS="-march=native -O2" ./configure") are good enough for optimize zlib together with your C code.
Hi Konstantin,
your work is a great improvement to compression.
Can you port the match32.asm to 64 bit?
Yours Sincerly
Ed
This repo is currently active and NOT compaiable with your patch:)
https://github.com/cloudflare/zlib
it's very usefull this lib, and your blog articles.
i want use this lib in mac os,but i don't konw how.
did you have the solutaion? very thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.