Comments (4)
I'm not sure I understand the score part. If the goal is for there to be "no simple way for a pattern of differences to cancel out", why would scoring more cancellations be better? (I'm just asking about the wording here, not the code).
As for larger block sizes, the main issue there is usually what to do about residuals.
- Casey
from meow_hash.
It is internal cancellations needed to cancel the entire change, so basically what it says it that something unlikely has to happen 11 times for a full internal collision to occur.
There is going to be some waste with larger blocks, that is unavoidable, but there is definitely also an advantage for larger blocks that will more than offset this when the input gets large enough.
from meow_hash.
I found a cool trick while playing around with a new function, with 3-reg instructions you can switch what registers get used for what lanes during computation, without using any additional instructions, so for instance this naive implementation of 4 rounds with lane shifts can be rewritten unrolled swapping the registers around as it goes:
//naive
for(a=0;a<4;a++){
xmm6=meow_paddq(xmm6,meow_load(rax+0x10+a*0x20));
xmm4=meow_aesenc(xmm4,meow_load(rax+0x00+a*0x20),xmm1);
xmm2=meow_paddq(xmm2,meow_load(rax+0x00+a*0x20));
xmm1=meow_aesenc(xmm1,meow_load(rax+0x10+a*0x20),xmm5);
ttt0=xmm0;
xmm0=xmm1;
xmm1=xmm2;
xmm2=xmm3;
xmm3=xmm4;
xmm4=xmm5;
xmm5=xmm6;
xmm6=xmm7;
xmm7=ttt0;
}
//unrolled
ttt0=meow_paddq(xmm6,meow_load(rax+0x10));
ttt1=meow_aesenc(xmm4,meow_load(rax+0x00),xmm1);
xmm6=meow_paddq(xmm2,meow_load(rax+0x00));
ttt2=meow_aesenc(xmm1,meow_load(rax+0x10),xmm5);
ttt3=meow_paddq(xmm7,meow_load(rax+0x30));
xmm1=meow_aesenc(xmm5,meow_load(rax+0x20),xmm6);
xmm7=meow_paddq(xmm3,meow_load(rax+0x20));
xmm6=meow_aesenc(xmm6,meow_load(rax+0x30),ttt0);
xmm4=meow_paddq(xmm0,meow_load(rax+0x50));
xmm2=meow_aesenc(ttt0,meow_load(rax+0x40),xmm7);
xmm0=meow_paddq(ttt1,meow_load(rax+0x40));
xmm7=meow_aesenc(xmm7,meow_load(rax+0x50),ttt3);
xmm5=meow_paddq(ttt2,meow_load(rax+0x70));
xmm3=meow_aesenc(ttt3,meow_load(rax+0x60),xmm0);
xmm1=meow_paddq(xmm1,meow_load(rax+0x60));
xmm0=meow_aesenc(xmm0,meow_load(rax+0x70),xmm4);
This way we can operate on 128 byte blocks instead of 256 byte. This could also be used for making functions with a non-power-of-2 state that still operate on power-of-2 blocks.
from meow_hash.
That does seem good - I have never profiled register renaming in the front end, btw, but it's not an ALU op, so there's also the fact that at least some number of registers can be renamed every cycle "for free" anyway... meaning that even without ternary ops, you can still do this, because
a = op b c
is basically the same cost as
b = op b c
a = b
because register renaming is a front-end op only.
- Casey
from meow_hash.
Related Issues (20)
- Example program does not work on Windows HOT 9
- Use streaming construction to hash files HOT 2
- A Sun port i did on a whim, using the system compiler... HOT 3
- Benchmark Results From Ryzen 7 1700 1st Gen HOT 5
- dotnet (c#) bindings HOT 1
- How deterministic is Meow hash? HOT 4
- 256-bit variants HOT 13
- Consider using -mavx rather than -mavx2 in build.sh's build of meow_bench HOT 1
- Inlining Failed HOT 4
- Errors in contributors links
- MeowU64From only returns the first 64 bytes of the hash HOT 4
- .NET Core 3.1 port. HOT 2
- _ReadWriteBarrier() deprecated HOT 2
- Make input parameters const?
- Buffer overflow when size is not a multiple of 16 (ASan). HOT 2
- Full 128-bit collision between two files HOT 15
- Meow 0.6 candidate functions HOT 4
- Compare against xxHash HOT 2
- Suggestion: API for runtime AES instruction check
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from meow_hash.