Comments (5)
Sure @JobLeonard, you can start and I can join a bit later.
For benchmarking I often take cloud instances (c7g and r7iz) to gain access to more hardware. Might be useful to you too 🤗
One thing to watch out for - on very short strings (well under 64 bytes) we are optimizing the for-loop. On longer strings, if we take the first and the last character - we end up fetching 2 cache lines from each string, instead of just one.
from stringzilla.
Hi @JobLeonard,
I have a big update! I've generalized the substring search methods to be able to match different characters within the needle. The method that infers the best targets is called _sz_locate_needle_anomalies
. For long needles and small alphabets, updating it may have a noticeable impact. Here is the code:
StringZilla/include/stringzilla/stringzilla.h
Lines 1586 to 1657 in cbdae26
Not sure about what would be the best dataset for such benchmarks, seems like this is related to #91.
from stringzilla.
Thanks for the heads-up, looks like a worthwhile change for the examples given in the doc comment.
I was about to write that I'm still interested in giving this a go but have been very busy at work in the last month. Not entirely sure when I manage to free up some time again but just wanted to re-assure you that I haven't forgotten about it!
from stringzilla.
Hi @JobLeonard! Sorry for a late response, didn’t see the issue.
That’s a good suggestion for the serial version! I haven’t spent much time optimizing it.
On most modern CPUs the forward and backward passes over memory are equally fast, AFAIK. It might be a good idea to also add a version that uses 64 bit integers, if misaligned reads are allowed. Would you be interested in trying a couple of such approaches and submitting a PR?
In case you do, the CONTRIBUTING.md
file references several datasets for benchmarks 🤗
from stringzilla.
I'll take a shot at it, with the following caveats:
- I only have one laptop to benchmark it with (a six years old Lenovo P51 with an Intel® Core™ i7-7820HQ Processor running KDE Neon (Ubuntu LTS 22.04 based))
- don't have too much spare time so the GCC 12 I already have installed will have to do.
- "read string as 64 bit integers" sounds like it's great when everything aligns neatly, but wouldn't that require a lot of special casing? Checks for whether
sz_size_t length
is less than eight characters, and for whethersz_string_start_t a
and/orsz_string_start_t b
start misaligned, or end misaligned. Start or end within the same 8-byte word or not. That's a lot of variations to consider (unless there's an obvious bitmasking + aligned reads trick that I'm too tired to work out in my head right now). So I think I'll skip that for now.
So my benchmark nrs will be limited to a few simple variations of sz_equal
on x86-64, with AVX2 thrown in as a point of comparison too, and therefore only useful as a first sanity check for whether this idea is worth investigating further. Is that ok?
from stringzilla.
Related Issues (20)
- Missing `sz::string::shrink_to_fit` HOT 3
- Overwrite LibC symbols with `LD_PRELOAD` HOT 1
- Improve Rolling Hashes
- Avoid Python GIL in `write_to`, sorting, Levenshtein HOT 1
- Refactor Str and SplitIterator to use `sz_string_view_t`
- V4 Wishlist HOT 3
- search for string without loading entire file into memory? HOT 1
- [BUG] Instant error STATUS_ACCESS_VIOLATION on Windows with Rust lib HOT 8
- Inconsistent compiler flags with Clang HOT 1
- Quick-start instructions for C++, Rust, and Swift HOT 4
- CMake targets for the C shared library HOT 3
- Pretty-printing `Strs` in Python HOT 3
- sz_capabilities might be incorrect for AVX512 HOT 4
- [CLI] sz_split error HOT 8
- sz::string length();size() and rstrip() HOT 3
- Inline Assembly for detecting CPU features on Arm
- Doesn't build under FreeBSD 14-STABLE HOT 6
- V3 bindings for Node.js
- Bug: sz_find incorrectly finds the substring with length=5 HOT 6
- Standard-compliant `split` implementation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stringzilla.