Comments (15)
from stringzilla.
That would work.
from stringzilla.
Sure, let's do it right now.
from stringzilla.
@ashvardanian definitely, let's do it!
from stringzilla.
As mentioned in #79, I am not sure about the right course of action here. The other operations, like #82 or random string generation might be more relevant. We should also benchmark against memchr and other native Rust string projects.
from stringzilla.
@ashvardanian regarding the "fingerprints" in the table that you've shared in the PR, is it the same as sz_hash?
from stringzilla.
Not the same, but related. Fingerprints are rolling hashes, which are used to populate a bitset.
from stringzilla.
In that case which is the function for generating fingerprints using StringZilla?
from stringzilla.
@michaelgrigoryan25, it's called sz_fingerprint_rolling 🤗
I am not sure about what's the best Rust interface for it should look like, so let's keep it for the end.
from stringzilla.
These are the most commonly used string types in Rust:
&str
String
&String
Cow<'_, str>
Cow<'_, String>
from stringzilla.
These are the most commonly used string types in Rust:
&str
String
&String
Cow<'_, str>
Cow<'_, String>
I can implement a macro which implements a common trait for all these types, so that methods like sz_find
can be accessed directly, by only importing the trait via use
.
from stringzilla.
Sure. How about the AsRef<[u8]>
I currently use?
from stringzilla.
@ashvardanian michaelgrigoryan25@4f4ace3
from stringzilla.
@michaelgrigoryan25 this looks good! Want to open a PR or want to add a few more things before that?
from stringzilla.
Thanks a lot, great patches, @michaelgrigoryan25! In C++ I've implemented lazy-evaluated convenience functions, like find_all
, rfind_all
, split_all
, rsplit_all
, and so on. Took around 400 lines of code. I think it might be a great idea to implement them in Rust as well. What do you think? Would you be interested in adding those and the Levenshtein / Needleman-Wunsch alignment scores??
from stringzilla.
Related Issues (20)
- Missing `sz::string::shrink_to_fit` HOT 3
- Overwrite LibC symbols with `LD_PRELOAD` HOT 1
- Improve Rolling Hashes
- Avoid Python GIL in `write_to`, sorting, Levenshtein HOT 1
- Refactor Str and SplitIterator to use `sz_string_view_t`
- V4 Wishlist HOT 3
- search for string without loading entire file into memory? HOT 1
- [BUG] Instant error STATUS_ACCESS_VIOLATION on Windows with Rust lib HOT 8
- Inconsistent compiler flags with Clang HOT 1
- Quick-start instructions for C++, Rust, and Swift HOT 4
- CMake targets for the C shared library HOT 3
- Pretty-printing `Strs` in Python HOT 3
- sz_capabilities might be incorrect for AVX512 HOT 4
- [CLI] sz_split error HOT 8
- sz::string length();size() and rstrip() HOT 3
- Inline Assembly for detecting CPU features on Arm
- Doesn't build under FreeBSD 14-STABLE HOT 6
- V3 bindings for Node.js
- Bug: sz_find incorrectly finds the substring with length=5 HOT 6
- Standard-compliant `split` implementation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stringzilla.