Comments (4)
Looking back to this, I've come up with some ideas:
- Switching from JSON files to a binary format like CBOR or MessagePack
- Compressing the JSON files
- Switching to SQLite
Out of these, I'm right now most intrigued by the third option, as it would cut the amount of data into a third right away and provides an extensible base for future needs. I'll give it a try.
from jisho.
It's quite cumbersome to try to read an SQLite database that's embedded in the binary. I might come back to this approach later, but for now I'll just compress the JSON files with flate2
. This is already a marked improvement.
from jisho.
Compressing the JSON files results in the binary shrinking from 121MB to 32MB... however, this also results in a hefty performance degradation:
~/jisho (compress-dictionaries) % ./bench
Finished release [optimized] target(s) in 0.02s
Benchmark 1: cargo run --release 緑
Time (mean ± σ): 285.1 ms ± 4.3 ms [User: 205.4 ms, System: 79.0 ms]
Range (min … max): 279.6 ms … 294.7 ms 10 runs
Benchmark 1: cargo run --release みどり
Time (mean ± σ): 326.9 ms ± 6.6 ms [User: 234.9 ms, System: 91.2 ms]
Range (min … max): 321.6 ms … 344.8 ms 10 runs
Benchmark 1: cargo run --release green
Time (mean ± σ): 641.5 ms ± 4.5 ms [User: 496.0 ms, System: 144.1 ms]
Range (min … max): 635.1 ms … 648.4 ms 10 runs
compared to
~/jisho (main) % ./bench
Compiling jisho v0.1.7 (/home/aku/jisho)
Finished release [optimized] target(s) in 22.98s
Benchmark 1: cargo run --release 緑
Time (mean ± σ): 204.7 ms ± 1.9 ms [User: 137.3 ms, System: 66.8 ms]
Range (min … max): 201.6 ms … 207.8 ms 14 runs
Benchmark 1: cargo run --release みどり
Time (mean ± σ): 232.6 ms ± 2.7 ms [User: 147.5 ms, System: 84.2 ms]
Range (min … max): 229.0 ms … 237.6 ms 12 runs
Benchmark 1: cargo run --release green
Time (mean ± σ): 448.2 ms ± 4.7 ms [User: 295.3 ms, System: 151.7 ms]
Range (min … max): 441.0 ms … 454.0 ms 10 runs
Slowing down the quick CLI lookup usecase by 50% is a dealbreaker. I'll figure out something else.
from jisho.
Related Issues (20)
- Create separate entries for each gloss
- Handle entries without associated kanji
- Sort result entries by frequency
- Handle multiple reading elements
- Collect multiple glosses into a vector
- Evaluate using a build script to create dictionaries at compile time
- Mention REPL-mode in README
- Add readline-like features for REPL-mode
- Fix how backspace works in REPL-mode
- Update the embedded JMdict
- Adopt Rust 2021 edition
- Support wildcard searches HOT 1
- Work directly with JMdict_e.gz
- Improve lookup heuristic HOT 1
- Add benchmarks HOT 1
- Version check
- Quick reference
- Allow prefix matches for meanings? HOT 1
- Prefix & postfix matches
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jisho.