cncases / cases Goto Github PK
View Code? Open in Web Editor NEW**裁判文书网本地搜索
License: Mozilla Public License 2.0
**裁判文书网本地搜索
License: Mozilla Public License 2.0
二次运行需要skipping很久才能重新处理
能否实现一下插入指定压缩包的新数据,以及插入数据更新索引的功能,谢谢
二次运行时候一直需要skipping,要等待很久才能重新处理
直接运行./main
命令行结果如下:
$ ./main
2024-01-22T06:56:34.400725Z INFO main: listening on http://127.0.0.1:8081
但访问8081端口无任何响应。
使用管理员权限运行./main
结果如下:
$ sudo ./main2024-01-22T06:56:38.123740Z INFO main: listening on http://127.0.0.1:8081
thread 'main' panicked at src/bin/main.rs:27:84:
called `Result::unwrap()` on an `Err` value: Error { message: "IO error: While open a file for random read: /data/cases/rocksdb/009237.sst: Too many open files" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[1] 19883 IOT instruction sudo ./main
运行./main
后在本机端口成功访问网页并检索,但是局域网内其他设备无法访问。
防火墙端口已开
如果选择索引案件内容,搜索结果的页面中,下一页的链接不对,点击只能转到关键字的结果。
执行convert
命令时,运行一段世界后报错:
2024-01-19T11:57:18.865094Z INFO convert: inserting 15324160, time: 875
2024-01-19T11:57:18.937344Z INFO convert: inserting 15325184, time: 875
2024-01-19T11:57:19.012542Z INFO convert: inserting 15326208, time: 876
2024-01-19T11:57:19.128853Z INFO convert: inserting 15327232, time: 876
2024-01-19T11:57:19.198966Z INFO convert: inserting 15328256, time: 876
2024-01-19T11:57:19.267122Z INFO convert: inserting 15329280, time: 876
2024-01-19T11:57:19.390479Z INFO convert: inserting 15330304, time: 876
thread 'main' panicked at src/bin/convert.rs:61:45:
called `Result::unwrap()` on an `Err` value: Error { message: "IO error: While open a file for appending: /data/cases/rocksdb/002043.log: Too many open files" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
系统信息:
$ uname -a
Linux ubuntu 5.15.0-84-generic #93-Ubuntu SMP Tue Sep 5 17:16:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
$ ulimit -a
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8192
-c: core file size (blocks) 0
-m: resident set size (kbytes) unlimited
-u: processes 124695
-n: file descriptors 1024
-l: locked-in-memory size (kbytes) 4005456
-v: address space (kbytes) unlimited
-x: file locks unlimited
-i: pending signals 124695
-q: bytes in POSIX msg queues 819200
-e: max nice 0
-r: max rt priority 0
-N 15: unlimited
重新执行之后运行到同样的地方报同样的错误,看起来时文件打开限制的问题,有什么解决办法嘛?
运行 convert config.toml 程序。此过程会将原始数据放入 rocksdb 数据库中,数据库文件路径为 config.toml 中的 db 变量;转换后的数据大小约为 200G,转换可能会花费数小时的时间;如果中途中断,再次运行会从中断处继续。
再次运行要先skipp很久的文件才能接着处理,目前处理了96G。这个是运行要加参数吗?
用git clone的代码编译的convert
如题
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.