Comments (6)
UnicodeJsonItemExporter write by @x4base
因為需要看到中文字,所以還是改回使用scrapy原始輸出
再經由https://github.com/g0v/councilor-voter-guide/blob/master/data/reformat_json.py
產出https://github.com/g0v/councilor-voter-guide/blob/master/data/pretty_format/tccc/councilors.json
也就是:
/data 底下分各縣市原始 JSON
/data/pretty_format/ 底下放各縣市轉過的好讀版 JSON(方便debug等)
也許scrapy也可直接做到這樣的需求?
謝謝回饋!我想我需要點時間把README寫得好一點
from councilor-voter-guide.
建議原始JSON就輸出為 utf-8字容易除錯,這部份可用 scrapy + UnicodeJsonItemExporter 完成,至於好讀版也另外提供個小工具 prettyjson.py 來方便排版觀察,與 reformat_json.py 差異在於方便個別 json 測試觀察。
歡迎討論 Comparing g0v:master...y12studio:tccc-utf8 · y12studio/councilor-voter-guide ,如可接受再提PR。
export the json file
$ cd crawler/tccc
$ scrapy crawl councilors -o /tmp/test.json
pretty json stdout
$ cat /tmp/test.json | python ../../utils/prettyjson.py
save the pretty json file
$ python ../../utils/prettyjson.py /tmp/test.json /tmp/pretty.json
from councilor-voter-guide.
@x4base ping! 你應該會想看一下
抱歉之前把 tccc jsonexporter改掉沒跟你說阿,因debug需要想加indent就先改了
@y12studio
感覺很方便,scrapy實在應該內建阿
PR please
from councilor-voter-guide.
剛試用了,讚,輸出的檔案大小還小了約一半(4.5MB > 2.6MB)!為什麼啊??有確定資料是一樣的
之後會把各縣市都統一改過
另外README也update過了,歡迎補充!
from councilor-voter-guide.
應是u1234這類編碼是直接存的關係,並非存其utf-8的碼,差2倍差不多。
from councilor-voter-guide.
酷喇~~sorry 之前沒被ping到XDD
from councilor-voter-guide.
Related Issues (20)
- og:title 少了一個 { HOT 2
- 找提案 功能的選區配置錯誤 HOT 3
- popup 訊息也許可以避免重複出現 HOT 1
- 中選會鄉鎮市區問題 HOT 1
- 候選人可內嵌影片
- 候選人可儲存草稿/發佈
- 新竹縣議員配合款網頁連結無效 HOT 1
- [新版]許願池 功能的 mouse:hover 在手機版會失去意義
- [新版]許願池針對個別候選人的許願尚未整合到候選人資訊中
- 搜尋功能在 Firefox 上無法使用
- 搜尋建議出現族語,但內頁沒有族語,使搜尋導向到首頁 HOT 1
- 柯路加議員有兩種寫法 HOT 2
- 選區議員列表「了解更多」頁:現任議員提案除「網友下標」外,應可連至原提案 HOT 1
- 無障礙網頁
- CSS配色建議
- 左上icon於寬板時應可留邊距 HOT 2
- 擬參選人下一頁後filter失效
- 戴錫欽多一筆2014的政治獻金資料
- 配合款得標廠商
- 增加各次選舉的投票規則
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from councilor-voter-guide.