It is only used for personal study and technical exchange, and cannot be used for commercial purposes.
This is a spider for **裁判文书网.
- Support IP proxy
- Support multiple processes
- Support full crawling
- Divide data according to decision time, region and court
python spider.py -num_processes 1 -start_time 2016-1-2 -end_time 2016-1-2
- raw data
- processed data