Data can be downloaded from here.
Each entry of data has the following information:
- A monotonically increasing counter (useful for sorting the trace in chronological order)
- The timestamp of the request in Unix notation with milli-second precision
- The requested URL
- A flag to indicate if the request resulted in a database update or not
Store all data into ./data
folder. Feel free to put big files in this fold because they will be ignored.
- Xuan Li
- Xiaofei Sun
- Xuan Xu
- Yunqing Yang
- Hongyi Duanmu
- Keyi Chen
- Yicheng Lin
-
Metadata: What kinds of meta data correlated with visit frequency?
-
Revisions: What are hidden in revision records?
-
Time Serials: Is there any visit pattern along time?
-
Language: Let’s be international: different languages.
-
Graph: Entries relationships as graphs.
-
Clustering: We do clustering using both text and link structure.
-
Category: From entry to category: what will change?