GithubHelp home page GithubHelp logo

kasius63 / eventforecast Goto Github PK

View Code? Open in Web Editor NEW

This project forked from moment-of-peace/eventforecast

0.0 2.0 0.0 7.83 MB

Time series prediction and text analysis using Keras LSTM, plus clustering, association rules mining

License: GNU Lesser General Public License v3.0

Python 91.04% PHP 0.67% JavaScript 6.45% CSS 1.84%

eventforecast's Introduction

-----------------------

website: https://github.com/moment-of-peace/EventForecast

EventForecast

A group data mining project using deep learning (LSTM) aiming at predicting the probability of occurence of particular events and the popularity of events. This project is granted "The Best Data Mining Project Award" of the University of Queensland in 2017


dataset

Source data: https://www.gdeltproject.org/data.html

Crawled news data on google drive: https://drive.google.com/file/d/0B--MjMVnQr09SmJ2VGh5VkstR2c/view?usp=sharing

Processed data: https://drive.google.com/drive/folders/0B_Qs_6HNIHS9Qk1hMzZ1c3VWOE0?usp=sharing


Occurrence Prediction

Environment: numpy, scipy, tensorflow, keras, h5py, matplotlib

Require at leat one "attr-county" folder provided by processed data link above (need to unzip first)

Use following command to run (-p is compulsory, others are optional):

python3 rnn_model.py [option parameter]

-p: the path of precessed events records for a certain country (any "attr-country" folder in the processed data link)

-s: step size, how long history to consider

-a: how many days look ahead (which day in the future to predict)

-e: training epochs

Popularity (hot events) Prediction

Environment: numpy, scipy, tensorflow, keras, nltk, h5py, gensim.

Require folder "news_50_num" (need to unzip first), and word embedding files "vocab_event100.pkl", "weights_event100.npy" provided by the processed data link above

Use following command to run (all options are optional):

python3 rnn_text.py [option parameter]

-b: batch

-l: cutting length, i.e. how many words in the news to consider

-e: training epochs


Cluster

Environment: R language:

1.copy the exact data in data/test.xlsx (X to AP)

2.z=read.table("clipboard",header=T)//import data in to z

3.km <- kmeans(z[,1:20], 3)//use build-in Kmeans function

4.km//print km results


Other files

preprocessing.py: general preprocessing, including removing useless data, extract and count events and so on.

nlp_preprocessing.py: special preprocessing for text analysis, including removing common words, stemming, and so on

hot_news_php/ : files for the webpage application about hot news recommendation

association_rule/ : codes association rules mining

cluster/ : files and results for clustering

eventforecast's People

Contributors

moment-of-peace avatar charlie1994 avatar jadexin avatar zsjdddhr avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.