GithubHelp home page GithubHelp logo

nishantsbi / airticketpredicting Goto Github PK

View Code? Open in Web Editor NEW

This project forked from junlulocky/airticketpredicting

0.0 1.0 0.0 15.59 MB

Machine Learning modeling for Air Ticket Predicting

License: MIT License

Python 100.00%

airticketpredicting's Introduction

Machine Learning for Air Ticket Predicting

Table of Contents:

Instructions on the codes

Any theory about this project, please refer to my report. If you want to keep track of the result performance, please refer to the "Performance Record.xlsx" file.

I implemented many kinds of classifiers and regressors on this project in python.

And the features I used in classification and regression is described in the report.

The package I used during the project is described in requirements.txt.

This project defines two problems, one is the specific problem, the other one is the generalized problem. You can find the definition in the report (or in the representation draft).

And also, I provide the representation draft and ppt to you, it is compressed version of my report, which does not focus on mathematical formulas, but gives you the intuition of my project. And also, many details are not shown in the representation draft, please refer to my report if you want to know more details about this project.

Code Structure

Classification

  • Use Classification to predict

Specific Problem

|-inputClf_small                  # the input for classification method
|-inputClf_GMMOutlierRemoval      # the input for classification method with outlier removal by EM
|-inputClf_KMeansOutlierRemoval   # the input for classification method with outlier removal by K-Means

# Classification methods
|-ClassificationBase.py           # The base class of the classification objects
	|-ClassificationAdaBoost.py     # AdaBoost class  
	|-ClassificationDecisionTree.py # Decision Tree class
	|-ClassificationKNN.py          # K nearest neighbot class
	|-ClassificationLinearBlend.py  # linear blending class
	|-ClassificationLogReg.py       # logistic regression class
	|-ClssificationNN.py            # neural networks class
	|-ClassificationPLA.py          # perceptron learning algorithm class
	|-ClassificationRandomForest.py # random forest algorithm class
	|-ClassificationSVM.py          # SVM class
	|-ClassificationUniformBlending.py # uniform blending algorithm class
# Classification test
|-mainAdaBoostClf.py
|-mainDecisionTreeClf.py
|-mainGeneralizeClf.py
|-mainKNNClf.py
|-mainLinearBlendClf.py
|-mainLogisticReg.py
|-mainNNClf.py
|-mainPLA.py
|-mainRandomForestClf.py
|-mainSVMClf.py
|-mainUniformBlendClf.py

Generalized Problem

# methods
|-inputGeneralClf_small              # the input for uniformGneralize method
|_inputGeneralClf_HmmParsed          # the input pattens are parsed from HMM Sequence Classification, used for HmmGeneralizeClf method       
|-ClassificationHmmGeneralize.py     # use hmm to do the generalized problem
|-ClassificationUniformGeneralize.py # use uniform blending to do the generalized problem
# test files
|-mainHmmGeneralizeClf.py
|-mainUniformGeneralize.py

Regression

  • Use regression to predict.

Specific Problem

|-inputReg # input for regression methods
# Regression methods
|-RegressionBase.py # The base class of the regression objects
	|-RegressionAdaBoost.py     # AdaBoost class
	|-RegressionDecisionTree.py # Decision Tree class
	|-RegressionGaussianProcess.py # gaussian process class
	|-RegressionKNN.py          # K nearest neighbors class
	|-RegressionLinReg.py       # linear regression class
	|-RegressionNN.py           # neural networks class
	|-RegressionRandomForest.py # random forest class
	|-RegressionRidgeReg.py     # ridge regression class
	|-RegressionUniformBlend.py # Uniform Blending class
# Regression test
|-mainAdaBoostReg.py
|-mainDecision.py
|-mainGaussianProcess.py
|-mainLinReg.py
|-mainNNReg.py
|-mainRandomForestReg.py
|-mainRidgeReg.py
|-mainUniformBlendReg.py

Generalized Problem

There is no generalized problem method in regression, because the final preferred algorithm is AdaBoost-DecisionTree Classification.

AI(Aritificial Intelligence)

Use Artificial Intelligence to predict, here mainly Q-Learning.

# Artificial Intelligence methods
|-inputQLearning   # input for qlearning method
|-qlearn.py        # q learning class
|-mainQLearning.py # test for qlearning

HmmGeneralizeModel

It is used to generalize the patterns for the new routes.

|-HmmClassifier.py
|-mainHMM.py

Others

# utils
|-data_small   # input data crawled from an airplane company, json files. It is a 103 day period.
|-inputGeneralRaw   # the generalized problem input matrices parsed from data_small, and it is not price normalized(i.e. not in Euro currency)
|-inputSpecificRaw  # the specific problem input matrices parsed from data_small, and it is not price normalized(i.e. not in Euro currency)  
|-util.py      # util functions
|-load_data.py # load input from the raw json data
|-log.py       # log function, if you do not want to see some log info, please change the DEBUG variable in this file to 'False'   
|-priceBehaviorAnalysis.py # analyze the price behavior of several routes
|-plotOutlierRemoval.py    # plot the figure to illustrate outlier removal
|-plotNNUpdate.py          # plot the effect of different update method in NN
|-computeMoneySave.py      # used to compute the money save for every client on the average

# infos
|-requirements.txt         # package requirements
|-Performance Record.xlsx  # record the performance of various parameters

Citation

The repo is based on the following research articles:

  • Lu, Jun. "Machine learning modeling for time series problem: Predicting flight ticket prices." arXiv preprint arXiv:1705.07205 (2017).

airticketpredicting's People

Contributors

junlulocky avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.