GithubHelp home page GithubHelp logo

lob's Introduction

LOB

Benchmark Dataset of Limit Order Book in China Markets

FinAI Laboratory

Hong Kong Graduate School of Advanced Studies

[email protected]

Table of Contents

  1. Introduction
  2. Abstract
  3. Keywords
  4. Models
  5. Data Format
  6. Installation and Usage
  7. Results

Introduction

This repository contains the dataset and codes described in the paper "Benchmark Dataset for Short-Term Market Prediction of Limit Order Book in China Markets". Five baseline models, inculding linear regression (LR), multilayer perceptron (MLP), convolutional neural network (CNN), long short term memory (LSTM), and CNN-LSTM, are tested on the proposed benchmark dataset.

Note

  1. All algorithms are implemented based on the deep learning framework PyTorch.
  2. Our PyTorch version is 1.7.0. If you are in a lower version, please modify the codes accordingly.

Abstract

Limit Order Book (LOB) has generated “big financial data” for analysis and prediction from both academic community and industry practitioners. This paper presents a benchmark LOB dataset of China stock market, covering a few thousand stocks for the period of June to September 2020. Experiment protocols are designed for model performance evaluation: at the end of every second, to forecast the upcoming volume-weighted average price (VWAP) change and volume over 12 horizons ranging from 1 second to 300 seconds. Results based on linear regression model and state-of-the-art deep learning models are compared. Practical short-term trading strategy framework based on the alpha signal generated is presented.

Keywords

High-Frequency Trading, Limit Order Book, Artificial Intelligence, Machine Learning, Deep Neural Network, Short-Term Price Prediction, Alpha Signal, Trading Strategies, China Stock Market

Models

  1. Configuration of the linear regression model: Linear Regression

  2. Configuration of the multilayer perceptron model: Multilayer Perceptron

  3. Configuration of the shallow LSTM model: Long Short Term Memory

  4. Configuration of the CNN model: Convolutional Neural Network

  5. Configuration of the CNN-LSTM model: CNN-LSTM

Data Format

The folder structure of the LOB dataset is like the following.

   .\LOB_data
         .\2020.6
	 .\2020.7
	 .\2020.8
	 .\2020.9
	 lob_sz_6789_train_val.txt
	 lob_sz_678_train.txt
	 lob_sz_9_val.txt 

"lob_sz_678_train.txt" is the file list used to train the machine learning models, and "lob_sz_9_val.txt" is the file list used to test the accuracy as the validation. In each folder under ".\LOB_data", there are monthly LOB features in ".csv" format for many different stocks. These ".csv" files store all the LOB features of stocks row by row consecutively. The detailed explaination of these LOB features can be found here (in English and in Chinese).

Installation and Usage

Please refer to the ReadMe.txt in ./lob_modeling to install and run experiments.

Results

  1. Model performance metrics for different horizons computed on the test folds Results

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.