GithubHelp home page GithubHelp logo

g0bel1n / ddql-optimal-execution Goto Github PK

View Code? Open in Web Editor NEW
7.0 3.0 0.0 8.07 MB

Double Deep Q-Learning for Optimal Execution implementation

License: MIT License

Python 100.00%
deep-learning double-dqn finance rl

ddql-optimal-execution's Introduction

Lucas Saban

Quant Researcher @QRT

Previously Double Degree Master student at Ensae Paris and MVA (ENS Paris-Saclay). Interested in Machine Learning, Deep Learning, Swarm intelligence.

I open source some of my projects here.

Currently working on :

  • Quantools : [Financial Engineering | Visualization | High Performance Python] A tool-box for quants (ML-oriented)
  • DDQL : [RL | Optimal Execution | Trading] Implementation of a paper.

Contact : lucas[dot]saban[at]ensae[dot]fr

Awards and Competitions

Some projects

Academy

  • Leveraging latent representations for efficient textual OOD detection. Paper, Repo (Not published)

  • Large scale joint Hyperparameters and feature selection using Multi-objective heuristic optimization Report, Presentation and Repo

  • Implementation of Fast Shapelet discoveries for time series classification Orignal Paper, Report and Repo

  • Implementation of Double Deep Q-Learning for Optimal Trading Execution Orignal Paper, Report and Repo

Industry

  • Spatio-temporel spectral clustering algorithm based on HDBSCAN for dynamic event recognition. Source is private but happy to talk about it.

  • A full-stack Image Classification project using Tensorflow, Transfer-learning and CNN's. An iOS app, that I coded in swift, is the front-end part of the project. It was realized with Augustin Cramer

Demo

  • A scikit-learn plugin for binary Classification tasks, delivered as a pypi package. The source is available here.

TAM_logo

Demo

The ants adapt to the change in their environnement. They quickly find a new path following the pheromones tracks.

gRavlaw

ddql-optimal-execution's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ddql-optimal-execution's Issues

load episodes from csv ?

https://api.github.com/g0bel1n/DDQL-optimal-execution/blob/8fb5bdca42a467e52167801c55360b4aa2c26fba/src/environnement/_env.py#L11

import pandas as pd 
from ._state import State

class MarketEnvironnement:
    
    def __init__(self, initial_inventory : float = 100.) -> None:

        self.historical_data = pd.read_csv("data/historical_data.csv")
        self.historical_data = self.historical_data.set_index("Date")

        # TODO: load episodes from csv ?
        # assignees: g0bel1n



        self.horizon = self.historical_data.shape[0]

        self.current_step = 0
        self.cols = self.historical_data.columns

        initial_state = self.historical_data.iloc[self.current_step, :].values + [initial_inventory, self.current_step]
        states_elements = self.cols + ['inventory', 'step']

        self.done = False

        self.state = State(states_elements, initial_state)    

    def step(self, action: int) -> tuple:
        # Execute one time step within the environment
        self.current_step += 1

        reward = self._get_reward(action)

        self.done = self.current_step == self.horizon - 1

        self.state['inventory'] = self.state['inventory'] + action
        self.state['step'] = self.state['step'] + 1


        if not self.done:
            self.state.update_state(**self.historical_data.iloc[self.current_step, :].values)

        return None
    
    def get_trading_episodes(self) -> tuple:
        # Return the trading episodes
        return None
    
    
    def _get_reward(self, action: int) -> float:
        return self.state[-1] if action == 0 else -self.state[-1]
    
    def reset(self) -> None:
        # Reset the state of the environment to an initial state
        self.current_step = 0
        self.state = self.historical_data.iloc[self.current_step, :].values
        return self.state
    
    

check if this is correct and clarify numpy/torch/State object

https://api.github.com/g0bel1n/DDQL-optimal-execution/blob/8fb5bdca42a467e52167801c55360b4aa2c26fba/src/agent/_agent.py#L70

from ._neural_net import QNet
from ._utils import get_device
from ._state import State

import torch
import torch.nn as nn
import torch.optim as optim

import numpy as np
import scipy


from typing import Optional

class Agent:

    def __init__(self, state_dict: Optional[dict] = None,  greedy_decay_rate: float = .1, target_update_rate: int = 100, initial_greediness : float = .2, mode :str = 'train', lr:float = 1e-3) -> None:

        self.device = get_device()
        print(f"Using {self.device} device")

        self.main_net = QNet().to(self.device)
        self.target_net = QNet().to(self.device)

    
        if state_dict is not None:
            self.main_net.load_state_dict(state_dict)
            self.target_net.load_state_dict(state_dict)

        self.greedy_decay_rate = greedy_decay_rate
        self.target_update_rate = target_update_rate
        self.greediness= initial_greediness

        self.mode = mode

        self.learning_step = 0

        if self.mode == 'train':
            self.optimizer = optim.RMSprop(self.main_net.parameters(), lr=lr)
            self.loss_fn = nn.MSELoss()



    def train(self) -> None:
        self.main_net.train()
        self.mode = 'train'

    def eval(self) -> None:
        self.main_net.eval()
        self.mode = 'eval'
    

    def _get_action(self, state) -> torch.Tensor:

        if np.random.rand() < self.greediness and self.mode == 'train':
            action = np.random.binomial(state['inventory'], 1/state['inventory'])
        else:
            action = self.main_net(state, action).argmax().item() #clarify
        
        return action
    
    def _update_target_net(self) -> None:
        self.target_net.load_state_dict(self.main_net.state_dict())


    def _complete_target(self, experience_batch : torch.Tensor) -> torch.Tensor:
        ids = torch.cat(torch.where(experience_batch['done'] == 0)[0], torch.where(experience_batch['predone'] == 0)[0])
        for experience in experience_batch[ids]:
            constraints = ({'type': 'ineq', 'fun': lambda x: x}, {'type': 'ineq', 'fun': lambda x: experience['inventory'] - x}) 
            # TODO: check if this is correct and clarify numpy/torch/State object
            # assignees: g0bel1n
            best_action = scipy.optimize.minimize(lambda x: -self.main_net(experience['next_state'], x), np.array([0.]), constraints=constraints).x
            target_complement = experience['gamma'] * self.target_net(experience['next_state'],best_action)
            experience['target'] += target_complement

        return experience_batch

    
    def learn(self, experience_batch : torch.Tensor) -> None:
        experience_batch = self._complete_target(experience_batch)
        dataloader = torch.utils.data.DataLoader(experience_batch, batch_size=32, shuffle=True)
        for batch in dataloader:
            target = batch['target'].unsqueeze(1)
            pred = self.main_net(batch['state'], batch['action'])
            loss = self.loss_fn(pred, target)
            self.optimizer.zero_grad()
            loss.backward()
            self.optimizer.step()

        self.learning_step += 1
        self.greediness = max(0.01, self.greediness * self.greedy_decay_rate)
        if self.learning_step % self.target_update_rate == 0:
            self._update_target_net()
            print(f"Target network updated at step {self.learning_step} with greediness {self.greediness:.2f}")

        

    def __call__(self, state) -> torch.Tensor:
        return self._get_action(state)
       
        
        

implement create_fake_LOB_data

following the same data structure outputs than in the paper

# TODO: implement create_fake_LOB_data

        raise ValueError("return_type must be either 'numpy' or 'torch'")


# TODO: implement create_fake_LOB_data
# following the same data structure outputs than in the paper
def create_fake_LOB_data(
    n_samples: int = 1000, n_features: int = 10, n_classes: int = 2
) -> tuple:
    """Creates fake LOB data for testing purposes."""
    
    pass

implement inventory_action_transformer according to Appendix A.1

and add tests for it in tests/test_data_utils.py (if possible) (might need to implement a function to compute the inverse of the transformation)

# TODO: implement inventory_action_transformer according to Appendix A.1

    :param inv_act_pairs: a tensor of shape (batch_size, 2) where the first column is the inventory and
    the second column is the action
    """
    # TODO: implement inventory_action_transformer according to Appendix A.1 
    # and add tests for it in tests/test_data_utils.py (if possible) (might need to implement a function to compute the inverse of the transformation)
    pass

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.