GithubHelp home page GithubHelp logo

vincent-picaud / deep_partial_least_squares Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kem975/deep_partial_least_squares

0.0 0.0 0.0 121.21 MB

Source code for Deep Partial Least Squares for Empirical Asset Pricing.

Home Page: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4137647

R 100.00%

deep_partial_least_squares's Introduction

Deep Partial Least Squares

Matthew Dixon, Nick Polson, and Kemen Goicoechea, Deep Partial Least Squares for Empirical Asset Pricing, 2022. This code accompanies the paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4137647

The code is a working repository, still being tested, and subject to further change.

Configuration

Anaconda

  • Install conda 4.12 or a later version
  • Anaconda environment configuration is in DPLS.yml
  • Create the conda virtual environment by typing conda env create -f DPLS.yml

R Packages

  • Link the anaconda environment to R: In RStudio: Tools > Global Options > Python > Select > Conda Environment
  • Run the first 3 lines of src/main.r to obtain the R packages

Source Code

The repository contains one main file: main.R, which calls functions in other files. This code contains 4 types of models: LASSO (glmnet package), PLS (pls package), Neural Network (keras), Deep Partial Least Squares (pls + keras).

Trained models for NN and DPLS are in the data folder. However, these are Tensorflow Python saved models and may not work across different CPU architectures. Tensorflow saved models are needed to use automatic differentiation. If you do not wish to use this functionnality, you can train the models using RKeras. However this will take approximately one hour to perform across all 330 periods.

Data

The factor data has been collected from a financial data vendor and santized to avoid violation of data licensing agreement and non-commercial utility. The data, for non-commercial use only, can be downloaded from: https://www.dropbox.com/s/4o86a3p2n7kawst/ScaledData.RData?dl=0

Then, move ScaledData.RData to the data folder.

The actual symbols have been remapped and the factors have been normalized in each period. The stocks are characterized by GICS and use dummy variables to represent the four difference catergories:

industry=[10, 20, 30 ,40 ,50, 60, 70]

subindustry=[10, 15, 20, 25, 30 ,35, 40 ,45, 50, 60, 70 ,80]

sector=[10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60]

indgroup=[10, 20, 30 ,40, 50]

Note that the first element in each list is dummatized as 1 0 0 0 .. and the next as 0 1 0 0 ... etc.

ID Symbol Value Factors
1 B/P Book to Price
2 CF/P Cash Flow to Price
3 E/P Earning to Price
4 S/EV Sales to Enterprise Value (EV). EV is given by
EV=Market Cap + LT Debt + max(ST Debt-Cash,0),
where LT (ST) stands for long (short) term
5 EB/EV EBIDTA to EV
6 FE/P Forecasted E/P. Forecast Earnings are calculated from Bloomberg earnings consensus estimates data.
For coverage reasons, Bloomberg uses the 1-year and 2-year forward earnings.
17 DIV Dividend yield. The exposure to this factor is just the most recently announced annual net dividends
Stocks with high dividend yields have high exposures to this factor.
--- --- ---
Size Factors
--- --- ---
8 MC Log (Market Capitalization)
9 S Log (Sales)
10 TA Log (Total Assets)
--- --- ---
Trading Activity Factors
--- --- ---
11 TrA Trading Activity is a turnover based measure.
Bloomberg focuses on turnover which is trading volume normalized by shares outstanding.
This indirectly controls for the Size effect.
The exponential weighted average (EWMA) of the ratio of shares traded to shares outstanding:
In addition, to mitigate the impacts of those sharp shortlived spikes in trading volume,
Bloomberg winsorizes the data:
first daily trading volume data is compared to the long-term EWMA volume(180 day half-life),
then the data is capped at 3 standard deviations away from the EWMA average.
--- --- ---
Earnings Variability Factors
--- --- ---
12 EaV/TA Earnings Volatility to Total Assets.
Earnings Volatility is measured
over the last 5 years/Median Total Assets over the last 5 years
13 CFV/TA Cash Flow Volatility to Total Assets.
Cash Flow Volatility is measured over the last 5 years/Median Total Assets over the last 5 years
14 SV/TA Sales Volatility to Total Assets.
Sales Volatility over the last 5 years/Median Total Assets over the last 5 year
--- --- ---
Volatility Factors
--- --- ---
15 RV Rolling Volatility which is the return volatility over the latest 252 trading days
16 CB Rolling CAPM Beta which is the regression coefficient
from the rolling window regression of stock returns on local index returns
--- --- ---
Growth Factors
--- --- ---
7 TAG Total Asset Growth is the 5-year average growth in Total Assets
18 EG Earnings Growth is the 5-year average growth in Earnings

deep_partial_least_squares's People

Contributors

kem975 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.