Fama French Factor Finder

Background

After reading The Incredible Shrinking Alpha 2nd edition: How to be a successful investor without picking winners and Your Complete Guide to Factor-Based Investing: The Way Smart Money Invests Today by Larry Swedroe and Andrew Berkin, I decided that I needed a way to find investments with exposure to "factors". The factors I am searching for are the unique set of factors found in the Fama French five factor model and the Carhart four factor model: (1) market risk, (2) small caps, (3) stocks with a high book-to-market ratio -- i.e. value stocks -- (4) firms with high operating profitability, and (5) firms that invest conservatively. These factors are also known market minus risk free, small minus big, high minus low, robust minus weak, conservative minus aggressive respectively. And they get abbreviated as acronyms.

This application combines the following data sources:

Factor return data from Ken French's data library.
Investment metadata and return data from the Yahoo Finance API.
- With some help on the metadata from the Seeking Alpha API.
Ticker symbols by market types (US, Developed ex US, and Emerging) found with the help of Fidelity's ETF screener.
- I look for ETFs of equity funds that are not leveraged or inverse, are not thematic, and have the country exposure appropriate for their market type.

I am looking for funds that show returns that are statistically significantly similar to the returns of the factors in the Ken French data library.

I try to find the most optimal combination of funds for each market type. My criteria resembles the Sharpe ratio (mean / standard deviation) -- I want maximum exposure to all five factors, and I want my exposure to be divided as equally as possible across all of them. Inspired by this blog post on how to optimize a portfolio's Sharpe ratio, I employ scipy's SLSQP optimizer to do this. I have almost no understanding of the math behind this optimizer. (An gentle intro for the layman, a cookbook for the expert.)

The optimizer can easily get stuck in local optimums, so I run it 100 times, each time with a random sampling of 80% of the data. This seems to mitigate this problem.

Getting started

This application uses docker and docker-compose and run.

Make a secrets.env file based on the template secrets.env.example
```
cp secrets.env.example secrets.env
```
Fill out the secrets.env file
Build the application's image
```
docker-compose build
```
Start the database
```
docker-compose up -d db
```
Step into the application development environment
```
docker-compose run --rm app bash
```
Once inside, run the application
```
python .
```

Developing

Install a new package

Add the package name to requirements.in. Then, while exec-ed into the container, but not while running the application, run:
```
pip-compile --output-file=- > requirements.txt
```
This will write to requirements.txt. For more details, see this stackoverflow.
This new package will be gone once you exit the container. But since it's still listed in requirements.txt, you can bake it into all future containers by rebuilding the image
```
docker-compose build
```

Play around with the data frame

Set a break point at the end of __main__.py, and run python . to catch it. Then play around with the data frame.

import pdb; pdb.set_trace() # set a break point
(Pdb) df # look at the data frame
(Pdb) df.sort_values(by=['coef'], ascending=False) # sort it
(Pdb) df.head(20) # look at the first twenty rows

For ideas about how to further manipulate the data frame, google "pandas cheat sheet".

Conclusions

The world market is roughly divided four-eighths US, three-eighths Developed ex US, and one-eighth Emerging. I intend to do the same with my equity allocation.

Developed markets ex US

Ken French's data library defines Developed market ex US countries as, "Australia, Austria, Belgium, Canada, Switzerland, Germany, Denmark, Spain, Finland, France, Great, Greece, Hong Kong, Ireland, Italy, Japan, Netherlands, Norway, New Zealand, Portugal, Sweden, and Singapore".

For the size and value factors, my choice is and RODM. My data frame contained four choices, but after some manual research I excluded three of them for containing non-Developed countries, and one for being actively managed. Coincidentally, the remaining choice had the best expense ratio and the lowest yield (which is better for non-tax-advantaged accounts).

smb = df[(df.coef >= 0) & (df.factor == 'small_minus_big')]
hml = df[(df.coef >= 0) & (df.factor == 'high_minus_low')]
neg = df[df.coef <= 0]

df[~df.ticker.isin(neg.ticker)][(df.ticker.isin(smb.ticker)) & (df.ticker.isin(hml.ticker))].sort_values(by=['coef_sum', 'ticker', 'factor'])

coef  tvalue  pvalue             factor ticker  coef_sum   yield   expense ratio   actively managed  Non-Developed Countries
0.30    2.16    0.03     high_minus_low   RODM      0.94   2.78%        0.29%            False
0.46    2.77    0.01  robust_minus_weak   RODM      0.94   2.78%        0.29%            False
0.18    2.05    0.04    small_minus_big   RODM      0.94   2.78%        0.29%            False
0.23    2.21    0.03     high_minus_low   FNDC      1.23   2.76%        0.39%            False       South Korea (7.12%)
0.57    4.71    0.00  robust_minus_weak   FNDC      1.23   2.76%        0.39%            False       South Korea (7.12%)
0.43    6.94    0.00    small_minus_big   FNDC      1.23   2.76%        0.39%            False       South Korea (7.12%)
0.43    2.82    0.01     high_minus_low   FYLD      1.41   3.72%        0.59%            True        China (9.06%)
0.71    3.97    0.00  robust_minus_weak   FYLD      1.41   3.72%        0.59%            True        China (9.06%)
0.27    2.99    0.00    small_minus_big   FYLD      1.41   3.72%        0.59%            True        China (9.06%)
0.59    2.52    0.01     high_minus_low    FID      1.85   3.81%        0.60%            False       China (4.25%), South Korea (3.46%)
1.01    3.67    0.00  robust_minus_weak    FID      1.85   3.81%        0.60%            False       China (4.25%), South Korea (3.46%)
0.25    1.75    0.08    small_minus_big    FID      1.85   3.81%        0.60%            False       China (4.25%), South Korea (3.46%)

For the momentum factor, my choice is IMOM. It has the strongest momentum score out of any of the funds, and it has good scores in a few other factors too.

df[df.ticker == 'IMOM']

coef  tvalue  pvalue                factor ticker
0.63    1.91    0.06     robust_minus_weak   IMOM
0.36    1.84    0.07       small_minus_big   IMOM
0.54    4.23    0.00  winners_minus_losers   IMOM

Emerging markets

Ken French's data library defines Emerging market countries as, "Argentina, Brazil, Chile, China, Colombia, Czech Republic, Egypt, Greece, Hungary, India, Indonesia, Malaysia, Mexico, Pakistan, Peru, Philippines, Poland, Qatar, Russia, Saudi Arabia, South Africa, South Korea, Taiwan, Thailand, Turkey, United Arab Emirates."

For the size and value factors, my choice is EEMS, with a 0.71% expense ratio. It's a "classic" choice since it's the only one that simultaneously exhibits small cap and value tilts at the same time, and it avoids the common problem of having a negative momentum tilt.

smb = df[(df.coef >= 0) & (df.factor == 'small_minus_big')]
hml = df[(df.coef >= 0) & (df.factor == 'high_minus_low')]

df[(df.ticker.isin(smb.ticker)) & (df.ticker.isin(hml.ticker))].sort_values(by=['coef_sum', 'ticker', 'factor'])

                               coef  tvalue  pvalue                         factor ticker
small_minus_big                0.62    8.07    0.00                small_minus_big   EEMS
high_minus_low                 0.14    1.47    0.14                 high_minus_low   EEMS
conservative_minus_aggressive -0.24   -1.96    0.05  conservative_minus_aggressive   EEMS
winners_minus_losers           0.08    1.34    0.18           winners_minus_losers   EEMS

For the momentum factor, my choice is PIE, with a 0.90% expense ratio. It is the fund with the strongest momentum tilt.

df[df.ticker == 'PIE']

                      coef  tvalue  pvalue                factor ticker
high_minus_low       -0.32   -2.05    0.04        high_minus_low    PIE
winners_minus_losers  0.39    5.22    0.00  winners_minus_losers    PIE

Pre-Emerging Markets

For the sake of diversity, I'm considering FM, with a 0.79% expense ratio. It calls itself a "Frontier" market fund, with exposure to even more marginal countries. As the seeking alpha article notes, this market type has relatively low correlation to other market types. This is a positive in its own right, but it means I can't use any of Ken French's data to analyze it for desirable tilts.

johnnymo87 / ff-factor Goto Github PK

ff-factor's Introduction

Fama French Factor Finder

Background

Getting started

Developing

Install a new package

Play around with the data frame

Conclusions

Developed markets ex US

Emerging markets

Pre-Emerging Markets

ff-factor's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs