econ-ark / remark Goto Github PK

Replications and Explorations Made using the ARK

License: Apache License 2.0

Python 100.00%

remark's Introduction

R[eplications/eproductions] and Explorations Made using ARK (REMARK)

REMARKs are self-contained and complete projects, whose content here should be executable by anyone with a suitably configured computer or using nbreproduce.

Types of content include (see below for elaboration):

Explorations
- Use the Econ-ARK/HARK toolkit to demonstrate some set of modeling ideas
Replications
- Attempts to replicate important results of published papers written using other tools
Reproductions
- Code that reproduces ALL of the results of some paper that was originally written using the toolkit

For Authors

Each project lives in its own repository. To make a new REMARK, you can start with a skeleton REMARK-starter-example and add to it, or from an example of a complete project using the toolkit, BufferStockTheory, whose content (code, text, figures, etc) you can replace with your own.

REMARKs should adhere to the REMARK Standard.

For Editors

The REMARK catalog and Econ-ARK website configuration will be maintained by Editors.

Editorial guidelines are here.

REMARK Catalog

A catalog of all REMARKs is available under the REMARK tab at econ-ark.org.

The ballpark is a place for the set of papers that we would be delighted to have replicated in the Econ-ARK.

In cases where the replication's author is satisfied that the main results of the paper have been successfully replicated, we expect to approve pull requests for new REMARKs with minimal review, as long as they satsify the criteria described in the Standard.

We also expect to approve with little review cases where the author has a clear explanation of discrepancies between the paper's published results and the results in the replication attempt.

We are NOT intending this resource to be viewed as an endorsement of the replication; instead, it is a place for itb to be posted publicly for other people to see and form judgments on. Nevertheless, pull requests for attempted replications that are unsuccessful for unknown reasons will require a bit more attention from the Econ-ARK staff, which may include contacting the original author(s) to see if they can explain the discrepancies, or may include consulting with experts in the particular area in question.

Replication archives should contain two kinds of content (along with explanatory material):

Code that attempts to replicate key results of the paper
A Jupyter notebook that presents at least a minimal set of examples of the use of the code.

This material will all be stored in a directory with some short pithy name (a bibtex citekey might make a good directory name).

Code archives should contain:

All information required to get the replication code to run
- Including a requirements.txt file explaining the software requirements
An indication of how long it takes to run the reproduce.sh script
- One some particular machine whose characteristics should be described

Jupyter notebook(s) should:

Explain their own content ("This notebook uses the associated replication archive to demonstrate three central results from the paper of [original author]: The consumption function and the distribution of wealth)
Be usable for someone wanting to explore the replication interactively (so, no cell should take more than a minute or two to execute on a laptop)

Differences with DemARK

The key difference with the contents of the DemARK repo is that REMARKs are allowed to rely on the existence of local files and subdirectories (figures; data) at a predictable filepath relative to the location of the root.

For Maintainers

Command Line Interface cli.py

cli.py is an automated tool that facilitates:

cloning of REMARK repositories
linting (detection of missing files from a given REMARK)
building conda environments/docker images
- uses conda/repo2docker under the hood
executing reproduce.sh scripts within the built environments.

All artifacts generated by cli.py are stored in a newly created _REMARK folder.

Once you clone a REMARK you'll be able to find its contents inside of _REMARK/repos/…
Once you build/execute a REMARK you'll be able to find a corresponding log file from that process inside of _REMARK/logs/…

cli.py has built-in parallelization specified by the -J flag for many actions.

Requirements

python 3.9 or newer.
contents requirements.txt

Action

Clone/Pull

pulling REMARKs (these are populated in the _REMARKS folder)

python cli.py pull --all         # git clone all REMARKS
python cli.py pull {remark_name} # git clone one or more REMARK(s)

Lint

Shows what files are missing from given REMARK(s). The linter uses the file-tree print out from STANDARD.md and compares it to the current files found in the currently cloned REMARK(s).

python cli.py lint --all # detect missing files from all REMARKs
python cli.py lint {remark_name} # detect missing files from one or more REMARK(s)

Build

Building conda environments and/or docker images.

python cli.py build conda --all          # build conda environments for all REMARKs (stored as a `condaenv` folder inside the cloned REMARK repo)
python cli.py build docker --all         # build docker images for all REMARKs (stored as a `condaenv` folder inside the cloned REMARK repo)
python cli.py build conda {remark_name}  # build conda environments for one or more REMARK(s)
python cli.py build docker {remark_name} # build docker image(s) for one or more REMARK(s)

The primary difference between conda and docker for builds are that docker will be more flexible for multilanguage REMARKs. It leverages repo2docker (same tool that mybinder uses) to create docker images from repositories.

Execute

Automated execution within built conda environments/docker containers.

python cli.py execute conda --all          # execute reproduce.sh via conda for all REMARKs
python cli.py execute docker --all         # execute reproduce.sh via docker for all REMARKs
python cli.py execute conda {remark_name}  # execute reproduce.sh via conda for one or more REMARK(s)
python cli.py execute docker {remark_name} # execute reproduce.sh via docker for one or more REMARK(s)

Both the build and execute subcommands have an optional --jobs argument to specify the number of jobs to run in parallel when building/executing.

Logs/Summarize

python cli.py logs # view most recent logs for all previous building/executing commands

Clean/Remove

python cli.py clean conda --all          # remove all built conda environments
python cli.py clean docker --all         # remove all build docker images
python cli.py clean conda {remark_name}  # remove conda environment(s) from specified REMARK(s)
python cli.py clean docker {remark_name} # remove docker images built from specified REMARK(s)

remark's People

Contributors

Stargazers

Watchers

remark's Issues

move out PCBlogPost as a different repo

move PCBlogPost as a different repo and create a markdown file according to the layout and create a new link of the blogpost.

Where should .bib exist in this REMARK repo?

The BibTeX entry could be of the REMARK or of the original paper.

IMO if a BibTeX entry exists it should exist inside the REMARK itself rather than here (the REMARKs index).

@llorracc Thoughts?

requirements.txt requires matplotlib 2.1.2 but that doesn't work with python 3.8

I looked again at the transcript of my code that creates my VM and it was in the installation of the REMARKs rather than DemARKs that the error occurred.

+ pip install -r requirements.txt
Collecting matplotlib==2.1.2
  Downloading matplotlib-2.1.2.tar.gz (36.2 MB)
    ERROR: Command errored out with exit status 1:
     command: /usr/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-c6nqlpuv/matplotlib/setup.py'"'"'; __file__='"'"'/tmp/pip-install-c6nqlpuv/matplotlib/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-c6nqlpuv/matplotlib/pip-egg-info
         cwd: /tmp/pip-install-c6nqlpuv/matplotlib/
    Complete output (71 lines):
    ============================================================================
    Edit setup.cfg to change the build options
    
    BUILDING MATPLOTLIB
                matplotlib: yes [2.1.2]
                    python: yes [3.8.2 (default, Jul 16 2020, 14:00:26)  [GCC
                            9.3.0]]
                  platform: yes [linux]
    
    REQUIRED DEPENDENCIES AND EXTENSIONS
                     numpy: yes [version 1.19.1]
                       six: yes [using six version 1.14.0]
                  dateutil: yes [using dateutil version 2.7.3]
    backports.functools_lru_cache: yes [Not required]
              subprocess32: yes [Not required]
                      pytz: yes [using pytz version 2020.1]
                    cycler: yes [using cycler version 0.10.0]
                   tornado: yes [using tornado version 6.0.4]
                 pyparsing: yes [using pyparsing version 2.4.7]
                    libagg: yes [pkg-config information for 'libagg' could not
                            be found. Using local copy.]
                  freetype: no  [The C/C++ header for freetype2 (ft2build.h)
                            could not be found.  You may need to install the
                            development package.]
                       png: no  [pkg-config information for 'libpng' could not
                            be found.]
                     qhull: yes [pkg-config information for 'libqhull' could not
                            be found. Using local copy.]
    
    OPTIONAL SUBPACKAGES
               sample_data: yes [installing]
                  toolkits: yes [installing]
                     tests: no  [skipping due to configuration]
            toolkits_tests: no  [skipping due to configuration]
    
    OPTIONAL BACKEND EXTENSIONS
                    macosx: no  [Mac OS-X only]
                    qt5agg: no  [PySide2 not found; PyQt5 not found]
                    qt4agg: no  [PySide not found; PyQt4 not found]
    Unable to init server: Could not connect: Connection refused
    Unable to init server: Could not connect: Connection refused
                   gtk3agg: yes [installing, version 3.20.24]
    Unable to init server: Could not connect: Connection refused
    Unable to init server: Could not connect: Connection refused
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-c6nqlpuv/matplotlib/setup.py", line 216, in <module>
        pkg_help = pkg.install_help_msg()
      File "/tmp/pip-install-c6nqlpuv/matplotlib/setupext.py", line 595, in install_help_msg
        release = platform.linux_distribution()[0].lower()
    AttributeError: module 'platform' has no attribute 'linux_distribution'
                 gtk3cairo: yes [installing, version 3.20.24]
                    gtkagg: no  [Requires pygtk]
                     tkagg: yes [installing; run-time loading from Python Tcl /
                            Tk]
                     wxagg: no  [requires wxPython]
                       gtk: no  [Requires pygtk]
                       agg: yes [installing]
                     cairo: yes [installing, pycairo version 1.16.2]
                 windowing: no  [Microsoft Windows only]
    
    OPTIONAL LATEX DEPENDENCIES
                    dvipng: no
               ghostscript: yes [version 9.50]
                     latex: no
                   pdftops: yes [version 0.86.1]
    
    OPTIONAL PACKAGE DATA
                      dlls: no  [skipping due to configuration]
    
    ============================================================================
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

BufferStockTheory REMARK

The linked BufferStockTheory repository looks empty.
The repository it forks looks overwhelmed with auto-compiled files.

https://github.com/econ-ark/BufferStockTheory

It is not possible to bring either of these repositories up to the REMARK standards in their current form. See #89

Code for validating minimal REMARK requirements

Related to #23 #26 #28

Given the minimal requirements for REMARKs, a script that validates whether a new REMARK meets these requirements.

For HARK 1.0, Need to pin all REMARKs so they have requirements.txt and work

check REMARKs for: tag, version in metadata, metadata file

Check the viable REMARKs for:

repository tag
metadata file
match between Version metadata field and repository tag.

Linear interpolation error in SolvingMicroDSOP

I've run across an error in the linear interpolator when I'm running the SolvingMicroDSOP replication.

I've uploaded a branch here that has a "clean demonstration" of the error: https://github.com/npalmer-professional/REMARK/

... and specifically see this "slice across the objective surface:" https://github.com/npalmer-professional/REMARK/blob/interpolation_error_example/REMARKs/SolvingMicroDSOPs/Figures/SMMslice_cubicbool-false.pdf

You an create this image by running the function "make_contour_slice()" in /REMARK/REMARKs/SolvingMicroDSOPs/Code/StructEstimation.py
(the default parameters will create the figure).

Experimenting on the command line I found that if I pull the parameters at one of those spikes and look at the consumption function using that parameter set, you see how a couple of the consumption functions got a little crazy. (Not included here because haven't cleaned up that code for demonstration).

Funny enough I remember something like this 2 or 3 years ago, I ran across this same error but reversed -- the cubic spline gave the error, while the linear spline didn't. Now it appears to be reversed; If I run the same "slice" for the cubic spline I don't get the problem (but I haven't explored cubic spline over the whole space to see if there are errors elsewhere). @mnwhite I think you worked on that back then and fixed the cubic spline issue, and both cubic and linear were working back then. I think this was all way before the 2.7 -> 3.x conversion for what that is worth in terms of very rough timeline.

Edit: Also potentially important, this only shows up when you allow agents to borrow.

REMARK contributor docs

REMARK has specific contribution requirements.
These should be documented in this repository.

Details here:
econ-ark/HARK#410 (comment)

Integrate remarkify into templating/instructions

remarkify scripts are now in this repository.

I need to look at these and figure out how they fit in with the other REMARK requirements discussed in other issues.

Scaffolding ideal REMARK cross-environment workflow

Assuming the ideal case for REMARKs, there are at least two ways to execute the code:

With the do_all.py script
As a Jupyter notebook

Ideally, the Jupyter notebook is available as a .py file using Jupyter text.
(Not clear to me whether the standard is to have the .py file, or the notebook, checked into the repository).

Ideally, both of these ways of running the code work in a properly prepared environment.
"Working" means:

For the do_all.py:
- Print statements go to stdout, displayed in the command line
- Pyplot figures are displayed to the user in a OS native window
- Pyplot figures are also saved to a Figures/ directory as .png
For the Jupyter notebook
- Print statements go to the notebook
- Pyplot figure are displayed inline in the notebook
- Pyplot figures are also saved to the Figures/ directory.

This notebook detection and alternative display logic is now intended for something like a remark.py module that can be imported by all REMARKs.

One issue with implementing this:

It looks like there's no way to detect the path of an Jupyter Notebook file from within the notebook. So getting the right relative path to the Figures/ directory may not be possible.

Seems like not saving pngs when the code is executed in a notebook isn't a huge loss.

Simple template REMARK

A minimal REMARK for use as a template.

Release cycle and dependency management: design and documentation

Some questions have been raise about REMARK release cycle and dependency management.
This is a ticket to get clarity on those issues and document them.

This is my proposal:

REMARKs are analogous to publications.
As static artifacts, their dependencies should be locked to specific versions. (This includes HARK)
They may be locked to different versions.
REMARKs should have release numbers, because their authors might want to go back and change them.

My understanding is that there is an obstacle to this proposal:

Most of the REMARKs are not yet "completed", or in a "publishable" form

As a consequence of the latter issue, the REMARK repository is currently being developed in an ad hoc way, with no releases or dependency locking at all.

Style Guide: first cell in notebooks should have all the imports

Turn 'Solving MicroDSOPs' into a notebook

SolvingMicroDSOPs SCF data construction will fail with new approach to data

I was just looking through SolvingMicroDSOPs and noticed that there is a file

./Calibration/SetupSCFData.py

which certainly will not work with the new way of doing things. Looks like it should be an easy fix for @MridulS

Relatedly, in principle the SCF data file that we have put in the datasets storage should be replicable from the raw SCF data using the relevant code in cstwMPC:

Code/Stata/SCF

but in fact there does not seem to be any clear mapping between the code in that directory and the SCFwealthDataReduced.txt file. This is not worth the effort to fix; I'm writing this more to record for posterity the fact that I've looked and been unable to find any code that creates it.

standarize REMARKs

https://github.com/econ-ark/BlanchardPA2019 : needs reproduce.sh and pinned dependencies
https://github.com/econ-ark/CGMPortfolio needs pinned dependencies
https://github.com/econ-ark/KrusellSmith needs pinned econ-ark dependency
https://github.com/econ-ark/PortfolioChoiceBlogPost needs reproduce.sh and dependencies
https://github.com/econ-ark/EndogenousRetirement needs reproduce.sh and requirements file
https://github.com/econ-ark/BayerLuetticke : needs reproduce.sh and pinned dependencies, with binder folder

llorracc repositories

https://github.com/econ-ark/BufferStockTheory linked repository is default to gh-pages branch, derived from llorrac repo version.
BLOCK https://github.com/llorracc/cstwMPC pin econ-ark, needs reproduce.sh (no merge rights)
ERROR https://github.com/llorracc/ctDiscrete needs reproduce.sh and requirements file
ERROR https://github.com/llorracc/cAndCwithStickyE needs reproduce.sh

Lower priority:

ERROR https://github.com/econ-ark/cstwMPC-RHetero pin econ-ark, pin econ-ark dependency

Redo README, points towards to REMARK-template and BST

drop cstwMPC-RHetero from REMARK index

Bayer & Luetticke REMARK

Dual of econ-ark/HARK#361

econ-ark-tools

Make an econ-ark-tools package
Put remark.py code into it, see #38
Install/import this in econ-ark package (see HARK setup.py)
Use this code in a REMARK

SolvingMicroDSOPs Jupyter Notebook for instruction

#30 (comment)

Include a tag reference in the REMARK standard

The REMARK standard needs to be explicit about pointing to a specific tag/release of a repository, not a branch (which can change over time).

Create a template for REMARK PR.

Rules/checklist before merging in a REMARK markdown file and including it in the index.

Clean up SolvingMicroDSOPs REMARK

The SolvingMicroDSOPs REMARK seems to both:

A Code/StructEstimation.py and a Calibration/EstimationParameters.py
Optional (in a try: block) imports of the corresponding code from HARK, i.e:

https://github.com/econ-ark/REMARK/blob/master/REMARKs/SolvingMicroDSOPs/do_all.py#L68

and...

https://github.com/econ-ark/REMARK/blob/master/REMARKs/SolvingMicroDSOPs/Code/StructEstimation.py#L44

From here, it looks like the imports from HARK are redundant, creating an unnecessary dependence on HARK, and should be removed. @llorracc is that correct?

This relates to HARK #440. SolvingMicroDSOPs is an application of HARK that could be moved into an example/, or could exist as a REMARK, but maybe should not be both.

If there is some aspect of the SolvingMicroDSOPs application that is especially reproducible, I would argue this should be split out into a library classes and methods.

New cstwMPC REMARK

There are differences in couple of files in HARK/cstwMPC and REMARK/cstwMPC.
For example https://github.com/econ-ark/HARK/blob/master/HARK/cstwMPC/cstwMPC.py and https://github.com/llorracc/cstwMPC/blob/master/Code/Python/cstwMPC.py

What's the source of truth? @llorracc

Make Krusell-Smith ReMARK from the Krusell-Smith DeMARK

In preparation for integration of HARK and dolARK, make a REMARK from the Krusell-Smith DemARK

Find a way to make REMARKs submodules always point to the master branch

Enumerate all REMARK requirements

I'm running into some complications trying to build out a REMARK (#17)

Because different REMARKs implement different feature sets, it's hard for me to infer what the necessary requirements of a REMARK are.

In particular, it looks like REMARKs have scaffolding that allow them to be executed in several different ways:

Within a Jupyter notebook
Standalone with ipython but not in a notebook
Sometime 'Generating' saved features, sometimes just displaying them.

I'm a little lost as to which of these cases needs to be accommodated for a REMARK.

To the extent that there is a standard set of requirements, it would make sense to encapsulate this into a standard module that can be imported into all REMARKs.

That would make it easier to separate the details of the presentation, the details of the environment, and the logic of the economic simulation.

standardise information in the first cell of a notebook

example -> https://raw.githubusercontent.com/econ-ark/REMARK/master/REMARKs/KrusellSmith.md the YAML part should be inside the first cell of the KrusellSmith notebook, and econ-ark.org should be able to extract it and construct GitHub Pages.

metadata template is unclear

What is cff-version
do we need a field for commit if we have a field for version that points to a tag?
should explain the difference beween remark-name and title
papers are often published with a year but not a day or even a month. how should the date-published-original-paper field be filled in these cases?
are the authors listed the authors of the original paper, or of the REMARK?

pinning Python version number for REMARKs

It's a problem if a REMARK uses a language feature that is deprecated and dropped.

Currently in the environment setup (limited to requirements.txt) we have no way of specifying a python version

llorracc/ctDiscrete#4

@MridulS do you think there's a way to code the python version into reproduce.sh?

OSS License in standard?

Repos that it links to should have their own LICENSE files.

Suggested license is Apache 2.0.

docker tool script for each REMARK

@MridulS, I've been thinking about what to do next with your docker tool (what should I call it?), and concluded that a joint test of the tool and our existing REMARKs would be for you to construct a command, for each REMARK, to execute

A do_min_code.sh command if one exists;
A do_all_code.sh command if one exists.

If instead what exists is a do_min.py or do_all.py command, but no do_min_code.sh or do_all_code.sh, then you should first create the do_min_code.sh and/or do_all_code.sh commands (which in this case would be bash scripts with one line: ipython do_min.py or ipython do_all.py.

BufferStock REMARK do_all.py crashes loading IPython shell

Running this do_all.py from the command line:
https://github.com/llorracc/BufferStockTheory/blob/5ccb22b72cb9fd46245b8954459ce5f44ab3c7ea/Code/Python/do_all.py

gets me the following error in the terminal:

Traceback (most recent call last):
  File "do_all.py", line 3, in <module>
    import BufferStockTheory
  File "/home/sb/projects/econ-ark/REMARK/REMARKs/BufferStockTheory/Code/Python/BufferStockTheory.py", line 154, in <module>
    get_ipython().run_line_magic('matplotlib', 'inline')
AttributeError: 'NoneType' object has no attribute 'run_line_magic'

This looks like it's because the get_ipython() method returns None when no InteractiveShell instance is registered:
https://ipython.readthedocs.io/en/stable/api/generated/IPython.core.getipython.html

(Please let me know if these issues are better tracked to the BufferStockTheory repo in llorracc.)

Get the useful code from this (deprecated) repository into this one, then tell Shauna.

Finalize most viable REMARKS

Finalize these REMARKs:

Automated testing of REMARK repositories to meet the standard

It would be great to have

an automated test that inspects a repository and makes sure it meets the standard
a process for using this test when PRs with new repositories are submitted.

We need a way to designate SOME files for SOME REMARKs as things to test on a regular basis

Don't want to test "do_all" very often, but "do_min" could perhaps be tested every week or two?

BufferStock REMARK notebook hangs on first code cell

The first executable code cell in the BufferStock REMARK notebook (linked here from the git repo submodule):
https://github.com/llorracc/BufferStockTheory/blob/5ccb22b72cb9fd46245b8954459ce5f44ab3c7ea/Code/Python/BufferStockTheory.ipynb

hangs upon execution of the first code cell.

Command line output from Jupyter lab indicates a 404 connection to the local JupyterLab endpoint.

[I 11:13:19.120 LabApp] Saving file at /BufferStockTheory.ipynb
[W 11:13:19.120 LabApp] Saving untrusted notebook BufferStockTheory.ipynb
[W 11:13:22.667 LabApp] 404 GET /metrics?1572275602658 (127.0.0.1) 3.85ms referer=http://localhost:8888/lab
[W 11:13:27.679 LabApp] 404 GET /metrics?1572275607671 (127.0.0.1) 3.81ms referer=http://localhost:8888/lab
[W 11:13:32.689 LabApp] 404 GET /metrics?1572275612681 (127.0.0.1) 3.58ms referer=http://localhost:8888/lab
[W 11:13:37.700 LabApp] 404 GET /metrics?1572275617692 (127.0.0.1) 3.82ms referer=http://localhost:8888/lab
[W 11:13:42.711 LabApp] 404 GET /metrics?1572275622702 (127.0.0.1) 3.84ms referer=http://localhost:8888/lab

Align BlanchardPA2019 to standards

See #23

layout of a REMARK markdown - initial draft

URL to the repo [button]
Title of the REMARK
One line summary of the REMARK
Abstract of the REMARK
URL for the notebook [if exists] [button]
Authors of the Paper [if different from REMARK author]
Authors of the REMARK
bib [if exists]
DOI [if exists] [button]
link to one figure [optional]
link to a downloadable pdf [if exists] [button]
any other link [optional]

document metadata requirements for REMARKs in README

To be a REMARK, you need metadata in this repo that's used by the website.

This requirement should be documented in the README.

Once it's there I can go through and ticket up the metadata needs.

An index for listing REMARKs to be ingested into the website
A remarXive index for those REMARKs that have reached a standard of "scholarly publication"
An index of REMARKS that we trust and want to run with HARK 'master' to catch errors