GithubHelp home page GithubHelp logo

remark's Introduction

R[eplications/eproductions] and Explorations Made using ARK (REMARK)

REMARKs are self-contained and complete projects, whose content here should be executable by anyone with a suitably configured computer or using nbreproduce.

Types of content include (see below for elaboration):

  1. Explorations
    • Use the Econ-ARK/HARK toolkit to demonstrate some set of modeling ideas
  2. Replications
    • Attempts to replicate important results of published papers written using other tools
  3. Reproductions
    • Code that reproduces ALL of the results of some paper that was originally written using the toolkit

For Authors

Each project lives in its own repository. To make a new REMARK, you can start with a skeleton REMARK-starter-example and add to it, or from an example of a complete project using the toolkit, BufferStockTheory, whose content (code, text, figures, etc) you can replace with your own.

REMARKs should adhere to the REMARK Standard.

For Editors

The REMARK catalog and Econ-ARK website configuration will be maintained by Editors.

Editorial guidelines are here.

REMARK Catalog

A catalog of all REMARKs is available under the REMARK tab at econ-ark.org.

The ballpark is a place for the set of papers that we would be delighted to have replicated in the Econ-ARK.

In cases where the replication's author is satisfied that the main results of the paper have been successfully replicated, we expect to approve pull requests for new REMARKs with minimal review, as long as they satsify the criteria described in the Standard.

We also expect to approve with little review cases where the author has a clear explanation of discrepancies between the paper's published results and the results in the replication attempt.

We are NOT intending this resource to be viewed as an endorsement of the replication; instead, it is a place for itb to be posted publicly for other people to see and form judgments on. Nevertheless, pull requests for attempted replications that are unsuccessful for unknown reasons will require a bit more attention from the Econ-ARK staff, which may include contacting the original author(s) to see if they can explain the discrepancies, or may include consulting with experts in the particular area in question.

Replication archives should contain two kinds of content (along with explanatory material):

  1. Code that attempts to replicate key results of the paper
  2. A Jupyter notebook that presents at least a minimal set of examples of the use of the code.

This material will all be stored in a directory with some short pithy name (a bibtex citekey might make a good directory name).

Code archives should contain:

  • All information required to get the replication code to run
    • Including a requirements.txt file explaining the software requirements
  • An indication of how long it takes to run the reproduce.sh script
    • One some particular machine whose characteristics should be described

Jupyter notebook(s) should:

  • Explain their own content ("This notebook uses the associated replication archive to demonstrate three central results from the paper of [original author]: The consumption function and the distribution of wealth)
  • Be usable for someone wanting to explore the replication interactively (so, no cell should take more than a minute or two to execute on a laptop)

Differences with DemARK

The key difference with the contents of the DemARK repo is that REMARKs are allowed to rely on the existence of local files and subdirectories (figures; data) at a predictable filepath relative to the location of the root.

For Maintainers

Command Line Interface cli.py

cli.py is an automated tool that facilitates:

  • cloning of REMARK repositories
  • linting (detection of missing files from a given REMARK)
  • building conda environments/docker images
    • uses conda/repo2docker under the hood
  • executing reproduce.sh scripts within the built environments.

All artifacts generated by cli.py are stored in a newly created _REMARK folder.

  1. Once you clone a REMARK you'll be able to find its contents inside of _REMARK/repos/…
  2. Once you build/execute a REMARK you'll be able to find a corresponding log file from that process inside of _REMARK/logs/…

cli.py has built-in parallelization specified by the -J flag for many actions.

Requirements

  • python 3.9 or newer.
  • contents requirements.txt

Action

Clone/Pull

pulling REMARKs (these are populated in the _REMARKS folder)

python cli.py pull --all         # git clone all REMARKS
python cli.py pull {remark_name} # git clone one or more REMARK(s)

Lint

Shows what files are missing from given REMARK(s). The linter uses the file-tree print out from STANDARD.md and compares it to the current files found in the currently cloned REMARK(s).

python cli.py lint --all # detect missing files from all REMARKs
python cli.py lint {remark_name} # detect missing files from one or more REMARK(s)

Build

Building conda environments and/or docker images.

python cli.py build conda --all          # build conda environments for all REMARKs (stored as a `condaenv` folder inside the cloned REMARK repo)
python cli.py build docker --all         # build docker images for all REMARKs (stored as a `condaenv` folder inside the cloned REMARK repo)
python cli.py build conda {remark_name}  # build conda environments for one or more REMARK(s)
python cli.py build docker {remark_name} # build docker image(s) for one or more REMARK(s)

The primary difference between conda and docker for builds are that docker will be more flexible for multilanguage REMARKs. It leverages repo2docker (same tool that mybinder uses) to create docker images from repositories.

Execute

Automated execution within built conda environments/docker containers.

python cli.py execute conda --all          # execute reproduce.sh via conda for all REMARKs
python cli.py execute docker --all         # execute reproduce.sh via docker for all REMARKs
python cli.py execute conda {remark_name}  # execute reproduce.sh via conda for one or more REMARK(s)
python cli.py execute docker {remark_name} # execute reproduce.sh via docker for one or more REMARK(s)

Both the build and execute subcommands have an optional --jobs argument to specify the number of jobs to run in parallel when building/executing.

Logs/Summarize

python cli.py logs # view most recent logs for all previous building/executing commands

Clean/Remove

python cli.py clean conda --all          # remove all built conda environments
python cli.py clean docker --all         # remove all build docker images
python cli.py clean conda {remark_name}  # remove conda environment(s) from specified REMARK(s)
python cli.py clean docker {remark_name} # remove docker images built from specified REMARK(s)

remark's People

Contributors

a1177568 avatar alanlujan91 avatar amonninger avatar camriddell avatar ccarrollatjhuecon avatar dedwar65 avatar derinaksit avatar drdrij avatar ganong123 avatar iworld1991 avatar jacalin1 avatar llorracc avatar mriduls avatar mv77 avatar npalmer-professional avatar pkofod avatar sbenthall avatar sonjeongwon621 avatar zixuanhuangecon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

remark's Issues

requirements.txt requires matplotlib 2.1.2 but that doesn't work with python 3.8

I looked again at the transcript of my code that creates my VM and it was in the installation of the REMARKs rather than DemARKs that the error occurred.

+ pip install -r requirements.txt
Collecting matplotlib==2.1.2
  Downloading matplotlib-2.1.2.tar.gz (36.2 MB)
    ERROR: Command errored out with exit status 1:
     command: /usr/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-c6nqlpuv/matplotlib/setup.py'"'"'; __file__='"'"'/tmp/pip-install-c6nqlpuv/matplotlib/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-c6nqlpuv/matplotlib/pip-egg-info
         cwd: /tmp/pip-install-c6nqlpuv/matplotlib/
    Complete output (71 lines):
    ============================================================================
    Edit setup.cfg to change the build options
    
    BUILDING MATPLOTLIB
                matplotlib: yes [2.1.2]
                    python: yes [3.8.2 (default, Jul 16 2020, 14:00:26)  [GCC
                            9.3.0]]
                  platform: yes [linux]
    
    REQUIRED DEPENDENCIES AND EXTENSIONS
                     numpy: yes [version 1.19.1]
                       six: yes [using six version 1.14.0]
                  dateutil: yes [using dateutil version 2.7.3]
    backports.functools_lru_cache: yes [Not required]
              subprocess32: yes [Not required]
                      pytz: yes [using pytz version 2020.1]
                    cycler: yes [using cycler version 0.10.0]
                   tornado: yes [using tornado version 6.0.4]
                 pyparsing: yes [using pyparsing version 2.4.7]
                    libagg: yes [pkg-config information for 'libagg' could not
                            be found. Using local copy.]
                  freetype: no  [The C/C++ header for freetype2 (ft2build.h)
                            could not be found.  You may need to install the
                            development package.]
                       png: no  [pkg-config information for 'libpng' could not
                            be found.]
                     qhull: yes [pkg-config information for 'libqhull' could not
                            be found. Using local copy.]
    
    OPTIONAL SUBPACKAGES
               sample_data: yes [installing]
                  toolkits: yes [installing]
                     tests: no  [skipping due to configuration]
            toolkits_tests: no  [skipping due to configuration]
    
    OPTIONAL BACKEND EXTENSIONS
                    macosx: no  [Mac OS-X only]
                    qt5agg: no  [PySide2 not found; PyQt5 not found]
                    qt4agg: no  [PySide not found; PyQt4 not found]
    Unable to init server: Could not connect: Connection refused
    Unable to init server: Could not connect: Connection refused
                   gtk3agg: yes [installing, version 3.20.24]
    Unable to init server: Could not connect: Connection refused
    Unable to init server: Could not connect: Connection refused
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-c6nqlpuv/matplotlib/setup.py", line 216, in <module>
        pkg_help = pkg.install_help_msg()
      File "/tmp/pip-install-c6nqlpuv/matplotlib/setupext.py", line 595, in install_help_msg
        release = platform.linux_distribution()[0].lower()
    AttributeError: module 'platform' has no attribute 'linux_distribution'
                 gtk3cairo: yes [installing, version 3.20.24]
                    gtkagg: no  [Requires pygtk]
                     tkagg: yes [installing; run-time loading from Python Tcl /
                            Tk]
                     wxagg: no  [requires wxPython]
                       gtk: no  [Requires pygtk]
                       agg: yes [installing]
                     cairo: yes [installing, pycairo version 1.16.2]
                 windowing: no  [Microsoft Windows only]
    
    OPTIONAL LATEX DEPENDENCIES
                    dvipng: no
               ghostscript: yes [version 9.50]
                     latex: no
                   pdftops: yes [version 0.86.1]
    
    OPTIONAL PACKAGE DATA
                      dlls: no  [skipping due to configuration]
    
    ============================================================================
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Linear interpolation error in SolvingMicroDSOP

I've run across an error in the linear interpolator when I'm running the SolvingMicroDSOP replication.

I've uploaded a branch here that has a "clean demonstration" of the error: https://github.com/npalmer-professional/REMARK/

... and specifically see this "slice across the objective surface:" https://github.com/npalmer-professional/REMARK/blob/interpolation_error_example/REMARKs/SolvingMicroDSOPs/Figures/SMMslice_cubicbool-false.pdf

You an create this image by running the function "make_contour_slice()" in /REMARK/REMARKs/SolvingMicroDSOPs/Code/StructEstimation.py
(the default parameters will create the figure).

Experimenting on the command line I found that if I pull the parameters at one of those spikes and look at the consumption function using that parameter set, you see how a couple of the consumption functions got a little crazy. (Not included here because haven't cleaned up that code for demonstration).

Funny enough I remember something like this 2 or 3 years ago, I ran across this same error but reversed -- the cubic spline gave the error, while the linear spline didn't. Now it appears to be reversed; If I run the same "slice" for the cubic spline I don't get the problem (but I haven't explored cubic spline over the whole space to see if there are errors elsewhere). @mnwhite I think you worked on that back then and fixed the cubic spline issue, and both cubic and linear were working back then. I think this was all way before the 2.7 -> 3.x conversion for what that is worth in terms of very rough timeline.

Edit: Also potentially important, this only shows up when you allow agents to borrow.

Scaffolding ideal REMARK cross-environment workflow

Assuming the ideal case for REMARKs, there are at least two ways to execute the code:

  • With the do_all.py script
  • As a Jupyter notebook

Ideally, the Jupyter notebook is available as a .py file using Jupyter text.
(Not clear to me whether the standard is to have the .py file, or the notebook, checked into the repository).

Ideally, both of these ways of running the code work in a properly prepared environment.
"Working" means:

  • For the do_all.py:

    • Print statements go to stdout, displayed in the command line
    • Pyplot figures are displayed to the user in a OS native window
    • Pyplot figures are also saved to a Figures/ directory as .png
  • For the Jupyter notebook

    • Print statements go to the notebook
    • Pyplot figure are displayed inline in the notebook
    • Pyplot figures are also saved to the Figures/ directory.

This notebook detection and alternative display logic is now intended for something like a remark.py module that can be imported by all REMARKs.

One issue with implementing this:

  • It looks like there's no way to detect the path of an Jupyter Notebook file from within the notebook. So getting the right relative path to the Figures/ directory may not be possible.

Seems like not saving pngs when the code is executed in a notebook isn't a huge loss.

Release cycle and dependency management: design and documentation

Some questions have been raise about REMARK release cycle and dependency management.
This is a ticket to get clarity on those issues and document them.

This is my proposal:

  • REMARKs are analogous to publications.
  • As static artifacts, their dependencies should be locked to specific versions. (This includes HARK)
  • They may be locked to different versions.
  • REMARKs should have release numbers, because their authors might want to go back and change them.

My understanding is that there is an obstacle to this proposal:

  • Most of the REMARKs are not yet "completed", or in a "publishable" form

As a consequence of the latter issue, the REMARK repository is currently being developed in an ad hoc way, with no releases or dependency locking at all.

SolvingMicroDSOPs SCF data construction will fail with new approach to data

I was just looking through SolvingMicroDSOPs and noticed that there is a file

./Calibration/SetupSCFData.py

which certainly will not work with the new way of doing things. Looks like it should be an easy fix for @MridulS

Relatedly, in principle the SCF data file that we have put in the datasets storage should be replicable from the raw SCF data using the relevant code in cstwMPC:

Code/Stata/SCF

but in fact there does not seem to be any clear mapping between the code in that directory and the SCFwealthDataReduced.txt file. This is not worth the effort to fix; I'm writing this more to record for posterity the fact that I've looked and been unable to find any code that creates it.

standarize REMARKs

llorracc repositories

Lower priority:

econ-ark-tools

  1. Make an econ-ark-tools package
  2. Put remark.py code into it, see #38
  3. Install/import this in econ-ark package (see HARK setup.py)
  4. Use this code in a REMARK

Clean up SolvingMicroDSOPs REMARK

The SolvingMicroDSOPs REMARK seems to both:

  • A Code/StructEstimation.py and a Calibration/EstimationParameters.py
  • Optional (in a try: block) imports of the corresponding code from HARK, i.e:

https://github.com/econ-ark/REMARK/blob/master/REMARKs/SolvingMicroDSOPs/do_all.py#L68

and...

https://github.com/econ-ark/REMARK/blob/master/REMARKs/SolvingMicroDSOPs/Code/StructEstimation.py#L44

From here, it looks like the imports from HARK are redundant, creating an unnecessary dependence on HARK, and should be removed. @llorracc is that correct?

This relates to HARK #440. SolvingMicroDSOPs is an application of HARK that could be moved into an example/, or could exist as a REMARK, but maybe should not be both.

If there is some aspect of the SolvingMicroDSOPs application that is especially reproducible, I would argue this should be split out into a library classes and methods.

Enumerate all REMARK requirements

I'm running into some complications trying to build out a REMARK (#17)

Because different REMARKs implement different feature sets, it's hard for me to infer what the necessary requirements of a REMARK are.

In particular, it looks like REMARKs have scaffolding that allow them to be executed in several different ways:

  • Within a Jupyter notebook
  • Standalone with ipython but not in a notebook
  • Sometime 'Generating' saved features, sometimes just displaying them.

I'm a little lost as to which of these cases needs to be accommodated for a REMARK.

To the extent that there is a standard set of requirements, it would make sense to encapsulate this into a standard module that can be imported into all REMARKs.

That would make it easier to separate the details of the presentation, the details of the environment, and the logic of the economic simulation.

metadata template is unclear

  • What is cff-version
  • do we need a field for commit if we have a field for version that points to a tag?
  • should explain the difference beween remark-name and title
  • papers are often published with a year but not a day or even a month. how should the date-published-original-paper field be filled in these cases?
  • are the authors listed the authors of the original paper, or of the REMARK?

docker tool script for each REMARK

@MridulS, I've been thinking about what to do next with your docker tool (what should I call it?), and concluded that a joint test of the tool and our existing REMARKs would be for you to construct a command, for each REMARK, to execute

  1. A do_min_code.sh command if one exists;
  2. A do_all_code.sh command if one exists.

If instead what exists is a do_min.py or do_all.py command, but no do_min_code.sh or do_all_code.sh, then you should first create the do_min_code.sh and/or do_all_code.sh commands (which in this case would be bash scripts with one line: ipython do_min.py or ipython do_all.py.

BufferStock REMARK do_all.py crashes loading IPython shell

Running this do_all.py from the command line:
https://github.com/llorracc/BufferStockTheory/blob/5ccb22b72cb9fd46245b8954459ce5f44ab3c7ea/Code/Python/do_all.py

gets me the following error in the terminal:

Traceback (most recent call last):
  File "do_all.py", line 3, in <module>
    import BufferStockTheory
  File "/home/sb/projects/econ-ark/REMARK/REMARKs/BufferStockTheory/Code/Python/BufferStockTheory.py", line 154, in <module>
    get_ipython().run_line_magic('matplotlib', 'inline')
AttributeError: 'NoneType' object has no attribute 'run_line_magic'

This looks like it's because the get_ipython() method returns None when no InteractiveShell instance is registered:
https://ipython.readthedocs.io/en/stable/api/generated/IPython.core.getipython.html

(Please let me know if these issues are better tracked to the BufferStockTheory repo in llorracc.)

BufferStock REMARK notebook hangs on first code cell

The first executable code cell in the BufferStock REMARK notebook (linked here from the git repo submodule):
https://github.com/llorracc/BufferStockTheory/blob/5ccb22b72cb9fd46245b8954459ce5f44ab3c7ea/Code/Python/BufferStockTheory.ipynb

hangs upon execution of the first code cell.

Command line output from Jupyter lab indicates a 404 connection to the local JupyterLab endpoint.

[I 11:13:19.120 LabApp] Saving file at /BufferStockTheory.ipynb
[W 11:13:19.120 LabApp] Saving untrusted notebook BufferStockTheory.ipynb
[W 11:13:22.667 LabApp] 404 GET /metrics?1572275602658 (127.0.0.1) 3.85ms referer=http://localhost:8888/lab
[W 11:13:27.679 LabApp] 404 GET /metrics?1572275607671 (127.0.0.1) 3.81ms referer=http://localhost:8888/lab
[W 11:13:32.689 LabApp] 404 GET /metrics?1572275612681 (127.0.0.1) 3.58ms referer=http://localhost:8888/lab
[W 11:13:37.700 LabApp] 404 GET /metrics?1572275617692 (127.0.0.1) 3.82ms referer=http://localhost:8888/lab
[W 11:13:42.711 LabApp] 404 GET /metrics?1572275622702 (127.0.0.1) 3.84ms referer=http://localhost:8888/lab

layout of a REMARK markdown - initial draft

  • URL to the repo [button]
  • Title of the REMARK
  • One line summary of the REMARK
  • Abstract of the REMARK
  • URL for the notebook [if exists] [button]
  • Authors of the Paper [if different from REMARK author]
  • Authors of the REMARK
  • bib [if exists]
  • DOI [if exists] [button]
  • link to one figure [optional]
  • link to a downloadable pdf [if exists] [button]
  • any other link [optional]

Indices for different functions

  • An index for listing REMARKs to be ingested into the website
  • A remarXive index for those REMARKs that have reached a standard of "scholarly publication"
  • An index of REMARKS that we trust and want to run with HARK 'master' to catch errors

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.