Light

astroml / astroml_figures Goto Github PK

View Code? Open in Web Editor NEW

7.0 3.0 10.0 7.7 MB

Figures from the astroML book and paper

License: BSD 2-Clause "Simplified" License

Python 100.00%

astroml_figures's Introduction

astroML Figures

Figures from the astroML book and paper

astroml_figures's People

Contributors

Stargazers

Watchers

Forkers

cicerolneto rbiswas4 bsipocz macicco connolly stgilhool mikistli stkyr davidrrice lxlofpku

astroml_figures's Issues

Avoid hacky way of setting up GaussianMixture dataset

Some of the current examples are hacking GaussianMixture() to set up the input dataset. In more recent versions of scikit-learn sampling with a none witted GaussianMixture is not really supported feature (discussion around scikit-learn/scikit-learn#7822 (comment)).

So while is possible to hack it around, we should look into other ways to generate the input dataset for these user facing examples.

examples are e.g.: book_figures/chapter6/fig_GMM_nclusters.py

Figure 4.2

Hi,
upon running the code for Book Figure 4.2 on Ubuntu, Python returned an error: 'GMM' object has no attribute 'eval' for logprob, responsibilities =M_best.eval(x).
To solve the problem, I replaced M_best.eval(x) (line 85) with:
M_best.score_samples(x.reshape((-1,1)))
and M_best.predict_proba(x) (line 110) with:
p = M_best.predict_proba(x.reshape((-1,1)))

I'm using scikit-learn 0.17

(was astroML/astroML#82, more discussion is on that issue)

Plot regression with newer sklearn.decomposition.PCA

The PCA projection in book_figures/chapter7/fig_S_manifold_PCA.py has changed depending on the sklearn version being used (y range should be flipped).

Investigate the cause of it, and report upstream if it looks like a bug.

Fix title 1.12

This is a duplicate of astroML/astroML#96 as somehow SDSS Stripe 82 Moving Object Catalog sneaked back to the htmls.

Add new second edition baselines

Add source code for figure 9.20

cc @connolly

Figure 3.19 y axis is off by a factor of 2

We should fix this figure.

See astroML/text_errata#29

(was: astroML/astroML#62)

CNN cartoon issues due to M51 picture

The M51 picture for the CNN cartoon brings up two low priority issues, one needs documentation only, the other a solution:

the jpeg file requires Pillow as a dependency. Maybe the best solution is to convert is to png (the only issue is to make sure the result image is the same as what went into the book).
using the current test and pdf generating mechanism (relying on extracting the code out to a temporary "somefile.py") is not working with the current solution for the file path of the image. Running the script directly works, so users shouldn't be affected by is.
Copying a workaround from the astroML pickle_results mechanism is probably the easiest solution here.

Figure 10.3 inconsistencies

I just noticed that, somehow, the top 4 and the bottom 4 panels in the 2nd edition of the web version of fig. 10.3 are swapped  (and now the caption is wrong). The two printed versions and the 1st edition of the web version are fine. See https://www.astroml.org/book_figures/chapter10/fig_FFT_aliasing.html

Note: we should also check the notebook version of this figure.

Figure 6.17

In figure 6.17, we should use the correlation from the full data rather than the mean of bootstrap samples as the best estimate.

(was: astroML/astroML#76)

Remove note from Figure 10.16

There is no need to point out the error in print ed1, the old figure and the note should be removed

chapter 9 "Star/Quasar Classification ROC Curves" example trains classifiers on the whole data set rather than the train split

In fig_star_quasar_ROC.py, inside compute_results(), classifiers are trained on X rather than X_train. This means that the test set has been observed from the classifiers during training which of course is a bad practice.

The way to fix this would be to change line 90 from

model.fit(X, y)

model.fit(X_train, y_train)

Additionally, the figure fig_star_quasar_ROC_1.png needs to be updated as it is the result of the execution of the script.

Figure 9.12: sklearn.tree.DecisionTreeClassifier incompatibility

I am using:

sklearn.version: 0.16.1
astroML.version: 0.3

File "fig_rrlyrae_treevis.py", line 242, in 
random_state=0, criterion='entropy')
TypeError: init() got an unexpected keyword argument 'compute_importances'

I added these lines to fix my fork:

# in 0.14+ Setting compute_importances=True is no longer required. 
try:
  # version < 0.14
 clf = DecisionTreeClassifier(compute_importances=True,
                             random_state=0, criterion='entropy')
except:
  # version 0.14+
  clf = DecisionTreeClassifier(
                             random_state=0, criterion='entropy')

see also: astroML/astroML#77

(was astroML/astroML#78)

Missing 1.15 baseline

Baseline figure for 1.15 is missing

Add caption for new figures

Once they are finalized, update the caption for the new figures.

Wrong equation reference in Figure 5.9

In the comment in Figure 5.9, there is an incorrect reference in the equations.
For the probability p(b), instead of "eqn. 5.70", it should be "eqn. 5.71".
For the gaussian approximation, the equation is not "eqn. 5.71".

RuntimeError triggered by pymc3 for figure 5.24

at the time of opening this issue I suspect this is a local issue on my laptop, but either case having the issue doesn't hurt.

I now run into pymc3 issues a few times with pycharm mostly when examples are embended in notebooks, but this now consistently appears on the command line, too. I only see the error using python3.8, while it works as expected with identical numpy and pymc3 versions on python3.7.

python book_figures/chapter5/fig_model_comparison_mcmc.py 
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [M1_log_sigma, M1_mu]
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [M1_log_sigma, M1_mu]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/runpy.py", line 262, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/runpy.py", line 95, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/bsipocz/munka/devel/worktrees/astroML_figures/giant_figure_generating_branch_ed2/book_figures/chapter5/fig_model_comparison_mcmc.py", line 87, in <module>
    trace1 = pm.sample(draws=2500, tune=100)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/sampling.py", line 469, in sample
    trace = _mp_sample(**sample_args)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/sampling.py", line 1053, in _mp_sample
    sampler = ps.ParallelSampler(
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/parallel_sampling.py", line 355, in __init__
    self._samplers = [
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/parallel_sampling.py", line 356, in <listcomp>
    ProcessAdapter(
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/parallel_sampling.py", line 242, in __init__
    self._process.start()
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
    return Popen(process_obj)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data
    _check_not_importing_main()
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

Move PRs over from astroML

Python 3.7 compatibility: issues with pymc (at least 10 figures)

pymc has a method called await. Given that async and await are reserved keywords in python 3.7 pymc is not even importable causing at least the following figures not compatible with python3.7 either:

book_figures/chapter5/fig_cauchy_mcmc.py
book_figures/chapter5/fig_signal_background.py
book_figures/chapter5/fig_model_comparison_mcmc.py
~~- [ ] book_figures/chapter1/fig_moving_objects_multicolor.py~~ this was never problematic, not sure how it ended up on this list
book_figures/chapter10/fig_matchedfilt_chirp2.py
book_figures/chapter10/fig_matchedfilt_chirp.py
book_figures/chapter10/fig_arrival_time.py
book_figures/chapter10/fig_matchedfilt_burst.py
book_figures/chapter5/fig_gaussgauss_mcmc.py
book_figures/chapter8/fig_outlier_rejection.py

Page 104: Figure 3.19 shows the positive part of a double Weibull distribution, not a Weibull distribution. In this case it means that the values on the y axis are half of what they should be. To get a Weibull distribution in scipy, use exponweib with a=1 rather than dweibull.

Add new baselines

for the 2nd edition.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble