ekaterinailin / altaipony Goto Github PK

View Code? Open in Web Editor NEW

24.0 24.0 9.0 16.68 MB

Find flares in Kepler and TESS light curves. Notebooks for quickstart inside.

Home Page: https://altaipony.readthedocs.io

License: MIT License

Python 26.77% Jupyter Notebook 73.23%

altaipony's People

Contributors

Stargazers

Watchers

Forkers

barentsen danselem daxfeliz jogendra aaronmaas toihr afeinstein20 pierfra-rocci lupitatovar

altaipony's Issues

Drop the assumption that the de-trending procedure preserves flare signatures

What needs to be created or improved?

Implement synthetic flare injection on raw light curves, then de-trend, then recover and analyse performance.

Can you provide an example?

Analogous to #15

What is the goal / expected behaviour?

This checks if the de-trending (i.e., GP K2SC) swallows flare signatures of a certain energy range.

Implement synthetic flare injection and recovery in a new module

What needs to be created or improved?

create a function that injects flares generated by a semi-empirical model into a real light curve, avoiding true flares and return a new FlareLightCurve.
create a wrapper that repeats the above process n times with a given flare frequency in a physically motivated grid and return recovered energies and recovery probabilities as a function of ED.

Can you provide an example?

def inject_flares(lc, flare_frequency=0.1):
    [...]
    return new_lc

def synthetic_flare_analysis(lc, n=1000):
    [...]
    return recovery_probability, ed_fraction

What is the goal / expected behaviour?

The module includes everything to characterize a given light curve with respect to its flare detection performance.

Improve test_detrend

What needs to be created or improved?

In test_flarelc:

Test that non TPF derived LC fails to be de-trended.
Test that the shapes of all resulting time-domain arrays are the same (detrended_flux, flux, quality``, etc.)
Test that the relevant attributes are kept (e.g., EPIC ID)

What is the goal / expected behaviour?

K2SC is used as a blackbox, so we might wan to be able to catch and deal with exceptions later.

Find smaller flares.

What needs to be created or improved?

You can find smaller flares if you demand only 2 or (even 1) consecutive data point(s) above 3 sigma and then several with maybe 2 sigma that occur afterwards.

Can you provide an example?

Find a flare with 2 data points above 3 sigma, then go check the next data point:
If it is above 2 sigma: add to the flare candidate list and look at the next
Else: drop candidate

What is the goal / expected behaviour?

Find flares that have the typical flare shape (gradual decay) but are small or do not spread their flux among three data points so that they all are above 3 sigma.

Create a Zenodo link to the first release

What needs to be created or improved?

When version 1.0 is up, create a Zenodo link.

Can you provide an example?

Zenodo

What is the goal / expected behaviour?

Easy to cite a given version of a code that is under development.

How to use AltaiPony with TESS tpfs?

Hi,

I'm trying to use this package with TESS target pixel files but have been having some trouble. Specifically, from_KeplerLightCurve doesn't seem to like some of the TESS fits headers (probably because it's not a Kepler light curve...):

Traceback (most recent call last):
  File "flare_counting.py", line 6, in <module>
    flc = from_TargetPixel_source("TIC 219152539", mission="TESS", sector=5)
  File "//anaconda/lib/python3.5/site-packages/altaipony-0.0.1-py3.5.egg/altaipony/lcio.py", line 59, in from_TargetPixel_source
    lc = from_KeplerLightCurve(lc, origin = 'TPF', **keys)
  File "//anaconda/lib/python3.5/site-packages/altaipony-0.0.1-py3.5.egg/altaipony/lcio.py", line 122, in from_KeplerLightCurve
    flux_unit = u.electron/u.s, **z)
TypeError: __init__() got an unexpected keyword argument 'ccd'

Is there a way around this, or perhaps a short tutorial on using altaipony with TESS data?
Thanks!

More Flexibility for single star use

What needs to be created or improved?

The ability to run the fake flare injection procedure even for very unlikely flares. This would be useful in doing detailed studies of a single star or a handful of stars (but less so for large samples).

What is the goal / expected behaviour?

Default use is the same, but keywords allow more choices around fake flare procedure for small flares.

Test the preservation of units throughout the pipeline

What needs to be created or improved?

Write tests that explicitly check units of outputs of functions if the input units are known.
Question: Would this be a good use case for decorators? (@gully)

Can you provide an example?

def test_units():
    lc = get_k2sc_lc()
    flares = lc.find_flares(*args, **kwargs)
    test_find_flares(lc, flares)

#OR

@testunits
def find_flares(*args, **kwargs):
    [...]
    return flares

What is the goal / expected behaviour?

Time is given in days, but ED is typically measured in seconds, frequencies can be per hour or per year, amplitued can be relative or e-/s.

Update lcio functions according to new lightkurve version

What needs to be created or improved?

lightkurve search functions work differently, and the old ones will be deprecated soon.

Can you provide an example?

from lightkurve import search_targetpixelfile
tpf = search_targetpixelfile(EPIC)
TPF = tpf.download()

What is the goal / expected behaviour?

Make LCIO compatible with future versions of lightkurve and also cleaner than before.

Write docstrings for inject_fake_flares

Write test_characterize_one_flare and test_resolve_complexity

in test_fakeflares.py

Implement a documentation page.

What needs to be created or improved?

Docstrings exist but a human-friendlier documentation page is missing.

Can you provide an example?

pytest docs
everest docs

What is the goal / expected behaviour?

Make a human-friendly docs page that you can use yourself. It will also expose redundancies and faults.

Use the splits given for K2SC de-trending as additional input to the gap finder

What needs to be created or improved?

We could use the splits given for K2SC de-trending as additional input to FLC.find_gaps() to inprove the definition of continuous observing periods.

What is the goal / expected behaviour?

The light curve is split according to K2SC splits, then adds additional splits from function parameters, i.e., minimum duration of continuous observation, minimum cadence.

Create a setup function

What needs to be created or improved?

A medium simple way to install AltaiPony is to clone and run a setup function.

Can you provide an example?

git clone https://github.com/ekaterinailin/AltaiPony.git
cd AltaiPony
python setup.py install

What is the goal / expected behaviour?

Three lines of commands to make AltaiPony run.

Implement continuous integration testing

What needs to be created or improved?

Install Travis or comparable framework that would run tests automatically.

Can you provide an example?

travis
appveyor

What is the goal / expected behaviour?

Tests won't be forgotten while pushing to the repo.

Project Roadmap

Tasks to complete:

build MVP
open issues on all the enhancements and features you want to implement
close this issue

Make the package pip installable.

What needs to be created or improved?

Setup AltaiPony as a PyPI package that you can install via pip.

Can you provide an example?

pip install altaipony

What is the goal / expected behaviour?

Simplified way for quick use of AltaiPony. Even simpler that setup.py install

Add amplitude to the flares attribute and into find_flares

What needs to be created or improved?

characterize_one_flare works with flare amplitudes, it would be nice to see what was fed the algorithm there.

What is the goal / expected behaviour?

The DataFrame that contains the recovered flares now features an amplitude attribute in relative units.

Write a quickstart guide.

What needs to be created or improved?

Write a quickstart guide that runs a limited number of use cases for AltaiPony.

Can you provide an example?

import altaipony
lc = get_k2sc_lc(some_id)
lc = lc.k2sc_detrend()
flares = find_flares(lc)
plot(flares)

What is the goal / expected behaviour?

Instructions will help make the structure of the code more transparent for users and will expose counter-intuitive implementations such as unclear object (e.g. methods) names.

Add tests that also perform de-trending steps.

What needs to be created or improved?

Testing the de-trending part is computationally intensive. We still need to test these parts of the code.

Goal:

Write tests that assert that de-trending is triggered when necessary, and leads to similar results each time.

Use mock_flc and test detrending on this one.
check if characterize_flares works as expected with the inject_before_detrending flag set.

What is the goal / expected behaviour?

I shall return helpful error messages when the attributes of a given FlareLightCurve do not have the de-trended part but need them, or catch these exceptions on the go if this is what we would expect as users.

Issue with flc.find_flares()

Hi,

I was trying to find flares with TESS data and the find_flares() function didn't work right out of the box. This is because the detrended_flux_err is a requirement (or else it's defaulted to a list of NaNs), which I initially did not input as I wasn't using a lightkurve.lightcurve.LightCurve object. It would be wonderful if you could add this to the documentation :)

Additionally, for characterize_one_flare() (and please correct me if I'm wrong), the input should be a Pandas.DataFrame row. However, there were a lot of syntax errors when trying to manipulate the values in the rows in the function.

I've forked a version of AltaiPony that now works for me with these fixes. Can you confirm that this is the appropriate behavior of the code here? If it is, I will submit it as a pull request.

Thanks!

Add splits for K2SC de-trending as as attribute

What needs to be created or improved?

We need a list of split indices as attribute in FlareLightCurve.
This is then fed directly to detrend(). append() needs a feature that will add another split index = len(first_FLC)

What is the goal / expected behaviour?

We have a dict of default splits that talks to FlareLightCurve.__init__(). Appending FlareLightCurve objects introduces one additional split value and updates the indices of the appended object with += len(FlareLightCurve.flux).

Test the (recovery of) the ampl_rec attribute in flares

... in particular, in the characterize_flares, find_flare, and sample_flare_recovery methods.
... also in the attributes of FlareLightCurve.

Add load from path to test_from_TargetPixel_source and test_from_LightCurve_source

What needs to be created or improved?

Alternative way of loading data also needs testing.

Can you provide an example?

Replace EPIC ID with some local path of an example light curve in the repo.

What is the goal / expected behaviour?

lightkurve will resolve the path automatically.

Add option to download all campaigns/ quarters/sectors for a target

What needs to be created or improved?

lcio.from_mast should return a list of FLCs if c is not specified, or at least give a list of available cs. Currently, only the first LC is returned.

Can you provide an example?

flc = from_mast(3863594, mission="Kepler", mode="LC", c=None)

Warning: 3863594 may refer to a different Kepler or TESS target. Please add the prefix 'KIC' or 'TIC' to disambiguate.
Warning: 3863594 may refer to a different Kepler or TESS target. Please add the prefix 'KIC' or 'TIC' to disambiguate.
/home/ekaterina/Documents/001_Science/AltaiPony/appaloosa_for_tess/lib/python3.6/site-packages/lightkurve-1.1.1-py3.6.egg/lightkurve/search.py:185: LightkurveWarning: Warning: 15 files available to download. Only the first file has been downloaded. Please use `download_all()` or specify additional criteria (e.g. quarter, campaign, or sector) to limit your search.
  LightkurveWarning)
1% (3/476) of the cadences will be ignored due to the quality mask (quality_bitmask=1130799).
1% (3/476) of the cadences will be ignored due to the quality mask (quality_bitmask=1130799).
/home/ekaterina/Documents/001_Science/AltaiPony/appaloosa_for_tess/lib/python3.6/site-packages/altaipony-0.0.1-py3.6.egg/altaipony/lcio.py:56: ResourceWarning: unclosed file <_io.FileIO name='/home/ekaterina/.lightkurve-cache/mastDownload/Kepler/kplr003863594_lc_Q111111011101110111/kplr003863594-2009131105131_llc.fits' mode='rb' closefd=True>
  return _from_mast_Kepler(targetid, c, **kwargs)

What is the goal / expected behaviour?

download_all is either callable from AltaiPony or a warning and the list of cs is returned instead of a LC, or both.

Have buttons that lower the inhibition level for users

What needs to be created or improved?

You do not see at first glance if you can use the code, and if yes, how you can credit the authors.
Implement:

license button (MIT)
build button (travis?)
attribution (Zenodo?)

Can you provide an example?

See k2sc or trappist-1 for an instance.

What is the goal / expected behaviour?

Click-able buttons from shields.io that direct you to the respective information of that button.

Write test_getitem

in test_flarelc

Write test_repr

in test_flarelc

Can't open downloaded fits file

What needs to be created or improved?

I'm having trouble opening a fits file downloaded from the Kepler database. However, the example fits file that was provided does work.

Can you provide an example?

from altaipony.flarelc import FlareLightCurve
from altaipony.lcio import from_fits_file 
path = "/home/homanj/Documents/Flares/AltaiPony/Flare Data/2/kplr010002792-2009259160929_llc.fits"
flc = from_fits_file(path)

I get an error message that flc.detrended_flux is not a proper input for np.isfinite.

What is the goal / expected behaviour?

I expected that the program would run smoothly. I also copied the code from from_fits_file in altaipony.lcio and ran each of the parts directly, which resulted in the improper input error message for both fits files (the example one and the downloaded one, which is consistent with the above error)

Manage dependencies in the setup

What needs to be created or improved?

Installing AltaiPony with setup.py entails a number of errors thrown with uninstalled dependencies. Ideally, these would be installed recursively.

What is the goal / expected behaviour?

Running python3 setup.py install will truly be the only line of code to call in order to install AltaiPony.

Naming of the method append to concatenate or merge

What needs to be created or improved?

In python naming a method append usually returns a modified version of the object, and not a modified copy. For this behavior the naming concatenate or merge is used.

Can you provide an example?

a = [1, 2, 3]
a.append(4)
print(a)
>>> [1, 2, 3, 4]

a is modified.

lc1.append(lc2)

lc1 is not modified

What is the goal / expected behaviour?

import copy

class A():
    
    def __init__(self, data):
        self.data = data
    
    def concatenate(self, other):
        '''This concatenate self.data and other.data into a new object'''
        merged = copy.deepcopy(self)
        merged.data += other.data
        return(merged)
    
    def append(self, other):
        '''This append other.data at the end of self.data.'''
        self.data += other.data
        return(True)

Write docstrings and tests for all lcio functions.

What needs to be created or improved?

As the IO functions all fully functional, they now need to be kept updated while the code matures.

What is the goal / expected behaviour?

If IO needs to be extended in future, no other code will break or become redundant.

Make a light curve class that inherits from lightkurve and/or k2sc light curve classes.

What needs to be created or improved?

Create class that inherits from lightkurve or even k2sc.
Add flare attributes and flare finding functions and methods to the new class.

Can you provide an example?

from lightkurve import KeplerLightCurve
class FlareLightCurve(KeplerLightCurve):

    [...]
    self. flares = np.array([ ], dtype=int)
    [...]
    def find_flares(self, **kwargs):
        [...]

What is the goal / expected behaviour?

Build on provided, well-maintained and popular tools.
Exercise object-oriented programming.
Improve code structure (script -> package)

Build on TPF.interact() to view flares in light curves together with their TPFs.

What needs to be created or improved?

Build on TPF.interact() so that it will show the de-trended lc
and the flares
and the tpf
and the raw light curve.

Can you provide an example?

def FlareTargetPixelFile(KeplerTargetPixelFile):
    [...] 
    def interact(self, **kwargs):
        [...]
        #bokeh stuff
        return

What is the goal / expected behaviour?

The goal is to quickly analyse suspicious candidates by eye with all the ancillary information at hand, e.g., the TPF.

Load K2SC light curve from MAST

What needs to be created or improved?

No need to download K2SC light curves manually if you can just pass the EPIC ID to some function that queries MAST.

Can you provide an example?

Something similar to or building on:

from lightkurve import KeplerLightCurveFile

def get_k2sc_lc(id_):
    lc = KeplerLightCurveFile.from_archive()

What is the goal / expected behaviour?

Speed up and simplify use of AltaiPony using existing and well-maintained tools.

Create a "Citing" page in docs and section in README

Users who find AltaiPony useful should cite this package (DOI needed), lightkurve, and k2sc.

docs
README

Add total observation time in candences and days to every FlareLightCurve.flares table

What needs to be created or improved?

If you want to compute FFDs you need to know for how long the target on which you found a flare was observed. But you do not want to give a frequency immediately because you do not know a priori with how many flares in a light curve you will end up.

What is the goal / expected behaviour?

The number of cadences are saved and then converted to total observation time of the target. This number is stored in FlareLightCurve.flares['tot_obs_cad','tot_obs_time'].

Analyse performance on superimposed, i.e. complex, (synthetic) flares

What needs to be created or improved?

For the original light curve:

Identify flares that are superimposed and by that create complex events.
Split the analysis of simple and complex flares.

Additionally, for the synthetic flares:

Adjust fake flare injection frequency so that the fraction of complex flares remains reasonable (<10%)
Add the synthetic EDs of the constituents and compare those to the recovered complex events to get better ED recovery estimates.

Can you provide an example?

Think of a better structure than this:

def find_flares(new_lc, complex=False, **kwargs):
    if complex == True:
        return find_complex_flares(new_lc, **kwargs)
    else find_simple_flares(new_lc, **kwargs)

new_lc = inject_flares(lc)
recovery_probabilities, ed_corrections = find_flares(new_lc, complex=True)

What is the goal / expected behaviour?

Complex flares make up a significant part of true flare events and occur in artificially flare-infested light curves. Finding them by sensible criteria would improve analysis of the underlying physics, e.g., true flare frequencies, sympathetic flares etc.

Use the logging package for debugging

What needs to be created or improved?

Setup logging and return useful comments/warnings/logs.

Can you provide an example?

logging How-To

What is the goal / expected behaviour?

Return an online, or optionally, an offline protocol with different levels of verbosity that does not clutter up the code and is easy to silence.

Can't use from_TargetPixel_source for short cadence Kepler data

What needs to be created or improved?

from_TargetPixel_source seems to be the only way to get a lightcurve that is suitable for detrending, and it errors out on at least some short cadence inputs:

I think this is because even with a given KID and quarter, there can still be multiple results from the call to lightkurve.search_targetpixelfile().

Can you provide an example?

from altaipony.lcio import from_TargetPixel_source
flc = from_TargetPixel_source("9592705", quarter=9, cadence='short')

What is the goal / expected behaviour?

I guess there needs to be a way to either load more than one target pixel file, or to allow the user to choose which one when there are more than one in a sector.

Alternately there could be a target pixel file counterpart to from_KeplerLightCurve_file, from_KeplerTargetPixel_file or something.

Rewrite the structure description in a short and accessible way for new users.

What needs to be created or improved?

Can you provide an example?

top level modules - FlareLightCurve, wrapper places
core functions - flare finding, injection/recovery, analysis
helper modules - IO, MAST caller

What is the goal / expected behaviour?

Regardless of verb-y and descriptive module names the code structure can remain opaque to users or my future self. To prevent this, note your current idea that the code's structure represents.

Write a function that checks each flare if it suffers from saturation.

What needs to be created or improved?

Some function may go back to the TPF, get the aperture mask and see if the pixels inside the mask are saturated (>10094 or so). Then add a column to FlareLightCurve.flares with the average saturation in the mask.

Can you provide an example?

def measure_saturation(self):
    TPF = from_archive(self.targetid)
    relevanttpf = TPF.flux(in='some aperture')
    for i, row in self.flares.iterrows(): 
        [...]

What is the goal / expected behaviour?

We want to control for saturation to isolate possible departures from a single power law.

Write test_sample_flare_recovery

in test_flarelc

Flag (and drop) flares that have flagged cadences

What needs to be created or improved?

Flag (and optionally, drop) flares that have flagged cadences.

Can you provide an example?

flares.flags = check_for_flags(flares)
flares = np.delete(flares, flares.flags)

What is the goal / expected behaviour?

It becomes more transparent how often artifacts occur and if flags really trace any of them.

Replace flux with detrended_flux in characterize_one_flare

What needs to be created or improved?

The amplitude should be determined from the detrended flux, not from the raw one that contains variablity and so on.

Can you provide an example?

472     ampl = np.max(flc.flux[f.istart:f.istop])/flc.it_med[f.istart]-1.
#to
472     ampl = np.max(flc.detrended_flux[f.istart:f.istop])/flc.it_med[f.istart]-1.

Replace test_wrapper() in test_altai.py

What needs to be created or improved?

Replace test_wrapper() in test_altai.py with tests that do the same on the different parts of the refurbished wrapper() function, once it takes FlareLightCurve objects instead of numpy.recarrays.

Can you provide an example?

from ..flarelc import FlareLightCurve

def test_wrapper():
    lc = FlareLightCurve(EPIC)
    assert lc == 'something'
    lc = lc.detrend()
    assert lc == [...]

What is the goal / expected behaviour?

Run a full integration test to make sure we can push a light curve through the pipeline from A to Z.

Limit injection/recovery to the flare gap it's actually in.

What needs to be created or improved?

Light curve quality varies between breakpoints sometimes. Also the maximum flare duration that can be recovered depends on the gap's size.

What is the goal / expected behaviour?

Fake flares are only injected into the gap where the real flare is found.

Use k2sc for detrending of raw K2 light curves

What needs to be created or improved?

Not every K2 TPF has k2sc de-trended light curves. You need to grab a raw TPF and de-trend it using k2sc.

Can you provide an example?

get_k2sc_lc(file, EPIC_ID==None):
 if EPIC_ID != None & grab_from_k2sc_archive() == Error:
    lc = grab_from_mast()
    lc = lc.k2sc()

What is the goal / expected behaviour?

AltaiPony should be able to treat any observed K2 target.

Add a construct_ffd() function

What needs to be created or improved?

Implement a function that creates a cumulative FFD object given a set of true or fake flares detected in one or multiple light curves.

Can you provide an example?

@test_units
def construct_ffd(flares, **kwargs):
    hist, bins = np.cumhist(flares.ed)
    [...]
    return ffd

flares = lc.find_flares()
ffd = construct_ffd(flares)

What is the goal / expected behaviour?

Create a simple FFD with frequency over ED for a quick view of flare analysis results that can be easily extended to include real energies (quiescent luminosity provided), log-log representation, power law fits and their uncertainties.

open issues on the additional enhancements mentioned above.

Create an integration test that runs the MVP

What needs to be created or improved?

Create an integration test that runs the MVP and run it using pytest (https://docs.pytest.org/en/latest/).

Can you provide an example?

lc = get_k2sc_lc('examples/hlsp_k2sc_k2_llc_211119999-c04_kepler_v2_lc.fits')
start, stop = wrapper(lc)
print(start,stop)

What is the goal / expected behaviour?

Test should return the cadence number of flares detected in that particular light curve.