GithubHelp home page GithubHelp logo

gwastro / pycbc Goto Github PK

View Code? Open in Web Editor NEW
307.0 41.0 345.0 67.56 MB

Core package to analyze gravitational-wave data, find signals, and study their parameters. This package was used in the first direct detection of gravitational waves (GW150914), and is used in the ongoing analysis of LIGO/Virgo data.

Home Page: http://pycbc.org

License: GNU General Public License v3.0

Python 97.21% Shell 0.67% HTML 0.48% CSS 0.07% JavaScript 0.07% Dockerfile 0.05% C++ 0.55% Cython 0.91%
astronomy physics gravity ligo gravitational-waves pycbc analysis python black-hole neutron-star gwastro virgo signal-processing open-science cosmic-explorer einstein-telescope lisa

pycbc's Introduction

GW150914

PyCBC is a software package used to explore astrophysical sources of gravitational waves. It contains algorithms to analyze gravitational-wave data, detect coalescing compact binaries, and make bayesian inferences from gravitational-wave data. PyCBC was used in the first direct detection of gravitational waves and is used in flagship analyses of LIGO and Virgo data.

PyCBC is collaboratively developed by the community and is lead by a team of GW astronomers with the aim to build accessible tools for gravitational-wave data analysis.

The PyCBC home page is located on github at

Documentation is automatically built from the latest master version

For the detailed installation instructions of PyCBC

Want to get going using PyCBC?

Quick Installation

pip install pycbc

To test the code on your machine

pip install pytest "tox<4.0.0"
tox

If you use any code from PyCBC in a scientific publication, then please see our citation guidelines for more details on how to cite pycbc algorithms and programs.

For the citation of the pycbc library, please use a bibtex entry and DOI for the appropriate release of the PyCBC software (or the latest available release). A bibtex key and DOI for each release is avaliable from Zenodo.

DOI Build Status PyPI version PyPI - Downloads Anaconda-Server Badge Anaconda-Server Badge astropy

pycbc's People

Contributors

a-r-williamson avatar ahnitz avatar arthurtolley avatar bema-aei avatar bhooshan-gadre avatar cdcapano avatar cmbiwer avatar cyberface avatar dfinstad avatar duncan-brown avatar duncanmmacleod avatar garethcabourndavies avatar jakeb245 avatar josh-willis avatar lppekows avatar lpsinger avatar maxtrevor avatar micamu avatar pannarale avatar praveen-mnl avatar prayush avatar soumide1102 avatar spxiwh avatar stevereyes01 avatar sum33it avatar tapaimarton avatar tdent avatar titodalcanton avatar veronica-villa avatar wushichao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pycbc's Issues

Changing value of a readonly variable

Bumped into this using an old ini-file: once I cleaned up and fixed the ini-file, I did not hit this problem again. Having said that, it looks like line 106 of timeseries.py tries to change the value of a readonly variable, which should not be happening.

Traceback (most recent call last):
File "/home/francesco.pannarale/opt/pycbc//bin/pycbc_inspiral" line 120, in
segments = strain_segments.fourier_segments()
File "/home/francesco.pannarale/opt/pycbc/lib/python2.6/site-packages/pycbc/strain.py", line 587, in fourier_segments
freq_seg = make_frequency_series(self.strain[seg_slice])
File "/home/francesco.pannarale/opt/pycbc/lib/python2.6/site-packages/pycbc/types/timeseries.py", line 106, in getitem
index.start += len(self)
TypeError: readonly attribute

hdf results page task list

The hdf based coinc result pages are severely lacking at the moment. Let's come together and count the ways (and hopefully fix some of them too!). Please comment with what should be added to this list, if you are working on some new plot, or are interested in helping out.

  • check for missing titles, axis, captions
  • include minifollowups
  • include range plots
  • meta information is missing (Ian?)
  • ifar plots
  • tables of loudest background
  • tables of loudest missed injections
  • use UTC time in addition to GPS wherever it is used
  • cross links to alog and detchar summary pages
  • omega scans????
  • site map
  • single detector histograms
    More tasks to add to list:
  • Units on all plots
  • segment and veto information plots
  • Use green for L1 and red for H1 on all plots
  • Add a overflow bin for extreme outliers on histograms
  • Fix x-axis limit on histograms
  • Change "without little dogs" label in legend to "closed-box background"; explain in caption
  • Define injected decisive distance in caption
  • Show linear and log scale for found/missed plot
  • is there any template bank information we can show on the summary page?
  • Show summary page at a commissioner meeting/telecon
  • Caption explanation that walks a LSC user with only knowledge of Sensemon range through how it relates to the sensitive distance plots
  • Add 1.4-1.4 line to sensitive distance plot on summary page
  • Figure out the scaling factor between L1 and H1 for how SNRs are relative, since they are not collocated
  • Add an expand all accordion button
  • Put H1,L1 ASD/range on the same plot. Use the assigned IFO colors on the plots
  • plot p-values for sigma values on cumulative rate vs. stat plot
  • Include mini followup of loudest event; loudest events should be in the dropdown tab for result tab

last example version
https://sugar-jobs.phy.syr.edu/~ahnitz/projects/paper/testing/w1/w1_test3/html/

The current example ini files are broken in pycbc workflow

The current example ini files seem to broken in the current master of pycbc. Running them gives an error (shown below). I'm not really sure what this means ... I worry that the bitwise operation performed between two segment lists here may not be compatible with the lal.LIGOTimeGPS type. If so, that would be bad. Thoughts?

2015-06-16 18:04:53,187:INFO : Leaving split output files module.
2015-06-16 18:04:53,187:INFO : Entering injection module.
2015-06-16 18:04:53,196:INFO : Leaving injection module.
2015-06-16 18:04:53,196:INFO : Entering time slides setup module.
2015-06-16 18:04:53,604:INFO : Entering matched-filtering setup module.
2015-06-16 18:04:53,605:INFO : Adding matched-filter jobs to workflow.
2015-06-16 18:04:53,605:INFO : Setting up matched-filtering for H1.
Traceback (most recent call last):
File "./pycbc_make_coinc_workflow", line 123, in
tags = [tag])
File "/home/spxiwh/lscsoft_git/executables_master/lib/python2.7/site-packages/PyCBC-65a9dd-py2.7.egg/pycbc/workflow/matched_filter.py", line 119, in setup_matchedfltr_workflow
compatibility_mode=compatibility_mode)
File "/home/spxiwh/lscsoft_git/executables_master/lib/python2.7/site-packages/PyCBC-65a9dd-py2.7.egg/pycbc/workflow/matched_filter.py", line 221, in setup_matchedfltr_dax_generated
compatibility_mode=compatibility_mode)
File "/home/spxiwh/lscsoft_git/executables_master/lib/python2.7/site-packages/PyCBC-65a9dd-py2.7.egg/pycbc/workflow/jobsetup.py", line 263, in sngl_ifo_job_setup
useSplitLists=True)
File "/home/spxiwh/lscsoft_git/executables_master/lib/python2.7/site-packages/PyCBC-65a9dd-py2.7.egg/pycbc/workflow/core.py", line 867, in find_outputs_in_range
overlap_windows = [abs(i.segment_list & currsegment_list) for i in overlap_files]
TypeError: in method 'LIGOTimeGPS___gt__', argument 2 of type 'LIGOTimeGPS *'

Add file names (and checksum?) to the attributes of hdf files

Currently, STATMAP, TRIGGERMERGE, and HDFINJFIND files make references to other files without actually recording those files' names anywhere. For example, STATMAP files have 'template_id' in their foreground and background fields, which point to a template in a BANKHDF file, but the name/location of the bankhdf file is not stored anywhere in the file. Ditto 'trigger_idX' (which point to single detector ids) and TRIGGERMERGE files. Likewise, HDFINJFIND files will include an injection_id which points to an injection in an INJECTION xml file, but doesn't record which file. (These are particularly annoying, since there are multiple INJECTION files in a run, and the HDFINJFIND file names don't exactly match the INJECTION file name.) This means that if a user wants to do something with both coincident results and single-detector results, they just have to know which file goes with what. Also, if you want to write a program that uses information from multiple files, you have to have the user provide all the files on the command line, e.g., 'coinc-files STATMAP --bank-file BANKFILE', etc.

This also isn't very safe from a review stand point. There's no way to ensure that, for instance, the TRIGGERMERGE file that a STATMAP file used on creation is the same TRIGGERMERGE file currently sitting in the run directory.

The request is that the files on which each hdf file depends on get recorded in the attributes of that file, along with a checksum. In particular: STATMAP files record the BANKHDF, and TRIGGERMERGE files that they use. HDFINJFIND files record the BANKHDF, TRIGGERMERGE, INJECTION*xml, and STATMAP files. TRIGGERMERGE files record the BANKHDF file.

As I'm not too familiar with what codes generate each of these files, I'm just throwing this out there right now to see if anyone wants to take this on. If not, I'll come back to it and try to do it myself.

Accounting tags in staging/cleanup jobs

Jobs without accounting_group tags are now blocked on Atlas. I found the hard way that some .sub files in PyCBC workflows don't yet have the tag:

[tito@atlas6 ~/er7/runs/week1/local]$ find . -type f -name \*.sub -exec grep -L accounting_group "{}" \;
./create_dir_er7_uberbank_week1_0_local.sub
./subdax_main_ID0000001.sub
./subdax_finalization_ID0000002.sub
./cleanup_er7_uberbank_week1_0_local.sub
./er7_uberbank_week1-0.dag.condor.sub
./main_ID0000001.000/create_dir_main_0_local.sub
./main_ID0000001.000/stage_in_remote_local_0_1.sub
./main_ID0000001.000/stage_in_remote_local_1_1.sub
./main_ID0000001.000/stage_in_remote_local_0_0.sub
./main_ID0000001.000/stage_in_remote_local_1_0.sub
./main_ID0000001.000/stage_out_local_local_0_1.sub
./main_ID0000001.000/stage_out_local_local_0_0.sub
./main_ID0000001.000/stage_out_local_local_1_1.sub
./main_ID0000001.000/stage_out_local_local_1_0.sub
./main_ID0000001.000/stage_out_local_local_2_0.sub
./main_ID0000001.000/stage_out_local_local_2_1.sub
./main_ID0000001.000/stage_out_local_local_3_0.sub
./main_ID0000001.000/stage_out_local_local_3_1.sub
./main_ID0000001.000/main-0.dag.condor.sub
./main_ID0000001.000/stage_out_local_local_4_0.sub
./main_ID0000001.000/stage_out_local_local_5_0.sub
./main_ID0000001.000/stage_out_local_local_5_1.sub
./main_ID0000001.000/stage_out_local_local_6_1.sub
./main_ID0000001.000/stage_out_local_local_6_0.sub
./main_ID0000001.000/stage_out_local_local_7_1.sub
./main_ID0000001.000/stage_out_local_local_7_0.sub

I don't know Pegasus enough to fix this quickly, but I hope it's an easy fix for Alex or Ian.

Datafind AT_RUNTIME_MULTIPLE_CACHES and AT_RUNTIME_SINGLE_CACHES methods are broken

I think my recent change to support a backup datafind server has broken the two datafind methods that return cache files. This is not used by the all-sky workflow any more, but the GRB pipeline might be using this. I think the fix will be quick, but I may not be able to fix and test while at the meeting. Just noting that this issue probably exists and to be aware of this for now if you are using it. Let me know if fixing this becomes urgent otherwise I will fix this next week.

correlate parallel broken when used in simple usage

I have reverted the correlate function (not the Correlator, so pycbc_inspiral still used it), so the match function will use the old correlate_inline.

(from Riccardo)

Hi All,

I'm trying to compute overlaps in pycbc.
While in the past it has been as straightforward as

hp, hc = get_td_waveform(coa_phase=phiRef, delta_t=dT, mass1=m1,
mass2=m2, f_lower=fLow, distance=dist, inclination=inc,
approximant=apprxStr)
overlap, idxM = match(hp, hc, low_frequency_cutoff=fLow)

using present master version I hit the following error:

/tmp/1000_python27_compiled/sc_ce6a0f08bc653d0518ec8e7d5816a62311.cpp:
In function โ€˜PyObject* compiled_func(PyObject_, PyObject_)โ€™:
/tmp/1000_python27_compiled/sc_ce6a0f08bc653d0518ec8e7d5816a62311.cpp:1070:82:
error: cannot convert โ€˜std::complexโ€™ to
โ€˜std::complex
โ€™ for argument โ€˜1โ€™ to โ€˜void
ccorrf_parallel(std::complex, std::complex,
std::complex*, uint32_t, uint32_t)โ€™
ccorrf_parallel(htilde, stilde, qtilde, (uint32_t) arrlen,
(uint32_t) segsize);

       ^

/tmp/1000_python27_compiled/sc_ce6a0f08bc653d0518ec8e7d5816a62311.cpp:
In function โ€˜PyObject* compiled_func(PyObject_, PyObject_)โ€™:
/tmp/1000_python27_compiled/sc_ce6a0f08bc653d0518ec8e7d5816a62311.cpp:1070:82:
error: cannot convert โ€˜std::complexโ€™ to
โ€˜std::complex
โ€™ for argument โ€˜1โ€™ to โ€˜void
ccorrf_parallel(std::complex, std::complex,
std::complex*, uint32_t, uint32_t)โ€™
ccorrf_parallel(htilde, stilde, qtilde, (uint32_t) arrlen,
(uint32_t) segsize);

       ^

Traceback (most recent call last):
File "checkEOBNRv2.py", line 54, in
overlap, idxM = match(hpM_tilde, hpM_tilde, low_frequency_cutoff=fLow)
File "/home/riccardo/lvc/lscsoft/opt/EOBNRv2review/pycbc/lib/python2.7/site-packages/pycbc/filter/matchedfilter.py",
line 677, in match
high_frequency_cutoff, v1_norm, out=_snr)
File "/home/riccardo/lvc/lscsoft/opt/EOBNRv2review/pycbc/lib/python2.7/site-packages/pycbc/filter/matchedfilter.py",
line 557, in matched_filter_core
correlate(htilde[kmin:kmax], stilde[kmin:kmax], _qtilde[kmin:kmax])
File "", line 2, in correlate
File "/home/riccardo/lvc/lscsoft/opt/EOBNRv2review/pycbc/lib/python2.7/site-packages/pycbc/scheme.py",
line 172, in scheming_function
return schemed_fn(_args, *_kwds)
File "/home/riccardo/lvc/lscsoft/opt/EOBNRv2review/pycbc/lib/python2.7/site-packages/pycbc/filter/simd_correlate.py",
line 400, in correlate_parallel
support_code = corr_support, auto_downcast = 1)
File "/usr/lib/python2.7/dist-packages/scipy/weave/inline_tools.py",
line 361, in inline
*_kw)
File "/usr/lib/python2.7/dist-packages/scipy/weave/inline_tools.py",
line 491, in compile_function
verbose=verbose, *_kw)
File "/usr/lib/python2.7/dist-packages/scipy/weave/ext_tools.py",
line 373, in compile
verbose=verbose, **kw)
File "/usr/lib/python2.7/dist-packages/scipy/weave/build_tools.py",
line 297, in build_extension
raise e
scipy.weave.build_tools.CompileError: error: Command "c++ -pthread
-fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -fPIC
-I/usr/lib/python2.7/dist-packages/scipy/weave
-I/usr/lib/python2.7/dist-packages/scipy/weave/scxx
-I/usr/lib/python2.7/dist-packages/numpy/core/include
-I/usr/include/python2.7 -c
/tmp/1000_python27_compiled/sc_ce6a0f08bc653d0518ec8e7d5816a62311.cpp
-o /tmp/scipy-riccardo-XzpoDg/python27_intermediate/compiler_8855277b295f576c423c618665ded9a0/tmp/1000_python27_compiled/sc_ce6a0f08bc653d0518ec8e7d5816a62311.o

-march=native -O3 -w -fopenmp" failed with exit status 1

Can anyone tell me what I am doing wrong?
Cheers,

Riccardo

P.S. I'm sorry I can't retrieve the git hash of the pycbc that worked
for me, but it should be master from last April or so.

Need to think about how to handle pegasus executable entries for different tags

This issue only affects MPI running on Stampede, so it's not urgent, but I'm putting it here so as not to lose the issue.

Larne, Duncan,

After some further thought on the request to use the same executable entry for each injection run, I realized that my simple suggestion to merge injection executable was wrong, and will not work in general for the injection inspiral jobs.

I am cc'ing Ian, as this may involve significant changes to the way the pycbc workflow modules conceptually work.

The model that we have used for pycbc workflow is that it would be built from a top-level set of components that are essentially procedural, but modular, where each would add some useful amount of work to the workflow in an independent manner, and only share information by explicit lists of result files, ensuring that the data dependencies are clearly visible at the highest level.

Executables are only instantiated and used within the context of a single workflow function. The model we have used is that they independent generators of jobs. As such, in principle (through ini file configuration), although not I believe in active use, one could use a different physical executable for each call to a setup function.

To recap the main problems we are having, as have discovered by running on stampeded are the following.

  1. Each ifo/injection combination gets a different entry in the transformation catalog.
    a) An executable is staged for each entry even if the PFN is the same (it is renamed on the remote site)
    b) horizontal clustering if the main mode of analysis and operates at the level of the transformation catalog, meaning that it can only cluster jobs made by a single executable instance at the moment, which is a problem in the case of a large number of injection sets. Whereas with label-based clustering we can easily set to a few jobs (i.e. 1 per ifo is a trivial ini file option), but can't granularly control the number of clustered jobs with a single option.

  2. There is a general conceptual mismatch between the Dax3.Executable class and the pycbc Executable class. The worry is that this could cause us to move out of step with Pegasus development and cause further issues down the line.

Larne, Duncan, is that a fair understanding of what the problems are? Please point out any other issues.

The current model has a number of consequences that make it difficult to quickly implement the model you are requesting, though certainly not unfeasible.

These are the main technical issues.

  1. Setup functions do not share executable instances. Currently, they only share files, file lists, data products, etc. The idea was that this is something that should not be exposed at the top-level as it has no bearing on the actual connection of data driven plumbing.

  2. Executables instances are viewed as generators. This means they keep track of the common options for the executable (derived from the ini file), and information such as which output folder the files it will generate will be stored in, etc, which needs to be separate.

We could explicitly instantiate executables in the top-level workflow and pass them to the setup functions. We would also need to move the common option, output folder, etc. logic into the node instantiation functions. This would require changes to nearly all parts of the code. We would need to pass information we would normally pass to the executable, to the node creation instead. These are mostly local changes, however, as typically the executable is generated and then called many times to generate jobs. We would have to think very carefully how this would affect the configuration file hooks, and while I think this change could be made without changing the way the configuration file addresses different tasks, that requires some thought and verification. I am struggling with is the idea you one passes around the executable instance between setup function calls. To me the current agreement about what goes into and comes out of a workflow setup function is very clear, and this would break that agreement. Do you do this for all setup functions, or do we have the inspiral ones as a special case?

If this is a change worth making, then would need to plan for it, and it certainly can't be done until after O1. In the meantime, are there show-stoppers on why we can't use label-based clustering for now?

-Alex

I'm missing the context here (this appears to be the result of some
previous runs which I haven't heard about!), so my answer probably
doesn't make sense. But ....

Is the problem is that you are wanting to cluster over different
injection runs? As Alex says this could be done with label clustering.
However, why would you want to do that? Even in our largest workflows
we have never run more than 200 injection sets. Now you could use
horizontal clustering to cluster all inspiral jobs in each injection
run together, giving 200 jobs. What is the problem with that?

Maybe the issue is gwf file staging? Maybe you want a node to copy a
bunch of data files and then have a bunch of inspiral jobs from
different inspiral runs analyse the same data files? If that is
the goal then you need label clustering, horizontal clustering will
just cluster jobs at random.

Cheers
Ian

coinc_time option doesn't analyze some coincident time

The COINC_TIME option is implemented by taking the intersection of two detectors science time and treating it as the science time instead, so filtering jobs only get tiled during this time. This is fine, except in the case where a coincident segment is shorter than the analysis-length, but the science segments that made it up were longer. In this case, the data would not be analyzed.

If anyone has a chance to fix this, please let me know, and go ahead. Otherwise, simply take note.

Sec_to_year function in pycbc_coinc_statmap

This function hardcodes a value of 3.15569e7 in the executable as the transformation between seconds and years.

Constants like this should be defined in one place (use the values in lal), and it should be clear whether we are defining a year as:

lal.YRSID_SI
lal.YRTROP_SI

or

lal.YRJUL_SI

Cheers
Ian

pycbc.events.veto.indices_within_times can blow up RAM usage

The function pycbc.events.veto.indices_within_times can explode in RAM usage in cases where large numbers of triggers are present.

Specifically the issue is that it (in pycbc_coinc_statmap) is testing for trigger IDXes around a set of foreground triggers. It does not ensure that the idxes returned are unique, so in the case that you have a large number of foreground triggers you will get a lot of idxes, but each will show up lots of times, leading to an N**2 blow up of RAM. Somehow this function should include a uniqueness test when generating the list of integer idxes. Locally I am using something like:

for i in range((len(left)+1) // 100000):
    print i, (len(left)+1) // 100000
    start_idx = i*10000
    end_idx = (i+1) *10000
    if end_idx > len(left):
        end_idx = len(left)
    curr_left = left[start_idx:end_idx]
    curr_right = right[start_idx:end_idx]
    curr_stacked = numpy.hstack(numpy.r_[s:e] for s, e in zip(curr_left, curr_right))
    stacked = numpy.unique(numpy.hstack([stacked,curr_stacked]))
return tsort[stacked]
# The original code
# return tsort[numpy.hstack(numpy.r_[s:e] for s, e in zip(left, right))]

But this is hardcoding a value of 100000 that is good for my use-case, but maybe not good always. I don't know a magic numpy way of including uniqueness in the list comprehension.

..... And a gripe. We are doing a lot of optimizing of the inspiral code to make it run faster. But I have never really been hit by run time issues since the hard work to make the chisq really fast and optimize the main FFT computation. We are all managing to complete ER7 runs quickly on different clusters. However, I have hit memory issues on a number of occasions. I think our optimization tests need to also consider and balance RAM usage. For example a clustering algorithm that is really fast, but keeps more triggers than the algorithm used in coh_PTF is probably going to be an issue for RAM usage before the speedup is useful.

Show config files in HTML reports from HDF workflows

Hi Chris,

it would be useful to have the merged .ini file visible from your pycbc_make_html_page reports from HDF workflows. Is that something you can do easily? If not, what's the best/quickest/easiest way to implement it?

Add a mechanism for validating ini file versions

Incompatibilities between ini file versions and code versions has been the bane of our existence for years. I suggest adding a section like

[version]
pycbc = 1.1.1

that pycbc_make_*_workflow can check against its own version information and tell the user whether or not the ini file is valid for the version of the workflow generator.

Inspiral will not produce triggers if cluster-window > injection-window

I found that if the cluster-window argument is larger than the injection-window argument, pycbc_inspiral will produce no triggers. You can find an example command to make this happen on atlas8, at: /home/cdcapano/overlapStudies/early_aligo/flow30Hz/M_4_50_bbh/ians_new_banks/geom_bank_6_1100_stoch_filled_2step/pycbc_workflow/run_mkl/test/run_inspiral.sh. In that script, you'll see that cluster-window is set to 4, while injection-window is set to 1. That command will produce no triggers. If I change injection-window to 3.9, I still get no triggers; if I set it to 4.1, I get triggers. To ensure that it is something to do with the cluster window, if I set the injection-window to 3.9, but change the cluster window to 3.5, I get triggers.

I'm not sure what's going on, but it appears to be due to something in threshold_and_cluster.

allow tarballed / zipped files in results pages

In the case of an indeterminate amount of output files from a plotting / result page generation code, where it is not reasonable to individually specify the files, we would like to standardize on generating a single tar'd file. Note that this would also resolve Pegasus handling.

This request is to ensure that these files are properly handled in the pycbc_make_html_page script. It will untar these files and render the resulting output as normal. If the tar'd file contains its own index.html, this will also be respected and not overwritten.

PyCBC install instructions are confusing

When running

[dbrown@seaview ~]$ pip install pycbc --user

I get the error:

Downloading/unpacking mpld3>=0.3git (from pycbc)
Could not find a version that satisfies the requirement mpld3>=0.3git (from pycbc) (from versions: 0.0.1, 0.1, 0.2)
Some insecure and unverifiable files were ignored (use --allow-unverified mpld3 to allow).

I looked at pypa/pip#1423 which discusses this issue, but I couldn't resolve it with any combination of --allow-unverified and --allow-all-external that I could figure out.

pip 1.5.6

Using empty lists as default arguments

Looking at our landscape.io page there are a lot of functions that use an empty list as a default option, eg. tags=[].

I don't think we've run into a problem with this before but this is not the best practice. The default list [] is created when the function is imported not when it is called. A small experiment in ipython shows this:

In [1]: def print_tags(tags=[]):
   ...:     print tags
   ...:     tags.append(1)
In [2]: print_tags()
[]
In [3]: print_tags()
[1]
In [4]: print_tags()
[1, 1]

A page that describes the gotcha is here: https://pythonconquerstheuniverse.wordpress.com/2012/02/15/mutable-default-arguments/

A simple alternative is to pass a None object as the default, then below check if None create an empty list.

Handling of tags in SplitXXX codes

The handling of tags in the splitinspinj and splittmpltbank codes is inconsistent:

https://github.com/ligo-cbc/pycbc/blob/master/pycbc/workflow/jobsetup.py#L1117

https://github.com/ligo-cbc/pycbc/blob/master/pycbc/workflow/jobsetup.py#L1461

these codes are a little different from others as they inherit tags from the parent jobs and then add a split number tag. But in some cases you may want to split the parent file more than once using different numbers of split files (ie. for injection analysis the bank could be denser than for full-data, most relevant in GRB where sliding is done in full-data).

Alex, what do you think is the best way to handle this, should we add a extra_tags option (or something) in the "init" of these classes and append that to the parent's tags and the split number tag?

Having these two share functions code (and be adjacent in the code) would also be good ... Maybe the executables could even be combined?

Single detector trigger info is missing from results page

It was suggested that it would be nice to have single detector SNRs in the page_foreground output alongside coincident information. I'm just opening this issue to let people know I am working on this as part of dumping the loudest events to XML format for upload to gracedb at box opening.

merge segment handling functions

Ok, Ian, here is the todo list. Please let me know if there is something else I am missing.

Task List

  • move science segments to workflow-science
  • move veto creation to workflow-vetoes
  • add min-segment-length back to science time generation
  • make veto function return segment files, and the relevant segment names (still multiple segment files)
  • update exes to take SEGMENT_NAME on the command line (for the ones that don't)
    • pycbc_coinc_findtrigs
    • pycbc_plot_params_vs_singles
    • pycbc_page_snrchi
    • pycbc_page_coinc_snrchi
    • Any others ?????
  • make veto function return single file with all the cumulative lists
  • update and test pycbc_make_hdf_coinc_workflow for this model + downstream setup functions
  • update and test pycb_make_coinc_workflow for this model + downstream setup functions
  • add new option for pregenerated science time
  • add new option for pregenerated veto files
  • update documentation and example ini files

Out of order dependencies in PyCBC build

It looks like the dependencies in the PyCBC build are out of order. If I

[dbrown@sugar-dev3 ~]$ virtualenv /home/dbrown/projects/pycbc
[dbrown@sugar-dev3 ~]$ source /home/dbrown/projects/pycbc/bin/activate
(pycbc)[dbrown@sugar-dev3 ~]$ pip install "numpy>=1.6.4" unittest2
(pycbc)[dbrown@sugar-dev3 ~]$ pip install -e git+https://github.com/ligo-cbc/pycbc#egg=pycbc --process-dependency-links

In the build I see:

Running setup.py bdist_wheel for h5py
running build_ext
File "setup_build.py", line 140, in run
from Cython.Build import cythonize
ImportError: No module named Cython.Build


Failed building wheel for h5py

Later I see:

Running setup.py bdist_wheel for Cython

Then later I see:

Running setup.py install for h5py

and my guess is that the install cleans up anything that failed in the earlier build.

PyCBC needs a mini-followup page

This page is to keep track of what is needed and what is implemented for the mini-followup page in PyCBC. So that it reproduces what we had in ihope, it should at least have:

  • Table showing the summary of the combined statistics of the trigger (FAR, combined NewSNR, time-delay between single-detector triggers)
  • Table showing single-detector trigger information (SNR, NewSNR, chisq/dof, m1, m2, mchirp, eta, mtotal, mass ratio, template duration, end time in GPS seconds, UTC, and site local time, effective distance). Also show the injection ID, if it is an injection.
  • Time series (+/- 10 second) of the single detector triggers SNR after applying CAT1 + gating and CAT2 vetoes.
  • Time series (+/- 10 second) of the single detector triggers NewSNR after applying CAT1 + gating and CAT2 vetoes.
  • Time series (+/- 10 second) of the single detector SNR for the loudest trigger in each detector.
  • Time series (+/- 10 second) of the single detector NewSNR for the loudest trigger in each detector.
  • Version information for the code used to produce the mini-followup.

Each event should have

  • Link to the alog for the day in which the event occurred.
  • Link to the daily summary page for which the event occurred.
  • Link to the CBC daily pages for which the event occurred.
  • Links to the glitch grams for the summary page plots for the hour in which the event occurred.

Add any other items that you can think of below:

  • Spectrograms of the data around the trigger.

pycbc_inspiral does not work with --psd-file option

First run:
pycbc_inspiral with --psd-estimation option, and output the psd.
Second run:
use the generated psd in the previous run to refilter the data with the --psd-file option.

We expect the results from both runs to be very similar to each other. However, in the second case no triggers are produced.

The necessary tar ball (containing the test data set and the template bank, both pycbc_inspiral scripts with the command line options used, the output psd and the output triggers in both cases) can be found in atlas miram.cabero/PyCBC/SNRLoss/test/temp/testpsd.tar.gz

software injection into a TimeSeries of zeroes is not tapered as expected

I was doing software injections into a TimeSeries of zeroes.

I'm using pycbc.inject.InjectionSet.apply (ie. what we use to do software injections in pycbc_inspiral). But the EOBNRv2 waveform that is supposed to be tapered looks like: https://sugar-jobs.phy.syr.edu/~cbiwer/tmp/nottapered.png

I did a little digging and here's what I've looked at so far. The function call that actually generates the waveform timeseries is lalsimulation.SimDetectorStrainREAL8TimeSeries (called via pycbc.detector.project_wave).

lalsimulation.SimDetectorStrainREAL8TimeSeries adds these little bits of noise at the start of the waveform timeseries. I saved the output in inject.py and it coming from this function call. There is some padding and interpolation that happens here, maybe needs a closer look.

Then this waveform timeseries is tapered with SimInspiralREAL8WaveTaper (called via pycbc.waveform.utils.taper_timeseries). Now the algorithm in SimInspiralREAL8WaveTaper is to taper to the second peak in the waveform. So these little bits of noise add a bunch of little peaks and SimInspiralREAL8WaveTaper tapers just a couple noise peaks.

Note if the waveform just abruptly ends then there is no problem with SimInspiralREAL8WaveTaper, it works as its suppose to. However these little noise bits cause a problem.

For the plot above the following injection file on sugar was used: /home/cbiwer/projects/pycbc_test_inject/issue/injection.xml.gz
And the following script on sugar to make the plot: /home/cbiwer/projects/pycbc_test_inject/issue/pycbc_test_inject

Maybe I'm doing something weird here but I'm a bit suspicious about the taper of our injections now.

CPU matched filter ignores number of threads argument

When I use the matched filter engine with the processing scheme set to CPU on Atlas, it ignores the num threads argument that is set in scheme.from_cli. For instance, if I run pycbc_inspiral with:

pycbc_inspiral {some arguments}  --processing-scheme cpu

Several parallel threads will get launched when it hits the filtering part (observed by running htop while the job while was running). This is true both on the head node and on the cluster nodes. If I manually try to set the number of threads with:

pycbc_inspiral {some arguments}  --processing-scheme cpu:1

It still ignores it. Oddly, however, if I manually export OMP_NUM_THREADS prior to running, then the number of threads does respect what I set it to (though still ignoring whatever from_cli is set to). For instance, if I do:

export OMP_NUM_THREADS=3
pycbc_inspiral {some arguments}  --processing-scheme cpu:1

The number of parallel threads that are launched is 3. If I repeat the same thing, but manually setting NUM_THREADS to 1, the number of threads is 1.

I don't know why this is happening. If I set the processing scheme to mkl, it does respect the number of threads that are set in from_cli. Looking at scheme.py, I see that the MKLScheme inherits from CPUScheme, the code that sets the environment must be working properly. Is it possible that omp somehow doesn't respect python's os.environ?

In any event, until this is fixed, 'cpu' should not be used (at least on atlas; maybe it's ok on another cluster), as it can cause a job to hog resources when running in a dag. I realize mkl is probably better anyway, but cpu is the default that is used if processing-scheme is not specified. This can cause issues for new users.

Fallback return value of get_waveform_filter_length_in_time() and time clustering

pycbc.waveform.get_waveform_filter_length_in_time() currently returns None if an approximant does not have an entry in _filter_time_lengths. This breaks --cluster-window template in pycbc_inspiral, effectively disabling clustering. Tom and I discovered this due to erroneously using TaylorF2 rather than SPAtmplt templates.

What's a sensible behavior here? Should an error be raised when using --cluster-method template and an approximant which can not report its duration? Or should we have a meaningful fallback duration in get_waveform_filter_length_in_time()?

missing documenation

I've opened this issue to record the major sections of missing documentation. Please post additional things (or write the documentation).

  • hdf file format (Alex)
  • hdf workflow instructions (Alex)
  • examples of calculating waveform matches

Need exact version of dependencies for O1 release

Most of our PyPI dependencies say e.g. numpy >= 1.6.4 so we get at least that version, but may get newer versions (containing changes or bugs). For the O1 release and review, we should specify exact versions of all of the dependencies.

pycbc_inspiral psd-ouput option gives zero strain values

When running pycbc_inspiral with the psd-output option, the resulting psd has many zeros for the strain value. Luckily these are all at the start of the file produced, so you can go in to the file and delete them all before using this psd. I was getting issues when trying to run pycbc_inspiral using a fixed psd with these zeros in place (since you need to log these values).

cannot use the --frame-type command with pycbc_inspiral

I'm trying to use the --frame-type option with pycbc_inspiral and I'm receiving an error.

The command I ran on sugar was:

cd /home/cbiwer/src/cbccalibration_exe_dev/examples/workflow/er6_t1/work/main_ID0000001
pycbc_inspiral  --segment-end-pad 64 --segment-length 256 --pad-data 8 --sample-rate 4096 --segment-start-pad 64 --psd-segment-stride 128 --psd-inverse-length 16 --filter-inj-only  --psd-segment-length 256 --processing-scheme cpu --snr-threshold 5.5 --cluster-method template --approximant SPAtmplt --psd-estimation median --maximization-interval 30 --strain-high-pass 30 --order 7 --chisq-bins 16 --channel-name L1:OAF-CAL_DARM_DQ --low-frequency-cutoff 40 --gps-start-time 1102660099 --gps-end-time 1102662147 --trig-start-time 1102660091 --trig-end-time 1102662155 --output  110266/L1-INSPIRAL_KAPPA_A_0.975-1102660091-2064.xml.gz  --frame-type L1_RDS  --bank-file  L1-TMPLTBANK_SNGL-1102660163-1920.xml.gz  --user-tag KAPPA_A_0.975

The error was:

Traceback (most recent call last):
  File "/home/cbiwer/pycbc/taper_fix_20150811/bin/pycbc_inspiral", line 4, in <module>
    __import__('pkg_resources').run_script('PyCBC===ee0cef', 'pycbc_inspiral')
  File "/home/cbiwer/.local/lib/python2.6/site-packages/pkg_resources/__init__.py", line 735, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/home/cbiwer/.local/lib/python2.6/site-packages/pkg_resources/__init__.py", line 1652, in run_script
    exec(code, namespace, namespace)
  File "/home/cbiwer/pycbc/taper_fix_20150811/lib/python2.6/site-packages/PyCBC-ee0cef-py2.6.egg/EGG-INFO/scripts/pycbc_inspiral", line 186, in <module>
    gwstrain = strain.from_cli(opt, DYN_RANGE_FAC)
  File "/home/cbiwer/pycbc/taper_fix_20150811/lib/python2.6/site-packages/PyCBC-ee0cef-py2.6.egg/pycbc/strain.py", line 64, in from_cli
    end_time=opt.gps_end_time+opt.pad_data)
  File "/home/cbiwer/pycbc/taper_fix_20150811/lib/python2.6/site-packages/PyCBC-ee0cef-py2.6.egg/pycbc/frame.py", line 287, in query_and_read_frame
    paths = frame_paths(frame_type, start_time, end_time)
  File "/home/cbiwer/pycbc/taper_fix_20150811/lib/python2.6/site-packages/PyCBC-ee0cef-py2.6.egg/pycbc/frame.py", line 251, in frame_paths
    gpsend=end_time)
  File "build/bdist.linux-x86_64/egg/glue/datafind.py", line 232, in find_times
  File "build/bdist.linux-x86_64/egg/glue/datafind.py", line 124, in _requestresponse
RuntimeError: Server returned code 404: Not Found404 Not Found

The server has not found anything matching the Request-URI.

Opening an issue to try and figure this out.

Incompatibility between different LIGOTimeGPS

I hit this error trying to analyze ER7 data. For some reason, a call to pycbc.events.veto.segments_to_file() is given a segment list made of segments with a variety of time types, even inside a single segment. It looks like this is the result of arithmetic operations between segment lists of different types. One of the segments happens to have a lal.LIGOTimeGPS type. This raises a ValueError when the time is casted to a glue.ligolw.lsctables.LIGOTimeGPS in segments_to_file() (apparently the cast is not allowed). I suppose the proper solution would be for the cast to succeed (fixing glue?). But maybe PyCBC can be more robust about such problems. What do people think? Can we just switch to using lal.LIGOTimeGPS everywhere?

The --psd-model in pycbc_inspiral is broken

There is currently a type mismatch in pycbc_inspiral if using the --psd-model option. Specifically the strain and templates will be single precision, but the PSD will be double precision. This will then cause a TypeError when calculating the sigma for a template.

I have made a simple fix locally for this (force psd module to return float32), but I'm not sure the right place to fix this in master. Should the PSD take a "precision" option, and convert to that precision if needed, as the strain module does .... that seems the best solution to me? Let me know if that's the right thing to do and I can prepare a patch.

Add PSD option explaination for hwinj instructions

Ben has brought to my attention that I didn't document all the PSD options on the hardware injection instructions page for pycbc_generate_hwinj, eg. --psd-estimation, etc. At the moment they're just listed with pycbc_generate_hwinj --help.

I assign myself this issue to add them to the docs page.

Executable to determine concident livetime

One of the things we'll have to do fairly regularly in O1 (and beyond) is to wait for enough coincident livetime to be accumulated before cutting the next analysis block. Can we quickly put together a simple script to calculate this given a veto definer and the segment names, etc?

Basic features

  • take gps start/end time
  • account for applying vetoes
  • account for minimum science segment length

Bonus features

  • Report the maximum FAR we can estimate from the available time (include number of background bins?)
  • handle more than two detectors (needed for beyond O1)

Please respond if you can put this together, or if there is already a good way to get this information.

add useful mkl error message

Currently, if you try to run with MKL support and it isn't available most executable don't give a useful error message, instead giving some cryptic failure. I've seen this come up multiple times now, so this is my reminder to fix it.

--processing-scheme mkl is not working

Using pycbc_inspiral --processing-scheme mkl does not appear to work on master.

On sugar-dev3:

source ~/lalsuite/heads/lalsuite-v6.29_20150520_dfc6a/etc/lscsoftrc
source ~/pycbc/master_20150626/etc/pycbc-user-env.sh
cd /home/cbiwer/projects/pycbc_test_arg_fix/gw/work/main_ID0000001
pycbc_inspiral  --segment-end-pad 16 --cluster-method window --low-frequency-cutoff 40 --pad-data 8 --cluster-window 1.0 --sample-rate 4096 --injection-window 1.0 --segment-start-pad 64 --psd-segment-stride 8 --psd-inverse-length 16 --filter-inj-only  --psd-segment-length 16 --processing-scheme mkl --snr-threshold 5.0 --segment-length 512 --approximant SPAtmplt --newsnr-threshold 5.0 --psd-estimation median --keep-loudest-num 100 --strain-high-pass 30 --keep-loudest-interval 2 --order 7 --chisq-bins 128 --channel-name L1:LDAS-STRAIN --gps-start-time 966388166 --gps-end-time 966390406 --trig-start-time 966388230 --trig-end-time 966389982 --output  96638/L1-INSPIRAL_FULL_DATA-966388230-1752.hdf  --frame-files  96638/L-T1200307_V4_EARLY_RECOLORED_V2-966385664-4096.gwf   96638/L-T1200307_V4_EARLY_RECOLORED_V2-966389760-4096.gwf  --bank-file  BNS_NonSpin_30Hz_earlyaLIGO.xml --verbose

Gives:

Traceback (most recent call last):
  File "/home/cbiwer/pycbc/master_20150626/bin/pycbc_inspiral", line 4, in <module>
    __import__('pkg_resources').run_script('PyCBC===e95439', 'pycbc_inspiral')
  File "/home/cbiwer/.local/lib/python2.6/site-packages/pkg_resources/__init__.py", line 735, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/home/cbiwer/.local/lib/python2.6/site-packages/pkg_resources/__init__.py", line 1652, in run_script
    exec(code, namespace, namespace)
  File "/home/cbiwer/pycbc/master_20150626/lib/python2.6/site-packages/PyCBC-e95439-py2.6.egg/EGG-INFO/scripts/pycbc_inspiral", line 190, in <module>
    segments = strain_segments.fourier_segments()
  File "/home/cbiwer/pycbc/master_20150626/lib/python2.6/site-packages/PyCBC-e95439-py2.6.egg/pycbc/strain.py", line 596, in fourier_segments
    freq_seg = make_frequency_series(self.strain[seg_slice])
  File "/home/cbiwer/pycbc/master_20150626/lib/python2.6/site-packages/PyCBC-e95439-py2.6.egg/pycbc/filter/matchedfilter.py", line 274, in make_frequency_series
    fft(vec, vectilde)   
  File "/home/cbiwer/pycbc/master_20150626/lib/python2.6/site-packages/PyCBC-e95439-py2.6.egg/pycbc/fft/__init__.py", line 182, in fft
    thebackend = backends_dict[backends_list[0]]
IndexError: list index out of range

If I remove the --processing-scheme it keeps running.

Workflow/pegasus has no way to supply multi-ifo input arguments

At the moment, as can be seen on

https://github.com/ligo-cbc/pycbc/blob/master/pycbc/workflow/psd.py#L52

there is no neat way to supply input for the multiple-detector input argument formatting. I looked into this a while ago for pycbc_multi_inspiral but found an underlying issue with pegasus that prevented this. Just posting this here to remind me (or anyone else) and I'll try and update with an example of how I think this should be fixed (but which may not actually work).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.