GithubHelp home page GithubHelp logo

tardis-sn / carsus Goto Github PK

View Code? Open in Web Editor NEW
21.0 21.0 43.0 4.27 MB

Atomic Database for Astronomy

Home Page: https://tardis-sn.github.io/carsus

Python 95.07% Jupyter Notebook 4.93%
astronomy atomic-database python

carsus's People

Contributors

afloers avatar andrewfullard avatar atharva-2001 avatar atul107 avatar ayushidaksh avatar chvogl avatar epassaro avatar gitter-badger avatar jselsing avatar jvshields avatar laureanomartinez avatar lukeshingles avatar mishinma avatar s-rathi avatar shreyas3156 avatar ssim avatar wkerzendorf avatar yeganer avatar youssef15015 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

carsus's Issues

Output format for photoionization data

Is your feature request related to a problem? Please describe.
TARDIS needs photoionization data to perform NLTE calculations. This data will come mainly from the CMFGEN atomic database. This issue describes the desired output format for this data.

Describe the solution you'd like
The goal is to export a table with frequency (in units Hz) and cross section values (in units cm^2) for each level. The frequencies start at the threshold for ionization and should follow a logarithmic grid. The image below shows an example of the desired ouput DataFrame:
Screenshot 2021-06-21 at 17 40 28
The DataFrame has a MultiIndex with the atomic_number, ion_number, level_number of the level from which photoionization occurs. The destination levels is not specified since we assume all photoionizations go to the ground state of the next higher ionization state.

Describe alternatives you've considered
None. The format is perfect as is ;-)

Cannot reproduce results of 'Creating a TARDIS atomic data file'

My code

from carsus.io.nist import NISTWeightsComp,  NISTIonizationEnergies
from carsus.io.kurucz import GFALLReader
from carsus.io.zeta import KnoxLongZeta
from carsus.io.output import TARDISAtomData

output_file = "kurucz_As-U_ALL.h5"

atomic_weights = NISTWeightsComp('As-U') # weights from NIST
ionization_energies = NISTIonizationEnergies('As-U') # ionization energies from NIST
atomic_weights.to_hdf(output_file)
ionization_energies.to_hdf(output_file)

zeta = KnoxLongZeta('data_for_carsus/knox_long_recombination_zeta.dat') # zeta from Knox Long
zeta.to_hdf(output_file)

gfall_reader = GFALLReader('data_for_carsus/gfall.dat') ## Kurucz gfall data

atom_data = TARDISAtomData(gfall_reader, ionization_energies, ions='As-U')
atom_data.to_hdf(output_file)

Problem description

Hello,

I am following the tutorial given here to create a TARDIS atomic data file. Everything works until the line atom_data = ..., at which point, I get the following:

Downloading data from the NIST Atomic Weights and Isotopic Compositions database.
Downloading ionization energies from the NIST Atomic Spectra Database
[carsus.io.kurucz.gfall][WARNING]  A specific combination to identify unique levels from the gfall data has not been given. Defaulting to ["energy", "j"]. (gfall.py:68)
Traceback (most recent call last):
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/series.py", line 987, in setitem
    self._set_with_engine(key, value)
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/series.py", line 1046, in _set_with_engine
    self.index._engine.set_value(values, key, value)
  File "pandas/_libs/index.pyx", line 95, in pandas._libs.index.IndexEngine.set_value
  File "pandas/_libs/index.pyx", line 106, in pandas._libs.index.IndexEngine.set_value
ValueError: setting an array element with a sequence.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nvieira/Documents/rprocess_2020/make_TARDISAtomData.py", line 112, in <module>
    ions='As-U')
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/carsus-0.1.dev623-py3.6.egg/carsus/io/output/base.py", line 50, in __init__
    self.ground_levels = ionization_energies.get_ground_levels()
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/carsus-0.1.dev623-py3.6.egg/carsus/io/nist/ionization.py", line 333, in get_ground_levels
    levels = self.parser.prepare_ground_levels()
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/carsus-0.1.dev623-py3.6.egg/carsus/io/nist/ionization.py", line 174, in prepare_ground_levels
    "L", "parity", "J"]] = ground_levels.apply(parse_ground_level, axis=1)
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/frame.py", line 6487, in apply
    return op.get_result()
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/apply.py", line 151, in get_result
    return self.apply_standard()
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/apply.py", line 257, in apply_standard
    self.apply_series_generator()
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/apply.py", line 286, in apply_series_generator
    results[i] = self.f(v)
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/carsus-0.1.dev623-py3.6.egg/carsus/io/nist/ionization.py", line 153, in parse_ground_level
    lvl["J"] = lvl_tokens["J"]
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/series.py", line 1039, in __setitem__
    setitem(key, value)
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/series.py", line 1015, in setitem
    self.loc[key] = value
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/indexing.py", line 190, in __setitem__
    self._setitem_with_indexer(indexer, value)
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/indexing.py", line 656, in _setitem_with_indexer
    value=value)
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 510, in setitem
    return self.apply('setitem', **kwargs)
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 395, in apply
    applied = getattr(b, f)(**kwargs)
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 827, in setitem
    values, value = self._try_coerce_args(values, value)
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 712, in _try_coerce_args
    if np.any(notna(other)) and not self._can_hold_element(other):
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/dtypes/missing.py", line 333, in notna
    res = isna(obj)
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/dtypes/missing.py", line 99, in isna
    return _isna(obj)
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/dtypes/missing.py", line 114, in _isna_new
    return _isna_ndarraylike(obj)
  File "/home/nvieira/anaconda3/envs/tardis/lib/python3.6/site-packages/pandas/core/dtypes/missing.py", line 193, in _isna_ndarraylike
    dtype = values.dtype
AttributeError: ("'function' object has no attribute 'dtype'", 'occurred at index 0')

I am working within an environment specifically for tardis, as suggested in the tardis installation guide. My install of tardis was cloned directly from the github and has successfully produced the spectra from the quickstart guide. I have verified that the version of pandas in this environment is 0.24.2, as is requested in the tardis environment definition file. The version of carsus I am using has also been cloned directly from the github.

I have tried taking the other route of using the carsus quickstart guide, but have had no success there either. I'm at a bit of a loss.

Any help would be much appreciated! (Additionally, if anyone happens to have an usable .h5 file with all the elements from As-U in the meantime, that would also be very helpful.)

Handling of multiple values for one Quantity

Currently we can associate multiple Quantities with one object, for example a line. In practice this might be useful when there is a theoretical and a measured value. However the current model makes interacting with the database very difficult.

I propose to change all relationships with quantities to One-to-One relationships as part of the Quantity rework that should be coming. One could then add both, a measured and a theoretical value, to the same quantity.
Then one could include the logic to decide which value should be used by default at one place and would not have to duplicate it several times.

Duplicate levels appearing in atomic dataset generated with carsus

Note that this is copied from:

tardis-sn/carsus-db#16

...it is likely relevant to both repositories so I posted it on both for now!


I have generated an atomic dataset following the procedure outlined at:

https://github.com/tardis-sn/carsus/blob/master/docs/notebooks/quickstart.ipynb

It worked cleanly and smoothly. However, there seem to be some suspicious levels and some duplicate levels appearing in the database, the origins of which I don't know - in some places this might be a matching issue?

Specifically, here's the listing I get of carbon 1 energy levels from a TARDIS run with the dataset made following the recipe above:

In [31]: simulation.plasma.atomic_data.levels.loc[(6,0)]
Out[31]:
energy g metastable
level_number
0 0.000000e+00 1.0 True
1 3.257771e-15 3.0 True
2 8.621174e-15 5.0 True
3 2.024711e-12 5.0 True
4 2.024711e-12 5.0 True
5 4.300260e-12 1.0 True
6 4.300260e-12 1.0 True
7 4.300260e-12 1.0 True
8 6.701314e-12 5.0 True
9 1.198491e-11 1.0 False
10 1.198872e-11 3.0 False
11 1.199677e-11 5.0 False
12 1.231235e-11 3.0 False
13 1.273052e-11 7.0 False
14 1.273110e-11 3.0 False
15 1.273132e-11 5.0 False
16 1.367794e-11 3.0 False
17 1.384344e-11 3.0 False
18 1.384764e-11 5.0 False
19 1.385427e-11 7.0 False
20 1.405290e-11 3.0 False
21 1.417379e-11 1.0 False
22 1.417625e-11 3.0 False
23 1.418032e-11 5.0 False
24 1.442373e-11 5.0 False
25 1.469491e-11 1.0 False
26 1.494879e-11 3.0 False
27 1.494905e-11 5.0 False
28 1.494922e-11 1.0 False
29 1.543067e-11 5.0 False
... ... ... ...
803 1.803847e-11 3.0 True
804 1.803848e-11 1.0 True
805 1.803849e-11 3.0 True
806 1.803918e-11 5.0 True
807 1.803920e-11 5.0 True
808 1.803920e-11 7.0 True
809 1.803921e-11 3.0 True
810 1.803922e-11 7.0 True
811 1.803923e-11 5.0 True
812 1.803924e-11 3.0 True
813 1.803925e-11 1.0 True
814 1.803925e-11 3.0 True
815 1.803989e-11 5.0 True
816 1.803991e-11 5.0 True
817 1.803991e-11 7.0 True
818 1.803992e-11 3.0 True
819 1.803993e-11 7.0 True
820 1.803994e-11 5.0 True
821 1.803995e-11 3.0 True
822 1.803996e-11 1.0 True
823 1.803996e-11 3.0 True
824 1.804056e-11 5.0 True
825 1.804057e-11 5.0 True
826 1.804057e-11 7.0 True
827 1.804058e-11 3.0 True
828 1.804059e-11 7.0 True
829 1.804060e-11 5.0 True
830 1.804060e-11 3.0 True
831 1.804061e-11 1.0 True
832 1.804061e-11 3.0 True

[833 rows x 3 columns]

There are duplicate levels, evident from the start: e.g. levels 3 and 4 I think are both the same, as are 5, 6 and 7. (I believe that 3 and 4 are both supposed to be 2s^2 2p^2 ^1D_2, while 5, 5 and 7 are all supposed to be 2s^2 2p^2 ^1S_0).

I don't know whether this issue existed pre-CARSUS in the atomic data or is new.

There also seem to be a few oddities in the CII energy level list, but I've not quite understood that yet.

Explicit logging when ingesting levels and lines

The SQL ingesters had a nice logging, would be nice to have the same feature in the new code:

  • Inform user about what ions are being ingested.
  • Inform user about what ions are not available on that source.

Add CMFGEN photoionization cross-sections

Description

After merging PR #191 we can start to discuss how to add cross sections to Carsus. First step would be creating a new method for the CMFGENReader class in carsus/io/cmfgen/base.py. This class should set an attribute called cross_sections or similar. This attribute is a table, not necessarily the "final" table we want to store on the HDF5 file.

For example, for levels and lines the three classes GFALLReader, ChiantiReader and CMFGENReader return tables with identical columns. Then, in the TARDISAtomData class we perform several other calculations until we get the desired results to store.

It's important to say that the CMFGENReader does not have a nice interface like the Chianti or GFALL counterparts. The current way to select levels and lines is: a) parsing the files, and b) creating a dict of dicts:

si0_lvl = CMFGENEnergyLevelsParser('/home/epassaro/Desktop/tardis-sn/atomic_data/CMFGEN/atomic/SIL/I/23nov11/SiI_OSC')
si0_osc = CMFGENOscillatorStrengthsParser('/home/epassaro/Desktop/tardis-sn/atomic_data/CMFGEN/atomic/SIL/I/23nov11/SiI_OSC')

si1_lvl = CMFGENEnergyLevelsParser('/home/epassaro/Desktop/tardis-sn/atomic_data/CMFGEN/atomic/SIL/II/16sep15/si2_osc_kurucz')
si1_osc = CMFGENOscillatorStrengthsParser('/home/epassaro/Desktop/tardis-sn/atomic_data/CMFGEN/atomic/SIL/II/16sep15/si2_osc_kurucz')

cmfgen_data = {'Si 0': {'levels': si_0_lvl, 'lines': si_0_osc}, 
               'Si 1': {'levels': si_1_lvl, 'lines': si_1_osc}}

cmfgen_reader =  CMFGENReader(cmfgen_data)

Then the cmfgen_reader object is passed to the TARDISAtomData class:

atom_data = TARDISAtomData(nist_weights, nist_energies, ... , cmfgen_reader).

A quick note on pho files

According to the GSoC 2019 work, we found three types of cross section files:

si1_cross_sections = CMFGENPhotoionizationCrossSectionParser(path_to_file))

                if df.shape[1] == 2:
                    df.columns = ['Energy', 'Sigma']

                elif df.shape[1] == 1:
                    df.columns = ['Fit coefficients']

                elif df.shape[1] == 8:  # Verner ground state fits. TODO: add units
                    df.columns = ['n', 'l', 'E', 'E_0',
                                  'sigma_0', 'y(a)', 'P', 'y(w)']

So the new method for CMFGENReader class should set a consistent cross_sections attribute (a DataFrame) for these three kind of files.

Also the base attribute of this parser is a bit confusing, returns a list of "small" DataFrames.

Match with levels

lines and collisions are connected to levels through the get_lvl_index2id method on TARDISAtomData class.

How data sources "priorities" work

To be discussed later...

Suggested roadmap

  1. Start exploring the output of the PHO parser.
  2. Take a look of the levels, lines attribute of GFALL, Chianti and CMFGEN readers and see how consistent they are.
  3. Write a method for CMFGENReader to produce a table for cross sections.

Probably before this we need to dig more into how the TARDISAtomData class works.

  1. Finally, write methods on TARDISAtomData called create_cross_sections, cross_sections_prepared.

Update `ChiantiPy` version

The last version of ChiantiPy that works with carsus is 0.8.4.

Since ChiantiPy have only a few dependencies maybe it will work for years... or not.

The following dependencies are required to run ChiantiPy,

Python3 (Python 3 is now required as of version 0.8.0)
Numpy
Scipy
Matplotlib
ipyparallel

Review `to_hdf` method in CMFGEN parsers

Description

to_hdf method in CMFGEN parsers does not accept fname parameter. It just "converts" the plain text files to HDF5 saving them in the same folder. I think this is good as default behavior, but would be nice to let users save the file to an arbitrary location.

Code sample

def to_hdf(key='/energy_levels', fname=None):

    if fname is None
        fname = self.fname
...

sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) too many SQL variables

When trying to export a database to tardis the sql query seems to get too long. This happens when querying for all levels with a level id that is in the list of selected levels.
Solutions would be to batch these queries or to query for all and then perform the selection in python instead of on the database level.

@mishinma How did you create the database inside carsus-db? How did you circumvent this problem or did it not appear back then?

Document Units in Carsus

From the upcoming docs:

In the recent years, much effort has been made to make Carsus easier to use and develop. A large piece of code was rewritten to avoid using an intermediate SQL database between the readers and the final atomic data.

This resulted in an simpler, faster and more maintainable Carsus codebase. But we lost some features in the process, being the most important one the tracking of physical quantities. Currently, we couldn’t solve the management of units in a sensible way.

We need to design and implement a new unit tracking system as soon as possible. In the meantime, should be enough to document the units used by the GFALL/CHIANTI/CMFGEN readers and the output module.

We agreed to document units until we design a new unit tracking system.

Re-implement the signature and hash system

Currently carsus does not sign the atomic data in a coherent way. For example, making two identical files leads to different MD5 sums.

Reproducibility of the atomic files should be one of the key features of Carsus, so I propose to re-think the hash/sign method.

Some ideas

  • Using to_msgpack method seems to be the most efficient way to hash Pandas objects (we tried other methods).
  • Since to_msgpack is deprecated since Pandas 0.25 now we should use pyarrow (see example below).
  • We should store in a DataFrame a table with key and md5sum columns per key available in the HDF5 file.
  • Also we should store dataset versions (all of them are available somehow).
  • We should store also version of important packages used like Pandas, etc.

Examples

I. pyarrow serialization

Tested, work as expected. Consecutive builds of the same atomic file returns the same MD5.

import pyarrow as pa

    def to_hdf(self, fname):
        """Dump the `base` attribute into an HDF5 file
        Parameters
        ----------
        fname : path
           Path to the HDF5 output file
        """

        self.atomic_weights.to_hdf(fname)
        self.ionization_energies.to_hdf(fname)
        self.zeta_data.to_hdf(fname)

        with pd.HDFStore(fname, 'a') as f:
            f.put('/levels', self.levels_prepared)
            f.put('/lines', self.lines_prepared)
            f.put('/macro_atom_data', self.macro_atom_prepared)
            f.put('/macro_atom_references',
                  self.macro_atom_references_prepared)

            md5_hash = hashlib.md5()
            for key in f.keys():
                context = pa.default_serialization_context()
                serialized_df = context.serialize(f[key])
                md5_hash.update(serialized_df.to_buffer())

            uuid1 = uuid.uuid1().hex

            print("Signing AtomData: \nMD5: {}\nUUID1: {}".format(
                md5_hash.hexdigest(), uuid1))

            f.root._v_attrs['md5'] = md5_hash.hexdigest().encode('ascii')
            f.root._v_attrs['uuid1'] = uuid1.encode('ascii')

Possible bug in `ChiantiIngester`

@livnehra:

I managed to add Si III and O I by creating a new hdf5 from the existing database 'kurucz_cd23_chianti_all' using the scripts in carsus-db. For some reason, using chianti_ions = 'H-He; Si II; Ca II; Mg II; S II; Si III; O I' didn't work and gave only H-He, Si II. But when I tried chianti_ions = 'H-He ; Si ; Ca ; Mg ; S ; O ' it created a file with all the collision data.

Status: unconfirmed

Negative upper levels for CMFGEN oscillator strengths

Description

The regex expression used by CMFGENOscillatorStrengthsParser is not working perfectly. Since lower and upper levels are tabulated as i-j sometimes upper level is saved as -j in the DataFrame.

I was wrong, the energy levels file contains negative ID's (for example -86):

https://github.com/tardis-sn/carsus-refdata/blob/master/cmfgen/energy_levels/si2_osc_kurucz

Solution 1: update the regular expression to skip the dash.
Solution 2: just take .abs() for j-column.

Mark these levels as theoretical.

fiasco as a Chianti reader?

Additional context
Fiasco is an interface to the Chianti db. Maybe it may be easier for Carsus to depend on fiasco than chiantipy.

Maybe @wtbarnes and @wkerzendorf can have a chat 😉 to see which features are needed and whether fiasco makes it easier to maintain.

Extending the atomic database

Hi all

I'm keen on trying to run TARDIS with heavier elements than the ones included in the default distribution (Up to Z = 30).

I have installed TARDIS with "pip install git+https://github.com/tardis-sn/tardis“ and have set up the tardis conda environments, as suggested as using the depencecy.yml from "https://raw.githubusercontent.com/tardis-sn/tardis/master/tardis_env27.yml”. However, scipy and pyne had dependency clashes and only after updating all the required packages with (conda update —all) can TARDIS successfully do the spectral synthesis.

Similarly, I have installed carsus with "pip install git+https://github.com/tardis-sn/carsus“ and all required packages. I have install ChiantiPy and downloaded the Chianti database and carsus-db.

Running the notebook, https://github.com/tardis-sn/carsus/blob/master/docs/notebooks/kurucz_chianti_h_he.ipynb, throws this error when running the Ionization Energies box:

Downloading ionization energies from the NIST Atomic Spectra Database
/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/carsus/io/nist/ionization.py:89: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support skipfooter; you can avoid this warning by specifying engine='python'.
  usecols=range(5), names=column_names, skiprows=3, skipfooter=1)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-b697c495f097> in <module>()
      5 ioniz_energies_ingester.ingest(
      6             ionization_energies=True,
----> 7             ground_levels=True
      8             )
      9 session.commit()

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/carsus/io/nist/ionization.pyc in ingest(self, ionization_energies, ground_levels)
    287         # Download data if needed
    288         if self.parser.base is None:
--> 289             self.download()
    290 
    291         if ionization_energies:

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/carsus/io/nist/ionization.pyc in download(self)
    221     def download(self):
    222         data = self.downloader(spectra=self.spectra)
--> 223         self.parser(data)
    224 
    225     def ingest_ionization_energies(self, ioniz_energies=None):

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/carsus/io/base.pyc in __call__(self, input_data)
     42 
     43     def __call__(self, input_data):
---> 44         self.load(input_data)
     45 
     46 

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/carsus/io/nist/ionization.pyc in load(self, input_data)
     89                          usecols=range(5), names=column_names, skiprows=3, skipfooter=1)
     90         for column in ['ground_shells', 'ground_level', 'ionization_energy_str']:
---> 91                 base[column] = base[column].map(lambda x: x.strip())
     92         self.base = base
     93         # column_names = ['atomic_number', 'ion_charge', 'ground_shells', 'ground_level', 'ionization_energy_str', 'ionization_energy_uncertainty_str']

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/pandas/core/series.pyc in map(self, arg, na_action)
   2352         else:
   2353             # arg is a function
-> 2354             new_values = map_f(values, arg)
   2355 
   2356         return self._constructor(new_values,

pandas/_libs/src/inference.pyx in pandas._libs.lib.map_infer()

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/carsus/io/nist/ionization.pyc in <lambda>(x)
     89                          usecols=range(5), names=column_names, skiprows=3, skipfooter=1)
     90         for column in ['ground_shells', 'ground_level', 'ionization_energy_str']:
---> 91                 base[column] = base[column].map(lambda x: x.strip())
     92         self.base = base
     93         # column_names = ['atomic_number', 'ion_charge', 'ground_shells', 'ground_level', 'ionization_energy_str', 'ionization_energy_uncertainty_str']

AttributeError: 'NoneType' object has no attribute 'strip'

I think this is due to the variable number of footer lines that are returned from the NIST query.

If I change the pandas.read_csv call in:

https://github.com/tardis-sn/carsus/blob/master/carsus/io/nist/ionization.py

from

base = pd.read_csv(StringIO(text_data), sep='|', header=None, usecols=range(5), names=column_names, skiprows=3, skipfooter=1)

to:

base = pd.read_csv(StringIO(text_data), sep='|', header=None, usecols=range(5), names=column_names, skiprows=3).dropna(how = "any")

correctly chops away the rows containing NaN values and the NIST part can be concluded. However, when ingesting the Kurucz database (GFALLIngester), the line-list are not correctly connected and it throws a SQL-related error my SQL-fu is not strong enough to clear up.

Ingesting levels from ku_latest
Ingesting levels for He 0
Ingesting levels for He 1
Ingesting levels for Li 0
Ingesting levels for Li 1
Ingesting levels for Be 0
Ingesting levels for Be 1
Ingesting levels for Be 2
Ingesting lines from ku_latest
Ingesting lines for He 0
Ingesting lines for He 1
---------------------------------------------------------------------------
InterfaceError                            Traceback (most recent call last)
<ipython-input-4-3d924feb90ac> in <module>()
      1 gfall_ingester = GFALLIngester(session, gfall_fname, ions='H-Be')
----> 2 gfall_ingester.ingest(levels=True, lines=True)
      3 session.commit()

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/carsus/io/kurucz/gfall.py in ingest(self, levels, lines)
    435 
    436         if lines:
--> 437             self.ingest_lines()
    438             self.session.flush()
    439 

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/carsus/io/kurucz/gfall.py in ingest_lines(self, lines)
    394             print("Ingesting lines for {} {}".format(convert_atomic_number2symbol(atomic_number), ion_charge))
    395             # print(ion)
--> 396             lvl_index2id = self.get_lvl_index2id(ion)
    397 
    398             for index, row in ion_lines.iterrows():

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/carsus/io/kurucz/gfall.py in get_lvl_index2id(self, ion)
    328 
    329         lvl_index2id = list()
--> 330         for id, index in q_ion_lvls:
    331             lvl_index2id.append((index, id))
    332 

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/orm/query.pyc in __iter__(self)
   2875         context.statement.use_labels = True
   2876         if self._autoflush and not self._populate_existing:
-> 2877             self.session._autoflush()
   2878         return self._execute_and_instances(context)
   2879 

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/orm/session.pyc in _autoflush(self)
   1442                     "consider using a session.no_autoflush block if this "
   1443                     "flush is occurring prematurely")
-> 1444                 util.raise_from_cause(e)
   1445 
   1446     def refresh(

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/util/compat.pyc in raise_from_cause(exception, exc_info)
    201     exc_type, exc_value, exc_tb = exc_info
    202     cause = exc_value if exc_value is not exception else None
--> 203     reraise(type(exception), exception, tb=exc_tb, cause=cause)
    204 
    205 if py3k:

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/orm/session.pyc in _autoflush(self)
   1432         if self.autoflush and not self._flushing:
   1433             try:
-> 1434                 self.flush()
   1435             except sa_exc.StatementError as e:
   1436                 # note we are reraising StatementError as opposed to

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/orm/session.pyc in flush(self, objects)
   2241         try:
   2242             self._flushing = True
-> 2243             self._flush(objects)
   2244         finally:
   2245             self._flushing = False

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/orm/session.pyc in _flush(self, objects)
   2367         except:
   2368             with util.safe_reraise():
-> 2369                 transaction.rollback(_capture_exception=True)
   2370 
   2371     def bulk_save_objects(

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.pyc in __exit__(self, type_, value, traceback)
     64             self._exc_info = None   # remove potential circular references
     65             if not self.warn_only:
---> 66                 compat.reraise(exc_type, exc_value, exc_tb)
     67         else:
     68             if not compat.py3k and self._exc_info and self._exc_info[1]:

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/orm/session.pyc in _flush(self, objects)
   2331             self._warn_on_events = True
   2332             try:
-> 2333                 flush_context.execute()
   2334             finally:
   2335                 self._warn_on_events = False

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.pyc in execute(self)
    389                     self.dependencies,
    390                     postsort_actions):
--> 391                 rec.execute(self)
    392 
    393     def finalize_flush_changes(self):

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.pyc in execute(self, uow)
    554                              uow.states_for_mapper_hierarchy(
    555                                  self.mapper, False, False),
--> 556                              uow
    557                              )
    558 

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/orm/persistence.pyc in save_obj(base_mapper, states, uowtransaction, single)
    179         _emit_insert_statements(base_mapper, uowtransaction,
    180                                 cached_connections,
--> 181                                 mapper, table, insert)
    182 
    183     _finalize_insert_update_commands(

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/orm/persistence.pyc in _emit_insert_statements(base_mapper, uowtransaction, cached_connections, mapper, table, insert, bookkeeping)
    864                 else:
    865                     result = cached_connections[connection].\
--> 866                         execute(statement, params)
    867 
    868                 primary_key = result.context.inserted_primary_key

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/engine/base.pyc in execute(self, object, *multiparams, **params)
    946             raise exc.ObjectNotExecutableError(object)
    947         else:
--> 948             return meth(self, multiparams, params)
    949 
    950     def _execute_function(self, func, multiparams, params):

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/sql/elements.pyc in _execute_on_connection(self, connection, multiparams, params)
    267     def _execute_on_connection(self, connection, multiparams, params):
    268         if self.supports_execution:
--> 269             return connection._execute_clauseelement(self, multiparams, params)
    270         else:
    271             raise exc.ObjectNotExecutableError(self)

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/engine/base.pyc in _execute_clauseelement(self, elem, multiparams, params)
   1058             compiled_sql,
   1059             distilled_params,
-> 1060             compiled_sql, distilled_params
   1061         )
   1062         if self._has_events or self.engine._has_events:

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/engine/base.pyc in _execute_context(self, dialect, constructor, statement, parameters, *args)
   1198                 parameters,
   1199                 cursor,
-> 1200                 context)
   1201 
   1202         if self._has_events or self.engine._has_events:

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/engine/base.pyc in _handle_dbapi_exception(self, e, statement, parameters, cursor, context)
   1411                 util.raise_from_cause(
   1412                     sqlalchemy_exception,
-> 1413                     exc_info
   1414                 )
   1415             else:

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/util/compat.pyc in raise_from_cause(exception, exc_info)
    201     exc_type, exc_value, exc_tb = exc_info
    202     cause = exc_value if exc_value is not exception else None
--> 203     reraise(type(exception), exception, tb=exc_tb, cause=cause)
    204 
    205 if py3k:

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/engine/base.pyc in _execute_context(self, dialect, constructor, statement, parameters, *args)
   1191                         statement,
   1192                         parameters,
-> 1193                         context)
   1194         except BaseException as e:
   1195             self._handle_dbapi_exception(

/usr/local/anaconda3/envs/tardis/lib/python2.7/site-packages/sqlalchemy/engine/default.pyc in do_execute(self, cursor, statement, parameters, context)
    505 
    506     def do_execute(self, cursor, statement, parameters, context=None):
--> 507         cursor.execute(statement, parameters)
    508 
    509     def do_execute_no_params(self, cursor, statement, context=None):

InterfaceError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely) (sqlite3.InterfaceError) Error binding parameter 1 - probably unsupported type. [SQL: u'INSERT INTO transition (type, lower_level_id, upper_level_id, data_source_id) VALUES (?, ?, ?, ?)'] [parameters: ('line',           id
index       
0      12253
0      53998,           id
index       
764    13017
764    54762, 3)] (Background on this error at: http://sqlalche.me/e/rvf5)

It looks like it correctly gets the lines for the neutral species, but complains when trying the singly ionised. Maybe it is not correctly separating the ions by charge?

Remove TARDIS from Carsus dependencies

Seems the only piece of TARDIS code required by Carsus is:

from tardis.util.colored_logger import ColoredFormatter, formatter_message

This bloats Carsus environment with unneeded dependencies such as Cython and tqdm.

Chianti database

The purpose of this issue is to gather all issues regarding the chianti database

  • Ingestion of Fe 8 takes over 18 GB of memory
  • Noting database version in the datasource

Pin packages to higher versions

To do:

  • Try to pin pytest=5
  • Try to pin pytest=4
  • Try to pin pandas=1.0
  • Try to pin astropy=3
  • Try to pin pyparsing=2.2
  • Try to pin SQLAlchemy=1.0
  • Unpin ChiantiPy
  • Remove tardis
  • Remove numexpr
  • Remove channel default

Type error when numpy.int64 is passed as a parameter

Got this error when trying to ingest data:

sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 0 - probably 
unsupported type. [SQL: u'SELECT atom.atomic_number AS atom_atomic_number, atom.symbol AS 
atom_symbol, atom.name AS atom_name, atom."group" AS atom_group, atom.period AS 
atom_period \nFROM atom \nWHERE atom.atomic_number = ?'] [parameters: (1,)]

The full traceback http://pastebin.com/4VYMYuGS
A very similar question at stackoverflow http://stackoverflow.com/questions/22734671/weird-type-error-arising-when-i-add-to-a-database-using-sql-alchemy

Write doc section about Einstein coeff

Description

This was done during GSoC, it's important to have this documented because those calculations are not super trivial. Then put a link in _create_einstein_coeff docstring.

EDIT: also include description in docstring.

Consistent units in GFALL, Chianti and CMFGEN readers

Description

This is not a bug, results are ok (but could be confusing).

Wavelength units are not consistent in GFALLReader and ChiantiReader.

In carsus/io/output/base.py:

        lines.loc[gfall_mask, 'wavelength'] = lines.loc[
            gfall_mask, 'wavelength'].apply(lambda x: x*u.nm)
        lines.loc[chianti_mask, 'wavelength'] = lines.loc[
            chianti_mask, 'wavelength'].apply(lambda x: x*u.angstrom)
        lines['wavelength'] = lines['wavelength'].apply(
            lambda x: x.to('angstrom'))
        lines['wavelength'] = lines['wavelength'].apply(lambda x: x.value)

... we can see GFALLReader stores wavelength in nm and ChiantiReader in AA. Then every line is stored in AA in TARDISAtomData object.

Since ChiantiReader (and later CMFGENReader) are tailored to mimic GFALLReader class, should store data in the same units for simplicity.

Make a nicer README.md

Carsus README.md is almost non existent.

  • Add azure-pipelines badge
  • Description and links to TARDIS package
  • Installation section (link from documentation as @jaladh-singhal did forstarkit/wsynphot)
  • License.
  • more...

[Output] Aggregate data per Ion

I propose to iterate over all Ion and gather the data individually. That makes it possible to select the datasource on a per_ion basis.

More ideas to come...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.