GithubHelp home page GithubHelp logo

ish_parser's People

Contributors

aablakely avatar coeusite avatar haydenth avatar jwm avatar k-nut avatar mrmucox avatar travc avatar vtoupet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ish_parser's Issues

Please add a license

Providing a license will make it much easier for other projects to include this nice little library.

Please upload v0.0.5 to PyPI

Hi Tom, thanks for writing ish_parser. It was exactly what I was looking for.

Would you please upload v0.0.5 to PyPI? It looks like the most recent version there is 0.0.4.

get_inches() of Distance is incorrect

This conversion fails for Distances in METERS and always returns None or 'Missing', and also millimeters convert to inches * 10. I propose the fixes below:

#This is inches per centimeter:
INCH_CONVERSION_FACTOR = 1/2.54

def get_inches(self):
    ''' convert the measurement to inches '''
    if self._obs_value in self.MISSING:
      return 'MISSING'
    if self._obs_units == self.MILLIMETERS:
      return round(self.INCH_CONVERSION_FACTOR * self._obs_value / 10, 4)
    if self._obs_units == self.METERS:
      return round(self.INCH_CONVERSION_FACTOR * self._obs_value * 100, 4)

Performance improvement

I am using your library with Pandas. Performance is not that good (it takes 1-2 seconds to process a full year).
The reasons for this are:

  • operations are performed sequentially while it could be partially vectorised.
  • everyhting is decoded even though you don't need everything

The way I see things:

  • use pandas.read_fwf for the mandatory sections
  • use apply method for the remaining part of the string (additional fields + remarks).

Usually, you know what information you are trying to get (and probably not every field that is present).
The idea would be to provide a list of desired fields. Based on that list, we could perform only the necessary decoding and return a Pandas Dataframe (or a list of records)

That would increase speed a lot.

Are you interested in such evolution for your library ?

Thanks,
Vincent

What does sky cover summation coverage get numeric return

I have a question about interpreting a reports sky summation cover observation. If I am looking at the data

coverage                                      OVERCAST - 8/8 coverage
secondary_coverage                    Missing
height                                          300
characteristic                               Missing
Name: (1981-01-28 03:22:00+00:00, sky_cover, 1), dtype: object

and I use get_numeric to convert the coverage data to float

a.coverage.get_numeric()
Out[363]: 4.0

What does 4.0 mean?

Thanks for the great package!

Cleaner python3 example for use with gzip

Not much of an 'issue', but...
The usage example with gzip could be better.

import ish_parser
import gzip

ish_filename = 'path/to/a/ish/file.gz'

# Read content
parser = ish_parser.ish_parser()
with gzip.open(ish_filename, 'rb') as gzstream:
    for line in gzstream:
        ishp.loads(line.decode('utf-8'))

# get the list of all reports
reports = parser.get_reports()
print(len(reports))

I'm currently working on code to fetch the data for a particular station over a particular range.
Also converting some results (just air_temperature for now) into pandas dataframes.
If your interested, I can submit that code to you if it ever gets done to a 'clean enough' standard.

Oh, and thanks for writing a parser for the insane mess... I was not looking forward to that.

Don't support remarks

Currently, this doesn't supporting parsing any of the remarks sections of the file. It basically stops after the additional data section, and does nothing with remarks. This is a feature we should add at some point, especially for stripping things out like METARs, etc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.