GithubHelp home page GithubHelp logo

Comments (7)

ssuffian avatar ssuffian commented on June 15, 2024

The weather data is sourced from NOAA (although the website is unfortunately currently down (ftp://ftp.ncdc.noaa.gov/pub/data/noaa)) which has access to stations globally. What we do is build a database that only contains certain weather stations (using the eeweather rebuild_db command. This command limits to US and Australian stations, as you can see here: https://github.com/openeemeter/eeweather/blob/master/eeweather/database.py#L186

The ``eeweather rebuild_db` command stores a sqlite3 db file that contains only the stations that were filtered by that line above. If you would like to expand the stations, you could play around with adjusting that line or possibly adding a cli parameter to the rebuild_db call in order to allow countries to be passed in as a parameter. It may be tough to do right now that NOAA is down, but it should hopefully be back up in the next few days. If you get it working, please consider submitting it as a pull request!

Let us know if you need any help navigating the code.

from eeweather.

philngo avatar philngo commented on June 15, 2024

@DLDonaldson @ssuffian I did a quick experiment to see how big the database of metadata gets when it includes all of the weather stations in the world. It looks like it increases in size from 11.7Mb to 25Mb, which is actually probably reasonable, and which is still well below the PyPI package size limit, which would be our upper bound. It's a little bit big for a python package, and we could probably do work to slim it down a bit or separate it out from the library itself, but I think I could be convinced to move to world-wide support. When I get a chance I will create a branch or tag with that change so that we can test it out in practice.

from eeweather.

bhough199 avatar bhough199 commented on June 15, 2024

@philngo I'm trying to expand the list of weather stations to include NZ and Canada (worldwide as mentioned in this issue would be great but I wanted to just start with what I need). I managed to get it almost working with the following changes:

  1. delete eeweather/eeweather/resources/metadata.db
  2. Change database.py as shown in this diff:
diff --git a/eeweather/database.py b/eeweather/database.py
index 8466f9f..68e406b 100644
--- a/eeweather/database.py
+++ b/eeweather/database.py
@@ -181,9 +181,11 @@ def _load_isd_station_metadata(download_path):
     )
 
     isAus = isd_history.CTRY == "AS"
+    isCan = isd_history.CTRY == "CA"
+    isNZ = isd_history.CTRY == "NZ"
 
     metadata = {}
-    for usaf_station, group in isd_history[hasGEO & hasUSAF & (isUS | isAus)].groupby("USAF"):
+    for usaf_station, group in isd_history[hasGEO & hasUSAF & (isUS | isAus | isCan | isNZ)].groupby("USAF"):
         # find most recent
         recent = group.loc[group.END.idxmax()]
         wban_stations = list(group.WBAN)
  1. Find a new source for the CA_Building_Standards_Climate_Zones.zip file, which is missing from the official ca.gov site. I know the place I found it probably isn't a long term solution to this problem, but I couldn't rebuild the database without this file.
diff --git a/scripts/create_ca_climate_zone_geojson.sh b/scripts/create_ca_climate_zone_geojson.sh
index 145711f..732408b 100755
--- a/scripts/create_ca_climate_zone_geojson.sh
+++ b/scripts/create_ca_climate_zone_geojson.sh
@@ -5,7 +5,7 @@ DATA_DIR=${1:-data}
 mkdir -p $DATA_DIR
 
 # download and install CA climate zone raw data
-wget -N http://ww2.energy.ca.gov/maps/renewable/CA_Building_Standards_Climate_Zones.zip -P $DATA_DIR -q --show-progress
+wget -N https://community.esri.com/servlet/JiveServlet/download/176380-1-158805/CA_Building_Standards_Climate_Zones.zip -P $DATA_DIR -q --show-progress
 unzip -q -o $DATA_DIR/CA_Building_Standards_Climate_Zones.zip -d $DATA_DIR
 
 # reproject to ESRI Shapefile
  1. After those changes I ran the eeweather rebuild-db command from inside the shell of my docker image and it worked. I am able to get weather stations and weather data for NZ and CA.

In doing all these changes I somehow broke the ability to use the is_tmy3=True parameter in eeweather.rank_stations() anymore (even when I am looking for stations in the US). If I pass is_tmy3=True the response is (None, [EEWeatherWarning(qualified_name=eeweather.no_weather_station_selected)]) regardless of where I look for a weather station. Is there some step that I overlooked when rebuilding the database that might have broken this?

from eeweather.

DLDonaldson avatar DLDonaldson commented on June 15, 2024

@philngo I did some work earlier this year to slim down the number of stations worldwide based on the duration of the history and the amount of data available for each station. That might be a good way to reduce the overall number of stations in moving to worldwide support if you want to filter it down somewhat. If we were to expand the worldwide coverage that might simultaneously address the issue raised by @bhough199.

Also perhaps the TMY3 problem is a result of #63.

from eeweather.

philngo avatar philngo commented on June 15, 2024

@DLDonaldson Worldwide coverage is definitely something I am interested in pursuing, but I'll need some support to move it forward. I had considered at one point making a download step that downloads the whole database, or which ever part of the database that was necessary for your task - which would decouple it from the PyPI release schedule. Filtering things down also seems like a pretty reasonable approach.

@bhough199 Thanks for sharing what you did to get the rebuilding working again - that will help other power users figure out how to rebuild from source. There was a step in the database building which I think scraped the old NREL site for the TMY3 station metadata, it's possible that that is also now broken. Let us know if #65 fixes your issue.

from eeweather.

bhough199 avatar bhough199 commented on June 15, 2024

@philngo I tested with the newest release after you fixed #65 but I am unfortunately still getting the same problem with eeweather.rank_stations().

Also in case anyone else is trying my approach, the link to the CA_Building_Standards_Climate_Zones.zip file that I showed in my earlier comment has changed (I knew it wasn't a reliable link, and suggest maybe you should host this file in the same place you put the TMY3 weather data since it is not available from the official source anymore?). The new link I found today is https://community.esri.com/ccqpr47374/attachments/ccqpr47374/coordinate-reference-systemsforum-board/1814/1/CA_Building_Standards_Climate_Zones.zip

from eeweather.

philngo avatar philngo commented on June 15, 2024

@bhough199 Would you mind opening a new issue for this out-of-date source problem so we can track that separately from weather station expansion? I think it is a good idea to host our own version of the source files to prevent against this happening again - perhaps we can track that work in that new issue. Would appreciate help tracking down any other sources that are out of date, if you find any. One of these that may help solve the current rank_stations issue is this one (untested) which I believe you should be able to use from archive.org in the rebuilding step: http://web.archive.org/web/20181119091712/https://rredc.nrel.gov/solar/old_data/nsrdb/1991-2005/tmy3/by_USAFN.html

from eeweather.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.