centerforthebuiltenvironment / mave Goto Github PK

Automated Measurement & Verification tool for energy data

License: GNU General Public License v3.0

Python 100.00%

mave's Introduction

Mave

Mave is a tool for automated measurement and verification (M&V). At its most simple, the aim is to read energy consumption data from before and after a retrofit (pre-retrofit and post-retrofit data) and to predict how much energy the retrofit saved. Mave does this by training multiple models to the data and using the best model to predict energy consumption what the energy consumption would have been in the post-retrofit period, had the retrofit not happened.

Mave automatically resolves common problems with input files (missing data, irregular timestamps, outliers, etc.), builds input features from the data (including federal holidays, downloading weather data for the location, etc.), and normalizes the results to a Typical Meteorological Year (TMY) for a given physical address.

Installation

Assuming all of the dependencies are installed, install mave from source using:

    python setup.py install

Alternatively, install mave from pypi using:

    pip install mave --no-deps

If you run into trouble with the dependencies, see the Installation page of the wiki

Usage

Try mave out on an example file such as example.csv. This file contains 8 months of preretrofit data and 8 months of postretrofit data, with the retrofit occurring at 6/29/2013 20:15.using any of the examples methods below.

Each of the commands below will build a model on the example file data and predict the savings. The difference between these approaches is primaril in how the post-retrofit period is defined, and whether or not mave normalizes the results to a typical year dataset.

    mave example.csv

This assumes the last 25% of the file represents the postretrofit period as the default value of the 'test_size' argument is 0.25.

    mave example.csv -ts 0.5

This uses the 'ts' or 'test_size' argument to explicitly specify the fraction of the file to use. In this example the 50% of the file represents the postretrofit period (which is approximately correct for example.csv).

    mave example.csv -cp "2013/6/30 02:30"

This example uses the 'cp' or 'changepoint' argument to explicitly define the date at which the post retrofit period begins. This overrides the 'test-size' value. In this case all data on or after June 30, 2013 at 02:30 represents the post-retrofit period.

Note that this is the actual datetime that the postretrofit periods begins for example.csv. If you are wondering about mave's accuracy, for the hypothetical scenario in this file, the savings over the postretrofit period is a constant value of 6 units for each measured data point (or 15% NMBE).

    mave example.csv -cp "2013/6/30 02:30" -ad "berkeley, california"

This example uses the 'ad' or 'address' argument to include a physical address in sunny Berkeley, California. Mave will use the Google Maps API to resolve that to a latitude and longitude, which will then be used to lookup the nearest available historical weather data if none is provided in the input file (it is in the case of example.csv) and a Typical Meteorological Year for that location.

Mave has many configurable options some of which can be passed as command line arguments (run mave -h for details) and many more which can be passed using a separate configuration file. Command line arguments override those in the config file. However, the configuration file also allows many other advanced modeling options, such as specifying multiple different periods to use as pre- or post-retrofit period, or periods to ignore entirely. The advanced modeling options also allow the user to control what input features are used for the model. For example, if the input file has a lot of data, and seasonal production or occupancy, the user may want to include month as an input feature. Review the wiki documentation, or the [default config file] (https://github.com/CenterForTheBuiltEnvironment/mave/blob/master/mave/config/default.cfg) for detailed descriptions of the various options.

Results

The results of the analysis are contained in the log file. The 's' (or 'save') argument will serialize and save the model(s), along with the .csv files for the measured and predicted datasets.

A future version of mave will also include an option to plot figures and results in a pdf file.

Auxiliary scripts

mave also comes with two additional scripts, mave-weather and mave-tmy, for downloading weather data and TMY data for a given location, respectively. Please see examples below and running each command with the 'h' (or 'help') argument will describe each of the arguments in more detail.

    mave-weather 'berkeley, ca' -s '2010-01-01' -e '2010-01-05' -i 15

The above command will download and save the nearest available historical weather data for Berkeley, CA, USA, from Jan 1, 2010 to Jan 5 2010, and interpolate it to 15 minute intervals.

     mave-tmy 'berkeley, ca' -i 15 -y 2015

The above command will download and save the nearest available TMY data for Berkeley, CA, USA. It will also create a separate file containing interpolated data, and overwrite the year provided in the TMY file to 2015 (as it is useful to have a continuous year of data for modeling purposes).

To cite this tool: Paul Raftery & Tyler Hoyt, 2016, Mave: software automated Measurement and Verification. Center for the Built Environment, University of California Berkeley, https://github.com/CenterForTheBuiltEnvironment/mave

mave's People

Contributors

Stargazers

Watchers

Forkers

swetaagwl

mave's Issues

Automatically pull weather data

... based on ZIP code using an API:

http://www.wunderground.com/weather/api/d/docs?d=index
or
http://www.ncdc.noaa.gov/cdo-web/webservices/v2#stations

Comparer breaks with 0 values in the baseline array

If the baseline measured value is a zero (e.g. a net zero energy building), then normalizing against it throws a div by zero error. Not sure what to do with this. Skip this error metric for such a dataset?

Fix unittests

Output

Save model, result, and preprocessor objects as pickle files. Output csv files of predictions.

handle oat and dpt

Allow user to specify column names. download data if none available. If oat only provided, only extract that parameter in the tmy data.

higher level abstraction

a higher level abstraction for all the pre-processed objects, such as X_s, y_s, X_post_s, X_post, X_pre_s, X_pre, would be useful. @praftery

modify comparer to accept either a Dataset object or an array

fix visualize class

Cross validation

The cross-validation datasets are randomly split from the training data, but this occurs before each grid search. So while the grid search for a particular model uses the same dataset, each model uses a different randomly split dataset. This should be rearranged so the same cross validation split is used for all models - so we are comparing like with like when looking at R2 scores.

Example files and unit tests

Create new example files (and unit tests for them):
Integers instead of floats
Missing OAT data
One more input feature - outside wetbulb
Shorter input file
Different datetime formats
Poor csv formatting (trailing commas...)
Bad quality energy/temp data (zeros, nans, strings)

create subclass of dataset for measured and predicted data

... makes things like creating type E from D a lot easier and less error prone.

get_weather unit test

@taoning
within tests/tests.py
TODO: manually calculate the correct interpolated data and compare
to .interp_data over a (very) short timeperiod to make sure
this is working correctly

Weather/TMY unit mismatch

Weather in I-P, TMY in SI.

Also, need to include a test for user enterred weather data to ensure/convert into SI.

Optimize parameter ranges for randomized grid search

This relates to the randomized grid search to find the optimal set of parameters for each regressor (or at least, a reasonably good set anyway). The bounds on possible parameter values may be too broad, or too narrow, or may not really matter too much at all for our application (!), so it's probably worth looking into.

We could use the EnerNOC data for testing this out:
http://www.datadrivenbuilding.org/100-EnerNOC-Commercial-Buildings
http://open.enernoc.com/data/

Test out new ML algorithms

Like: https://github.com/dmlc/xgboost

Mave is not forwards compatible with scikit-learn

We use version 0.15.2 of scikit-learn. The latest version as of now (0.16.1) is not compatible with mave.

Create a mave-tmy script

to download and save tmy files for a location. Include in ./bin

move model validation error_metrics to MnV Class

... and clarify as AvsB and DvsF

Build a kind of ensemble method using all of the models

Kudos to Samir for this idea.

Take the different models (mean week, random forests, extra trees) that the tool currently trains, and instead of using the best one (evaluated using R2 value using k-fold cross validation), use a combination of the three.

Calculate avoided energy cost for both single and dual M&V, normalize to TMY for dual only.

Also quantify total energy saved (in orig units)

Selectively add features based on available data

add month if more than a year of training data, add weather if more than 6? months of data, add holiday if more than 6? months of data.

Missing historical data

Historical data arrays are all empty since moving from script based approach

Holidays

Extend the holidays set to include more years!

Look at other public sources of historical weather data

i.e. http://www.ncdc.noaa.gov/isd, has an api method and is public.

Outlier detection and removal

M&V data often contains missing energy consumption values. These missing values can be represented as either empty entries (which mave currently excludes from the learning process) or as an 'unusual' value (which mave does not exclude, but should). Examples include using values of -1, 999, or 99999, or strings (like 'None' or 'blank').

It would be great to include some outlier detection and removal. Even very simple approaches would capture most glaring issues (e.g. discard data X standard deviations from the mean of the data), though there are lots of other advanced methods: http://research.microsoft.com/pubs/217054/gupta14_tkde.pdf

Flexible selection of features from downloaded weather and TMY data

New peak error metrics

Include an error metric that looks at the peak month prediction. Utility bills for large buildings often charge based an overall $/kWh consumed (which we cover well with existing error metrics) as well as a $ per peak kW consumption in a month. These peak demands often make up half the total cost.

As yet, none of the error metrics evaluate model accuracy for this - and it can have a very significant impact on the end result.

Move target column names into config file

TMY data download no longer works

Selectively apply month and holidays

...based on the amount of available data (e.g. >= 12 months data for month feature)

Dual model

Implement dual model measurement and verification - one model on baseline, one on postretrofit. predict using each model on TMY data, and compare the results (to estimate energy savings).

Weight recent data as more important using sample weighting methods (for multi year datasets)

The idea is to basically weight more recent data as more important in the regression. e.g. if you have 3 years of building data, and want to make a prediction, surely the most recent year is the most relevant for that prediction? All the regressors have sample weighting built in as an option, so this should be relatively easy to test out.

Looks like this functionality already exists. We can pass a sample_weights array as an argument to the fit method, e.g.: http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html#sklearn.ensemble.RandomForestRegressor.fit

Apply models to predict based on TMY file

Add a new feature to extrapolate the results to a whole year of typical weather data.

Take TMY data from http://apps1.eere.energy.gov/buildings/energyplus/weatherdata_about.cfm
Train a model to the pre retrofit data
Train a model to the post retrofit data
Predict annual energy consumption using TMY data for both models
Compare the energy savings between both predictions.

Store names of input features (columns)

Convert mave script from using args to using a config file

Looks like there is too much configuration info needed to do this nicely in the command line. Switch mave script (in bin/mave) to read from a configuration file using ConfigParser instead of argparse.

Location/TMY feature

Weather data:
A) remove RH column

TMY data:

timestamps should have a consistent year
remove columns containing data that does not match those in the weather data (dbt and dpt)
interpolate timestamps using an interval from start to end

(3) above is pretty much identical to what is done in weather.py so it makes sense to reuse that code. One way is to create a new method that just does the datetime parse and interp part. While you are making this change you could also consolidate all of the location/weather/tmy classes in one py file.

Include additional public domain estimators that perform well

... based on the LBL report: http://eetd.lbl.gov/sites/all/files/lbnl-187225.pdf

M6. Weighted Time-of-Week-and-Temperature [Piette et al. 2013]: Piette, M.A., Brown R.E., Price P.N., Page, J., Granderson, J., Riess, D., et al. (2013). Automated measurement and signaling systems for the transactional network. Lawrence Berkeley National Laboratory, December 2013. LBNL-6611E.

M5. Time-of-Week-and-Temperature [Mathieu et al. 2011]: Mathieu, J.L., Price, P.N., Kiliccote, S., Piette, M.A. (2011). Quantifying changes in building electricity use, with application to Demand Response. IEEE Transactions on Smart Grid 2:507-518, 2011.

Create a new class to store all in fo associated with each dataset

e.g. an instance for the pre-retrofit data would store objects like:

X
X_s
X_standardizer
datetimes
y
y_s
y_standardizer
column names, etc.

Create mave-weather script

To download and save weather data for a particular address. Include in ./bin

Model results no longer match input data

Logging for weather and tmy download objects

... so the user can see progress, clearer error messages, etc.

Bug: Invalid value encountered in median

test files 197.csv, 186.csv - seems to occur when there are any missing intervals in the data.

mave ./csv/197.csv -ts 0.7 /usr/local/lib/python2.7/dist-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
INFO Starting mave
INFO Assessing input file: ./csv/197.csv
INFO No config file entered - Using default settings
INFO Configuration used for this analysis:
{'address': '',
'changepoints': None,
'datetime_column_name': 'dttm_utc',
'dayfirst': False,
'end_frac': 1.0,
'holiday_keys': ['USFederal'],
'ignored_column_names': ['anomaly', 'estimated', 'timestamp'],
'k': 10,
'n_jobs': -1,
'outside_db_column_name': 'OutsideDryBulbTemperature',
'outside_dp_column_name': 'OutsideDewPointTemperature',
'plot': False,
'print_console': True,
'remove_outliers': 'SingleValue',
'save': True,
'search_iterations': 20,
'start_frac': 0.0,
'target_column_name': 'value',
'test_size': 0.7,
'timestamp_format': '%Y-%m-%d%T%H%M',
'use_holidays': True,
'use_month': True,
'use_tmy': True,
'yearfirst': True}
INFO No location provided
INFO Ignoring the following columns named:['anomaly', 'estimated', 'timestamp']
INFO Preprocessing the input file
INFO Preprocessing started
INFO Creating input features from datetimes
INFO Creating other (non datetime related) input features
value
INFO Cleaning up data - removing outliers, missing data, etc.
WARNING Removed the following 1 outlier value(s):
[('2012-03-23 16:10:00', 50.509099999999997)]
INFO Splitting data into pre- and post-retrofit datasets
/usr/local/lib/python2.7/dist-packages/sklearn/preprocessing/data.py:583: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
warnings.warn(DEPRECATION_MSG_1D, DeprecationWarning)
/usr/local/lib/python2.7/dist-packages/sklearn/preprocessing/data.py:646: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
warnings.warn(DEPRECATION_MSG_1D, DeprecationWarning)
INFO Fitting models to the preretrofit data
INFO Training hour and weekday binning models
INFO Training random forest regressor models
INFO Best HourWeekdayBinModel model R2 score: 0.713385516195, with parameters: {'strategy': 'mean'}
INFO Best RandomForestRegressor model R2 score: 0.813320339418, with parameters: {'max_features': 3, 'min_samples_split': 267, 'bootstrap': False, 'max_depth': 9, 'min_samples_leaf': 52}
/usr/local/lib/python2.7/dist-packages/numpy/lib/function_base.py:3142: RuntimeWarning: Invalid value encountered in median
RuntimeWarning)
/usr/local/lib/python2.7/dist-packages/numpy/lib/function_base.py:3403: RuntimeWarning: Invalid value encountered in median
RuntimeWarning)
INFO Outputting analysis results:

===== Pre-retrofit model training summary =====

=== Selected model ===
Best cross validation score on training data: 0.813320339418
Best model:
RandomForestRegressor(bootstrap=False, criterion='mse', max_depth=9,
max_features=3, max_leaf_nodes=None, min_samples_leaf=52,
min_samples_split=267, min_weight_fraction_leaf=0.0,
n_estimators=10, n_jobs=1, oob_score=False, random_state=None,
verbose=0, warm_start=False)
The relative importances of input features are:
['Minute: 0.00670161894804',
'Hour: 0.581716661949',
'DayOfWeek: 0.315205710804',
'Month: 0.0585278722051',
'Holiday: 0.0378481360934']

=== Fit to the training data ===
These error metrics represent the match between the pre-retrofit data used to train the model and the model prediction:

The model does not meet the ASHRAE Guideline 14:2002 criteria.

Negative indicates prediction > baseline (i.e. savings)
Total Biased Error: -0.000 [in original units]
Normalized Mean Bias Error: -0.000%
Mean Absolute Percent Error: 19.965%
CVRMSE: 35.767%
R2: 0.819

Distribution of normalized errors:
minimum: -472.156%
10th %ile: -32.337%
25th %ile: -12.932%
median: -1.852%
75th %ile: +6.923%
90th %ile: +18.443%
maximum: +75.598%

mean: -9.093%
std. dev.: 45.045%
count: 31622

===== Results =====
These results quantify the difference between the measured post-retrofit data and the predicted consumption:

Negative indicates prediction > baseline (i.e. savings)
Total Biased Error: 323136.091 [in original units]

Note: There are zero values in the baseline data rendering some typical comparison metrics meaningless.

Normalized Mean Bias Error: 34.629%
Mean Absolute Percent Error: nan%
CVRMSE: 65.305%
R2: 0.216

Distribution of normalized errors:
minimum: +nan%
10th %ile: +nan%
25th %ile: +nan%
median: +nan%
75th %ile: +nan%
90th %ile: +nan%
maximum: +nan%

mean: +nan%
std. dev.: nan%
count: 73785

Test using nans for values in the regressors

Some of the regressors (the ones that are based on decision trees) can predict even if some of the input features do not have data. e.g. if you train a model use the datetime, OAT, and occupancy as the input features, but need to predict a value for a datetime and you do not have an occupancy measurement, they can still give a reasonable result. Likewise, the same applies on the training side - at the moment it drops any training instance that is missing even one datapoint. As we also use recurrent data (e.g. oat values from the previous timestamps), a single missing datapoint percolates through and excludes more than one training instance.

This would be particularly relevant for datasets with a lot of missing data.

Tidy up logging implementation

use_month tag no longer works

Visualization

Extend Comparer.py to include a visualization of the results (or write a new class).

Visualize:
Measured vs predicted (for pre-retrofit baseline period)
Pre-retrofit model vs post-retrofit model (when normalized to TMY)

Initial ideas to consider:

Timeseries snapshot (1 week?) with OAT
Scatter plot (measured vs predicted)
Box plots of error (broken out by hour, day of week, temperature?)
Color plot of error by hour of day, day o

Write outputs to file structure

... pickle all models and comparer objects.

use_holidays no longer works

Mnoth feature

Should be disabled by default. longer term, include automatically if more than 12 months training data.

Datetime parsing and related input features

Datetime parsing is very slow. It's single core, probably using a vectorized function is not the best approach, etc.

Add example files to test a couple of scenarios that people are likely to encounter with input files:
Duplicate datetimes - this might break the historical outside air temp interpolation.
Different datetime formats - so far we have only tested one. It shouldn't be an issue, but better to test it.
Irregular datetime intervals (15 minute, 14:59, etc.) - this does not work in the current implementation, and it is a pretty common problem. The simplest approach would be to 'round' the datetime to the nearest interval value.