GithubHelp home page GithubHelp logo

hessels070 / kb-74-opschaler Goto Github PK

View Code? Open in Web Editor NEW

This project forked from dekeijzer/kb-74-opschaler

0.0 0.0 0.0 442.57 MB

Energy usage prediction of houses for the minor Applied Data Science at The Hague University of Applied Sciences.

Home Page: http://www.opschaler.nl/

Python 0.02% HTML 3.80% Batchfile 0.01% Jupyter Notebook 96.18%

kb-74-opschaler's Introduction

All up-to-date models are found here.

The research paper is found here.
The (final) presentation given at the symposium is found here.

KB-74-OPSCHALER

This repository contains code for the KB-74-OPSCHALER project. KB-74 stands for the minor Applied Data Science at The Hague University of Applied Sciences, with Opschaler being the project name. The goal of this project is to predict the energy usage of houses, 1 week ahead with a 10 second resolution. More information about Opschaler can be found at their website.

Personal portfolio's

Links to the personal portfolio's of the KB-74-OPSCHALER group members are listed below.

About the data

There also is sensor data (occupancy, CO2 values, humidity, temperature and more) from within the dwellings available, this has not been added to this file.

Smart meter data

ParameterUnitSample rateDescription
Timestamp-10 sTimestamp of data telegram (set by smart meter) in local time
eMeterkWh10 sMeter reading electricity delivered to client, normal tariff
eMeterReturnkWh10 sMeter reading electricity delivered by client, normal tariff
eMeterLowkWh10 sMeter reading electricity delivered to client, low tariff
eMeterLowReturnkWh10 sMeter reading electricity delivered by client, low tariff
ePowerkWh10 sActual electricity power delivered to client
ePowerReturnkWh10 sActual electricity power delivered by client
gasTimestamp-1 hTimestamp of the gasMeter reading (set by smart meter) in local time
gasMeterm31 hLast hourly value (temperature converted0, gas delivered to client

Weather data

This is weather data from the KNMI weather station in Rotterdam with a sample rate of 15 minutes.
A representative from OPSCHALER says that this weather station is the most nearby all the dwellings, the exact dwelling locations however are unknown.
They probably are in a 25 km radius from this weather station.

ParameterUnitDescription
DDdegreesWind direction
DRsPrecipitation time
FXm/sMaximum gust of wind at 10 m
FFm/sWindspeed at 10 m
NoktaCloud coverage
PhPaOutside pressure
QW/m2Global radiation
RGmm/hRain intensity
SQm Sunshine duration (in minutes)
Tdeg CTemperature at 1,5 m (1 minute mean)
T10deg CMinimum temperature at 10 cm
TDdeg CDew point temperature
U%Relative humidity at 1,5 m
VVmHorizontal sight
WW-Weather- and station-code

----- Notes for the group members are listed below -----

All (sub)chapters below are ment for the KB74-Opschaler group members.

Setting up GitHub on JupyterHub

  1. Login to JupyterHub on the datascience server.
  2. In the top right press 'New -> Terminal'. A SSH terminal should pop up in a new window.
  3. Next follow this tutorial: link.
  4. When you have done this you will need to add the SSH key to your GitHub account: link. Notice that step 1 will not work because 'clip' is not recognized! Work around this by using FileZilla to browse to your ~/.ssh/id_rsa.pub and download the file. Where ~ is your home folder. Then open the file with a texteditor, copy the contents and go on with the tutorial.
  5. Test your connection: link
  6. You are ready to clone repositories.

Basic SSH commands

  • ls Lists directory contents
  • `cd directory_name' Moves up to directory_name
  • cd .. Moves down a directory
  • cp Copies a file or directory to directory
  • Press tab to finish a word automatically.
    Note that ~ represents your home folder. More info on Linux commands: link

Cloning the KB-74-OPSCHALER repository

  1. Once GitHub has been setup correctly you can clone this reposotiry by pressing the green Clone or download button, copy the (link](https://github.com/deKeijzer/KB-74-OPSCHALER.git).
  2. In the jupyter terminal window you should see the line studentnumber@datascience:~$. Move to the 'notebooks' folder by typing cd notebooks. The directory you are in now should be ~/notebooks.
  3. While in here type git clone <the link you copied, from this repository>.
  4. Once this is done, move to the 'KB-74-OPSCHALER' folder by typing cd KB-74-OPSCHALER. 5. Once in here type git status. This will give you additional information and show you that you have cloned successfully.

Git push & pull

Before you start working on code in jupyter, be sure that you have the latest version of this repository. Do this by typing git pull. Once you have written certain parts of code and want to upload it to this repository do this as follows.

  • git add . (this will select all files)
  • git commit -m 'commit message. For examples changes that you made to the code.'
  • git push More push & pull information can be found in this notebook.

Important data locations

Below is a list of the most important data locations for the Opschaler project. Make sure to not modify or add any files in the folders listed below. Some notebooks have been programmed in such a way that they expect all files in a folder to have a certain file structure. For example: in the smartmeter_data folder the only files in there should be smartmeter files in the format dwelling_id.csv. Any other file in there will crash the notebook which uses this folder to process the files.

KNMI

The KNMI data consists of two dataframes. One is the raw format, this is the way KNMI has provided the data. The other dataset is the processed one, this has been cleaned/prepared/processed in such a way that it can be used for EDA.

KNMI Raw data

Location: /datc/opschaler/weather_data/knmi_10_min_raw_data
This is the raw 10 minute interval data from 2015 till 2018 as provided by the KNMI (by mail).

KNMI preprocessed dataframe

Location: //datc//opschaler//weather_data//weather.csv
The KNMI dataframe (1,82 GB) contains weather data from 2015 to 2018, with a 10 minute resolution. More information can be found in this notebook.
Reading in the data is done as follows:

  • weather = pd.read_csv('//datc//opschaler//weather_data//weather.csv', delimiter='\t', comment='#', parse_dates=['datetime'])
  • weather = weather.set_index(['datetime'])
  • weather.head()

Smartmeter data (from the TU Delft server)

This is the smartmeter data as downloaded from the TU Delft server.

Raw smartmeter data (from the TU Delft server)

Location: /datc/opschaler/smartmeter_data
These are the raw smartmeter dataframes from the TU Delft server.
They should be in the format export_dwelling_id.csv.
These files contain the raw electricity and raw gas data.

preprocessed dwelling_id dataframes

Location: //datc//opschaler//combined_gas_smart_weather_dfs//unprocessed
The smartmeter, gasmeter and weather dataframes merged into one dataframe.
_hour has a one hour sample rate, _10s has a 10 second sample rate.
NaNs are not removed, the following has been done (in order):
For _hour files:

    1. gasPower calculated by using .diff() on gas column.
    1. smartmeter and weather data downsampled to 1 hour, using mean.
    1. merged smartmeter, gas and weather data.

For _10s files:

    1. gas has been upsampled to 10s by using forward fill (.ffill())
    1. gasPower calculated by using .diff() on gas column.
    1. weather upsampled to 10s by using forward fill
    1. merged smartmeter, gas and weather data

Processed dwelling_id dataframes (Use these for analysis)

Location: /datc/opschaler/combined_gas_smart_weather_dfs/processed
The smartmeter, gasmeter and weather dataframes merged into one dataframe. Rows containing a NaN streak which is higher than accepted have been dropped. NaNs in the weather data have been forward filled. NaNs in 'eMeter', 'eMeterReturn', 'eMeterLowReturn', 'gasMeter' have been interpolated. ePower, ePowerReturn and gasPower might still contain NaNs, drop these after reading in the files (if required). More information can be found here

  • dir = '//datc//opschaler//combined_gas_smart_weather_dfs//processed//'
  • dwelling_id = 'P01S01W0373' (for example)
  • df = pd.read_csv(dir+dwelling_id+'.csv', delimiter='\t', parse_dates=['datetime'])
  • df = df.set_index(['datetime'])

Honeywell sensor data

Location: /datc/opschaler/honeywell_sensors_per_dwelling_combined/honeywell_all_dwellings_combined.csv Processed Honeywell sensordata.
All sensordata in one dataframe with dwelling labels.
Note that the serial data in this file has not yet been converted to the room labels. The serial to room datafile honeywell_serial_to_room.xlsx can be found in the same folder.

NaN Information of not-processed dataframes

Location: /datc/opschaler/nan_information
This folder contains dwelling_id_threshold_percentage.csv files together with corresponding plots to get indepth knowledge about the NaNs in all used data. The notebook in which dwelling_id_threshold_percentage.csv is created can be found here.

EDA results locations

location: //datc//opschaler//EDA// The EDA results, saved per dwelling.
For example, correlation coefficient matrices are saved in //datc//opschaler//EDA//correlation_matrices

Usefull terminal commands

In Linux:

  • top to see CPU & RAM.
  • `nvidia-smi -l 1' to see GPU usage and refresh this information every second.

On Windows:
To use nvidia-smi first move to:

  • cd C:\Program Files\NVIDIA Corporation\NVSMI
    Then run nvidia-smi by:
  • .\nvidia-smi -l 1.

To see the CPU usage:

  • wmic cpu get loadpercentage

kb-74-opschaler's People

Contributors

dekeijzer avatar 16021665 avatar poldevisser avatar victorgomezgithub avatar victorgrp avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.