Several tasks related to smart meters project, including
- access data from remote database
- visualize statistics load data as figures
- train anomaly detection model for each resident
- show the reconstruction error between model result and input data
Package | Version |
---|---|
python | 3.9.7 |
pytorch | 1.10.2 |
cudatoolkit | 11.3.1 |
numpy | 1.21.2 |
pandas | 1.4.1 |
matplotlib | 3.5.1 |
tensorboard | 2.8.0 |
requests | 2.27.1 |
xlrd | 2.0.1 |
If you are using conda to manage package and in linux, run
source check.sh
to check current dependency package and version.
Or run
conda list | findstr /rc:'^python ' /rc:'pytorch ' /rc:'cudatoolkit' /rc:'numpy ' /rc:'pandas' /rc:'matplotlib ' /rc:'tensorboard ' /rc:'requests ' /rc:'xlrd'
in windows cmd.
If the specific package is not show in return list, that means package haven't been installed yet, or not in this environment but in base.
There are several functions in main.py
, each function represent a corresponding task.
To do the certain task, uncomment the related line in if __name__=='__main__:'
block and comment other functions, then run the main.py
by
python main.py
-
access data
use debug or interactive mode would be better, since need to update several times when network is unstable.
-
visualize statistics load data
-
train reconstruction model
We modify nbeats model to output backcast and fit the re-construction based anomaly detection task. And we use CNN structure withdown-sampling factor
to control relation between input and representation instead ofbackcast_length
andforecast_length
. So the origin fully-connected structure nbeats model may not work fine and should be deprecate.- make sure there are
data_xxx.csv
indataset/TMbase
folder - adjust parameters of
train_model
function in main script
make_argv
actually return 2-layer list, outside is for each resident fromcond1
and inside is arguments corresponding to the resident to train the model.- choose GPU device index
- log
context_visualization
may consume large space
- run
main.py
- Information in training process is logged in
exp/{expname}/log/{name}.csv
- the model weight is saved in
exp/{expname}/model/{name}.mdl
- To visualize the information, run tensorboard by
tensorboard --logdir=exp/{expname}/run [--port]
{expname}
need to be replace to experimental name, and specify port number if the default port is conflict
- make sure there are
-
detection
- make sure there are
data_xxx.csv
indataset/TMbase
folder and model weight underexp/{expname}/model
- adjust parameters in each task:
- compute anomaly ratio by given threshold list run
detect_compute_ratio
function - apply other residents' data on each model run
detect_apply_on_other_data
function - show reconstruction result and error run
detect_user_period
function - parameter
output_place
of functiondetect_compute_ratio
anddetect_apply_on_other_data
can adjust figure output place,
None
for directly ouput to window or plot panel in editor, or given a string{str}
as pathruns/detect/{str}
to log in tensorboard.
- compute anomaly ratio by given threshold list run
- make sure there are
-
other minor tasks
- plot_user_data
- plot_model_basis
- plot_model_result
[1] nbeats paper
[2] nbeats source code
[3] IHEPC dataset