revenol / droo Goto Github PK

Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks

License: MIT License

Python 100.00%

droo's Introduction

DROO

Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks

Python code to reproduce our DROO algorithm for Wireless-powered Mobile-Edge Computing [1], which uses the time-varying wireless channel gains as the input and generates the binary offloading decisions. It includes:

memory.py: the DNN structure for the WPMEC, inclduing training structure and test structure, implemented based on Tensorflow 1.x.
- memoryTF2.py: Implemented based on Tensorflow 2.
- memoryPyTorch.py: Implemented based on PyTorch.
optimization.py: solve the resource allocation problem
data: all data are stored in this subdirectory, includes:
- data_#.mat: training and testing data sets, where # = {10, 20, 30} is the user number
main.py: run this file for DROO, including setting system parameters, implemented based on Tensorflow 1.x
- mainTF2.py: Implemented based on Tensorflow 2. Run this file for DROO if you code with Tensorflow 2.
- mainPyTorch.py: Implemented based on PyTorch. Run this file for DROO if you code with PyTorch.
demo_alternate_weights.py: run this file to evaluate the performance of DROO when WDs' weights are alternated
demo_on_off.py: run this file to evaluate the performance of DROO when some WDs are randomly turning on/off

Cite this work

L. Huang, S. Bi, and Y. J. Zhang, “Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks,” IEEE Trans. Mobile Compt., vol. 19, no. 11, pp. 2581-2593, November 2020.

@ARTICLE{huang2020DROO,  
author={Huang, Liang and Bi, Suzhi and Zhang, Ying-Jun Angela},  
journal={IEEE Transactions on Mobile Computing},   
title={Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks},   
year={2020},
month={November},
volume={19},  
number={11},  
pages={2581-2593},  
doi={10.1109/TMC.2019.2928811}
}

About authors

Liang HUANG, lianghuang AT zjut.edu.cn
Suzhi BI, bsz AT szu.edu.cn
Ying Jun (Angela) Zhang, yjzhang AT ie.cuhk.edu.hk

Required packages

Tensorflow
numpy
scipy

How the code works

For DROO algorithm, run the file, main.py. If you code with Tenforflow 2 or PyTorch, run mainTF2.py or mainPyTorch.py, respectively. The original DROO algorithm is coded based on Tensorflow 1.x. If you are fresh to deep learning, please start with Tensorflow 2 or PyTorch, whose codes are much cleaner and easier to follow.
For more DROO demos:
- Laternating-weight WDs, run the file, [demo_alternate_weights.py](demo_alternate_weights.
- ON-OFF WDs, run the file, demo_on_off.py
- Remember to respectively edit the import MemoryDNN code from
```
  from memory import MemoryDNN
```
  to
```
  from memoryTF2 import MemoryDNN
```
  or
```
  from memoryPyTorch import MemoryDNN
```
  if you are using Tensorflow 2 or PyTorch.

DROO is illustrated here for single-slot optimization. If you tend to apply DROO for multiple-slot continuous control problems, please refer to our LyDROO project.

droo's People

Contributors

Stargazers

Watchers

Forkers

phamqv pengjiwu jiankangren wkz20042008 rauniyar01 zc300 sywcode khoink94 xiaobailin hintonthu guangbinw rusyadiramli hzf2017 daisy176 nananaruto zhlinup zhp1996 rusyadisadi hglun iwantfight zhanglintao819 ydangerous liluoniuniu ainilaha yishangbo yuaotian08 wsycuhk sky-explore kunchanglee kuonanhong ghizall4 chen-shihua oyy524275383 jb-byte wingml haha-533 limerzeniyouth pingzhou3 peng-cpu blyspyder luigisysu alfredchiang qingfengmufeng ibrahimabdallahelgendy luanedge lebeausc saeid-jhn david-spc wakewater htliu6 cnetboy fduerwilliam zronin xiaogaogaoxiao alerfaromeoo mejiro288 guangp vu1seek tanxiangtj bjm1998 eugeneyuz zerolkw farrystar zealya kellsky mohammedabuibaid happyfaye maile2108 shi-tianyi zahraaghapour danielmtowe hiranaeem vandung85 cherryfloris plliu1989 aaabbbcccddddeee wangshgeo gao874117893 wkj00 frontop137 jiadaoz22 srjaffry suika030 wjy-cup rickwayne1125 zijiwang baiyeeqg liyinong1998 liuxl2519 l-etrangers xiao-wang-hebut shiwensuoluo gaofushuaihandsome ankurhcu iris-ding721 luke052 kevinlinq yyds-xtt moveisthebest washake

droo's Issues

Could you please tell me that where can I see the references of order-preserving quantization method?

What kind of deep reinforcement learning method does this algorithm belong to

Replay method

I want to use prioritized experience replay as a replay method and batch selection, instead of random selection. As mentioned in this link to have TD error, a Q(s,a) and a Target value which is derived from reward are needed. In your method, Q(s,a) is the same as Q*(s,x*) which we gain in the action selection part(Fig3). But, about Target, we're not working with reward directly. Do you have any idea how the Target value can be calculated? Is it the same as Q*(h,x') in Eq(11) which we manually calculate?

Thanks a lot for giving your time.

May be I didn't get it all...

But I think this is actually supervised learning right? Because you're actually using the output_obj as your ground truth, and the neural network is trying to imitate this ground truth by the mutual entropy loss. In practical situation, output_obj cannot be gained.

Datasets

Where do the datasets come from?

对于资源分配模块，我有地方不太理解。

很抱歉，我对于数字信号传输方面的底层知识掌握有些薄弱。目前我有以下三处不太理解。
1.请问在3.3小节的公式(3)中，用香农公式除以变量Vu的意义是什么？
2.对于每个设备的权重，论文中提到，权重与对应设备的运算速率成正相关。请问设置不同权重的具体意义是什么？是在模拟用户设备本身硬件水平的不同，以及不同设备任务被卸载到边缘服务器后，所得到计算资源量的不同吗？
3.optimiaze.py的代码注释很少。请问在此代码文件中，用于计算a与tau的p1函数与phi函数对应着什么公式或者原理？

AttributeError: module 'tensorflow._api.v1.losses' has no attribute 'binary_crossentropy'

#user = 10, #channel=30000, K=10, decoder = OP, Memory = 1024, Delta = 32

AttributeError Traceback (most recent call last)
in ()
96 training_interval=10,
97 batch_size=128,
---> 98 memory_size=Memory
99 )
100

2 frames
/tensorflow-1.15.2/python3.7/tensorflow_core/python/util/module_wrapper.py in getattr(self, name)
191 def getattr(self, name):
192 try:
--> 193 attr = getattr(self._tfmw_wrapped_module, name)
194 except AttributeError:
195 if not self._tfmw_public_apis:

AttributeError: module 'tensorflow._api.v1.losses' has no attribute 'binary_crossentropy'

KeyError: (slice(None, None, None), None)

how to fix this error. I found this error in all .py files

KeyError: (slice(None, None, None), None)

写的挺高大尚，没啥用

写的挺高大尚，没啥用写的挺高大尚，没啥用写的挺高大尚，没啥用

how to calculate channel gain(input_h)

hi. I found this paper generated the channe gain from a Rayleigh fading channel model, and the independent random channel fading factor α following an exponential distribution with unit mean.

I'd like to ask that in simulation experiment how to calculate the value of channel gain when a random vairable is considered. Since the random variable following an exponential distribution, and it hard for me to understand how can we get the value of α in different time slot. Will it be a specific numerical value or a value choose randomly in the distribution? Thanks!

应用场景

请问论文中的WD有哪些具体的实例吗，也就是具体的应用场景。

problem with memory module

sir can you please add some comments in memory module as it is quite complicated to understand for the beginners.
README.md

Could you please tell me that where can I see the references of order-preserving quantization method?

kind of reinforcement learning

hello,
This reinforcement learning algorithm is which one, off-policy or offline RL?
thank you for the nice paper and your clean code.

question about fmax and the assurances for task

Hi Dr Huang, I have read your paper recently and I'm appreciate your job in droo.

I'm curious about how you assign f_max in DROO, because I saw that you got a constraint fi<=f_max in paper Computation Rate Maximization..., have you also assign the same constrain in DROO's dataset and what is the exact value.

And the model of DROO and the Previous paper mentioned above are not care for specific task right? For that I've notice the result of resource allocation about τ_i are very unevenness. Some τ_i can smaller than 10e-5, And I think if it's valueable for some constraints if I'm care about specific tasks in MEC system?

Hopping for your reply!

Regarding the lambertw function calling part in the optimization.py file

Hello teacher, I have encountered difficulties in reading the code. If it is convenient, please help me answer:

There is a variable v in the bisection function in the optimization.py file. You continuously call the function Q(v) through binary search. What does this v mean?
There is a sum1 variable in the Q(v) function. The p1 function is called in the expression to calculate sum1 and the v value is passed. The p1 function calls the phi function, and the lambertw function is used in the phi function. The appearance of this series of calls and expressions made me suspicious. It seems that the lambertw function and other contents did not appear in your paper.

I'd be very grateful if you could give me an answer.