The deep_learning_nikolenko_and_co from iusaspb

Possible Numerical Bug at ch10_04_03_Pic_10_05.py, ch10_04_04_Pic_10_06.py, ch10_04_05_Pic_10_07.py, and ch10_04_06_Pic_10_08.py

Thanks for offering this great repository! We deployed a tool that automatically detected this possible numerical bug from your repository.

deep_learning_Nikolenko_and_Co/ch10_04_03_Pic_10_05.py

Line 50 in 4c57dfb

z = tf.add(z_mean, tf.multiply(tf.sqrt(tf.exp(z_log_sigma_sq)), eps))

Our tool shows that the z_log_sigma_sq variable may be too large to apply tf.exp or too small so that the gradients after tf.sqrt(tf.exp(.)) become too large.

To fix this, maybe we can replace

z_log_sigma_sq = tf.add( tf.matmul(enc_layer_2, w["w_recog"]['out_log_sigma']), w["b_recog"]['out_log_sigma'])

at Line 46 by

z_log_sigma_sq = tf.add( tf.matmul(enc_layer_2, w["w_recog"]['out_log_sigma']), w["b_recog"]['out_log_sigma'])
z_log_sigma_sq = tf.clip_by_value(z_log_sigma_sq, -87, 87)

Similar possible issues were also found at ch10_04_04_Pic_10_06.py:

deep_learning_Nikolenko_and_Co/ch10_04_04_Pic_10_06.py

Line 50 in 4c57dfb

z = tf.add(z_mean, tf.multiply(tf.sqrt(tf.exp(z_log_sigma_sq)), eps))

at ch10_04_05_Pic_10_07.py:

deep_learning_Nikolenko_and_Co/ch10_04_05_Pic_10_07.py

Line 54 in 4c57dfb

z = tf.add(z_mean, tf.multiply(tf.sqrt(tf.exp(z_log_sigma_sq)), eps))

and at ch10_04_06_Pic_10_08.py:

deep_learning_Nikolenko_and_Co/ch10_04_06_Pic_10_08.py

Line 54 in 4c57dfb

z = tf.add(z_mean, tf.multiply(tf.sqrt(tf.exp(z_log_sigma_sq)), eps))

If they are valid issues, the fixing method could be the same as above.

Thanks!

[Potential NAN bug] Loss may become NAN during training even if 1e-10 was added in log()

Hello~

Thank you very much for sharing your code.

I ran ch10_04_01.py on my computer. Unfortunately, I got NAN loss.

After preliminary inspection of the code, I couldn't find the root cause of NaN loss. 1e-10 was added to the log to prevent log(0). This confused me for a long time. After I checked the program carefully with tfdebug, I found the problem!

In the code, 1e-10 was added to log to prevent log(0)

reconstr_loss = -tf.reduce_sum(x * tf.log(1e-10 + x_reconstr_mean) +(1-x) * tf.log(1e-10 + 1-x_reconstr_mean), 1)

However, 1e-10 may be too small to avoid 1e-10 + 1-x_reconstr_mean becoming 0, because I found that x_reconstr_mean may sometimes become around 1.0000000001 due to floating-point accuracy issue, which may lead to log(0)

Increasing 1e-10 to 1e-7 can solve this problem.

iusaspb / deep_learning_nikolenko_and_co Goto Github PK

deep_learning_nikolenko_and_co's People

Stargazers

Watchers

Forkers

deep_learning_nikolenko_and_co's Issues

Possible Numerical Bug at ch10_04_03_Pic_10_05.py, ch10_04_04_Pic_10_06.py, ch10_04_05_Pic_10_07.py, and ch10_04_06_Pic_10_08.py

[Potential NAN bug] Loss may become NAN during training even if 1e-10 was added in log()

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs