masazi / cnn_depth_tensorflow Goto Github PK
View Code? Open in Web Editor NEWdepth estimation using tensorflow
depth estimation using tensorflow
Hi,
I can not train the model in this code. I checked issue #[1] and #[3] and fixed them in the code.
However, the code can not be trained the model and I don't know why.
Does anyone train the model in this code?
Finally, thank you for providing a readable code.
Hello,
Can anybody please tell how to run the trained model on new images?
Regards,
Arnab Banerjee
Where are the coarse data while training? The coarse folder is empty.
Can I test video after training ?
Thank you .
Hi again,
In the file model.py at line 46, shouldn't:
cost = tf.reduce_mean(sum_square_d / 55.0*74.0 - 0.5*sqare_sum_d / math.pow(55*74, 2))
be replaced by
cost = tf.reduce_mean(sum_square_d / (55.0*74.0) - 0.5*sqare_sum_d / math.pow(55*74, 2))
to match equation (4) form the paper? (I think you just forgot the parenthesis).
Best,
Clement
Dear sir,
I wonder if you have the pretrained model on KITTI?
Hi
I'm reading your code. I have a few questions regarding the depth map.
In the mat file of the dataset, the depth is in the unit of meter. The depth ranges from 0 to 10 meters. When you transfer the distances to pixel values in the convert_mat_to_img.py, for each depth image, you normalized the depth with the highest depth distance value, then you multiplied the normalized distance with 255. Then you train the model with labels being the png pixel values divided by 255, which is not the distance but the normalized distance. Therefore, the model's output is not regressing on the distances but on the normalized distances. Shouldn't it regress on the true distance?
I think you should normalize the distance with 10 which is the maximum depth(I tested with python that the maximum depth is 9.99547 meters in NYU2 dataset website). Then the png image can be transfered to true depth value in meter as labels.
Meanwhile, is invalid_depth
needed in the codes? From my understanding it indicates the sign of the depth. But can the depth values be negative?
By the way, for the scale-invariant loss, is the 0.5 in the following code needless?
cost = tf.reduce_mean(sum_square_d / 55.0*74.0 - 0.5*sqare_sum_d / math.pow(55*74, 2))
There is not a 0.5 in the formula (3) in the paper.
Is my understanding right?
In the prepare_data.py line 40
if not os.path.exists('train.csv'):
os.remove('train.csv')
Is this corrent? When I run the program for the first time, it told me train.csv no such file exists.
if REFINE_TRAIN:
print("refine train.")
coarse = model.inference(images, keep_conv, trainable=False)
logits = model.inference_refine(images, coarse, keep_conv, keep_hidden)
else:
print("coarse train.")
logits = model.inference(images, keep_conv, keep_hidden)
In these code, why use keep_conv and keep_hidden in inference_refine and inference?
Why reuse and trainable could be replaced by keep_conv and keep_hidden?
And, why keep_conv = 0.8 and keep_hidden = 0.5?
thanks,
According to the paper, we need to train the coarse layers at first, and than fix it and train the refine part. And I think in this code, we only need turn the flag REFINE_TRAIN into False to do the first step and than REFINE_TRAIN=True to do the second step right?
However after I set REFINE_TRAIN=True, I found all variables from coarse network were still trainable. I think it is because the setting of trainable flag in the function '_variable_on_gpu' in model_part.py file is neglected.
Another problem is about the learning rate. According to the paper, learning rates are 0.001 for coarse convolutional layers 1-5, 0.1 for coarse full layers 6 and 7, 0.001 for fine layers 1 and 3, and 0.01 for fine layer 2. But the initial learning rates are set to 0.0001 for all layers in the code. With this learning rate, I cannot get a good result even after training for more than two days, compared with the performance mentioned in the paper. So I'm just wondering does anyone get a good result with this code and how to train the network to obtain such a result?
At last, thanks for providing such a clean and readable implementation :)
Hi,
First of all, thanks for this really clean and easy to read code. I tried to train the network using task.py but I have several questions:
are the pictures outputted in the "predict_refine_..." folders really validation pictures? I tested my trained net on other non seen pictures and the results are not as good (actually they are pretty bad).
If I got it right (and checking issue #[1]) the script trains everything together while in the paper the coarse network is trained first before training the fine network (and freezing the coarse predictions). Is that a big issue?
I haven't seen the data augmentation part (from section 3.4 in the paper) in the code, will it impact the results?
Finally, in task.py MAX_STEPS is set to 10000000 but I couldn't run this number of epoch (I went to 4140). How many of them is needed to get good results?
Best,
Clement
Hi, I try to train this model, but I don't konw how to creat the train.csv, can you provide an example of the 'train.csv'? Best regards!
I get an error while running the task.py file
variables_averages_op = variable_averages.apply(tf.trainable_variables())
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/moving_averages.py", line 393, in apply
var.name)
TypeError: The variables must be half, float, or double: Variable:0
Thank you for your code.
I am testing your code, but train loss doesn't decrease.
Perhaps my examination is wrong (I changed a few codes to use tensorflow v1.0).
About how many epoch does this training want?
In my examine, train loss goes around 1600. ใ 2500. in first 0ใ10 epochs.
Would you give me a advice?
In tensorflow v1.0 we can't use these module below.
tf.mul() -> multiply
tf. sub() -> subtract
tf.concat(3, [fine1_dropout, coarse7_output]) -> tf.concat([fine1_dropout, coarse7_output],3)
Hi, how can I run the trained model directly?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.