Hi,
Many thanks for the incredible work at genforce and for making it available to the world.
I got an unexpected exception when trying to train the encoder. Looking at the code, it appears to be caused by the code that creates the image snapshots.
If I would have to guess, perhaps there is something wrong with the .tfrecord files I'm using? No errors training the generator, though. Is it a requirement to use dataset_tool from the idinvert repo to generate the .tfrecord files?
Here's log.txt and the traceback:
dnnlib: Running training.training_loop_encoder.training_loop() on localhost...
Constructing networks...
E_gpu0 Params OutputShape WeightShape
Input_img - (?, 3, 256, 256) -
FromImg 4992 (?, 64, 256, 256) (5, 5, 3, 64)
Downscale2D - (?, 64, 128, 128) -
E_block_0 119616 (?, 128, 128, 128) (3, 3, 64, 64)
Downscale2D_1 - (?, 128, 64, 64) -
E_block_1 476800 (?, 256, 64, 64) (3, 3, 128, 128)
Downscale2D_2 - (?, 256, 32, 32) -
E_block_2 1903872 (?, 512, 32, 32) (3, 3, 256, 256)
Downscale2D_3 - (?, 512, 16, 16) -
E_block_3 4721664 (?, 512, 16, 16) (3, 3, 512, 512)
Downscale2D_4 - (?, 512, 8, 8) -
E_block_4 4721664 (?, 512, 8, 8) (3, 3, 512, 512)
Downscale2D_5 - (?, 512, 4, 4) -
E_block_5 4721664 (?, 512, 4, 4) (3, 3, 512, 512)
Latent_out 58741760 (?, 7168) (8192, 7168)
Total 75412032
Gs Params OutputShape WeightShape
latents_in - (?, 512) -
labels_in - (?, 0) -
lod - () -
dlatent_avg - (512,) -
G_mapping/latents_in - (?, 512) -
G_mapping/labels_in - (?, 0) -
G_mapping/PixelNorm - (?, 512) -
G_mapping/Dense0 262656 (?, 512) (512, 512)
G_mapping/Dense1 262656 (?, 512) (512, 512)
G_mapping/Dense2 262656 (?, 512) (512, 512)
G_mapping/Dense3 262656 (?, 512) (512, 512)
G_mapping/Dense4 262656 (?, 512) (512, 512)
G_mapping/Dense5 262656 (?, 512) (512, 512)
G_mapping/Dense6 262656 (?, 512) (512, 512)
G_mapping/Dense7 3677184 (?, 7168) (512, 7168)
G_mapping/Reshape - (?, 14, 512) -
G_mapping/dlatents_out - (?, 14, 512) -
Truncation - (?, 14, 512) -
G_synthesis/dlatents_in - (?, 14, 512) -
G_synthesis/4x4/Const 534528 (?, 512, 4, 4) (512,)
G_synthesis/4x4/Conv 2885632 (?, 512, 4, 4) (3, 3, 512, 512)
G_synthesis/ToRGB_lod6 1539 (?, 3, 4, 4) (1, 1, 512, 3)
G_synthesis/8x8/Conv0_up 2885632 (?, 512, 8, 8) (3, 3, 512, 512)
G_synthesis/8x8/Conv1 2885632 (?, 512, 8, 8) (3, 3, 512, 512)
G_synthesis/ToRGB_lod5 1539 (?, 3, 8, 8) (1, 1, 512, 3)
G_synthesis/Upscale2D - (?, 3, 8, 8) -
G_synthesis/Grow_lod5 - (?, 3, 8, 8) -
G_synthesis/16x16/Conv0_up 2885632 (?, 512, 16, 16) (3, 3, 512, 512)
G_synthesis/16x16/Conv1 2885632 (?, 512, 16, 16) (3, 3, 512, 512)
G_synthesis/ToRGB_lod4 1539 (?, 3, 16, 16) (1, 1, 512, 3)
G_synthesis/Upscale2D_1 - (?, 3, 16, 16) -
G_synthesis/Grow_lod4 - (?, 3, 16, 16) -
G_synthesis/32x32/Conv0_up 2885632 (?, 512, 32, 32) (3, 3, 512, 512)
G_synthesis/32x32/Conv1 2885632 (?, 512, 32, 32) (3, 3, 512, 512)
G_synthesis/ToRGB_lod3 1539 (?, 3, 32, 32) (1, 1, 512, 3)
G_synthesis/Upscale2D_2 - (?, 3, 32, 32) -
G_synthesis/Grow_lod3 - (?, 3, 32, 32) -
G_synthesis/64x64/Conv0_up 1442816 (?, 256, 64, 64) (3, 3, 512, 256)
G_synthesis/64x64/Conv1 852992 (?, 256, 64, 64) (3, 3, 256, 256)
G_synthesis/ToRGB_lod2 771 (?, 3, 64, 64) (1, 1, 256, 3)
G_synthesis/Upscale2D_3 - (?, 3, 64, 64) -
G_synthesis/Grow_lod2 - (?, 3, 64, 64) -
G_synthesis/128x128/Conv0_up 426496 (?, 128, 128, 128) (3, 3, 256, 128)
G_synthesis/128x128/Conv1 279040 (?, 128, 128, 128) (3, 3, 128, 128)
G_synthesis/ToRGB_lod1 387 (?, 3, 128, 128) (1, 1, 128, 3)
G_synthesis/Upscale2D_4 - (?, 3, 128, 128) -
G_synthesis/Grow_lod1 - (?, 3, 128, 128) -
G_synthesis/256x256/Conv0_up 139520 (?, 64, 256, 256) (3, 3, 128, 64)
G_synthesis/256x256/Conv1 102656 (?, 64, 256, 256) (3, 3, 64, 64)
G_synthesis/ToRGB_lod0 195 (?, 3, 256, 256) (1, 1, 64, 3)
G_synthesis/Upscale2D_5 - (?, 3, 256, 256) -
G_synthesis/Grow_lod0 - (?, 3, 256, 256) -
G_synthesis/images_out - (?, 3, 256, 256) -
G_synthesis/lod - () -
G_synthesis/noise0 - (1, 1, 4, 4) -
G_synthesis/noise1 - (1, 1, 4, 4) -
G_synthesis/noise2 - (1, 1, 8, 8) -
G_synthesis/noise3 - (1, 1, 8, 8) -
G_synthesis/noise4 - (1, 1, 16, 16) -
G_synthesis/noise5 - (1, 1, 16, 16) -
G_synthesis/noise6 - (1, 1, 32, 32) -
G_synthesis/noise7 - (1, 1, 32, 32) -
G_synthesis/noise8 - (1, 1, 64, 64) -
G_synthesis/noise9 - (1, 1, 64, 64) -
G_synthesis/noise10 - (1, 1, 128, 128) -
G_synthesis/noise11 - (1, 1, 128, 128) -
G_synthesis/noise12 - (1, 1, 256, 256) -
G_synthesis/noise13 - (1, 1, 256, 256) -
images_out - (?, 3, 256, 256) -
Total 29500757
D Params OutputShape WeightShape
images_in - (?, 3, 256, 256) -
labels_in - (?, 0) -
lod - () -
FromRGB_lod0 256 (?, 64, 256, 256) (1, 1, 3, 64)
256x256/Conv0 36928 (?, 64, 256, 256) (3, 3, 64, 64)
256x256/Conv1_down 73856 (?, 128, 128, 128) (3, 3, 64, 128)
Downscale2D - (?, 3, 128, 128) -
FromRGB_lod1 512 (?, 128, 128, 128) (1, 1, 3, 128)
Grow_lod0 - (?, 128, 128, 128) -
128x128/Conv0 147584 (?, 128, 128, 128) (3, 3, 128, 128)
128x128/Conv1_down 295168 (?, 256, 64, 64) (3, 3, 128, 256)
Downscale2D_1 - (?, 3, 64, 64) -
FromRGB_lod2 1024 (?, 256, 64, 64) (1, 1, 3, 256)
Grow_lod1 - (?, 256, 64, 64) -
64x64/Conv0 590080 (?, 256, 64, 64) (3, 3, 256, 256)
64x64/Conv1_down 1180160 (?, 512, 32, 32) (3, 3, 256, 512)
Downscale2D_2 - (?, 3, 32, 32) -
FromRGB_lod3 2048 (?, 512, 32, 32) (1, 1, 3, 512)
Grow_lod2 - (?, 512, 32, 32) -
32x32/Conv0 2359808 (?, 512, 32, 32) (3, 3, 512, 512)
32x32/Conv1_down 2359808 (?, 512, 16, 16) (3, 3, 512, 512)
Downscale2D_3 - (?, 3, 16, 16) -
FromRGB_lod4 2048 (?, 512, 16, 16) (1, 1, 3, 512)
Grow_lod3 - (?, 512, 16, 16) -
16x16/Conv0 2359808 (?, 512, 16, 16) (3, 3, 512, 512)
16x16/Conv1_down 2359808 (?, 512, 8, 8) (3, 3, 512, 512)
Downscale2D_4 - (?, 3, 8, 8) -
FromRGB_lod5 2048 (?, 512, 8, 8) (1, 1, 3, 512)
Grow_lod4 - (?, 512, 8, 8) -
8x8/Conv0 2359808 (?, 512, 8, 8) (3, 3, 512, 512)
8x8/Conv1_down 2359808 (?, 512, 4, 4) (3, 3, 512, 512)
Downscale2D_5 - (?, 3, 4, 4) -
FromRGB_lod6 2048 (?, 512, 4, 4) (1, 1, 3, 512)
Grow_lod5 - (?, 512, 4, 4) -
4x4/MinibatchStddev - (?, 513, 4, 4) -
4x4/Conv 2364416 (?, 512, 4, 4) (3, 3, 513, 512)
4x4/Dense0 4194816 (?, 512) (8192, 512)
4x4/Dense1 513 (?, 1) (512, 1)
scores_out - (?, 1) -
Total 23052353
Building Graph on GPU 0
Building Graph on GPU 1
Building testing graph...
Getting training data...
Optimization starts!!!
Iter: 000000 recon_loss: 0.6908 adv_loss: 0.1358 d_r_loss: 0.3838 d_f_loss: 0.4184 d_reg: 0.0683 time:21s
Iter: 000050 recon_loss: 0.4536 adv_loss: 0.1509 d_r_loss: 0.3033 d_f_loss: 0.2664 d_reg: 0.1203 time:2m 03s
Iter: 000100 recon_loss: 0.3741 adv_loss: 0.1978 d_r_loss: 0.1764 d_f_loss: 0.1475 d_reg: 0.1859 time:3m 42s
Iter: 000150 recon_loss: 0.4159 adv_loss: 0.1959 d_r_loss: 0.2112 d_f_loss: 0.1248 d_reg: 0.2401 time:5m 21s
Iter: 000200 recon_loss: 0.4182 adv_loss: 0.1677 d_r_loss: 0.1778 d_f_loss: 0.2468 d_reg: 0.2434 time:7m 00s
Iter: 000250 recon_loss: 0.4088 adv_loss: 0.1463 d_r_loss: 0.2826 d_f_loss: 0.2760 d_reg: 0.2503 time:8m 39s
Iter: 000300 recon_loss: 0.4374 adv_loss: 0.1626 d_r_loss: 0.2324 d_f_loss: 0.2215 d_reg: 0.2145 time:10m 18s
Iter: 000350 recon_loss: 0.4204 adv_loss: 0.1349 d_r_loss: 0.3165 d_f_loss: 0.2742 d_reg: 0.2216 time:11m 57s
Iter: 000400 recon_loss: 0.4152 adv_loss: 0.1598 d_r_loss: 0.2791 d_f_loss: 0.2067 d_reg: 0.2191 time:13m 36s
Iter: 000450 recon_loss: 0.4083 adv_loss: 0.1477 d_r_loss: 0.3626 d_f_loss: 0.2404 d_reg: 0.2136 time:15m 15s
Iter: 000500 recon_loss: 0.4392 adv_loss: 0.1666 d_r_loss: 0.3399 d_f_loss: 0.2035 d_reg: 0.2288 time:16m 54s
Iter: 000550 recon_loss: 0.4462 adv_loss: 0.1416 d_r_loss: 0.1786 d_f_loss: 0.2402 d_reg: 0.2195 time:18m 33s
Iter: 000600 recon_loss: 0.3907 adv_loss: 0.1528 d_r_loss: 0.2858 d_f_loss: 0.2014 d_reg: 0.2031 time:20m 12s
Iter: 000650 recon_loss: 0.4291 adv_loss: 0.1333 d_r_loss: 0.1841 d_f_loss: 0.2675 d_reg: 0.2020 time:21m 51s
Iter: 000700 recon_loss: 0.4557 adv_loss: 0.1238 d_r_loss: 0.2788 d_f_loss: 0.3539 d_reg: 0.2099 time:23m 30s
Iter: 000750 recon_loss: 0.4234 adv_loss: 0.1384 d_r_loss: 0.2545 d_f_loss: 0.2659 d_reg: 0.2164 time:25m 09s
Iter: 000800 recon_loss: 0.4068 adv_loss: 0.1595 d_r_loss: 0.3547 d_f_loss: 0.1850 d_reg: 0.2159 time:26m 51s
Iter: 000850 recon_loss: 0.4170 adv_loss: 0.1254 d_r_loss: 0.3539 d_f_loss: 0.3013 d_reg: 0.2199 time:28m 30s
Iter: 000900 recon_loss: 0.4169 adv_loss: 0.1431 d_r_loss: 0.3396 d_f_loss: 0.2748 d_reg: 0.2101 time:30m 11s
Iter: 000950 recon_loss: 0.4328 adv_loss: 0.1403 d_r_loss: 0.1860 d_f_loss: 0.2507 d_reg: 0.2057 time:31m 50s
Iter: 001000 recon_loss: 0.4246 adv_loss: 0.1395 d_r_loss: 0.2495 d_f_loss: 0.2478 d_reg: 0.2162 time:33m 29s
Iter: 001050 recon_loss: 0.4207 adv_loss: 0.1147 d_r_loss: 0.2124 d_f_loss: 0.3612 d_reg: 0.2228 time:35m 08s
Iter: 001100 recon_loss: 0.3931 adv_loss: 0.1415 d_r_loss: 0.4792 d_f_loss: 0.2361 d_reg: 0.1948 time:36m 47s
Iter: 001150 recon_loss: 0.4295 adv_loss: 0.1231 d_r_loss: 0.2721 d_f_loss: 0.3117 d_reg: 0.2000 time:38m 26s
Iter: 001200 recon_loss: 0.3966 adv_loss: 0.1348 d_r_loss: 0.3569 d_f_loss: 0.2381 d_reg: 0.2061 time:40m 02s
Iter: 001250 recon_loss: 0.4007 adv_loss: 0.1097 d_r_loss: 0.2767 d_f_loss: 0.3597 d_reg: 0.2027 time:41m 41s
Iter: 001300 recon_loss: 0.4392 adv_loss: 0.1316 d_r_loss: 0.2217 d_f_loss: 0.2693 d_reg: 0.2128 time:43m 20s
Iter: 001350 recon_loss: 0.4290 adv_loss: 0.1236 d_r_loss: 0.3017 d_f_loss: 0.3025 d_reg: 0.2032 time:44m 59s
Iter: 001400 recon_loss: 0.4203 adv_loss: 0.1271 d_r_loss: 0.3310 d_f_loss: 0.2832 d_reg: 0.2099 time:46m 38s
Iter: 001450 recon_loss: 0.4115 adv_loss: 0.1171 d_r_loss: 0.3016 d_f_loss: 0.3156 d_reg: 0.2031 time:48m 17s
Iter: 001500 recon_loss: 0.3993 adv_loss: 0.1201 d_r_loss: 0.2723 d_f_loss: 0.3320 d_reg: 0.1909 time:49m 56s
Iter: 001550 recon_loss: 0.3839 adv_loss: 0.1343 d_r_loss: 0.3669 d_f_loss: 0.2735 d_reg: 0.1945 time:51m 35s
Iter: 001600 recon_loss: 0.4208 adv_loss: 0.1104 d_r_loss: 0.2530 d_f_loss: 0.3461 d_reg: 0.1930 time:53m 17s
Iter: 001650 recon_loss: 0.4352 adv_loss: 0.1229 d_r_loss: 0.2545 d_f_loss: 0.3369 d_reg: 0.1977 time:54m 56s
Iter: 001700 recon_loss: 0.4403 adv_loss: 0.1192 d_r_loss: 0.3191 d_f_loss: 0.3290 d_reg: 0.1944 time:56m 35s
Iter: 001750 recon_loss: 0.3838 adv_loss: 0.1265 d_r_loss: 0.2571 d_f_loss: 0.2577 d_reg: 0.2040 time:58m 14s
Iter: 001800 recon_loss: 0.4028 adv_loss: 0.1437 d_r_loss: 0.3599 d_f_loss: 0.2610 d_reg: 0.1905 time:59m 53s
Iter: 001850 recon_loss: 0.4085 adv_loss: 0.1089 d_r_loss: 0.2759 d_f_loss: 0.3384 d_reg: 0.1958 time:1h 01m 32s
Iter: 001900 recon_loss: 0.3930 adv_loss: 0.1114 d_r_loss: 0.2733 d_f_loss: 0.3467 d_reg: 0.2007 time:1h 03m 11s
Iter: 001950 recon_loss: 0.4068 adv_loss: 0.1112 d_r_loss: 0.2926 d_f_loss: 0.3603 d_reg: 0.1914 time:1h 04m 50s
Iter: 002000 recon_loss: 0.3942 adv_loss: 0.1057 d_r_loss: 0.3776 d_f_loss: 0.3608 d_reg: 0.1978 time:1h 06m 29s
Traceback (most recent call last):
File "train_encoder.py", line 69, in
main()
File "train_encoder.py", line 65, in main
dnnlib.submit_run(**kwargs)
File "/root/idinvert/dnnlib/submission/submit.py", line 290, in submit_run
run_wrapper(submit_config)
File "/root/idinvert/dnnlib/submission/submit.py", line 242, in run_wrapper
util.call_func_by_name(func_name=submit_config.run_func_name, submit_config=submit_config, **submit_config.run_func_kwargs)
File "/root/idinvert/dnnlib/util.py", line 257, in call_func_by_name
return func_obj(*args, **kwargs)
File "/root/idinvert/training/training_loop_encoder.py", line 212, in training_loop
recon = sess.run(fake_X_val, feed_dict={real_test: batch_images_test})
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1128, in _run
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (2, 3, 128, 128) for Tensor 'Input/real_image_test:0', which has shape '(2, 3, 256, 256)'
Here's submit_config.txt:
{ 'ask_confirmation': False,
'batch_size': 32,
'batch_size_test': 2,
'host_name': 'localhost',
'image_size': 256,
'num_gpus': 2,
'print_info': False,
'run_desc': 'stylegan-encoder-2gpu-256x256-ffhq',
'run_dir': 'results/00006-stylegan-encoder-2gpu-256x256-ffhq',
'run_dir_extra_files': None,
'run_dir_ignore': ['pycache', '.pyproj', '.sln', '*.suo', '.cache', '.idea', '.vs', '.vscode', 'results', 'datasets', 'cache'],
'run_dir_root': 'results',
'run_func_kwargs': { 'D_loss_args': {'func_name': 'training.loss_encoder.D_logistic_simplegp', 'r1_gamma': 10.0},
'D_opt_args': {'beta1': 0.9, 'beta2': 0.99, 'epsilon': 1e-08},
'E_loss_args': {'D_scale': 0.08, 'feature_scale': 5e-05, 'func_name': 'training.loss_encoder.E_loss', 'perceptual_img_size': 256},
'E_opt_args': {'beta1': 0.9, 'beta2': 0.99, 'epsilon': 1e-08},
'Encoder_args': {'func_name': 'training.networks_encoder.Encoder'},
'dataset_args': {'data_test': '/root/datasets/custom/custom-r07.tfrecords', 'data_train': '/root/datasets/custom/custom-r08.tfrecords'},
'decoder_pkl': {'decoder_pkl': '/root/network-snapshot-015800.pkl'},
'lr_args': {'decay_rate': 0.8, 'decay_step': 3390, 'learning_rate': 0.0001, 'stair': False},
'max_iters': 13562,
'mirror_augment': True,
'tf_config': {'rnd.np_random_seed': 1000}},
'run_func_name': 'training.training_loop_encoder.training_loop',
'run_id': 6,
'run_name': '00006-stylegan-encoder-2gpu-256x256-ffhq',
'submit_target': <SubmitTarget.LOCAL: 1>,
'task_name': 'root-00006-stylegan-encoder-2gpu-256x256-ffhq',
'user_name': 'root'}
The command line parameters I used are:
python train_encoder.py ~/datasets/custom/custom-r08.tfrecords ~/datasets/custom/custom-r07.tfrecords ~/network-snapshot-015800.pkl --num_gpus 2