GithubHelp home page GithubHelp logo

xmcgan_image_generation's Introduction

Cross-Modal Contrastive Learning for Text-to-Image Generation

This repository hosts the open source JAX implementation of XMC-GAN.

Setup instructions

Environment

Set up virtualenv, and install required libraries:

virtualenv venv
source venv/bin/activate

Add the XMC-GAN library to PYTHONPATH:

export PYTHONPATH=$PYTHONPATH:/home/path/to/xmcgan/root/

JAX Installation

Note: Please follow the official JAX instructions for installing a GPU compatible version of JAX.

Other Dependencies

After installing JAX, install the remaining dependencies with:

pip install -r requirements.txt

Preprocess COCO-2014

To create the training and eval data, first start a directory. By default, the training scripts expect to save results in data/ in the base directory.

mkdir data/

The TFRecords required for training and validation on COCO-2014 can be created by running a preprocessing script over the TFDS coco_captions dataset:

python preprocess_data.py

This may take a while to complete, as it runs a pretrained BERT model over the captions and stores the embeddings. With a GPU, it runs in about 2.5 hours for train, and 1 hour for validation. Once it is done, the train and validation tfrecords files will be saved in the data/ directory. The train files require around 58G of disk space, and the validation requires 29G.

Note: If you run into an error related to TensorFlow gfile, one workaround is to edit site-packages/bert/tokenization.py and change tf.gfile.GFile to tf.io.gfile.GFile. For more details, refer to the following link.

If you run into a tensorflow.python.framework.errors_impl.ResourceExhaustedError about having too many open files, you may have to increase the machine's open file limits. To do so, open the limit configuration file for editing:

vi /etc/security/limits.conf

and append the following lines to the end of the file:

*         hard    nofile      500000
*         soft    nofile      500000
root      hard    nofile      500000
root      soft    nofile      500000

You may have to adjust the limit values depending on your machine. You will need to logout and login to your machine for these values to take effect.

Download Pretrained ResNet

To train XMC-GAN, we need a network pretrained on ImageNet to extract features. For our purposes, we train a ResNet-50 network for this. To download the weights, run:

gsutil cp gs://gresearch/xmcgan/resnet_pretrained.npy data/

If you would like to pretrain your own network on ImageNet, please refer to the official Flax ImageNet example.

Training

Start a training run, by first editing train.sh to specify an appropriate work directory. By default, the script assumes that 8 GPUs are available, and runs training on the first 7 GPUs, while test.sh assumes testing will run on the last GPU. After configuring the training job, start an experiment by running it on bash:

mkdir exp
bash train.sh exp_name &> train.txt

Checkpoints and Tensorboard logs will be saved in /path/to/exp/exp_name. By default, the configs/coco_xmc.py config is used, which runs an experiment for 128px images. This is able to accommodate a batch size of 8 on each GPU, and achieves an FID of around 10.5 - 11.0 with the EMA weights. To reproduce the full results on 256px images in our paper, the full model needs to be run using a 32-core Pod slice of Google Cloud TPU v3 devices.

Evaluation

To run an evaluation job, update test.sh with the correct settings used in the training script. Then, execute

bash test.sh exp_name &> eval.txt

to start an evaluation job. All checkpoints in workdir will be evaluated for FID and Inception Score. If you can spare the GPUs, you can also run train.sh and test.sh in parallel, which will continuously evaluate new checkpoints saved into the work directory. Scores will be written to Tensorboard and output to eval.txt.

Tensorboard

To start a Tensorboard for monitoring training progress, run:

tensorboard --logdir /path/to/exp/exp_name

Citation

If you find this work useful, please consider citing:

@inproceedings{zhang2021cross,
  title={Cross-Modal Contrastive Learning for Text-to-Image Generation},
  author={Zhang, Han and Koh, Jing Yu and Baldridge, Jason and Lee, Honglak and Yang, Yinfei},
  journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Disclaimer

Not an official Google product.

xmcgan_image_generation's People

Contributors

kohjingyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

xmcgan_image_generation's Issues

Other version?

How to change this into tensorflow or save the model as .pb or .h5 or tflite?

Library version

When I installed the tensorflow==2.5.0rc0, it appeared the below message:

ERROR: Could not find a version that satisfies the requirement tensorflow==2.5.0rc0 (from versions: 2.2.0, 2.2.1, 2.2.2, 2.2.3, 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 2.5.0, 2.5.1, 2.5.2, 2.5.3, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.6.3, 2.6.4, 2.6.5, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.7.1, 2.7.2, 2.7.3, 2.7.4, 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.8.3, 2.8.4, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 2.9.0, 2.9.1, 2.9.2, 2.9.3, 2.10.0rc0, 2.10.0rc1, 2.10.0rc2, 2.10.0rc3, 2.10.0, 2.10.1, 2.11.0rc0, 2.11.0rc1, 2.11.0rc2, 2.11.0)
ERROR: No matching distribution found for tensorflow==2.5.0rc0

How can I do?

Would any generated image samples be released?

I understand that you probably cannot release pretrained model for now. But for a fair comparison between your methods and other future work, would you please release the generated samples on the validation set of LN-COCO, LN-OpenImage and COCO14?

How to calculate the IoU scores without ground truth bounding box input?

I found out that you used the official code to compute the SOA scores.
But in[OP-GAN](https://github.com/ppjh8263/semantic-object-accuracy-for-generative-text-to-image-synthesis/tree/1d07bf250aedf9e1b0c55505eb76c49d60ce0055/SOA. ") it is described as : "In order to calculate the IoU scores you need to save the "ground truth" information, i.e. the bounding boxes you give your model as input, so we can compare them with the bounding boxes from the detection network." And there is no bounding box input in your model(xmcgan).
Can I get the details of your SOA score calculation? Thank you.

Question about the sentence embedding.

Excuse me, in your code, the sentence embedding is calculated by averaging the word embedding in the word_num dim. While, in BERT, the encoding corresponding to the '[CLS]' token can represent the whole sentence. Why not use this as sentence embedding? Which one performs better?

Inference script to generate images from list of raw captions

Hi again :)),

Could you provide the inference code to allow me to generate images from a list of raw captions?
Currently, I'm not familiar with flax code. I think it will save a lot of time for me and other people to re-code this script.

Thanks.

About jax.PRNGKey:Error reporting when running

Excuse me. When I tried to run this code, I have a problem about this line:

generator_variables = generator(train=False).init(g_rng, (inputs, z))

and the error is flax.errors.InvalidRngError: rngs should be a dictionary mapping strings to jax.PRNGKey. Actually, the g_rng is a array of shape[2,].
So anyone else can help me solve this problem?

By the way, I have configured cuda, but it still tell me cuda not found.
xla_bridge.py:232] Unable to initialize backend 'gpu': NOT_FOUND: Could not find registered platform with name: "cuda". Available platform names are: Interpreter Host
I even use tensorflow to test the gpu, which is right. I don't know what the problem is.

The implementation details in the code.

Excuse me, line 216 in xmc_net.py, "x = dense_fn(self.gf_dim * 16 * 4 * 4)(z)", which input the noise "z" into the linear layer. However, Table 7 in supp shows that the input is the concatenation of noise and reshaped condition. I wonder which one is right.

error: train.sh: line 24: 45523 Segmentation fault (core dumped)

error:
I1216 05:03:33.303731 140638140207296 utils.py:31] Checkpoint.restore_or_initialize() ...
I1216 05:03:33.304307 140638140207296 checkpoint.py:301] No checkpoint specified. Restore the latest checkpoint.
I1216 05:03:33.304460 140638140207296 utils.py:31] MultihostCheckpoint.get_latest_checkpoint_to_restore_from() ...
I1216 05:03:33.312287 140638140207296 checkpoint.py:430] Checked checkpoint base_directories: ['path/to/exp/exp_name/checkpoints-0'] - common_numbers={1} - exclusive_numbers=set()
I1216 05:03:33.312516 140638140207296 utils.py:41] MultihostCheckpoint.get_latest_checkpoint_to_restore_from() finished after 0.01s.
I1216 05:03:33.312650 140638140207296 checkpoint.py:307] Restoring checkpoint: path/to/exp/exp_name/checkpoints-0/ckpt-1
2021-12-16 05:03:33.316385: W ./tensorflow/core/framework/dataset.h:550] Failed precondition: StatelessRandomGetKeyCounter is stateful.
I1216 05:03:45.659061 140638140207296 checkpoint.py:312] Restored save_counter=1 restored_checkpoint=path/to/exp/exp_name/checkpoints-0/ckpt-1
I1216 05:03:45.659443 140638140207296 utils.py:41] Checkpoint.restore_or_initialize() finished after 12.36s.
I1216 05:03:47.525738 140590360545024 logging_writer.py:56] Hyperparameters: {'architecture': 'xmc_net', 'batch_norm_group_size': -1, 'batch_size': 8, 'beta1': 0.5, 'beta2': 0.999, 'checkpoint_every_steps': 5000, 'coco_version': '2014', 'cond_size': 16, 'd_lr': 0.0004, 'd_spectral_norm': True, 'd_step_per_g_step': 14, 'data_dir': 'data/', 'dataset': 'mscoco', 'df_dim': 96, 'dtype': 'bfloat16', 'eval_avg_num': 3, 'eval_batch_size': 4, 'eval_every_steps': 1000, 'eval_num': 30000, 'g_lr': 0.0001, 'g_spectral_norm': False, 'gamma_for_g': 15, 'gf_dim': 96, 'image_contrastive': True, 'image_size': 128, 'log_loss_every_steps': 1000, 'model_name': 'xmc', 'num_epochs': 500, 'num_train_steps': -1, 'polyak_decay': 0.999, 'pretrained_image_contrastive': True, 'return_filename': False, 'return_text': False, 'seed': 42, 'sentence_contrastive': True, 'show_num': 64, 'shuffle_buffer_size': 1000, 'train_shuffle': True, 'trial': 0, 'word_contrastive': True, 'z_dim': 128}
I1216 05:03:47.528530 140638140207296 train_utils.py:404] Starting training loop at step 1.
/root/yes/envs/py39/lib/python3.9/site-packages/jax/_src/profiler.py:166: UserWarning: StepTraceContext has been renamed to StepTraceAnnotation. This alias will eventually be removed; please update your code.
warnings.warn(
Fatal Python error: Segmentation fault

Thread 0x00007fddbdffb700 (most recent call first):
File "/root/yes/envs/py39/lib/python3.9/concurrent/futures/thread.py", line 75 in _worker
File "/root/yes/envs/py39/lib/python3.9/threading.py", line 910 in run
File "/root/yes/envs/py39/lib/python3.9/threading.py", line 973 in _bootstrap_inner
File "/root/yes/envs/py39/lib/python3.9/threading.py", line 930 in _bootstrap

Thread 0x00007fddbe7fc700 (most recent call first):
File "/root/yes/envs/py39/lib/python3.9/concurrent/futures/thread.py", line 75 in _worker
File "/root/yes/envs/py39/lib/python3.9/threading.py", line 910 in run
File "/root/yes/envs/py39/lib/python3.9/threading.py", line 973 in _bootstrap_inner
File "/root/yes/envs/py39/lib/python3.9/threading.py", line 930 in _bootstrap

Current thread 0x00007fe8de6390c0 (most recent call first):
File "/root/yes/envs/py39/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 1955 in shape
File "<array_function internals>", line 5 in shape
File "/root/yes/envs/py39/lib/python3.9/site-packages/jax/_src/api.py", line 1307 in
File "/root/yes/envs/py39/lib/python3.9/site-packages/jax/_src/api.py", line 1307 in _mapped_axis_size
File "/root/yes/envs/py39/lib/python3.9/site-packages/jax/_src/api.py", line 1633 in f_pmapped
File "/root/yes/envs/py39/lib/python3.9/site-packages/jax/_src/api.py", line 1725 in f_pmapped
File "/root/yes/envs/py39/lib/python3.9/site-packages/jax/_src/traceback_util.py", line 162 in reraise_with_filtered_traceback
File "/xmc_gan/xmcgan/train_utils.py", line 424 in train
File "/xmc_gan/xmcgan/main.py", line 62 in main
File "/root/yes/envs/py39/lib/python3.9/site-packages/absl/app.py", line 251 in _run_main
File "/root/yes/envs/py39/lib/python3.9/site-packages/absl/app.py", line 303 in run
File "/xmc_gan/xmcgan/main.py", line 70 in
File "/root/yes/envs/py39/lib/python3.9/runpy.py", line 87 in _run_code
File "/root/yes/envs/py39/lib/python3.9/runpy.py", line 197 in _run_module_as_main
train.sh: line 24: 45523 Segmentation fault (core dumped) CUDA_VISIBLE_DEVICES="0,1,2,3" python -m xmcgan.main --config="$CONFIG" --mode="train" --workdir="$WORKDIR"

details:
config.batch_size = 8
config.d_step_per_g_step = 14

Have you ever come across this mistake?

Resume training.

Good day, there!

I think this is for resume guidelines.

contains checkpoint training will be resumed from the latest checkpoint.

In my directory, there are checkpoint folders like 'checkpoints', checkpoints-0'. And In checkpoints folder, there is another checkpoints folder too. please refer to the screenshot.
image

I can't figure out which folder I should load. What command I should type? Is it right? bash train.sh exp_name/checkpoints &> train.txt

Thank you :)

Error when training on multi-GPU

I got the following error message when training on multiple GPUs...

Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/u7801832/xmcenv/xmcgan_image_generation/xmcgan/main.py", line 70, in
app.run(main)
File "/home/u7801832/xmcenv/lib/python3.8/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/home/u7801832/xmcenv/lib/python3.8/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "/home/u7801832/xmcenv/xmcgan_image_generation/xmcgan/main.py", line 62, in main
train_utils.train(FLAGS.config, FLAGS.workdir)
File "/home/u7801832/xmcenv/xmcgan_image_generation/xmcgan/train_utils.py", line 421, in train
batch = jax.tree_map(np.asarray, next(train_iter))
File "/home/u7801832/xmcenv/lib/python3.8/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 761, in next
return self._next_internal()
File "/home/u7801832/xmcenv/lib/python3.8/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 744, in _next_internal
ret = gen_dataset_ops.iterator_get_next(
File "/home/u7801832/xmcenv/lib/python3.8/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 2728, in iterator_get_next
_ops.raise_from_not_ok_status(e, name)
File "/home/u7801832/xmcenv/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 6897, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes at component 0: expected [7,16,17,768] but got [1,112,17,768]. [Op:IteratorGetNext]

The training script is as follows...

#!/bin/bash
CONFIG="xmcgan/configs/coco_xmc.py"
EXP_NAME=$1
WORKDIR="/work/u7801832/data2/" # CHANGEME

CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6" python -m xmcgan.main
--config="$CONFIG"
--mode="train"
--workdir="$WORKDIR" \

Please help.

Thanks

Changing batch size and using multiple gpu makes Incompatible shapes issue.

I have a memory issue.
So would it be better to change the batch size?

I changed only two things in the configuration.

batch_size = 56 -> 4
eval_batch_size = 7 -> 4.

But it makes a dimension error as below.

tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes at component 0: expected [1,8,17,768] but got [1,32,17,768]. [Op:IteratorGetNext]

  1. What makes this error and what should I do now in this case?
  2. Or would you give me better tips to solve the out-of-memory?

My device is RTX 1080 TI (11GB) X 2

FailedPreconditionError

how to solve 'tensorflow.python.framework.errors_impl.FailedPreconditionError: StatelessRandomGetKeyCounter is stateful. [Op:SerializeIterator]'?

image

Below is my checkpoint directory:
image

Is training steps relevant with batch size?

Hi,

Thx for your open-source code !

I find that training steps per epoch is not relevant with batch size. In line 343

steps_per_epoch = num_train_examples // (jax.local_device_count() * config.d_step_per_g_step)

maybe it shoud be

steps_per_epoch = num_train_examples // (jax.local_device_count() * config.d_step_per_g_step * config.batch_size)

About the package libml

Excuse me. I'm trying to run it. But now I encountered a problem. libml is not in the requirements.txt file. At the same time, when I installed it using pip, reports an error <cannot import name 'input_pipeline' from 'libml' >. so what's the problem? What should I do?

cuda and cudnn version?

My cuda and Cudnn verison are 11.4 and 8.2.
flax 0.3.6
jax 0.2.27
jaxlib 0.1.76+cuda11.cudnn82

The error message are below.

I0222 12:38:33.308843 140599999719232 xmc_gan.py:119] train_step(batch={'embedding': Traced<ShapedArray(bfloat16[14,17,768])>with<DynamicJaxprTrace(level=0/1)>, 'image': Traced<ShapedArray(bfloat16[14,256,256,3])>with<DynamicJaxprTrace(level=0/1)>, 'image_aug': Traced<ShapedArray(bfloat16[14,256,256,3])>with<DynamicJaxprTrace(level=0/1)>, 'max_len': Traced<ShapedArray(bfloat16[14,1])>with<DynamicJaxprTrace(level=0/1)>, 'sentence_embedding': Traced<ShapedArray(bfloat16[14,768])>with<DynamicJaxprTrace(level=0/1)>, 'z': Traced<ShapedArray(bfloat16[14,128])>with<DynamicJaxprTrace(level=0/1)>})
2023-02-22 12:39:19.873214: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 64 spatial: 56 56 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 64 input_feature_map_count: 64 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.877356: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 64 spatial: 56 56 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 64 input_feature_map_count: 64 layout: OutputInputYX shape: 3 3 }
{zero_padding: 1 1 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.884515: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 64 spatial: 56 56 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 256 input_feature_map_count: 64 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.891635: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 256 spatial: 56 56 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 64 input_feature_map_count: 256 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.899718: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 256 spatial: 56 56 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 128 input_feature_map_count: 256 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.907872: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 256 spatial: 56 56 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 512 input_feature_map_count: 256 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 2 2 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.912539: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 128 spatial: 28 28 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 512 input_feature_map_count: 128 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.917105: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 512 spatial: 28 28 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 128 input_feature_map_count: 512 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.920219: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 128 spatial: 28 28 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 128 input_feature_map_count: 128 layout: OutputInputYX shape: 3 3 }
{zero_padding: 1 1 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.925633: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 512 spatial: 28 28 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 256 input_feature_map_count: 512 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.930975: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 512 spatial: 28 28 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 1024 input_feature_map_count: 512 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 2 2 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.934477: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 256 spatial: 14 14 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 1024 input_feature_map_count: 256 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.937853: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 1024 spatial: 14 14 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 256 input_feature_map_count: 1024 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.940752: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 256 spatial: 14 14 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 256 input_feature_map_count: 256 layout: OutputInputYX shape: 3 3 }
{zero_padding: 1 1 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.944959: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 1024 spatial: 14 14 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 512 input_feature_map_count: 1024 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.949473: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 1024 spatial: 14 14 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 2048 input_feature_map_count: 1024 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 2 2 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.952659: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 512 spatial: 7 7 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 2048 input_feature_map_count: 512 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.955657: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 2048 spatial: 7 7 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 512 input_feature_map_count: 2048 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.958881: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 512 spatial: 7 7 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 512 input_feature_map_count: 512 layout: OutputInputYX shape: 3 3 }
{zero_padding: 1 1 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.964249: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 512 spatial: 7 7 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 2048 input_feature_map_count: 512 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.969219: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 512 spatial: 14 14 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 1024 input_feature_map_count: 512 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.973978: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 256 spatial: 14 14 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 1024 input_feature_map_count: 256 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.981359: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 256 spatial: 28 28 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 512 input_feature_map_count: 256 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:19.988299: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 128 spatial: 28 28 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 512 input_feature_map_count: 128 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:20.000546: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 128 spatial: 56 56 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 256 input_feature_map_count: 128 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:20.012110: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 64 spatial: 56 56 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 256 input_feature_map_count: 64 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:20.017571: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 64 spatial: 56 56 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 64 input_feature_map_count: 64 layout: OutputInputYX shape: 1 1 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 1 1 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:20.221487: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 3 spatial: 229 229 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 64 input_feature_map_count: 3 layout: OutputInputYX shape: 7 7 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 2 2 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:20.226297: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 128 spatial: 57 57 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 128 input_feature_map_count: 128 layout: OutputInputYX shape: 3 3 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 2 2 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:20.229980: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 256 spatial: 29 29 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 256 input_feature_map_count: 256 layout: OutputInputYX shape: 3 3 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 2 2 dilation_rates: 1 1 }
... because it uses an identity activation.
2023-02-22 12:39:20.233720: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc:5205] Disabling cuDNN frontend for the following convolution:
input: {count: 14 feature_map_count: 512 spatial: 15 15 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
filter: {output_feature_map_count: 512 input_feature_map_count: 512 layout: OutputInputYX shape: 3 3 }
{zero_padding: 0 0 pad_alignment: default filter_strides: 2 2 dilation_rates: 1 1 }
... because it uses an identity activation.
I0222 12:39:35.479767 140599999719232 train_utils.py:436] Finished training step 1.
I0222 12:39:37.334168 140599999719232 train_utils.py:436] Finished training step 2.
I0222 12:39:38.369166 140599999719232 train_utils.py:436] Finished training step 3.
I0222 12:39:39.788250 140599999719232 train_utils.py:436] Finished training step 4.
I0222 12:39:41.206766 140599999719232 train_utils.py:436] Finished training step 5.
....

I had tried cudnn 8.6, and it didn't work too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.