GithubHelp home page GithubHelp logo

triq's People

Contributors

junyongyou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

triq's Issues

About the tensorflow version

Hi,

The requirement.txt file said that the TensorFlow version used in this project is 2.2.0. However, when I tried to run the train_triq.py file, the error happened, which said that "Tensorflow Addons supports using Python ops for all Tensorflow versions above or equal to 2.3.0 and strictly below 2.5.0 (nightly versions are not supported)". It seems like the function tensorflow_addons.activations.gelu does not support the TensorFlow 2.2.0.

I'm not familiar with TensorFlow. Therefore, I want to check the TensorFlow version and discuss why this error happened.

OOM

hello, i have a question. I want to predict all the pictures of koniq using the trained model. So, I used a loop to process all the pictures in the folder, but there are some problem like this, can you help me?

tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[64,386,514] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node model_10/pool1_pad/Pad (defined at E:/Graudate/Code/triq-master-play/src/examples/image_quality_prediction.py:21) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[Op:__inference_predict_function_106843]

Handling different size inputs during training

Hi,

Could you please tell how you handled different image sizes as input during the training phase? Lets say we have three images of size (1080x1080), (1608x1608) and (2000x2000). If we give these images as an input to the network during training, how was this taken care of? Were the images padded with zeros to the image resolution of maximum size?
Thanks.

Does transformer really help?

Hi @junyongyou, I noticed that your triq model has a total of 23M parameters, most of which are from ResNet50. In this sense, Transformer layers are just like an FC head. The transformer layers you used (with parameters (2, 32, 8, 64)) even have fewer parameters than the projection head used in Koncept512.

So I am wondering how much does transformers indeed help over using an FC head? Did you have the standard train-test results on CLIVE and Koniq datasets such that I can easily compare with other SoTAs? Thank you very much.

Same output for every input image

def create_triq_model(n_quality_levels,
                      input_shape=(None, None, 3),
                      backbone='resnet50',
                      transformer_params=(2, 32, 8, 64),
                      maximum_position_encoding=193,
                      vis=False):
    chanDim = -1
    # define the model input
    inputs = Input(shape=input_shape)
    filters = (32, 64, 128)
    # loop over the number of filters
    for (i, f) in enumerate(filters):
        # if this is the first CONV layer then set the input
        # appropriately
        if i == 0:
            x = Rescaling(1./255)(inputs)

        # CONV => RELU => BN => POOL
        x = Conv2D(f, (3, 3), padding="same")(x)
        x = Activation("relu")(x)
        x = BatchNormalization(axis=chanDim)(x)
        x = MaxPooling2D(pool_size=(2, 2))(x)
    
    x = Conv2D(256, (3, 3), padding="same")(x)
    x = Activation("relu")(x)
    x = BatchNormalization(axis=chanDim)(x)
    x = MaxPooling2D(pool_size=(2, 2))(x)
    
    x = ZeroPadding2D(padding=(1, 1))(x)
    x = Conv2D(2048, (3, 3), padding="same")(x)
    x = Activation("relu")(x)
    x = BatchNormalization(axis=chanDim)(x)
    x = MaxPooling2D(pool_size=(2, 2))(x)
    dropout_rate = 0.1
    
    transformer = TriQImageQualityTransformer(
        num_layers=transformer_params[0],
        d_model=transformer_params[1],
        num_heads=transformer_params[2],
        mlp_dim=transformer_params[3],
        dropout=dropout_rate,
        n_quality_levels=n_quality_levels,
        maximum_position_encoding=maximum_position_encoding,
        vis=vis
    )
    outputs = transformer(x)
  
    model = Model(inputs=inputs, outputs=outputs)
    model.summary()
    return model

gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
input_shape = (564, 504, 3)
#model = create_triq_model(n_quality_levels=5, input_shape=input_shape, backbone='vgg16')
model = create_triq_model(n_quality_levels=1, input_shape=input_shape, backbone='resnet50')

from tensorflow.keras.optimizers import Adam
opt = Adam(learning_rate=0.001, decay=1e-3 / 200)
model.compile(loss="mean_squared_error", optimizer=opt)
model.fit(trainImagesX, trainY, validation_data=(valImagesX, valY),
          epochs=108, batch_size=16)

In the above code, I have modified the create_triq_model function in such a way that it uses a custom CNN model instead of the RSNET or VGGNet. The custom CNN model is such that its output shape is (18, 16, 2048). This output is fed to TriqImageQualityTransformer.

The issue is that after training the model predicts the same value for every input. I have experimented with various hyperparameters. It might output different values for different hyperparameter settings but for a particular setting, for every image as input, it outputs the same output. One more thing to note is that if I do not use a transformer but instead use an Artificial Neural Network, then the network trains well.

Ca you please suggest what am I doing wrong here?

About dataset

Hello, I have a question, the data shape of koniq-10k dataset is not consistent. Some is (224,224), otherwise some is(224,224,3)。but I do not find the process about the difference. Can you tell me more about the detail? thanks a lot.

dataset

mos_scale = [1, 2, 3, 4, 5]
image_files = {}
with open(mos_file, 'r+') as f:#打开文件为只读模式,文件指针位于文件开头
lines = f.readlines()#按行读取文件
for line in lines:
content = line.split(',')#将文件按行分成一个个数组
image_file = content[0].replace('"', '').lower()#取出数组第一列也就是文件中第一列图片名字

        if using_single_mos:
            score = float(content[-1]) if mos_format == 'mos' else float(content[1]) / 25. + 1

Hello,We have a puzzle. If use "single_mos" ,you have changed the MOS in the live-challenge data to [1-5], but the MOS tag in the LIVE data set in the code should be content [-1] instead of content [1]. We think the code should be
score = float(content[-1]) if mos_format == 'mos' else float(content[-1]) / 25. + 1

save model architecture

Hey, I wanted to ask - is it possible to save the whole model architecture in a json file?

paper link

Hi Junyong, could you provide the paper link? Thank you very much!

不能运行 image_quality_prediction.py

当我运行image_quality_prediction.py 报You are trying to load a weight file containing 13 layers into a model with 14 layers.错误,能否帮忙看看哪里出问题了?

Training

hello,I want to rapeat your work and rewrite it by pytorch. can you tell me more about the detail about training,"A base learning rate 5e-5 was used for pretraining"you mean pretrain in the same dataset(Koniq-10k and livec)?

TRIQ failure on images of particular size range

Hello,

Thank you for the great implementation of TRIQ.

I am able to run TRIQ successfully on most images, however it seems a particular range of resolutions causes failure.

First, I load in TRIQ model,

args = {}

args['n_quality_levels'] = 5
args['backbone'] = 'resnet50'
args['weights'] = 'path/TRIQ.h5'

model = create_triq_model(n_quality_levels=args['n_quality_levels'],
                          backbone=args['backbone'])

model.load_weights(args['weights'])

An example image link is below,

https://hpmlawatl.com/wp-content/uploads/2013/07/640x4802.gif

test_image = "/path/640x4802.gif"
image = Image.open(test_image).convert('RGB')
image = np.asarray(image, dtype=np.float32)
image = image[:,:,:3]
image /= 127.5
image -= 1.
prediction = model.predict(np.expand_dims(image, axis=0))

This shows an error,

InvalidArgumentError:  required broadcastable shapes
	 [[node model/tri_q_image_quality_transformer/add_1
 (defined at /home/ubuntu/production/triq/src/models/transformer_iqa.py:197)
]] [Op:__inference_predict_function_10094]


However, I can then resize the same image, to sizes both LARGER OR SMALLER, and the image will run successfully. As an example, this image can be set to either 512 X 384 OR 1024 X 768 and TRIQ will run fine.

test_image = "/path/640x4802.gif"
image = Image.open(test_image).convert('RGB')
img_sizes = image.size
print("Original Image size is, " + str(img_sizes[0])+  " " + str(img_sizes[1]))

size_cutoff = 1024 # This sets to 1024 X 768
size_cutoff = 512 # This sets to 512 X 384

if img_sizes[0] != size_cutoff and img_sizes[1] != size_cutoff:
    max_size = max(img_sizes)
    scale_factor = size_cutoff / max_size
    x_dim = round(img_sizes[0]*scale_factor)
    y_dim = round(img_sizes[1]*scale_factor)
    image = image.resize((x_dim,y_dim),Image.ANTIALIAS)

image = np.asarray(image, dtype=np.float32)
image = image[:,:,:3]
image /= 127.5
image -= 1.
prediction = model.predict(np.expand_dims(image, axis=0))

In order to pin this down I did a bit of empirical testing, and:

Values of size_cutoff = 513 will fail, while size_cutoff = 512 is okay.

Similarly, size_cutoff = 1057 will fail while size_cutoff = 1056 is okay.

I understand if an image is too small or large the TRIQ will fail. What I am not understanding is why images of a particular size (640X480) will fail, while the same image resized to be smaller (512, 384) or larger (1024, 768) will run successfully.

Any insight you have would be helpful.

Issue with Training - Generator error

Hello!
I followed all the instructions for training and prepared the data & labels accordingly. When I ran the training script it runs for a few steps say 170/2135 and then it stops throwing exception errors.
image
image
image

I then changed return np.array(images_aug), np.array(y_scores) to return np.array(images_aug, dtype='object'), np.array(y_scores, dtype='object'), but now script is just stuck and doesn't consume much GPU memory after a while(700MB/16GB). I even tried training from scratch(not loaded ImageNet pretrained weights) but still no luck.

My conda env details:
tensorflow-gpu==2.1.0
tensorflow_addons==0.8.3
h5py==2.10.0

The test set

Hi, I would like to ask what is the size of the KonIQ test set when it comes to testing? Also, when using the LIVE test set, its quality score is 0-100, however the prediction is 0-5, how do I calculate SROCC and PLCC

run image_quality_prediction.py shape erro

thanks for your work. it is very cool. I test jpg image with size 1919 × 1440. it will show me that:

tensorflow.python.framework.errors_impl.InvalidArgumentError:  Incompatible shapes: [1,661,32] vs. [1,193,32]
	 [[node model/tri_q_image_quality_transformer/add_1 (defined at /data2/zhx3/triq/src/models/transformer_iqa.py:198) ]] [Op:__inference_predict_function_9663]

Errors may have originated from an input operation.
Input Source operations connected to node model/tri_q_image_quality_transformer/add_1:
 model/tri_q_image_quality_transformer/concat (defined at /data2/zhx3/triq/src/models/transformer_iqa.py:194)

Function call stack:
predict_function

Input

Hello, I have been reproducing this project recently, and it feels great. Now I encounter a problem. I want to input two pictures at a time (or input one, and then enter the model and then segment it), I have read it for a long time, but I didn't find where to modify it. For the input-shape of None type, I can do nothing. Looking forward to your comments and guidance, thank you very much. - a beginner

training

Have you encountered such a problem?

File "E:/Graudate/Code/triq-master-play/src/train/train_triq.py", line 77, in train_main
model.compile(loss=loss, optimizer=optimizer, metrics=[metrics])
File "D:\tools\Anaconda\set\envs\python37tf\lib\site-packages\tensorflow\python\keras\engine\training.py", line 324, in compile
with self.distribute_strategy.scope():
File "D:\tools\Anaconda\set\envs\python37tf\lib\site-packages\tensorflow\python\keras\engine\training.py", line 455, in distribute_strategy
return self._distribution_strategy or ds_context.get_strategy()
AttributeError: 'Model' object has no attribute '_distribution_strategy'

plcc

Hello, I would like to ask what is the value of PLCC of the training set you get, when the epoch of training is 120? I think the result I get is a bit wrong.

request for trained model

Hi author,

I read your paper and think the ideas in it are very ingenious.
But I don't have enough computing power to train this model. Can you provide a trained weight file?

thank you very much :)

Error when run image_quality_prediction.py

I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
段错误 (核心已转储)

报这样的错,修改栈大小没用,不知道是什么原因

Save model config data

Hey, I wanted to ask which would be the best way to save model config data? After training it only saves weights and not the model config data itself. I tried changing it in callbacks.py "save_weights_only=False", but it did not work. Are there any other ways how to deal with this? Thank you in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.