facebookresearch / clevr-dataset-gen Goto Github PK

A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

License: Other

Python 100.00%

clevr-dataset-gen's Introduction

CLEVR Dataset Generation

This is the code used to generate the CLEVR dataset as described in the paper:

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Fei-Fei Li, Larry Zitnick, Ross Girshick
Presented at CVPR 2017

Code and pretrained models for the baselines used in the paper can be found here.

You can use this code to render synthetic images and compositional questions for those images, like this:

Q: How many small spheres are there?
A: 2

Q: What number of cubes are small things or red metal objects?
A: 2

Q: Does the metal sphere have the same color as the metal cylinder?
A: Yes

Q: Are there more small cylinders than metal things?
A: No

Q: There is a cylinder that is on the right side of the large yellow object behind the blue ball; is there a shiny cube in front of it?
A: Yes

If you find this code useful in your research then please cite

@inproceedings{johnson2017clevr,
  title={CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning},
  author={Johnson, Justin and Hariharan, Bharath and van der Maaten, Laurens
          and Fei-Fei, Li and Zitnick, C Lawrence and Girshick, Ross},
  booktitle={CVPR},
  year={2017}
}

All code was developed and tested on OSX and Ubuntu 16.04.

Step 1: Generating Images

First we render synthetic images using Blender, outputting both rendered images as well as a JSON file containing ground-truth scene information for each image.

Blender ships with its own installation of Python which is used to execute scripts that interact with Blender; you'll need to add the image_generation directory to Python path of Blender's bundled Python. The easiest way to do this is by adding a .pth file to the site-packages directory of Blender's Python, like this:

echo $PWD/image_generation >> $BLENDER/$VERSION/python/lib/python3.5/site-packages/clevr.pth

where $BLENDER is the directory where Blender is installed and $VERSION is your Blender version; for example on OSX you might run:

echo $PWD/image_generation >> /Applications/blender/blender.app/Contents/Resources/2.78/python/lib/python3.5/site-packages/clevr.pth

You can then render some images like this:

cd image_generation
blender --background --python render_images.py -- --num_images 10

On OSX the blender binary is located inside the blender.app directory; for convenience you may want to add the following alias to your ~/.bash_profile file:

alias blender='/Applications/blender/blender.app/Contents/MacOS/blender'

If you have an NVIDIA GPU with CUDA installed then you can use the GPU to accelerate rendering like this:

blender --background --python render_images.py -- --num_images 10 --use_gpu 1

After this command terminates you should have ten freshly rendered images stored in output/images like these:

The file output/CLEVR_scenes.json will contain ground-truth scene information for all newly rendered images.

You can find more details about image rendering here.

Step 2: Generating Questions

Next we generate questions, functional programs, and answers for the rendered images generated in the previous step. This step takes as input the single JSON file containing all ground-truth scene information, and outputs a JSON file containing questions, answers, and functional programs for the questions in a single JSON file.

You can generate questions like this:

cd question_generation
python generate_questions.py

The file output/CLEVR_questions.json will then contain questions for the generated images.

You can find more details about question generation here.

clevr-dataset-gen's People

Contributors

Stargazers

Watchers

Forkers

benjamesbabala ionvision ml-lab kiyukuta alexleethinker lynn00lynn hyren fstrub95 hyeonwoonoh dawitmureja sxtyzhangzk shubhampachori12110095 woodfrog jwyang jiasenlu tzhang2014 ronghanghu enggen wotulong charlienash dineshresearch kartheekmedathati alenaliu phil-hawkins ivmreg kgraph fabienbaradel seth-park mohitzsh quanpinjie erobic wjyoon ajayarunachalam lujiaying guydav aelnouby rhendrickson42 decewei hughperkins shamanez mackoo13 neixlo millskyle alkalami ccvl wanyenlo hexiang-hu michaal94 marcus-zhu walter0807 hyzcn tejas-gokhale rimamittal wh-forker loganbruns mysteryvaibhav tzuhsial yordanh ersks remega tayfunates taaccoo-beta roeiherz ammieqi hughplay vrama91 jerryswan jihoonl masterqsx csmith49 zuoym15 aurooj sebamenabar joeaortiz isabelraposo christopher-beckham mweiss17 fgolemo jmfanbu zerolatnc jinyangyuan amruz rivasd iiixyn dapatil211 himmelstein sequae92 mooninrain darisallc stephenkyung jinhoha swarrier246 zrhonor allenai evaportelance ayushjain1144 vtoborek oolongqian spencerkent oleksost

clevr-dataset-gen's Issues

Missing value-inputs in question generation

Hello

I have generated a series of images and questions using the scripts in this repository. For the question generation, I have only used the one hop template. I think that some of the functional programs in the output JSON file are missing some information.

Here is an example:

{"question": "There is a cube; what number of tiny green objects are right of it?",
 "program": [
    {
      "type": "scene",
      "inputs": [],
      "_output": [0, 1, 2, 3, 4, 5],
      "value_inputs": []
    },
    {
      "type": "filter_shape",
      "inputs": [0],
      "_output": [5],
      "value_inputs": []
    },
    {
      "type": "unique",
      "inputs": [1],
      "_output": 5,
      "value_inputs": []
    },
    {
      "type": "relate",
      "inputs": [2],
      "_output": [0, 1, 2, 3, 4],
      "value_inputs": ["right"]
    },
    {
      "type": "filter_size",
      "inputs": [3],
      "_output": [2],
      "value_inputs": ["small"]
    },
    {
      "type": "filter_color",
      "inputs": [4],
      "_output": [],
      "value_inputs": ["green"]
    },
    {
      "type": "count",
      "inputs": [5],
      "_output": 0,
      "value_inputs": []
    }
  ],

Shouldn't the first filter_shape function have ["cube"] as value_inputs?

This issue is not only restricted to shape, it also applies to size, color and material. Also, the issue is not limited to questions that start with There is a.... Here is a second example:

{"question": "Are there any large spheres on the left side of the tiny brown object?",
 "program": [
    {
      "type": "scene",
      "inputs": [],
      "_output": [0, 1, 2, 3, 4, 5],
      "value_inputs": []
    },
    {
      "type": "filter_size",
      "inputs": [0],
      "_output": [
        2,5],
      "value_inputs": []
    },
    {
      "type": "filter_color",
      "inputs": [1],
      "_output": [2],
      "value_inputs": []
    },
    {
      "type": "unique",
      "inputs": [2],
      "_output": 2,
      "value_inputs": []
    },
    {
      "type": "relate",
      "inputs": [3],
      "_output": [
        1,
        3,
        4,5],
      "value_inputs": ["left"]
    },
    {
      "type": "filter_size",
      "inputs": [4],
      "_output": [
        1,
        3,4],
      "value_inputs": ["large"]
    },
    {
      "type": "filter_shape",
      "inputs": [5],
      "_output": [
        3,4],
      "value_inputs": ["sphere"]
    },
    {
      "type": "exist",
      "inputs": [6],
      "_output": true,
      "value_inputs": []
    }
  ],

In this example, the first filter_size is missing ["tiny"] as value_inputs and the first filter_color is missing ["brown"] as value_inputs.

In both sentences, the filter functions later in the program do contain the correct value_inputs.

Using different materials for the object

Can I add different materials for the object? I tried several .blend files downloaded from the Internet, but it seems not working.

how to generate the image based on the CLEVR-1.0 dataset w/o images

I downloaded the CLEVR-1.0 dataset without the images. Is is possible to use the scripts in this repo to generate the corresponding images?

CLEVR with other background

Do you have thinking about to add background by HDRI data. I have try to do this, but it seems like something wrong. maybe because the camera is too close to the HDRI world scene. So, is there any example to replace the pure background with other images such like COCO

This is not an Issue. Just want to say this repo is super nice :)

easy to use
concisely and clearly documented
(and of course the dataset itself is super high-impact)

Deciding a right of a given Object

I want to decide the right of a given object.
Every Object has the right of listed but I am not sure how to decide the right answer.
Thank you!

Bounding boxes in scene files

How do we output bounding boxes for objects? The bound_box property outputs either 1 or -1. It seems like every object occupies the whole scene?

generate_questions produces a dataset that is more balanced than the original one

There must be another discrepancy between generate_questions.py and the original script that was used to generate CLEVR. I have noticed that in CLEVR the answer distribution for counting questions is very skewed. For example, for one of the question families I have the following answer counts:

{'1': 2658, '0': 2555, '2': 1911, '5': 52, '3': 579, '6': 17, '4': 136, '7': 2, '9': 1}

Here the 6th popular answer is "6" with the count of 17. This could not have happened if the current version of generate_questions.py were used, since it has a heuristic that forces all answer to occur at most 5 times as often as the 6th popular answer:

https://github.com/facebookresearch/clevr-dataset-gen/blob/master/question_generation/generate_questions.py#L322

The main reason I have created this here is for the record, because it's unclear how this issue can be addressed. But I guess people who are using the code should be made aware.

Question classification

Is there a way to know which category does a question belong to ? i.e count , exist , compare numbers etc.
Thanks.

Get rendered image data without writing to disk

Hi,

it is possible to get rendered data on a Python variable without having to save the image to disk and reading back?

Zeroeth template in compare_integer.json can never return true

First and foremost, it goes without saying, but great repo and dataset!

Moving on to the issue, I believe there's a slight error in the zeroeth template of compare_integer.json

Specifically, the zeroeth template (with indenting for legibility) is as follows:

    "constraints": [
      {
        "params": [
          1,
          3
        ],
        "type": "OUT_NEQ"
      }
    ],
    "nodes": [
      {
        "inputs": [],
        "type": "scene"
      },
      {
        "inputs": [
          0
        ],
        "side_inputs": [
          "<Z>",
          "<C>",
          "<M>",
          "<S>"
        ],
        "type": "filter_count"
      },
      {
        "inputs": [],
        "type": "scene"
      },
      {
        "inputs": [
          2
        ],
        "side_inputs": [
          "<Z2>",
          "<C2>",
          "<M2>",
          "<S2>"
        ],
        "type": "filter_count"
      },
      {
        "inputs": [
          1,
          3
        ],
        "type": "equal_integer"
      }
    ],
    "params": [
      {
        "name": "<Z>",
        "type": "Size"
      },
      {
        "name": "<C>",
        "type": "Color"
      },
      {
        "name": "<M>",
        "type": "Material"
      },
      {
        "name": "<S>",
        "type": "Shape"
      },
      {
        "name": "<Z2>",
        "type": "Size"
      },
      {
        "name": "<C2>",
        "type": "Color"
      },
      {
        "name": "<M2>",
        "type": "Material"
      },
      {
        "name": "<S2>",
        "type": "Shape"
      }
    ],
    "text": [
      "Are there an equal number of <Z> <C> <M> <S>s and <Z2> <C2> <M2> <S2>s?",
      "Are there the same number of <Z> <C> <M> <S>s and <Z2> <C2> <M2> <S2>s?",
      "Is the number of <Z> <C> <M> <S>s the same as the number of <Z2> <C2> <M2> <S2>s?"
    ]
  },

Note the constraint is OUT_NEQ between nodes 1 and 3. However, nodes 1 and 3 are filter_count nodes rather than filter nodes, so the constraint is that the counts (rather than the objects found) must differ. Consequently, the answer is always false (there cannot be an equal number of objects, because our constraint forces the object count to differ). I threw together a quick script to try and check the answers in the CLEVR training set, and I believe it supports my conclusions.

It's not a serious issue, but it confused me and took me a while to figure out what was going on -- I hope this spares someone else some debugging time. Cheers!

/python/lib/python3.7/site-packages/clevr.pth: Read-only file system

Hi, I am on ubuntu 20.04 and i installed Blender but I cannot add the image_generation path to the Python

echo $PWD/image_generation >> $BLENDER/$VERSION/python/lib/python3.5/site-packages/clevr.pth

I located blender like this:

whereis blender
/snap/bin/blender

though when i look inside /snap/bin/blender there is no python, but it is in /snap/blender

so the command that I run eventually is

$ sudo echo $PWD >> /snap/blender/43/2.83/python/lib/python3.7/site-packages/clevr.pth
bash: /snap/blender/43/2.83/python/lib/python3.7/site-packages/clevr.pth: Read-only file system

I also followed the solution to run sudo fsck -n -f and restart but it didn't help.

Anyone can advise?

Rotation in radians

Hi,
I think the rotations used in blender (random.rand()*360.0) and thus stored in scene json is in radians and not in degrees. I did verify this using preliminary testing.

Reference: blender stackexchange

Blender compatible version

Is this repo compatible with blender 2.8? if not which version should I install?
I recommend you to add this to the README file since this info is not included for now.
Another question please: I saw json file example of generated image and under "pixel_coords" key there was a list of three values. What does the last value represent?

image rendering issue

After I run commands for image generation, the program shows some invalid context and broken margin and stuck as below. Could anybody please show me how to resolve? Thanks
environment:OSX10.14.5 blender:2.78c
image_generation chen$ blender --background --python render_images.py -- --num_images 10
found bundled python: /Applications/blender.app/Contents/MacOS/../Resources/2.78/python
read blend: data/base_scene.blend
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
0.21743071058848407 0.4 left
BROKEN MARGIN!
0.25797991560249245 0.4 right
BROKEN MARGIN!
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
0.3010987159059719 0.4 right
BROKEN MARGIN!
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
Fra:1 Mem:13.74M (0.00M, Peak 13.74M) | Time:00:00.00 | Preparing Scene data
Fra:1 Mem:20.39M (0.00M, Peak 20.69M) | Time:00:00.00 | Preparing Scene data
Fra:1 Mem:20.39M (0.00M, Peak 20.69M) | Time:00:00.00 | Creating Shadowbuffers
Fra:1 Mem:20.39M (0.00M, Peak 20.69M) | Time:00:00.00 | Raytree.. preparing
Fra:1 Mem:27.20M (0.00M, Peak 27.20M) | Time:00:00.01 | Raytree.. building
Fra:1 Mem:26.83M (0.00M, Peak 37.47M) | Time:00:00.09 | Raytree finished
Fra:1 Mem:26.83M (0.00M, Peak 37.47M) | Time:00:00.09 | Creating Environment maps
Fra:1 Mem:26.83M (0.00M, Peak 37.47M) | Time:00:00.09 | Caching Point Densities
Fra:1 Mem:26.83M (0.00M, Peak 37.47M) | Time:00:00.09 | Sce: Scene Ve:45732 Fa:49578 La:1
Fra:1 Mem:26.83M (0.00M, Peak 37.47M) | Time:00:00.09 | Loading voxel datasets
Fra:1 Mem:26.83M (0.00M, Peak 37.47M) | Time:00:00.09 | Sce: Scene Ve:45732 Fa:49578 La:1
Fra:1 Mem:26.83M (0.00M, Peak 37.47M) | Time:00:00.09 | Sce: Scene Ve:45732 Fa:49578 La:1
Fra:1 Mem:26.83M (0.00M, Peak 37.47M) | Time:00:00.09 | Volume preprocessing
Fra:1 Mem:26.83M (0.00M, Peak 37.47M) | Time:00:00.09 | Sce: Scene Ve:45732 Fa:49578 La:1
Fra:1 Mem:26.83M (0.00M, Peak 37.47M) | Time:00:00.09 | Sce: Scene Ve:45732 Fa:49578 La:1

Duplicate question templates in comparison.json

I've found that in the first set of templates in the comparison.json file, the question template "Do the <Z> <C> <M> <S> and the <Z2> <C2> <M2> <S2> have the same size?" is duplicated. Is this intended to sample this template more frequently when we generate the questions ?

Inconsistency in 3d coordinates

Hi,

I am noticing an inconsistency between the placement of the 3d objects in the scene and their corresponding 3d coordinates in the scene. For instance, to show some examples:

(1):

(it looks like the z-axis in this plot corresponds to how close/far you are from the camera, e.g. 'small-cyclinder-brown' and 'small-sphere-gray' are at the top of the z-axis, and they are closest to the camera)

(2):

(this contradicts (1) now, because the small red cylinder lies around the midpoint of the z-axis yet it's actually the furthest from the camera)

(3):

(again, contradicts the previous images, because the gold sphere is closest to the camera but lies about the mid-point of the z-axis)

I have not generated the dataset myself from source, but if there is a discrepancy between the code and the dataset that is available online that may explain it. Or, if the 3d coordinates are not relative to the camera (e.g. the camera was randomly displaced prior to rendering). I tried looking at the code but I didn't find anything that appeared to be odd to me, but I am really not sure at this stage.

Thanks!

Test Answers in CLEVR

I sent an email with my test file to @jcjohnson, i would like to know how long it will take for reply???
Thank for your attention~~~

Error after around 700 images rendered

I was able to use the code to generate new images and scenes. However the following error occurs after rendering about 700 images.

CUDA error at cuModuleLoad: File not found

I tried this many times and it always fail after rendering 700 to 800 images.
System Info: Ubuntu 18.04. Rendering with GTX Titan GPU.
I also found someone with the same issue on StackOverflow:
https://stackoverflow.com/questions/57097832/cuda-error-at-cumoduleload-file-not-found-while-rendering-files-using-blender-o

original dataset and generation script output have different formats

In the dataset, "question_family_index" field takes values from 0 to 89. When I generate a new dataset with the generation script, "question_family_index" takes smaller values as it refers to the index within a template file. In this regard, I have two questions:

Are there any other differences between the code that was originally used to generate CLEVR and the code that is currently hosted on GitHub?
In the version of the CLEVR that is currently available for download, is there a way to resolve "question_family_index" into the actual template? I guess I have to know in which order the template files were loaded for this purpose, and I am not 100% sure if this is deterministic.

Rendering scenes with fixed objects locations

For my project I need to render images with fixed number of objects ( max_objects=min_objects =2 for now) and fix the locations of the objects in all the rendered scenes.
Any thoughts on how to fix the location parameter?

What is the leadership board for the CLEVR dataset in Machine Learning?

Where is a list to the papers with all the benchmarks and models done to this data set?

https://www.quora.com/unanswered/What-is-the-leadership-board-the-the-CLEVER-dataset-in-Machine-Learning

Image rendering fails in the function check_visibility()

Hi, I am trying to generate images however, it fails with the following output in the check_visibility() function in render_images.py. Any help is appreciated.

Traceback (most recent call last):
File "", line 2, in
File "C:\Koc\CLEVR_v1.0_no_images\clevr-dataset-gen\image_generation\render_images.py", line 568, in
main(args)
File "C:\Koc\CLEVR_v1.0_no_images\clevr-dataset-gen\image_generation\render_images.py", line 187, in main
output_blendfile=blend_path,
File "C:\Koc\CLEVR_v1.0_no_images\clevr-dataset-gen\image_generation\render_images.py", line 310, in render_scene
objects, blender_objects = add_random_objects(scene_struct, num_objects, args, camera)
File "C:\Koc\CLEVR_v1.0_no_images\clevr-dataset-gen\image_generation\render_images.py", line 368, in add_random_objects
return add_random_objects(scene_struct, num_objects, args, camera)
File "C:\Koc\CLEVR_v1.0_no_images\clevr-dataset-gen\image_generation\render_images.py", line 368, in add_random_objects
return add_random_objects(scene_struct, num_objects, args, camera)
File "C:\Koc\CLEVR_v1.0_no_images\clevr-dataset-gen\image_generation\render_images.py", line 368, in add_random_objects
return add_random_objects(scene_struct, num_objects, args, camera)
File "C:\Koc\CLEVR_v1.0_no_images\clevr-dataset-gen\image_generation\render_images.py", line 436, in add_random_objects
all_visible = check_visibility(blender_objects, args.min_pixels_per_object)
File "C:\Koc\CLEVR_v1.0_no_images\clevr-dataset-gen\image_generation\render_images.py", line 492, in check_visibility
os.remove(path)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Users\basit\AppData\Local\Temp\tmpmxvc4f2e.png'

Blender quit

#####################################################################

This is my full output after running the command "blender --background --python render_images.py -- --num_images 10"

AL lib: (EE) UpdateDeviceParams: Failed to set 44100hz, got 48000hz instead
found bundled python: C:\Program Files\Blender Foundation\Blender\2.78\python
read blend: data/base_scene.blend
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
0.08877431084461573 0.4 front
BROKEN MARGIN!
0.04820750292907583 0.4 behind
BROKEN MARGIN!
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
0.19977840115083056 0.4 right
BROKEN MARGIN!
0.08536601609309091 0.4 left
BROKEN MARGIN!
0.2000458868872319 0.4 behind
BROKEN MARGIN!
0.0873495889211191 0.4 front
BROKEN MARGIN!
0.29951943072529086 0.4 behind
BROKEN MARGIN!
0.21364749111962933 0.4 front
BROKEN MARGIN!
0.2549460895803004 0.4 left
BROKEN MARGIN!
0.003022989710967261 0.4 behind
BROKEN MARGIN!
0.15800284141277743 0.4 behind
BROKEN MARGIN!
0.07528758013917058 0.4 left
BROKEN MARGIN!
0.3628168400865366 0.4 behind
BROKEN MARGIN!
0.2012267348291028 0.4 left
BROKEN MARGIN!
0.018027017109982335 0.4 front
BROKEN MARGIN!
0.36046965269699305 0.4 front
BROKEN MARGIN!
0.27587637276083776 0.4 front
BROKEN MARGIN!
convertViewVec: called in an invalid context
0.2611278506012926 0.4 front
BROKEN MARGIN!
0.17951250110849548 0.4 front
BROKEN MARGIN!
0.38857280951171447 0.4 front
BROKEN MARGIN!
0.18484803662545168 0.4 right
BROKEN MARGIN!
0.1934248473197031 0.4 front
BROKEN MARGIN!
0.061180339553252416 0.4 front
BROKEN MARGIN!
0.32210705874923606 0.4 right
BROKEN MARGIN!
0.025174628800542997 0.4 front
BROKEN MARGIN!
0.3039664182996975 0.4 behind
BROKEN MARGIN!
0.3932510033371407 0.4 behind
BROKEN MARGIN!
0.1878642509881865 0.4 behind
BROKEN MARGIN!
0.3168021906515279 0.4 left
BROKEN MARGIN!
0.12667748818839542 0.4 front
BROKEN MARGIN!
0.2549021012303956 0.4 behind
BROKEN MARGIN!
0.17429347558240815 0.4 front
BROKEN MARGIN!
0.2816311804500198 0.4 front
BROKEN MARGIN!
0.3416351283617378 0.4 right
BROKEN MARGIN!
0.1585543098115758 0.4 right
BROKEN MARGIN!
0.03309468716003261 0.4 front
BROKEN MARGIN!
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
0.31520377161982727 0.4 front
BROKEN MARGIN!
0.036695282914279925 0.4 right
BROKEN MARGIN!
0.27263846978608974 0.4 behind
BROKEN MARGIN!
0.2627497052155988 0.4 front
BROKEN MARGIN!
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
0.23823929156837098 0.4 front
BROKEN MARGIN!
0.20292554043543543 0.4 right
BROKEN MARGIN!
0.05131734885347661 0.4 behind
BROKEN MARGIN!
0.11015337088431831 0.4 left
BROKEN MARGIN!
0.3012921639788525 0.4 behind
BROKEN MARGIN!
0.196631895470063 0.4 left
BROKEN MARGIN!
0.14357856465165075 0.4 behind
BROKEN MARGIN!
0.17079742580825474 0.4 left
BROKEN MARGIN!
0.2133834050769048 0.4 right
BROKEN MARGIN!
0.027611544870251636 0.4 front
BROKEN MARGIN!
0.3380986087170943 0.4 left
BROKEN MARGIN!
0.12554075248581043 0.4 behind
BROKEN MARGIN!
0.2644719437791302 0.4 left
BROKEN MARGIN!
0.22837131874101524 0.4 right
BROKEN MARGIN!
0.27992444075169554 0.4 behind
BROKEN MARGIN!
0.1121455771594837 0.4 front
BROKEN MARGIN!
0.2609541712115029 0.4 front
BROKEN MARGIN!
0.009093892328399189 0.4 behind
BROKEN MARGIN!
0.22688637130491607 0.4 right
BROKEN MARGIN!
0.25159153734433315 0.4 left
BROKEN MARGIN!
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
0.16587828357933465 0.4 behind
BROKEN MARGIN!
0.23942782585801536 0.4 behind
BROKEN MARGIN!
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
0.15818168059642357 0.4 right
BROKEN MARGIN!
0.32174770794017993 0.4 left
BROKEN MARGIN!
0.04934954263118774 0.4 left
BROKEN MARGIN!
0.04872374480676189 0.4 behind
BROKEN MARGIN!
0.07912650115289654 0.4 behind
BROKEN MARGIN!
0.36737731821024266 0.4 behind
BROKEN MARGIN!
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
0.23321592503030142 0.4 front
BROKEN MARGIN!
0.33803206920973494 0.4 behind
BROKEN MARGIN!
0.23251556811053053 0.4 behind
BROKEN MARGIN!
0.026286593130376845 0.4 right
BROKEN MARGIN!
0.37975706958135635 0.4 front
BROKEN MARGIN!
0.09096281901894843 0.4 behind
BROKEN MARGIN!
0.06642285983956242 0.4 left
BROKEN MARGIN!
0.14779452715317143 0.4 left
BROKEN MARGIN!
0.14547698733790737 0.4 front
BROKEN MARGIN!
0.05800929115355369 0.4 behind
BROKEN MARGIN!
0.2935328323114679 0.4 front
BROKEN MARGIN!
0.04034805536422903 0.4 left
BROKEN MARGIN!
0.23227894436989893 0.4 behind
BROKEN MARGIN!
0.2730608451646215 0.4 behind
BROKEN MARGIN!
0.12994023643916175 0.4 left
BROKEN MARGIN!
0.38239603391839716 0.4 front
BROKEN MARGIN!
0.020917860854106962 0.4 behind
BROKEN MARGIN!
0.11173843114846238 0.4 behind
BROKEN MARGIN!
0.15806884052201386 0.4 front
BROKEN MARGIN!
0.19924301764122276 0.4 behind
BROKEN MARGIN!
0.25774415927266725 0.4 right
BROKEN MARGIN!
0.14274537785558783 0.4 left
BROKEN MARGIN!
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
0.2998929058033455 0.4 behind
BROKEN MARGIN!
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
0.03389183078307523 0.4 left
BROKEN MARGIN!
0.1257545572137938 0.4 right
BROKEN MARGIN!
0.20604094885490842 0.4 front
BROKEN MARGIN!
convertViewVec: called in an invalid context
0.11733480138553087 0.4 front
BROKEN MARGIN!
convertViewVec: called in an invalid context
convertViewVec: called in an invalid context
0.2261386110013388 0.4 left
BROKEN MARGIN!
convertViewVec: called in an invalid context
0.10859933790817577 0.4 right
BROKEN MARGIN!
0.17036471780332807 0.4 left
BROKEN MARGIN!
0.21985563128597363 0.4 right
BROKEN MARGIN!
0.2997315793366022 0.4 behind
BROKEN MARGIN!
0.08207759256993974 0.4 right
BROKEN MARGIN!
0.3923129559317601 0.4 front
BROKEN MARGIN!
0.013859992422528222 0.4 right
BROKEN MARGIN!
0.2703106931489372 0.4 right
BROKEN MARGIN!
0.05204691796960281 0.4 behind
BROKEN MARGIN!
0.04159258864015758 0.4 front
BROKEN MARGIN!
0.3932999393607828 0.4 front
BROKEN MARGIN!
convertViewVec: called in an invalid context
Fra:1 Mem:59.51M (0.00M, Peak 59.51M) | Time:00:00.00 | Preparing Scene data
Fra:1 Mem:79.70M (0.00M, Peak 80.68M) | Time:00:00.13 | Preparing Scene data
Fra:1 Mem:79.70M (0.00M, Peak 80.68M) | Time:00:00.13 | Creating Shadowbuffers
Fra:1 Mem:79.70M (0.00M, Peak 80.68M) | Time:00:00.14 | Raytree.. preparing
Fra:1 Mem:101.67M (0.00M, Peak 101.67M) | Time:00:00.18 | Raytree.. building
Fra:1 Mem:100.49M (0.00M, Peak 134.78M) | Time:00:00.99 | Raytree finished
Fra:1 Mem:100.49M (0.00M, Peak 134.78M) | Time:00:00.99 | Creating Environment maps
Fra:1 Mem:100.49M (0.00M, Peak 134.78M) | Time:00:00.99 | Caching Point Densities
Fra:1 Mem:100.49M (0.00M, Peak 134.78M) | Time:00:01.00 | Sce: Scene Ve:163332 Fa:159977 La:1
Fra:1 Mem:100.49M (0.00M, Peak 134.78M) | Time:00:01.00 | Loading voxel datasets
Fra:1 Mem:100.49M (0.00M, Peak 134.78M) | Time:00:01.00 | Sce: Scene Ve:163332 Fa:159977 La:1
Fra:1 Mem:100.49M (0.00M, Peak 134.78M) | Time:00:01.00 | Sce: Scene Ve:163332 Fa:159977 La:1
Fra:1 Mem:100.49M (0.00M, Peak 134.78M) | Time:00:01.00 | Volume preprocessing
Fra:1 Mem:100.49M (0.00M, Peak 134.78M) | Time:00:01.01 | Sce: Scene Ve:163332 Fa:159977 La:1
Fra:1 Mem:100.49M (0.00M, Peak 134.78M) | Time:00:01.01 | Sce: Scene Ve:163332 Fa:159977 La:1
Fra:1 Mem:102.66M (0.00M, Peak 134.78M) | Time:00:01.02 | Scene, Part 2-2
Fra:1 Mem:101.66M (0.00M, Peak 134.78M) | Time:00:01.06 | Scene, Part 1-2
Fra:1 Mem:58.51M (0.00M, Peak 134.78M) | Time:00:01.07 | Sce: Scene Ve:163332 Fa:159977 La:1
Saved: 'C:\Users\basit\AppData\Local\Temp\tmp0js12txa.png'
Time: 00:01.19 (Saving: 00:00.11)

Blender quit

Object rotation angle problem

clevr-dataset-gen/image_generation/render_images.py

Line 411 in f0ce2c8

theta = 360.0 * random.random()

clevr-dataset-gen/image_generation/utils.py

Line 104 in f0ce2c8

bpy.context.object.rotation_euler[2] = theta

I think there is a mistake here. When I tried to use the object rotation, I found that the object did not rotate much. After I debug, I find that the angle here is incorrect. This angle should be radians, not angles.

I hope everyone can understand this problem. After all, this problem has caused me a lot of trouble.

Issues downloading the generated dataset

Dear maintainers, I am trying to download the generated dataset. When I run

wget https://dl.fbaipublicfiles.com/clevr/CLEVR_v1.0.zip

I encounter the following error:

--2020-04-22 20:48:27--  https://dl.fbaipublicfiles.com/clevr/CLEVR_v1.0.zip
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 104.22.75.142, 104.22.74.142, 2606:4700:10::6816:4a8e, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|104.22.75.142|:443... connected.
HTTP request sent, awaiting response... 429 Too Many Requests
2020-04-22 20:48:27 ERROR 429: Too Many Requests.

This is the first time for me to run this from this IP address.

Question templates scrambling

Did you scramble questions somehow for generating CLEVR? From what I see running your code for consecutive images you get a bunch of questions from the same file

Link for Clevr data generation using Colab

Colab link
for people who don't have many GPU resources out there!

Add different shapes

I tried adding different shapes that I created into the dataset, but to render some new images, but I get an error: " bpy_prop_collection[key]: Key "Polygon not found' "
I was able to add a torus and a cone with no issues, but not anything else. (Like a cube that isn't smooth) Could you explain to me why this is an issue?

Cannot run render_images.py on Blender 2.81

With Blender 2.81, on macOS Mojave I get the following error when I run
blender --background --python render_images.py -- --num_images 10

Blender 2.81 (sub 16) (hash f1aa4d18d49d built 2019-12-04 14:33:18)
found bundled python: /Applications/Blender.app/Contents/Resources/2.81/python
Read blend: data/base_scene.blend
Traceback (most recent call last):
File "/Users/tommaso/CSEM_repos/clevr-dataset-gen/image_generation/render_images.py", line 568, in
main(args)
File "/Users/tommaso/CSEM_repos/clevr-dataset-gen/image_generation/render_images.py", line 187, in main
output_blendfile=blend_path,
File "/Users/tommaso/CSEM_repos/clevr-dataset-gen/image_generation/render_images.py", line 264, in render_scene
bpy.ops.mesh.primitive_plane_add(radius=5)
File "/Applications/Blender.app/Contents/Resources/2.81/scripts/modules/bpy/ops.py", line 201, in call
ret = op_call(self.idname_py(), None, kw)
TypeError: Converting py args to operator properties: : keyword "radius" unrecognized

Blender quit

No key 'function' in the list of questions

I am referring to the repository CLEVR IEP for training model. I followed the steps as in the TRAINING.md file but unfortunately the script breaks. In the file programs.py inside functions function_to_str, list_to_tree, tree_to_prefix, tree_to_postfix all tries to access the key cur['function'], which does not exist in any of the questions generated.

Does someone have the updated code for clevr-dataset-gen for questions generation, and if not how this could be tackled?

Training GQNs on CLEVR dataset

I have done some limited experiments with training Generative Query Networks with CLEVR dataset so I can experiment with using them instead of RESNET based embeddings for CLEVR VQA.

I modified the CLEVR dataset generator to create additional images and metadata to allow GQNs to be trained on the CLEVR dataset domain. Namely to create multiple views of the same scene from different perspectives using a camera moving along a ring. Also preserving camera pose so it can be used both for GQN training and for the original image for generating embeddings for the CLEVR image.

Here as an example during training where for GQNs the objective is to train the model to predict a new view, previously unseen, after being given multiple contexts from different angles.

You can see even with limited training time it does a decent job of predicting the new view even with mostly accurate shadows. Below is a test time example.

I saw some promising preliminary results using the baseline models in clevr-iep but I also think that this might be an interesting area for others to investigate too. At least the intuition is that neural scene representations could improve scene understanding.

Before I clean up my code for a pull request I was wondering if there might be interest in a pull request? Below is my branch code that I would generalize and clean up.

master...loganbruns:clevr_gqn

Thanks,
logan

degenerate questions

The degeneracy check in generate_questions.py is applied only when there is a "raw" relate node in the template. This effectively means that it is only applied to question families defined in templates/single_and.json. Was this also the case when the original dataset was generated?

On a related note, when I generate new questions, many degenerate questions are not detected because of the object uniqueness check. Is often the case that the output of intersection refers to several objects, and yet all of them have the same value for the queried attributed. Is that by design?

Bounding box set

I am able to render my own images using this solution with bounding boxes. But I'm wondering if there somewhere is a dataset available to download with bounding boxes? Like a training dataset with 70.000 images or something. Generating this myself (on my pc) will take approx 100 hours

facebookresearch / clevr-dataset-gen Goto Github PK

clevr-dataset-gen's Introduction

CLEVR Dataset Generation

Step 1: Generating Images

Step 2: Generating Questions

clevr-dataset-gen's People

Contributors

Stargazers

Watchers

Forkers

clevr-dataset-gen's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs