yifanjiang19 / pedestrian-synthesis-gan Goto Github PK

View Code? Open in Web Editor NEW

85.0 85.0 27.0 18.26 MB

Pedestrian-Synthesis-GAN: Generating Pedestrian Data in Real Scene and Beyond

License: Other

Python 99.08% Shell 0.92%

computer-vision data-augmentation deep-learning generative-adversarial-networks pytorch

pedestrian-synthesis-gan's People

Stargazers

Watchers

pedestrian-synthesis-gan's Issues

Awesome concept which simply doesnt work for me.

Apart from some issues in the code

networks.py : change _id to device in line 91
networks.py : change the division (/) to int division (//) lines 451 452.
base_model.py : change device_id to device in line 47.
pix2pix_model.py : change from self.XXX.data[0] to self.XXX.cpu().data in all lines 189 to 196

The model trains okay but the results are really bad. I am not sure what I am doing wrong ? I used a dataset of roughly similar looking images, pedestrians on sidewalks , size 256 x 256, number of images 500. Training completes the loss graphs look okay all is looking good but when running test.py to generate pedestrians in images with similar backgrounds but no pedestrians I have really bad results. I trained the model with only 6 images same background but different pedestraisn and the model once again trains really well but when testing the model on the vers same images it was trained on it cannot recreate the training images. I would have expected that those images would be learned and recterated but it cannot. I tried experimenting with hyperparameters, learning rates, weighting of the lossess , UNET depth , you name it I tried it but unfortunatley I still can't get it to work. I think the ide is awesome and sound and it should work but not for me :-( . Can you please point me in some direction on what I might be doing wrong ?

The question about datsets.

When I trained with more than ten pictures, I encountered the following problem：
RuntimeError: Given input size: (512, 5, 3). Calculated output size: (1, 2, 0). Output size is too small.
What should I do？

how to prepare dataset

hi, I down the gtFine_trainvaltest and gtBbox_cityPersons_trainval from cityscape dataset. however the picture and the label are not like that you show in your dataset.

Could u please upload the dataset or give me an link to download ?

What's more, I have no idea how to prepare the data in large quantities？Do u select the picture using machine or by yourself?

Thank for your help in advance

spp-layer

Could you please point the location of spp-layer?

噪声框

请问一下作者，在源数据标注框中加入噪声的具体操作是如何的呢

torch.nn.MaxPool2d() RuntimeError

invalid argument 2: pad should be smaller than half of kernel size, but got padW = 1, padH = 2, kW = 14, kH = 3 at /pytorch/torch/lib/THCUNN/generic/SpatialDilatedMaxPooling.cu:39

Pre-trained model

Hello,
I'm interested in your work and I also should do the same job, and I need your help. Can you share your pre-trained model or your weights(.pth file)?
I need your help very much.
thank for your attention.

TypeError: cuda() got an unexpected keyword argument 'device_id'

I encountered a problem while training.
Traceback (most recent call last):
File "train.py", line 13, in
model = create_model(opt)
File "/home/zxt/GAN Network/PS-GAN/models/models.py", line 19, in create_model
model.initialize(opt)
File "/home/zxt/GAN Network/PS-GAN/models/pix2pix_model.py", line 36, in initialize
opt.which_model_netG, opt.norm, not opt.no_dropout, self.gpu_ids)
File "/home/zxt/GAN Network/PS-GAN/models/networks.py", line 51, in define_G
netG.cuda(device_id=gpu_ids[0])
TypeError: cuda() got an unexpected keyword argument 'device_id'
How to solve this problem？ Looking forward to your reply.

Confusing bounding box key name

bounding box's key names 'h' and 'w' make us associate 'height' and 'width', but they are NOT.

according source code, they means bounding box's bottom right (x, y) coordinate.
https://github.com/yueruchen/Pedestrian-Synthesis-GAN/blob/master/models/pix2pix_model.py#L93-L94

(This repository seems to be not active, but I left this note for the future reference)

Models？

Sorry to bother you, could you release some models in your datasets, and how to make the train data?

How to create a data set

Original Image and Noise Image ，ｈow should the dataset be placed in the datasets?
@yueruchen

test error；

Hi,after I run the "python test.py --dataroot data_path --name model_name --model pix2pix --which_model_netG unet_256 --which_direction BtoA --dataset_mode aligned --use_spp --no_lsgan --norm batch" .
it turns out "test.py: error: unrecognized arguments: --no_lsgan"
would you please help me with this ?
thanks a lot .

detection setup

Hi,
I would like to compare my results on the CityPersons detections with your work in your paper. Can you please tell me the exact evaluation protocol that you used?
What were the parameters? Do you evaluate all of the validation samples or only some subset (e.g. based on height, occlusion, etd.)? Can you please eventually even share with me the data or the script that you evaluated it with? Also, I am interested in what kind of Average Precision do you report. Is it an average AP (if so, what are the thresholds?) or is it AP at some predefined overlap?
Thank you very much in advance!

Replacing pedestrians in the bounding box

In the paper you say

We replace the pedestrians in the bounding boxes with
random noise and train the generator G to synthesize new pedestrians within
that noise region.

How are you selecting the pedestrian in the first place? Are you using object detection to detect the pedestrians on the road (assume cityscape data for now)? And then you apply the noise after getting the y1, y2, x1, x2 ??

Or just manually drawing a bounding-box around a pedestrian and then you apply noise within the box?
If the latter, what if I would like to pre-process, lets say 10.000 images?

Please shed some light on this.

pretrained model

Hi, can you please provide some pretrained models?
Thank you very much in advance.

Why not good performance

Hello, Mr Jiang, I need your help again.
Firstly, I make the pedestrian photo like the one you show.
Then I train the model with 280 photos, batchSize=2, and after 1000 epochs, the outputs are like this one:

What is the problem and could u please give me some advice to improve the performance?
I hope to get the performance like the one u show in the paper.
Thank advance!

Data preprocessing code

Hello, Thank you for releasing your code. I would like to know if your data preprocessing code for the cityscapes dataset (converting the original dataset to be usable by your code) is available?

Should person discriminator's loss be back-propagated to generator?

I think discriminator's loss should't be back-propagated to generator.
But in the source code, when calculating cropped pedestrian discriminator's loss, the tensor self.person_crop_fake isn't detached. I think it should be detached. Is this a bug?

https://github.com/yueruchen/Pedestrian-Synthesis-GAN/blob/master/models/pix2pix_model.py#L131

Training on Custom data and objects besides person

HI @oyxhust @yueruchen Thanks for wonderful code , when i train the model for custom object besides person it behaves differently , can you share some pointers what are the things i need to consider when i change the objects.
Thanks in advance

请问您使用的CUDA和pytorch版本是多少呢，我想在我自己这里运行一下您的程序

十分感谢您提供的代码

Python version problem

请问您使用的pytorch是什么版本，我用的是1.0的哈，报错了很多错误

trained Model

Can you please share trained model?

About PatchGAN

hello, sorry to bother you again

command line

python train.py --dataroot data_path --name model_name --model pix2pix --which_model_netG unet_256 --which_direction BtoA --lambda_A 100 --dataset_mode aligned --use_spp --no_lsgan --norm batch

the information of Discriminator for image was printed

NLayerDiscriminator(
(model): Sequential(
(0): Conv2d(6, 64, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): LeakyReLU(negative_slope=0.2, inplace)
(2): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(4): LeakyReLU(negative_slope=0.2, inplace)
(5): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(6): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): LeakyReLU(negative_slope=0.2, inplace)
(8): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
(9): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(10): LeakyReLU(negative_slope=0.2, inplace)
(11): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
(12): Sigmoid()
)
)

I think it is a typical convolution network without N*N patch, what l need to change in the command line if I want to use PatchGAN?

Updates generator every step

Hey, the code looks like this

if iter_d <= CRITIC_ITERS-1:
    only_d = False
else:
    only_d = False

Is this just a silly bug or are there models where you want to update the generator every time? The cycle-GAN doens't even take the parameter only_d.

数据集

您好，我最近在看行人检测的论文，发现了您的文章，我也想做一些尝试，但是在下载tsinghua-daimler-cyclist数据集的时候遇见了一些问题，总会跳转到别的链接，想问您一下是否还保留着这个数据集呢，如果可以的话可以分享给我一下吗，我的邮箱是[email protected]，不管如何，万分感谢

thanks for your nice work, but when i have a train with the code, it achieve a mistake, how do i handle it? thanks for your reply.

requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /env/main (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x00000229052A80B8>: Failed to establish a new connection: [WinError 10061] 由于目标计算机积极拒绝，无法连接。',))
create web directory ./checkpoints\model_name\web...

Using Spatial Pyramid Pooling to input images of different sizes

Is there a way to use your spp code on top of pix2pix so that i can input an image of any size and avoid resizing the image? I basically want to maintain the same aspect ratio.

Thanks
Gaurav

Generating the noisy data

Can you give pointers as to which script corresponds to creating the noisy data ?

training error

Running training as is get the following

Got this issue first ##6 (comment)
I changed the code from this

netG.cuda(device_id=gpu_ids[0])

to just netG.cuda(gpu_ids[0])

Running it again got me this

Traceback (most recent call last):
  File "train.py", line 35, in <module>
    model.optimize_parameters(only_d)
  File "/home/sevak/github/Pedestrian-Synthesis-GAN/models/pix2pix_model.py", line 179, in optimize_parameters
    self.backward_D_person()
  File "/home/sevak/github/Pedestrian-Synthesis-GAN/models/pix2pix_model.py", line 131, in backward_D_person
    self.person_fake = self.netD_person.forward(self.person_crop_fake)
  File "/home/sevak/github/Pedestrian-Synthesis-GAN/models/networks.py", line 480, in forward
    spp = self.spatial_pyramid_pool(x,1,[int(x.size(2)),int(x.size(3))],self.output_num)
  File "/home/sevak/github/Pedestrian-Synthesis-GAN/models/networks.py", line 453, in spatial_pyramid_pool
    x = maxpool(previous_conv)
  File "/home/sevak/python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/sevak/python3/lib/python3.5/site-packages/torch/nn/modules/pooling.py", line 142, in forward
    self.return_indices)
  File "/home/sevak/python3/lib/python3.5/site-packages/torch/nn/functional.py", line 396, in max_pool2d
    ret = torch._C._nn.max_pool2d_with_indices(input, kernel_size, stride, padding, dilation, ceil_mode)

Any ideas?

cuda() got an unexpected keyword argument '_id'

when training, the below error occurs:

python train.py \
    --dataroot /content/psgan_datasets/ \
    --name my_experiment_1 \
    --model pix2pix \
    --which_model_netG unet_256 \
    --batchSize 256 \
    --which_direction BtoA \
    --lambda_A 100 \
    --dataset_mode aligned \
    --use_spp --no_lsgan --norm batch

------------ Options -------------
batchSize: 256
beta1: 0.5
checkpoints_dir: ./checkpoints
continue_train: False
dataroot: /content/psgan_datasets/
dataset_mode: aligned
display_freq: 100
display_id: 1
display_port: 8097
display_single_pane_ncols: 0
display_winsize: 256
fineSize: 256
gpu_ids: [0]
identity: 0.0
input_nc: 3
isTrain: True
lambda_A: 100.0
lambda_B: 10.0
loadSize: 286
lr: 0.0002
max_dataset_size: inf
model: pix2pix
nThreads: 2
n_layers_D: 3
name: my_experiment_1
ndf: 64
ngf: 64
niter: 100
niter_decay: 100
no_dropout: False
no_flip: False
no_html: False
no_lsgan: True
norm: batch
output_nc: 3
phase: train
pool_size: 50
print_freq: 100
resize_or_crop: resize_and_crop
save_epoch_freq: 5
save_latest_freq: 5000
serial_batches: False
use_spp: True
which_direction: BtoA
which_epoch: latest
which_model_netD: basic
which_model_netG: unet_256
-------------- End ----------------
CustomDatasetDataLoader
dataset [AlignedDataset] was created
#training images = 1200
pix2pix
Traceback (most recent call last):
  File "train.py", line 13, in <module>
    model = create_model(opt)
  File "/content/Pedestrian-Synthesis-GAN/models/models.py", line 19, in create_model
    model.initialize(opt)
  File "/content/Pedestrian-Synthesis-GAN/models/pix2pix_model.py", line 43, in initialize
    self.netD_person = networks.define_person_D(opt.input_nc, opt.ndf, opt, use_sigmoid, self.gpu_ids)
  File "/content/Pedestrian-Synthesis-GAN/models/networks.py", line 91, in define_person_D
    _id=gpu_ids[0])
TypeError: cuda() got an unexpected keyword argument '_id'

The symptom is similar to this issue:
KupynOrest/DeblurGAN#26

数据集制作

你好，冒昧打扰你，想向你请教数据集的格式，疑惑包括
1）加噪声框是手动添加吗，代码中没有包含加噪声框吗
2）从dataset中的格式来看，加噪声框的图片是需要和原图拼在一起形成一张图片吗？
希望得到你的回复，感谢你！

python test.py --dataroot ./datasets

hi, when I run python test.py --dataroot ./datasets
some error occur,
CustomDatasetDataLoader
dataset [UnalignedDataset] was created
Traceback (most recent call last):
File "test.py", line 16, in
data_loader = CreateDataLoader(opt)
File "/home/ben/workspace/PSGAN/data/data_loader.py", line 6, in CreateDataLoader
data_loader.initialize(opt)
File "/home/ben/workspace/PSGAN/data/custom_dataset_data_loader.py", line 30, in initialize
self.dataset = CreateDataset(opt)
File "/home/ben/workspace/PSGAN/data/custom_dataset_data_loader.py", line 20, in CreateDataset
dataset.initialize(opt)
File "/home/ben/workspace/PSGAN/data/unaligned_dataset.py", line 17, in initialize
self.A_paths = make_dataset(self.dir_A)
TypeError: make_dataset() missing 1 required positional argument: 'dir_bbox'

The problem about training

Hello, I meet the problem in the training process.How I could solve the problem?
Traceback (most recent call last):
File "train.py", line 13, in
model = create_model(opt)
File "/home/image/fly/PS-GAN/models/models.py", line 19, in create_model
model.initialize(opt)
File "/home/image/fly/PS-GAN/models/pix2pix_model.py", line 43, in initialize
self.netD_person = networks.define_person_D(opt.input_nc, opt.ndf, opt, use_sigmoid, self.gpu_ids)
File "/home/image/fly/PS-GAN/models/networks.py", line 85, in define_person_D
netD = PersonDiscriminator(input_nc, ndf, use_sigmoid, gpu_ids=gpu_ids)
TypeError: init() got multiple values for keyword argument 'gpu_ids'

input of size [0]

Hi,

I got the error with my custom dataset which has size of 256x128.

File "/usr/local/lib/python2.7/dist-packages/torch-0.4.1-py2.7-linux-x86_64.egg/torch/nn/modules/conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 4, 4], but got input of size [0] instead.

Thanks in advance.

70x70 PatchGAN

Dear Yifan,
It is my great interest to read your encouranging paper, but I am still confused with several implementataions with Pedestrian-Synthesis-GAN, such as where is the 70x70 PatchGAN in this code.
How can I concact with you?
My email: [email protected]
Thank you very much.

Best,
Wei

GAN

您好，我看了您的硕士论文，收获破丰，我现在的应用场景是在战场上生成坦克，希望可以得到您的指导，还有就是运行您这个project的话是不是要把cityscapes的数据下载后按照论文第四章的要求截取成256*256的进行训练哈。

how to get noise picture?

thanks for you, i want to try it, but how to get noise picture? Can you share your code for adding noise to picture?
@yueruchen

yifanjiang19 / pedestrian-synthesis-gan Goto Github PK

pedestrian-synthesis-gan's People

Stargazers

Watchers

Forkers

pedestrian-synthesis-gan's Issues

hello, sorry to bother you again

command line

the information of Discriminator for image was printed

I think it is a typical convolution network without N*N patch, what l need to change in the command line if I want to use PatchGAN?

Recommend Projects

Recommend Topics

Recommend Org

Jobs