jhoffman / cycada_release Goto Github PK
View Code? Open in Web Editor NEWCode to accompany ICML 2018 paper
License: BSD 2-Clause "Simplified" License
Code to accompany ICML 2018 paper
License: BSD 2-Clause "Simplified" License
cycada_release/cycada/models/drn.py
Line 16 in 8629c03
Does there happen to be a mirror or a different way to download these paths?
Hi @jhoffman or other people,
I know you are busy, but I hope you have time to answer me.
I have tried to locate the code where you run a transfer of GTA to CityScapes? And not the Segmentation part of the code. Are you using the cycle_gan_segmantic model for this? And if so, did you change the structure of the netCLS etc? Did you actually release the code for the GTA to CityScapes? And can this be run without pretrained models?
My goal is to use it for my own data and here I don't have any pretrained models.
I hope you can help!
Thanks!
@in your code, there's an identity loss, but I didn't find this part in paper~
Is it useful in your final improvement?
When I tried to run the code, I suffered from such a issue with logging. Since, it's just logging, may I just comment this line out? Thanks.
Traceback (most recent call last):
File "train_fcn_adda.py", line 389, in
main()
File "/home/truly/.local/lib/python2.7/site-packages/click/core.py", line 722, in call
return self.main(*args, **kwargs)
File "/home/truly/.local/lib/python2.7/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/home/truly/.local/lib/python2.7/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/truly/.local/lib/python2.7/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "train_fcn_adda.py", line 352, in main
logging.info(info_str)
File "/usr/lib/python2.7/logging/init.py", line 1629, in info
root.info(msg, *args, **kwargs)
File "/usr/lib/python2.7/logging/init.py", line 1167, in info
self._log(INFO, msg, args, **kwargs)
File "/usr/lib/python2.7/logging/init.py", line 1286, in _log
self.handle(record)
File "/usr/lib/python2.7/logging/init.py", line 1296, in handle
self.callHandlers(record)
File "/usr/lib/python2.7/logging/init.py", line 1336, in callHandlers
hdlr.handle(record)
File "/usr/lib/python2.7/logging/init.py", line 759, in handle
self.emit(record)
File "/home/truly/truly/CYCADA/cycada_release/cycada/util.py", line 19, in emit
msg = self.format(record)
File "/usr/lib/python2.7/logging/init.py", line 734, in format
return fmt.format(record)
File "/usr/lib/python2.7/logging/init.py", line 469, in format
s = self._fmt % record.dict
KeyError: 'log_color'
First, thanks a lot for releasing the code!!!
One minor issue is that the links for the DRN models at the end of the README doesn't work anymore... Would you mind checking this? Thank you so much again!
I I download the gta images, put it under the cycada folder, and set data_dir to be 'gta/'.
I tried to run bash scripts/train_fcn_adda.sh
and got this error:
No such file or directory: 'gta/cyclegta5/split.mat'
Could you please let me know where I can find the split.mat
? Thanks!
Hi,
I would like to replicate the paper results ( SVHN->MNIST for example) but I just noticed that the test_cycada.sh script and the following test.py produce just the converted images and do not produce any accuracy at all.
Could you suggest me how to operate in order to reproduce the numerical paper results? Should I write my own test function, testing the target dataset after a full training?
Did you implement Synthia->Cityscapes procedure? I use one based on svhn->mnist, but it seems didn't work well.
Do I miss something or I misunderstand the adaption procedure?
My other settings are:
--name
cycada_synthia2cityscapes_noIdentity
--resize_or_crop
resize_and_crop
--loadSize=768
--fineSize=768
--which_model_netD
n_layers
--n_layers_D
3
--model
cycle_gan_semantic
--lambda_A
1
--lambda_B
1
--lambda_identity
0
--no_flip
--batchSize
1
--dataset_mode
synthia_cityscapes
--dataroot
/nfs/project/libo_i/cycada/data
--which_direction
BtoA
--display_id
0
@jhoffman Same question, how to translate GTA images to CityScapes images by our own?
I find it's different from Cyclegan's original code, especially in cycle_gan_semantic_models.py
.
I set my dataset properly using 'unaligned_datasets.py', but when I run train.py
in cyclegan module.
It raises an error about no self.input_A_label found
.
I find self.input_A_label
the variable is specially used in svhn->mnist to label each image's class.
GTA->CityScapes may not need this label? But that degenerate to original CycleGan.
So how do we deal with label
variable and related netCLS
network, etc?
Originally posted by @Luodian in #11 (comment)
@jhoffman
In cycle_gan_semantic_model.py, the netCLS was only implemented to work on digit datasets, how do I re-implement it for larger image resolution, say 256x256?
Hi, could you please provide the svhn2mnist.zip, mnist2usps.zip, and usps2mnist.zip files since the link to those is no longer accessible?
Thanks for such excellent contribution !
In the image segmentation task, should we first adapt between the adaptive source data and the target data in feature-space, or should we first use the adaptive source data with the original source label ?
The order in the text appears to be different from that in appendix 6.1.2.
Looking forward to reply, thank you
Hi
I have one question. When I use my own data to do cyclegan_semantic model. I meet out of memory error. My image size is a little bigger, but I have resized them. I want to ask when you do the cyclegan_semantic to GTA-city image? Do you use same code as minst_svhn(which from Github)? Or when we use model to more complex dataset, we should adjust the code.
Thanks!
Hi, I'm trying to use CyCADA on my own custom datasets. I've set them up according to the cylcegan tutorial (https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/blob/master/docs/tips.md) and adapted them into cycada. If I turn off the semantic consistency loss, the model will train fine. But when I turn it on, I get an error:
However if I set the load_size to 32 (the same size as the SVHN/MNIST demo) I will not get the size mismatch error, but it is not desirable to use images so small. If I start training with the SVHN/MNIST, but set the images larger than 32, the code will not crash. How do I process my datasets or change the feature extraction so I can use custom data with larger images?
Equation 5 in the paper makes use of f_S, however, the corresponding portion in Figure 2 (orange) only makes use of f_T. Additionally, f_T makes more sense since the input is in the target domain. Should the f_S in equation 5 be f_T?
Hi, @jhoffman ,
Could you also provide the code about "Train Image-level Adaptation for Semantic Segmentation" as mentioned in CyCADA paper? The code found in this repo. only shows feature-level domain adaptation (ADDA) method.
THX!
Hello,
How do I run the code without a server set up?
self.parser.add_argument('--display_winsize', type=int, default=256, help='display window size')
self.parser.add_argument('--display_id', type=int, default=1, help='window id of the web display')
self.parser.add_argument('--display_server', type=str, default="http://localhost", help='visdom server of the web display')
self.parser.add_argument('--display_port', type=int, default=8097, help='visdom port of the web display')
However it seems kind of unreasonable to assume that most people will have an http port set up for running visdom.
cycada_release/cycada/data/data_loader.py
Line 86 in 0a50507
Hi, I was comparing the image preprocessing routines in the cycada code and the cyclegan code in the submodule for the SVHN to MNIST task. It seems that in the cyclegan code, the MNIST image is normalized with transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
.
In contrast, in the cycada directory the input image is normalized with transforms.Normalize((0.5,), (0.5,))
, which I believe only normalizes the first channel of the image? I understand that they don't have to match since the ADDA part is separate from the cyclegan part, but is this difference intentional?
sorry i thought cycada jointly adapt feature and image before, which turns out to be separate adaptation.
Hello,
Firstly, thanks for this amazing work of the cycada, as well as having the code shared, so we can start using it on our implementations.
I cloned the repo, and I am starting to use the network for my own problem with my own dataset.
Simple, I want to convert synthetic images to real world data. I have both datasets available, with the labels (semantic segmentation) of the synthetic data (domain A).
My question is, how to start with the CyCada, for a first start, I have to train a semantic network on my synthetic labeled images (domain A), for a semantic segmentation task, and then save this model, and start with the CycleGAN training adapted with the semantic loss (CyCada).
Is there a provided network that can can be trained for a semantic segmentation task ?, or are there provided instructions of how to start cycada with my own dataset ?
Thanks a lot
Hi! After reading your code, I think your feature adaptation(scripts/train_adda.py) is the ADDA method. Am I right? Besides, I found that you directly put the results of the entire network into ADDA instead of the feature extraction output. I am confused, is this still feature adaptation?
First of all, thanks for the great work.
Recently i have implemented an image-level adaptation from CARLA(a simulator) to real urban scenarios using cycle_gan_model.py. But the results are suboptimal since lots of detail information are losing durting the translation.
The dataset is quite unbalanced with 9000 synthetic images and 21100 real images. The resolution differs also severely where 792 x 272 for synthetic and 4096 x 1024 for real ones. Do you have any ideas how can i get an improvement?
Thanks a lot.
Hi,
I am interested in comparing the results of our own domain transfer algorithm with the results in your paper.
Would it possible to get the successfully trained models for svhn2mnist
, mnist2usps
and the other alterations that you trained for getting the results ? That would enable me to run this code with the pretrained models and extract the results myself.
Please let me know.
Thanks in advance,
I download GTA as CityScapes images (16GB). But I only find 22061 rather than 24966 GTA images. Is there any selection strategy?
When Train feature adaptation following image adaptation, how to set dataset structure?
Where do we put [label]_[imageId]_fake_B.png
images ?
It seems given instruction is not enough to reproduce.
Changed src=svhn2mnist
in train_adda.py
,
I put all the images in svhn2mnist
which I mkdir
in the dataroot
folder.
Doesn't work in that way.
The function get_dataset
in data_loader.py
returns None
and causes
Traceback (most recent call last):
File "scripts/train_adda.py", line 63, in <module>
lr=src_lr, betas=betas, weight_decay=weight_decay)
File "/home/junho/uda/cycada/cycada/tools/train_task_net.py", line 96, in train
for epoch in range(num_epoch):
File "/home/junho/uda/cycada/cycada/tools/train_task_net.py", line 25, in train_epoch
for batch_idx, (data, target) in enumerate(loader):
TypeError: 'NoneType' object is not iterable
Plus, it would be nice to release your synthesized SVHN2MNIST images in zip or tar file. I didn't know how to download your images in webfolder and crawled whith index.html file.
Hello,
I would like to train a model on my own data.
How it is possible?
Thanks for sharing a good work. I have a question related to semantic consistency loss (Eq. 4)
You first train a source task model fs (it is similar to training a segmentation network with input and label (Xs, Ys)). Then you will freeze the model fs. In the second phase, you feed the Xt to fs to obtain fs(Xt)-->arg to obtain label, and feed the synthetic target image fs(G(Xt)). You use cross-entropy to compute the task loss
My question is that the G(Xt) is the synthetic of the source domain. fs(G(Xt)) will work because fs is trained on source domain. However, the Xt is the target domain, fs(Xt) will never work or noise in the segmentation result. Why do you use the fs(Xt) as the label to compute cross entropy?
Did you resize your images to 128 x 128 to make cyclegan work for GTAV -> Cityscapes?
Hi Can you explain how do we train cycada pixel only, feature only, pixel & feature with GTA5 to CityScapes?
I am confused the results from ./train_fcn.sh ./train_fcn_adda.sh, what's the different between them?
Thanks
Hi, I managed to train a model following your instructions but I'm not quite sure how to test.
Can you publish the model converting GTA from/to Cityscape?
Thank you
Could you provide pre-trained model that can produce translated Cityscapes-style GTA images?
thanks
Sorry for my disturbing.
I am trying to reproduce the result of gta as cityscape experiment.
My goal is to compare two models: 1) trained on gta5 data and evaluated with cityscape data; 2) trained with translated gta5 data and evaluated with cityscape data
If my understanding serves me well, the drn26-gta5-iter115000.pth
pre-trained model is matched with my first goal and drn26-cyclegta5-iter115000.pth
is matched with my second goal.
My questions are as follows:
scripts/train_fcn_adda.sh
script? How can I specify that I am adopting a pre-trained model (Because I only find that there is specification of the base_models
, but the train script is still going to start from iter 0, and train till max_iter
, and the baseiter
is only used to specify the pth file)drn26-gta5-iter115000.pth
model on cityscape dataset for my first goal, do I need to download the original gta5 dataset (since scripts/train_fcn_adda.sh
requires to specify the src
dataset)In the file __init__.py
in folder data
, there is
elif opt.dataset_mode == 'unaligned_A_labeled':
from data.unaligned_A_labeled_dataset import UnalignedALabeledDataset
dataset = UnalignedALabeledDataset()
but there is not corresponding file found. How could I prepare the dataset? thx.
Hi, I'm trying to reproduce your results on pixel-level adaptation. So, I'm trying to train the FCN-8s model on the dataset you shared (cyclegta5), but after around 100 iterations, the loss goes nan. This is happening for a 768x768 crop and batch size 2. What can I do to reproduce your results using FCN-8s? Thanks in advance
Hi, I got a bit confused about the implementation of the semantic loss in the CycleGANSemanticModel. In the CyCADA paper, the semantic consistency loss is computed using a pretrained model fs. However, in this code I found that the semantic loss is directly computed using the target model ft. I just wonder why the implementation differs and how will this influence the result.
Dear authors,
I was trying to download the dataset such as "Mnist to UPS", but none of these dataset is still available. The link seems invalid. Would you please kindly provide us with the dataset to play around with your code?
Thanks!
In the file cyclegta5.py
in folder data, there is
def __getitem__(self, index):
id = self.ids[index]
filename = '{:05d}.png'.format(id)
img_path = os.path.join(self.root, 'images', filename)
label_path = os.path.join(self.root, 'labels', filename)
but I can't find the labels file.
Could you please tell me where I can find it? Thanks!
Hi
When I try to reproduce the cycada to cyclegta5 to city images, it will stop running early,
shows "No suitable discriminator found -- returning" but at this time mean IOU is about 26%.
Does anyone meet the same problem, or Is it normal for cycada? Do I make some mistakes?
And for train_fcn_adda.sh, the paper said lambda will influence the mIOU results, it means the lambda in cyclegan part? Will the lambda_g & lambda_d in feature adaption influence the mIOU results?
Thanks
I noticed that in your code, if we set discrim_feat = True
, then we are training feature level adaption, otherwise, we are doing it at pixel level?
I am not sure if I understand this procedure correctly, but if I only use default script train_fcn_adda.sh
, I can only get a pixel level trained model? and I need to train it again by loading former model into pixel level setting?
Also, the 'drn26' doesn't correctly be implemented in discrim_feat = True
's case.
Maybe I am not clearly aware of the whole procedure, but I've got really confused.
# extract features
if discrim_feat:
score_s, feat_s = net_src(im_s)
score_s = Variable(score_s.data, requires_grad=False)
f_s = Variable(feat_s.data, requires_grad=False)
else:
score_s = Variable(net_src(im_s).data, requires_grad=False)
f_s = score_s
dis_score_s = discriminator(f_s)
if discrim_feat:
score_t, feat_t = net(im_t)
score_t = Variable(score_t.data, requires_grad=False)
f_t = Variable(feat_t.data, requires_grad=False)
else:
score_t = Variable(net(im_t).data, requires_grad=False)
f_t = score_t
dis_score_t = discriminator(f_t)
dis_pred_concat = torch.cat((dis_score_s, dis_score_t))
# prepare real and fake labels
batch_t, _, h, w = dis_score_t.size()
batch_s, _, _, _ = dis_score_s.size()
dis_label_concat = make_variable(
torch.cat(
[torch.ones(batch_s, h, w).long(),
torch.zeros(batch_t, h, w).long()]
), requires_grad=False)
# compute loss for discriminator
loss_dis = supervised_loss(dis_pred_concat, dis_label_concat)
(lambda_d * loss_dis).backward()
losses_dis.append(loss_dis.item())
# optimize discriminator
opt_dis.step()
Deleted
After analyzing the code, I feel it just a combination of Cycle-GAN and ADDA and theoretically, the transfer performance should severely depend on the generation performance of fake source-like images. It would remain a question for the performance in the datasets with a little bit larger domain gap instead of simple handwritten digits.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.