ganslate-team / ganslate Goto Github PK

Simple and extensible GAN image-to-image translation framework. Supports natural and medical images.

Home Page: https://ganslate.readthedocs.io

License: Other

Python 98.82% Shell 0.92% Dockerfile 0.26%

gan image-to-image-translation cyclegan pix2pix revgan cut contrastive-unpaired-translation natural-images medical-images image-translation style-transfer domain-adaptation

ganslate's Issues

Finalize Experiment Parameters

Finalize training parameters for all experiments with @ibro45

Such as:

  lambda_A: 5.0
  lambda_B: 5.0

empty out init.py's

__init__.pys should contain no code except for necessary imports to be able to access some modules more easily

Weights and Biases throws login error

While running on a new machine, weights and biases throws a login error if results are chosen to not be visualized. This needs to be fixed. Maybe upgrade wandb version

metrics all over the place in GAN implementations

see if it can be improved so that is general and that it is a single call, preferably in BaseGAN, and takes care if it's a cycle-consistency model or not and so on.

rn the problem is that the calculation of metrics is located in different places in GANs (see CycleGAN, RevGAN), which makes it not very readable and a pain to remember and place properly in new GAN implementations.

Make ExperimentTracker nicer

Maybe a storage class that can be used for wandb and tensorboard trackers? Ideas are welcome.

Investigate 3-component SSIM

http://utw10503.utweb.utexas.edu/publications/2009/cl_spie09.pdf

3 component SSIM is shown to correspond better with human perception of Mean Opinion Score ( a rating based on human quality scoring). It is shown to give better SSIM values compared to blurrier images.

The investigation here would be to see if changing weights to promote texture over edges and smoothness would be more valuable.

Replace SSIM with a better version

Replace with this: https://github.com/facebookresearch/fastMRI/blob/master/fastmri/losses.py

Objective is divided by 2 while optimizing D,which slows down the rate at which D learns relative to the rate of G

Not necessary to divide the D loss by 2 since this repo, unlike the original CycleGAN repo, has separate LR param for G and D. Reference: junyanz/pytorch-CycleGAN-and-pix2pix#720

Should it be done?

Add warning for when dataset size is smaller than number of processes in DDP * batch size

Check for conf[conf.mode].batch_size * communication.get_world_size() > len(dataloader.dataset)

inplace activations and memcnn

Fix and test bounding + masking implementations

The issue is discrepancy in values when doing a convex hull along with the padding

Image added shows A-> B-> A inverted. Notice the textures around the body

Convert SSIM index to SSIM distance measure

The issue discussed was what can be done to consider the effect of negative covariance between the image pixels.

Further reading led to an implementation of SSIM as a valid distance measure which leads it to be modified from being just a ( 1 - SSIM index) call.

Paper can be found below:
https://ece.uwaterloo.ca/~z70wang/publications/TIP_SSIM_MathProperties.pdf

Will read through, document and implement soon.

Loss masking for certain values such as out of bounds

Proposed solution:

The loss functions must be modified to have the loss reduction to none so that per voxel loss is obtained. Once this is obtained,

Create a mask based on matching with the certain value for the input image.
Used masked_select to mask the loss to obtain a 1D tensor: https://pytorch.org/docs/stable/generated/torch.masked_select.html
Reduce this 1D tensor obtaining an appropriate loss.

This loss masking needs to go into L1, SSIM and all the GAN losses.

Novograd optimizer

Try it out
https://docs.monai.io/en/latest/_modules/monai/optimizers/novograd.html
https://github.com/Project-MONAI/tutorials/blob/master/acceleration/fast_training_tutorial.ipynb

Add methods to ignore terminal slices in the CBCT FOV

This is important to handle the case when terminal slices of a CBCT scan have artifacts or incomplete anatomical representation. This can potentially corrupt training and should ideally be removed in the dataloader pre-registration.

Possible methods:

Manually ignore some percentile of slices
Seems to be the case in the NKI dataset that the full FOV can be identified using an almost perfect circle - owing to the fan beam FOV, maybe this can be leveraged to ignore the slices.

save checkpoint at the end

Specify wandb project name from config

base dataset classes

enforce that each dataset class is a child of a base dataset class

report time needed per iter and total for val and test

Check where and if the paths should be logged as absolute paths

Inference script

Implement an inference script along with necessary config that will be dataloader and model agnostic.

Fix patch sizes for all 3D experiments

Save checkpoint on cancel

Checkpoint to be saved when training is stopped. Implement as Jonas did.

Wandb, tensorboard for Inferer and Evaluators that work with DDP

evaluators have wandb, inferer doesnt. neither have tensorboard. make sure they work in DDP

computation and dataloading time is calculated well, but might not be obvious that it is when log_freq is a low value

When logging very often, the fact that the images have to be logged can make the iteration longer than what the calculated computation time and data-loading time state.

Add a warning message about that when training is starting if the log_freq is low.

Discuss and Analyze all Evaluation Metrics

Double check and verify if metrics are accurately computed before running final experiments along with @ibro45

Implement Multi-scale Discriminator Architecture

Reference Paper: https://openaccess.thecvf.com/content_ICCV_2019/papers/Shocher_InGAN_Capturing_and_Retargeting_the_DNA_of_a_Natural_Image_ICCV_2019_paper.pdf

Could be implemented in a different way but the general idea is to have discrimination on different scales.

No permissions to push into origin branch

Hi ~ friends ~ I write some code:

1.body_mask.py:if apply_bound is False,output the original size of image

2.cut_lung_from_background.py:It can cut lung from the background stable.

But I can not push to origin/zhixiang branch.Maybe its permissions issue?

Code are in here now:
https://github.com/ZhixiangWang-CN/Radiomics/blob/master/ImageSegmentation/body_mask.py
https://github.com/ZhixiangWang-CN/Radiomics/blob/master/ImageSegmentation/cut_lung_from_background.py

Go through TODOs in the code

@surajpaib once we're almost ready for running final experiments

Decide if metrics should be masked with patient body.

Case: Due to the masking of patient body while computing metrics, a lot of values are zero skewing the true metric values. Decide if while taking the mean of the metric, the mask should be applied over the elements.

PatchGAN output channels set equal to in_channels? Change it to 1

When using multi-channel input, PatchGAN produces multi-channel output. Shouldn't th output be just single channel?

implement function for setting seeds for reproducibility

make default imagedataset have different behavior for train vs val/test modes

in our projects we have completely separate datasets for this, the question here is if paired and unpaired ImageDatasets should have their valtest equivalents or just the same dataset class that handles different modes using conf.mode

Val-Test metrics need to work accurately across batches and across data points

With the current implementation if the batch size is 2 and data points are 5:

The final averaging is done in this way,
(average of first batch metrics with 2 data points + average of second batch metrics with 2 data points + third batch metric with only one data point ) / 3

For batch size 1, the implementation works this way,
( first data point metric + .... + 5th data point metric)/5

These values are not synonymous and need to be fixed so that the same values are given no matter what the implementation detail.

Separate the z-score function

Separate the current z-score function so that :

one is is general, per-datapoint, normalization
another one for per-slice normalization based on stats calculated from the whole volume

Continue training without necessarily loading optimizer state dicts

Currently, when loading a checkpoint, the optimizers are loaded as well. Make it optional, explain why might be useful to have that option.

Simplified Inference Architecture

Usecases:

The Inferer can be called as Validator during training
A separate inference config might not be necessary as most of the parameters are defined in train_config.yaml. Needed parameters for inference should be made mandatory as CLI arguments or fed through config. However, a mandatory separate config should not be needed. A design decision on what parameters need to provided need to be made.

    if cli.config:
        inference_conf = OmegaConf.load(cli.pop("config"))
        inference_conf = OmegaConf.merge(inference_conf, cli)
    else:
        inference_conf = cli

    # Fetch the config that was used during training of this specific run
    train_conf = Path(inference_conf.logging.checkpoint_dir) / "training_config.yaml"

Maybe conf here points to train_conf itself and separate inference_conf and train_conf are not needed

Center crop xy in StochasticFocalPatchSampler

If the defined 3D patch size is big enough in x and y dimension to encompass the whole body, center cropping it in x and y would most likely be ideal, while z position could be selected as usual.

Add an option to StochasticFocalPatchSampler for center cropping in x and y.

Brainstorm about flags in the EvalConfig

Does this need to inherit from dataset in the TrainConfig?

There could be cases where you want to run evaluation with different parameters in the dataset -> For example, with bound to avoid checking out of patient body areas although the training still includes it. For now, these are two separate configs and care should be taken to define it as you need it.

Enforce only image_size or patch_size

Is it necessary to enforce that the dataset uses image_size (instead of load_size) or patch_size to indicate what's the size of the inputs? What would benefit from it?

Test

project_dir: "./projects/nki_cervix_cbct_to_ct"
use_cuda: True
n_iters: 200000
n_iters_decay: 0
batch_size: 1

logging:
checkpoint_dir: ./checkpoints/cbct_ex3/
wandb:
project: "NKI_CBCT_CT"
log_freq: 50
checkpoint_freq: 10000

dataset:
name: "CBCTtoCTDataset"
root: /home/rt/workspace_suraj/cervix_resampled
hounsfield_units_range: [-1024, 2048]
num_workers: 4
patch_size: [32, 32, 32]

gan:
name: "PiCycleGAN"

generator:
name: Vnet3D
in_channels: 1
use_memory_saving: False
use_inverse: True
is_separable: False
down_blocks: [2, 2, 3]
up_blocks: [3, 3, 3]

discriminator:
name: "PatchGAN3D"
n_layers: 2
in_channels: 1

optimizer:
lambda_A: 25.0
lambda_B: 25.0
lambda_identity: 0.0
lambda_inverse: 0.0
proportion_ssim: 0.84
lr_D: 0.0001
lr_G: 0.0002

CT patient bed and clothing removal

Reference Post can be found here: https://discourse.itk.org/t/ct-patient-bed-and-clothing-removal-from-dicom-files/3278/2

Checkpoint dir name with timestamp

Add timestamp to it so that it doesn't overwrite. Option to not do it? Make sure it works in distributed training.

metrics calculation not working for batch_size > 1

@surajpaib the ValTest metric calculation isn't compatible with bigger than 1 batch size. Problem is that the batch dimension is squeezed, which works fine for BS=1, but not otherwise. the scikit metrics don't operate on batches, so it'll be necessary to loop over the batch and average the metrics

https://github.com/Maastro-CDS-Imaging-Group/midaGAN/blob/06d6c419fca81327008d2a2cdbf92f126cff1c19/midaGAN/utils/metrics/val_test_metrics.py#L14
https://github.com/Maastro-CDS-Imaging-Group/midaGAN/blob/06d6c419fca81327008d2a2cdbf92f126cff1c19/midaGAN/utils/metrics/val_test_metrics.py#L64
Also, is this slice num alright for regular images or? @surajpaib

Add CI Tests to make sure all deps are up to date and tracked.

@JulianPosch and I are installing the repo and it would be nice to have an automated CI build using github actions to run a small snippet of the code and show if tests are passing.

I've done something similar for the seg pipeline but it is broken now, so maybe we can sit together and fix/ add it @ibro45

multi_dataset breaks interpolation

E.g.

    
val:
    freq: 1000

    multi_dataset:
        lungs:
            name: "CBCTtoCTValTestDataset"
            root: "/workspace/train_val/val"
        
        phantoms:
            name: "CBCTtoCTValTestDataset"
            root: "/workspace/train_val/val_phantom"

test:
    dataset: "${val.multi_dataset}"
    sliding_window: "${val.sliding_window}"

build_loader selects a single dataset from multi_dataset, assigns it to dataset and sets multi_dataset to None. This breaks interpolation when it's referring to multi_dataset.

Error:

omegaconf.errors.ConfigKeyError: str interpolation key 'val.multi_dataset' not found               
        full_key: test.dataset                                                                                                   
        reference_type=Optional[Dict[Union[str, Enum], Any]]                                                                      
        object_type=dict

Better logging names for losses, metric, G and D, and visuals

Chinmay Rao 8:14 PM
I have some suggestions. Let me know what you think. First, regarding the naming conventions of the networks. I found them a bit confusing.
From the current code, G_A is the generator for A->B, for example. Maybe, we can change it to G_AB.
Also currently, D_A seems to be the discriminator that takes real_B and fake_B as inputs (i.e. domain B). Maybe, we can instead call it D_B, since it works in domain B. This would be similar as in the paper - domain Y and D_Y
If we can change the names, then we can have naming conventions for loggable (metrics, losses, predictions) based on them (edited)

Ibrahim Hadzic 8:16 PM
i agree. This D_A think must have been a mistake in naming i guess, which just propagated to everything
8:16
G
8:16
G_AB makes perfect sense
8:16
also, do you think these names as such are good
8:17
or would you prefer generator_ab instead of G_AB
8:17
i think that the latter, while not being great regarding the python conventions, is still more easily readable
8:17
in this case
👍
1

Chinmay Rao 8:17 PM
I agree. IMO, G_AB is good
New
8:21
For the loggable stuff -------------------------
Maybe have a naming format like (ignore the spaces) ==> mode - type - specific_component
For example, I'd suggest the following changes -----------------------------------
Loggable losses:
loss_G_A --> train-loss-adv_G_AB
loss_D_A --> train-loss-adv_D_B
loss_cycle_A --> train-loss-cycle_ABA
2. Logable predictions:
Train D_A_fake --> train-pred-D_A_fake
3. Loggable metrics:
mse --> val-mse-A (i.e. MSE between fake A and reference A)

Ibrahim Hadzic 8:28 PM
i would actually leave out "train" from train-loss-adv_G_AB but otherwise yes

Ibrahim Hadzic 8:30 PM
for the point 3, i agree with "val" but I'm wondering about "A". We're calculating only A->B and a lot of stuff is done that way in the framework only for A->B, like testing let's say, because B->A is gonna be the case only for cycle-consistency models
👍
1

1 reply
Today at 8:32 PMView thread

Ibrahim Hadzic 8:31 PM
so you have a prob with generalizability
👍
1

8:31
but if you really need to do B->A, you could manually take care of it lets say, but that's not elegant

specify requirements.txt

gan_summary breaks training with num_workers>0 for 3D training

not sure why only for 3D training

ganslate-team / ganslate Goto Github PK

ganslate's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs