GithubHelp home page GithubHelp logo

Comments (6)

Masaaki-75 avatar Masaaki-75 commented on June 14, 2024

Is there any document/description on the supported Pytorch/CUDA versions?

With python==3.8.18, I found the conda package can be installed with pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1, but got following messages (constantly solving environment issue) with pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 :

Collecting package metadata (current_repodata.json): / WARNING conda.models.version:get_matcher(542): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.7.1.*, but conda is ignoring the .* and treating it as 1.7.1
done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): | WARNING conda.models.version:get_matcher(542): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.8.0.*, but conda is ignoring the .* and treating it as 1.8.0
WARNING conda.models.version:get_matcher(542): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.9.0.*, but conda is ignoring the .* and treating it as 1.9.0
WARNING conda.models.version:get_matcher(542): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.6.0.*, but conda is ignoring the .* and treating it as 1.6.0
done
Solving environment: /

from torch-radon.

carterbox avatar carterbox commented on June 14, 2024

The conda packages are built from this repository: https://github.com/conda-forge/carterbox-torch-radon-feedstock

pytorch-cuda is not a package on the channel; Conda autodetects which CUDA version is appropriate for the host machine. Otherwise, use the cuda-version package. conda-forge is currently building with toolkits 11.2, 11.8, and 12.x

(base) bash-5.1$ conda create -n test pytorch torchvision torchaudio pytorch-cuda
Channels:
 - conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - pytorch-cuda
  - torchaudio

As the solver will tell you, if you try to create that environment unconstrained, torchaudio is not available on the channel. You can submit a new conda recipe to build on the channel or assist with an existing PR in the conda-forge/staged-recipes repo. It seems like a few attempts have been made to build torchaudio using conda, but I'm not sure why none have been merged.

from torch-radon.

Masaaki-75 avatar Masaaki-75 commented on June 14, 2024

Thank you for the hint! And sorry for my late reply.

I feel like I didn't make myself clear. I had trouble installing your releases of torch-radon, probably because of the incompatibility issue. I was trying to install your release based on my RTX-3090 machine, which has an older CUDA version of 11.5 and unfortunately cannot be updated for some time (so not supporting pytorch >2.0 that requires CUDA > 11.7).

And with conda install --channel conda-forge carterbox-torch-radon, the installed carterbox-torch-radon package seems to already wrap pytorch-2.1.0-cuda120py38h1932296_301 and cuda-version-12.2-he2b69de_2 inside. Therefore, I got messages like The NVIDIA driver on your system is too old (My conda environment has torch==1.11.0+cu113, torchvision==0.12.0+cu113, and torchaudio==0.11.0 installed).

So, I am wondering is there any way to install carterbox-torch-radon that supports older version of PyTorch & CUDA? πŸ€”

from torch-radon.

Masaaki-75 avatar Masaaki-75 commented on June 14, 2024

Update:

I found it ok to run python setup.py install with python==3.9 and torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0.

However, I found another problem related to the computing of gradient. Concretely, I wrote a test script that trains of a dual-domain dummy network, but found the grad_x remains incontiguous.

The error messages are like:

(carter) clma@my_server:~/projects/mar/RIL$ python radon_v2_example3.py
Batch: 0
>>> Forwarding sino net.
>>> Forwarding img net.
Traceback (most recent call last):
  File "/home/clma/projects/mar/RIL/radon_v2_example3.py", line 119, in <module>
    loss.backward()
  File "/home/clma/miniconda3/envs/carter/lib/python3.9/site-packages/torch/_tensor.py", line 363, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/home/clma/miniconda3/envs/carter/lib/python3.9/site-packages/torch/autograd/__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/clma/miniconda3/envs/carter/lib/python3.9/site-packages/torch/autograd/function.py", line 253, in apply
    return user_fn(self, *args)
  File "/home/clma/miniconda3/envs/carter/lib/python3.9/site-packages/torch_radon-0.0.0-py3.9-linux-x86_64.egg/torch_radon/differentiable_functions.py", line 46, in backward
    grad = cuda_backend.forward(grad_x, angles, ctx.tex_cache, ctx.vol_cfg, ctx.proj_cfg, exec_cfg)
RuntimeError: x must be contiguous

And the test script radon_v2_example3.py is as follows:

import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset
from torch_radon import FanBeam, Volume2D


class TestDataset(Dataset):
    def __init__(self):
        super().__init__()
        img = np.zeros((512, 512), dtype=np.float32)
        img[:, 255] = 1.
        img[255, :] = 1.
        
        imgs = [
            img,
            np.fliplr(img).copy(),
            np.rot90(img, k=1).copy(),
            np.rot90(img, k=2).copy(),
            np.rot90(img, k=3).copy(),
        ]
        self.imgs = [_ for _ in imgs] + [(1.0 - _.copy()) for _ in imgs]
    
    def __len__(self):
        return len(self.imgs)

    def __getitem__(self, index):
        img = self.imgs[index]
        x = torch.from_numpy(img).unsqueeze(0).float().contiguous()
        return x


class TestNet(nn.Module):
    def __init__(self, channels=3) -> None:
        super().__init__()
        self.sino_model = nn.Sequential(
            nn.Conv2d(1, channels, 5, padding=2),
            nn.Conv2d(channels, 1, 5, padding=2)
        )
        self.img_model = nn.Conv2d(2, 1, 3, padding=1).to(device)
        self.det_count = 672
    
    def forward_sino(self, sino):
        print('>>> Forwarding sino net.')
        return self.sino_model(sino)
    
    def forward_img(self, img1, img2):
        print('>>> Forwarding img net.')
        img = torch.cat([img1, img2], dim=1)
        return self.img_model(img)
    
    def projection(self, img, angles=None, filter=True):
        if angles is None:
            angles = np.linspace(0, np.pi * 2, 360, endpoint=False)
        
        volume = Volume2D()
        volume.set_size(img.shape[-2], img.shape[-1])  # [B, C, H, W]
        radon = FanBeam(self.det_count, angles, volume=volume)
        sino = radon.forward(img)
        if filter:
            sino = radon.filter_sinogram(sino)
        return sino
    
    def backprojection(self, sino, img_shape, angles=None):
        if angles is None:
            angles = np.linspace(0, np.pi * 2, 360, endpoint=False)

        volume = Volume2D()
        volume.set_size(img_shape[-2], img_shape[-1])  # [H, W]
        radon = FanBeam(self.det_count, angles, volume=volume)
        img = radon.backward(sino)
        return img
    
    def forward(self, sino, img_shape, angles=None):
        img = self.backprojection(sino, img_shape, angles=angles)
        sino_pred = self.forward_sino(sino)
        img_sino = self.backprojection(sino_pred, img_shape, angles=angles)
        img_pred = self.forward_img(img, img_sino)
        return sino_pred, img_sino, img_pred
    
    # ==== The following forward (adding contiguous()) does not work ====
    # def forward(self, sino, img_shape, angles=None):
    #     img = self.backprojection(sino, img_shape, angles=angles)
    #     img = img.contiguous()
    #     sino_pred = self.forward_sino(sino)
    #     img_sino = self.backprojection(sino_pred, img_shape, angles=angles)
    #     img_sino = img_sino.contiguous()
    #     img_pred = self.forward_img(img, img_sino)
    #     return sino_pred, img_sino, img_pred


if __name__ == '__main__':
    import torch.optim as optim
    from torch.utils.data import DataLoader
    
    device = torch.device('cuda')
    dataset = TestDataset()
    loader = DataLoader(dataset, batch_size=2, num_workers=4, pin_memory=True, shuffle=True)
    net = TestNet().to(device)
    optimizer = optim.Adam(net.parameters(), lr=1e-4)
    angles = torch.linspace(0, np.pi * 2, 360, requires_grad=False).float()
    
    for i, data in enumerate(loader):
        print(f'Batch: {i}')
        img_shape = data.shape[2:]
        with torch.no_grad():
            sino = net.projection(data.to(device).detach(), angles=angles).detach().contiguous()
            img = net.backprojection(sino, img_shape=img_shape, angles=angles).detach().contiguous()
        
        sino_pred, _, img_pred = net(sino, img_shape=img_shape, angles=angles)
        
        loss1 = F.l1_loss(sino, sino_pred)
        loss2 = F.l1_loss(img, img_pred)
        loss = loss1 + loss2
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

from torch-radon.

Masaaki-75 avatar Masaaki-75 commented on June 14, 2024

Update 2:

The above problems are solved by editing the src/python/torch_radon/differentiable_functions.py, and I have opened a PR that has this and other minor issues addressed at https://github.com/carterbox/torch-radon/pull/11.

from torch-radon.

carterbox avatar carterbox commented on June 14, 2024

The minimum driver version for all CUDA 11.x versions is the same., so if you have a driver from the CUDA 11.5 era, you should be able to run a package build with CUDA 11.8 (CUDA 11.2?).

cuda-version is a package which you can constrain. If conda cannot detect your driver version correctly, then you can set (environment variable) CONDA_OVERRIDE_CUDA=11.5 and/or (package) cuda-version=11.5. You will get pytorch=2.1 and torch-radon=2 built against CUDA 11.2 in your environment.

The pytorch version that I build against is chosen by the conda-forge channel, which is already transitioning from 2.0 to 2.1. pytorch 1.13.* is already 2 years old, so I'm not willing to publish pre-build releases.

Please keep this issue on topic by only discussing the pre-compiled releases or issues related to forking.

from torch-radon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.