philjd / contiguous_pytorch_params Goto Github PK

View Code? Open in Web Editor NEW

291.0 291.0 15.0 223 KB

Accelerate training by storing parameters in one contiguous chunk of memory.

Python 100.00%

contiguous_pytorch_params's People

Contributors

Stargazers

Watchers

Forkers

jackyvan lilujunai infra-structure endeavour10020 husnain-ali21 sailfish009 stevenjokess youtang1993 chengjunlu limitmhw pluiez fudp gictor97 youdaodc happyxy

contiguous_pytorch_params's Issues

Why is there no change in multiGPU training?

Made a pypi release of your repo

Hey there @PhilJd,

Terrific work with this neat trick 👌
I've been using it for a while and it's really helpful. As I have to make a release on another project that depends on your package, a non direct URL dependency is required. So I made a pypi release of your work over here: https://pypi.org/project/contiguous-params/1.0.0/

I only updated the classifiers and the requirements but the rest is identical to your current master branch!

I figured I'd let you know :)

how to use it when using apex

when i try using contiguous params, it happens loss=NAN if i use apex O2 mode.
here is my code

parameters = ContiguousParams(network.parameters())
optimizer = torch.optim.SGD(parameters.contiguous())
network, optimizer = amp.initialize(network, optimizer, opt_level='O2')
network = torch.nn.parallel.DistributedDataParallel(network, device_ids=device_ids)

reduce time not obviously

Hi, thank you for your meaningful job. I change the model in benchmark.py and set batch size 64.

device = "cuda"
# model = nn.Sequential(*[nn.Linear(128, 128) for i in range(100)]).to(device)
model = LResNet18E().to(device)
print("Number of parameters: ", sum(p.numel() for p in model.parameters()))
x = torch.randn(64, 3, 224, 224).to(device)
y = torch.ones(64).to(device)
y = y.long()
model_copies = [deepcopy(model) for _ in range(2)]
# Benchmark original.
parameters = list(model_copies[0].parameters())
optimizer = torch.optim.SGD(parameters, lr=1e-3)
benchmark_model(model_copies[0], optimizer, parameters, "original_params")
# Benchmark contiguous.
parameters = ContiguousParams(model_copies[1].parameters())
optimizer = torch.optim.SGD(parameters.contiguous(), lr=1e-3)
benchmark_model(model_copies[1], optimizer, parameters.contiguous(),
                "contiguous_params")
# Ensure the parameter buffers are still valid.
parameters.assert_buffer_is_valid()

the print result is dissatisfactory.
Number of parameters: 11055816
Mean step time: 2.763813018798828 seconds. (Autograd profiler enabled: False)
Mean step time: 2.8434643745422363 seconds. (Autograd profiler enabled: True)
Mean step time: 2.057171106338501 seconds. (Autograd profiler enabled: False)
Mean step time: 2.271756172180176 seconds. (Autograd profiler enabled: True)

when the batch size is 128:
Number of parameters: 11055816
Mean step time: 4.793098592758179 seconds. (Autograd profiler enabled: False)
Mean step time: 4.904996871948242 seconds. (Autograd profiler enabled: True)
Mean step time: 4.080202102661133 seconds. (Autograd profiler enabled: False)
Mean step time: 4.198964834213257 seconds. (Autograd profiler enabled: True)

What's wrong in my code? Thanks for your answer.

conflict with scheduler

TypeError: 'ContiguousParams' object is not iterable

How to handle the parameter groups when defining the optimizer?

Sometimes we define the optimizer using the dicts for parameters group instead of directly calling the parameters(), I just wonder how to handle this condition? For example,
train_params = [{'params': self.net.get_train_params(), 'lr': cfg.lr}]
self.optimizer = torch.optim.Adam(train_params, lr=cfg.lr, betas=(0.5, 0.999))

Very much appreciate it.

train with DDP is slower

When I just add params.py to my code, it run slower inTitan _XP. I found that when i define optimizer after DDP and it will keep the time almost same, but when I followed the readme.py and define optimizer before DDP, it run slower. What's wrong in my code? Thanks for your answer.
pp_imagenet.txt

philjd / contiguous_pytorch_params Goto Github PK

contiguous_pytorch_params's People

Contributors

Stargazers

Watchers

Forkers

contiguous_pytorch_params's Issues

Why is there no change in multiGPU training?

Made a pypi release of your repo

how to use it when using apex

reduce time not obviously

conflict with scheduler

How to handle the parameter groups when defining the optimizer?

train with DDP is slower

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs