chenyuntc / pytorch-book Goto Github PK
View Code? Open in Web Editor NEWPyTorch tutorials and fun projects including neural talk, neural style, poem writing, anime generation (《深度学习框架PyTorch:入门与实战》)
License: MIT License
PyTorch tutorials and fun projects including neural talk, neural style, poem writing, anime generation (《深度学习框架PyTorch:入门与实战》)
License: MIT License
imgs为训练时生成图片保存路径
解决方法:新建一个空文件夹,命名为imgs即可。
python main.py train --plot-every=150 --batch-size=128 --pickle-path='tang.npz' --lr=1e-3 --env='poetry3' --epoch=50
Traceback (most recent call last):
File "main.py", line 225, in
fire.Fire()
File "/home/lyerox/.virtualenvs/RNN-fXB3zvlI/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/home/lyerox/.virtualenvs/RNN-fXB3zvlI/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/home/lyerox/.virtualenvs/RNN-fXB3zvlI/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "main.py", line 131, in train
data = t.from_numpy(data)
RuntimeError: the given numpy array has zero-sized dimensions. Zero-sized dimensions are not supported in PyTorch
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at [email protected]
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
%config Application.verbose_crash=True
安装环境出错:
pip3 install -r requirements.txt
Collecting git+https://github.com/pytorch/tnt.git@master (from -r requiments.txt (line 4))
Cloning https://github.com/pytorch/tnt.git (to master) to /private/var/folders/pt/7_xdqqdj6l1_xrmp5022g3_r0000gn/T/pip-kt8yy0lq-build
Requirement already satisfied (use --upgrade to upgrade): torchnet==0.0.1 from git+https://github.com/pytorch/tnt.git@master in /usr/local/lib/python3.6/site-packages (from -r requiments.txt (line 4))
Requirement already satisfied: visdom in /usr/local/lib/python3.6/site-packages (from -r requiments.txt (line 1))
Requirement already satisfied: fire in /usr/local/lib/python3.6/site-packages (from -r requiments.txt (line 2))
Collecting torchvison (from -r requiments.txt (line 3))
error:
Could not find a version that satisfies the requirement torchvison (from -r requiments.txt (line 3)) (from versions: )
No matching distribution found for torchvison (from -r requiments.txt (line 3))
这个文件名不能在Windows系统下创建出来,因为文件名中包括英文冒号":":
chapter2-快速入门/chapter2: PyTorch快速入门.ipynb
如题
为何提供了一个空的脚本文件
希望把下载命令写上
或者不要保留这个文件
第二章 In [31]:
output = net(input)
target = Variable(t.arange(0,10))
criterion = nn.MSELoss()
loss = criterion(output, target)
loss
loss = criterion(output, target) 报错:
RuntimeError: input and target shapes do not match: input [1 x 10], target [10]
改成loss = criterion(output[0], target)后得到loss是28点多,环境是pycharm,Python 3.6,pytorch 0.4.0
之前看到了这个项目,很喜欢,不过去AI challenger里报名不能下载数据,爬虫的照片没有描述,不好训练。请大神提供思路!!
我的这段代码可以在jupyter notebook上运行, 但是使用pycharm运行失败
dataiter = iter(trainloader)
images, labels = dataiter.next() # 返回4张图片及标签
print(' '.join('%11s' % classes[labels[j]] for j in range(4)))
show(tv.utils.make_grid((images + 1) / 2)).resize((100, 100))
希望能解答一下, 谢谢!
Some files(i.e .ipynb_checkpoints
) generated after running notebooks, so please consider adding .gitignore
to the repo, thx!
看代码不清楚torchnet 的 meter 的作用。训练中能去掉吗
try to train the chapter9 , gpu=True, seems got the cuda problem
File "main.py", line 188, in train
model.cuda()
return self._apply(lambda t: t.cuda(device))
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 146, in _apply
module._apply(fn)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/rnn.py", line 123, in _apply
self.flatten_parameters()
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/rnn.py", line 85, in flatten_parameters
handle = cudnn.get_handle()
File "/usr/local/lib/python2.7/dist-packages/torch/backends/cudnn/init.py", line 296, in get_handle
handle = CuDNNHandle()
File "/usr/local/lib/python2.7/dist-packages/torch/backends/cudnn/init.py", line 110, in init
check_error(lib.cudnnCreate(ctypes.byref(ptr)))
CuDNNError: 1: CUDNN_STATUS_NOT_INITIALIZED
Hi,
try to run the chapter9 main.py code, and got the error as below:
File "main.py", line 212, in gen
start_words = opt.start_words.encode('ascii', 'surrogateescape').decode('utf8')
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)
my enviorment is python 3.5,mac os, pytorch is 0.3.0.post4
在第六章的猫狗训练过程中:
File "main.py", line 123, in train
if loss_meter.value()[0] > previous_loss:
RuntimeError: value cannot be converted to type float without overflow: 10000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815104.000000
pytorch 0.4中volatile被拿掉了,而代码中使用了很多的volatile,看看能不能把代码升级到pytorch 0.4?
File "e:\simulator\testRNN\data.py", line 48, in handleJson
data = json.loads(open(file, 'r').read().encode('utf-8').decode('utf-8', 'ignore'))
UnicodeDecodeError: 'gbk' codec can't decode byte 0xaa in position 23: illegal multibyte sequence
邮箱[email protected]
买了你的书
在纸质书中第三章,用Variable实现线性回归中,注释存在笔误:
# backward: 手动计算梯度
按上下文的意思应该是
# backward: 自动计算梯度
另外感谢作者,本仓库和对应书籍对我有很大帮助~
“它们都带有一个.cuda方法”这一段的最后一句话。
是想说这两种方法的效果是一样的吗?
Traceback (most recent call last):
File "D:/softWare/Python/pythonrun/Machine-Learning/dataClass/test.py", line 9, in
data = t.load(opt.caption_data_path)
File "D:\softWare\Python\pytorch\Anaconda3-5\anaconda\lib\site-packages\torch\serialization.py", line 227, in load
f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'caption.pth'
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at [email protected]
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
%config Application.verbose_crash=True
陈老师您好!看了您的书受益匪浅,看到Variable这一节,我想试着让两个全连接层共享权重:一个全连接层的权重是另一个的转置,我是这样做的:
import torch
import torch.nn as nn
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.linear1 = nn.Linear(5, 10)
self.linear2 = nn.Linear(10, 5)
self.linear2.weight = self.linear1.weight.t()
def forward(self, x):
x = self.linear1(x)
x = self.linear2(x)
return x
net = Net()
但是程序运行时报错:
TypeError: cannot assign 'torch.autograd.variable.Variable' as parameter 'weight' (torch.nn.Parameter or None expected)
这是哪里不对呢?
Can you convert this book to English?
正在阅读您的Pytorch-book 这个教程,在进行 Chap 3 Autograd 时,遇到了一个在 Google 和 Github issues 中没出现的问题,想询问您一下。在第 25 行,执行 y.grad_fn.saved_variables 时,显示如下错误
-----------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-25-3ba15276330c> in <module>()
----> 1 y.grad_fn.saved_variables
AttributeError: 'MulBackward1' object has no attribute 'saved_variables'
没有能够显示应该显示的结果
(Variable containing:
0.3356
[torch.FloatTensor of size 1], Variable containing:
1
[torch.FloatTensor of size 1])
之后也是,也就是说,我无法得到 saved_variables 这个变量的值
torch==0.3.0.post4
python 2.7.10(14)/3.5.2 均无法成功运行
Ubuntu / Mac OS X 10.12.6 均尝试无果
你好,我在用书中的方法使用 DataLoader 类时发现了一些问题:
dataiter = iter(dataloader)
batch_datas, batch_labesl = next(dataiter)
生成的迭代器只能遍历一次数据,之后会自动生成异常 StopIteration
然后停止,这个最好能在文档中注明一下,以及希望能提供一个循环遍历数据的方法。
dataiter = iter(dataloader)
时,程序会在整个数据集上执行 dataset.__getitem__ 函数。例如数据库中有100个数据,则会对这100个数据依次执行__getitem__。不知道这样做的原因是为什么?如果我dataset中有10万个数据,那岂不是要对10万个数据都执行一遍?
根据#23 的建议,换去ResNet, 出现out of memory的错误. 我的显卡是6GB.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "/home/ly/src/pytorch-examples/chenyun-pytorch/ch6/main.py", line 182, in
fire.Fire()
File "/home/ly/anaconda3/envs/learning/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/home/ly/anaconda3/envs/learning/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/home/ly/anaconda3/envs/learning/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "/home/ly/src/pytorch-examples/chenyun-pytorch/ch6/main.py", line 85, in train
score = model(input)
File "/home/ly/anaconda3/envs/learning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/ly/src/pytorch-examples/chenyun-pytorch/ch6/models/ResNet34.py", line 71, in forward
x = self.layer2(x)
File "/home/ly/anaconda3/envs/learning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/ly/anaconda3/envs/learning/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward
input = module(input)
File "/home/ly/anaconda3/envs/learning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/ly/src/pytorch-examples/chenyun-pytorch/ch6/models/ResNet34.py", line 24, in forward
return F.relu(out)
File "/home/ly/anaconda3/envs/learning/lib/python3.6/site-packages/torch/nn/functional.py", line 583, in relu
return threshold(input, 0, 0, inplace)
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THC/generic/THCStorage.cu:58
Traceback (most recent call last):
File "feature_extract.py", line 76, in
features = resnet50(imgs)
File "/home/dand/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/dand/anaconda3/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/models/resnet.py", line 144, in forward
File "/home/dand/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/dand/anaconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/dand/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/dand/anaconda3/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/models/resnet.py", line 88, in forward
File "/home/dand/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/dand/anaconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/dand/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/dand/anaconda3/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 49, in forward
self.training or not self.track_running_stats, self.momentum, self.eps)
File "/home/dand/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 1194, in batch_norm
training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1524586445097/work/aten/src/THC/generic/THCStorage.cu:58
python main.py test
--test-data-root=data/test1
--load-model-path='checkpoints/resnet34.pth'
--use-gpu=False
--batch-size=30
--num-workers=12
运行进入了 ipdb 命令行。。有点懵逼。。
user config:
env default
model ResNet34
...
result_file result.csv
max_epoch 10
lr 0.1
lr_decay 0.95
weight_decay 0.0001
parse <bound method parse of <config.DefaultConfig object at 0x104083c50>>
> /opt/git/pytorch-book/chapter6-实战指南/main.py(20)test()
19 # configure model
---> 20 model = getattr(models, opt.model)().eval()
21 if opt.load_model_path:
ipdb>
你好,我们团队最近翻译了pytorch的官方文档,感觉你的项目教程很不错,为了帮助更多的人学习pytorch,我想参考你的项目做一份教程可以吗?(会在项目中标注参考来源的。)
Traceback (most recent call last):
File "main.py", line 231, in
fire.Fire()
File "/Users/zhangxindong/Desktop/experiment/pytorch-book/pytorch/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/Users/zhangxindong/Desktop/experiment/pytorch-book/pytorch/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/Users/zhangxindong/Desktop/experiment/pytorch-book/pytorch/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "main.py", line 224, in gen
result = gen_poetry(model, start_words, ix2word, word2ix, prefix_words)
File "main.py", line 105, in gen_acrostic
w = ix2word[top_index]
KeyError: tensor(6522)
有一个可能是个人问题。。。我的resnet50去除FC层输出尺寸竟然是(1,8192)不是(1,2048),百思不得其解,打印出内容也不是0,是有值的。我换了种写法,结果还是差不多,尺寸变成了(1,2048,2,2),然后调用(:,:,0,0)的输出和标准答案是一样的,(:,:,0,1),(:,:,1,1),(:,:,1,0)都有所偏差。修改后的代码如下:
resnet50 = tv.models.resnet50(pretrained = True).eval()
# del resnet50.fc
modules = list(resnet50.children())[:-1]
resnet50 = t.nn.Sequential(*modules)
# resnet50.fc = lambda x:x
if opt.use_gpu:
resnet50.cuda()
img = img.cuda()
img_feats = resnet50(Variable(img,volatile=True))
img_feats = img_feats[:,:,0,0]
print(img_feats.data.squeeze(0).size())
print(img_feats)
应该是笔误
比如五绝、七绝、五律、七律
你好,我使用pip install visdom和pip install --upgrade visdom安装visdom之后。
使用python -m visdom.server时,浏览器打开http://localhost:8097页面是蓝色的空页面。
请问,你知道这个问题怎么解决么?谢谢
首先 文件夹里有两个重复的文件,建议删除一个。
README里的使用方法要是和实战指南中 6.1.8 使用相互一致或者类似更好。实战指南中的使用方式更为可靠。在实际使用中,--load-model-path=None
参数不能少,其实要是能够提供训练好的pth那就更好了,没用 GPU 来训练模型真是太痛苦了。
在 main.py
的 train
, test
, val
函数中使用 tqdm
,这样能看出进度> <,不然有一种死机的感觉。
另外,Chap 5 的 visdom 部分有些和预计结果不同,在 Chap 6 中,loss 图和 val_accuracy 都不能得到曲线,不知道是不是因为版本的原因。
I am sorry to open this issue because while running the feature_exact.py file from Chapt. 10, I have encountered the problem of running out of memory. I observed that my RAM increases from 10G to 25 G then an error raised RuntimeError: Couldn't open shared file mapping: <torch_17120_2057656553>, error code: <1455> at C:\Anaconda2\conda-bld\pytorch_1519496000060\work\torch\lib\TH\THAllocator.c:157
I have tried to use gc.collect and delete "imgs" and "features", but the memory use still gets increased.
I am using pytorch 0.3.1 in windows from @peterjc123.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "F:\Anaconda\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "F:\Anaconda\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "F:\Anaconda\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "F:\Anaconda\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "F:\Anaconda\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "F:\Anaconda\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "F:\Anaconda\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "G:\图像描述(Image Caption)\feature_extract.py", line 68, in <module>
for ii,(imgs, indexs) in tqdm.tqdm(enumerate(dataloader)):
File "F:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 417, in __iter__
return DataLoaderIter(self)
File "F:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 234, in __init__
w.start()
File "F:\Anaconda\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "F:\Anaconda\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "F:\Anaconda\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "F:\Anaconda\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "F:\Anaconda\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "F:\Anaconda\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Traceback (most recent call last):
File "feature_extract.py", line 68, in <module>
for ii,(imgs, indexs) in tqdm.tqdm(enumerate(dataloader)):
File "F:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 417, in __iter__
return DataLoaderIter(self)
File "F:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 234, in __init__
w.start()
File "F:\Anaconda\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "F:\Anaconda\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "F:\Anaconda\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "F:\Anaconda\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "F:\Anaconda\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe
因为我的电脑没有Nvidia的显卡,所以之前运行python feature_extract.py时出现错误:Torch not compiled with CUDA enabled,所以就安装了带有CUDA版本的pytorch,结果导入错误,于是就把所有的pytorch框架都删了,装上了一个最新的(其实版本好像还是0.3.1),重新运行了命令:python feature_extract.py,就报了以上的错误。至于报错里提到的:if name == 'main': 问题,我也在feature_extract.py文件里加上了这句判断,可是依然报这个错。google了BrokenPipeError: [Errno 32] Broken pipe的相关问题,基本都是socket的报错,帮助不大,特地跑来发问。深度学习的小白,坐等大神解答疑惑。
运行python main.py train
,报错为‘NameError: name 'opt' is not defined’
直接运行feature_extract.py会运行出现下面错误.
results[ii * batch_size:(ii + 1) * batch_size] = features.data.cpu()
RuntimeError: The expanded size of the tensor (2048) must match the existing size (8192) at non-singleton dimension 1
terminate called without an active exception
p213中显示了1, 10, 20, 30等等epoch之后的结果,但是我在visdom中没有看到有epoch的显示,请问你是如何知道visdom中的结果对应的是哪一个epoch呢?
pytorch version: 0.4.0
debug看到_word变量是没有dimension的tensor,而且拿不到里面的值(_word.data还是这个tensor).
python main.py train --data_path=xxx
Traceback (most recent call last):
File "main.py", line 232, in
fire.Fire()
File "/home/ly/anaconda3/envs/learning/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/home/ly/anaconda3/envs/learning/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/home/ly/anaconda3/envs/learning/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "main.py", line 180, in train
for iii in range(data.size(1))][:16]
File "main.py", line 180, in
for iii in range(data.size(1))][:16]
File "main.py", line 179, in
poetrys = [[ix2word[_word] for word in data[:, _iii]]
KeyError: tensor(8542, device='cuda:0')
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.