GithubHelp home page GithubHelp logo

ml5js / training-styletransfer Goto Github PK

View Code? Open in Web Editor NEW
72.0 72.0 30.0 20.27 MB

Style Transfer training and using the model in ml5js

License: Other

Python 96.25% Shell 1.47% Dockerfile 2.27%

training-styletransfer's People

Contributors

b2renger avatar cvalenzuela avatar shiffman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

training-styletransfer's Issues

UnicodeDecodeError

Getting the following message while training:

File "train.py", line 179, in
main()
File "train.py", line 76, in main
train(args)
File "train.py", line 91, in train
data_loader = TextLoader(args.data_dir, args.batch_size, args.seq_length)
File "/spell/training-lstm/utils.py", line 21, in init
self.preprocess(input_file, vocab_file, tensor_file)
File "/spell/training-lstm/utils.py", line 30, in preprocess
data = f.read()
File "/usr/lib/python3.5/codecs.py", line 698, in read
return self.reader.read(size)
File "/usr/lib/python3.5/codecs.py", line 501, in read
newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 7047: invalid start byte

-r requirements no tensorflow?

I installed everything on a aws image.

When installing the requirement I got wheel issues. Eventually got the install correct.

When running the script

Error messages no module numpy
Error messages no module tensorflow

Is it possible to update this to get this to work for enthusiast that try to learn how to make a model.

Updated code for Paperspace

Hi,

I know this is somewhat out the range of this repo. But I can't install the nvidia docker tools because of macOs.
So I wanted to go for the paperspace option. But their repo is outdated.

I have not other physical way of changing my setup. Could somebody help me set this up with paperscape or offer me an alternative where I don't need docker nvidia tools or have to do it by cpu for months.

Any help would be much obliged!

Kindest regards,

Update repo please??

I'm trying to make this to work on current platform and versions but it's very hard for a newbie like me..

Any help with this?

disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Traceback (most recent call last):
File "style.py", line 6, in
from optimize import optimize
File "src/optimize.py", line 6, in
from utils import get_img
File "src/utils.py", line 23
return img
^

IndentationError: expected an indented block
(mlquinten) ubuntu@ip-172-31-51-152:~/training_styletransfer$ sudo nano src/utils.py
(mlquinten) ubuntu@ip-172-31-51-152:~/training_styletransfer$ bash run.sh
WARNING:tensorflow:From /home/ubuntu/mlquinten/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Traceback (most recent call last):
  File "style.py", line 6, in <module>
    from optimize import optimize
  File "src/optimize.py", line 6, in <module>
    from utils import get_img
  File "src/utils.py", line 22
    return img
         ^
IndentationError: expected an indented block
(mlquinten) ubuntu@ip-172-31-51-152:~/training_styletransfer$ sudo nano src/utils.py
(mlquinten) ubuntu@ip-172-31-51-152:~/training_styletransfer$ bash run.sh
WARNING:tensorflow:From /home/ubuntu/mlquinten/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
ml5.js Style Transfer Training!
Note: This traning will take a couple of hours.
Training is starting!...
Train set has been trimmed slightly..
(1, 512, 362, 3)
UID: 25
Traceback (most recent call last):
  File "style.py", line 179, in <module>
    main()
  File "style.py", line 156, in main
    for preds, losses, i, epoch in optimize(*args, **kwargs):
  File "src/optimize.py", line 107, in optimize
    X_batch[j] = get_img(img_p, (256,256,3)).astype(np.float32)
ValueError: could not broadcast input array from shape (427,640,3) into shape (256,256,3)

training with tensorflow-gpu

Hello !

@cvalenzuela
I tried to run the training process on windows 10 with tensorflow gpu and failed.

I followed these instructions to get a working install of tensorflow up and running :
https://www.pugetsystems.com/labs/hpc/The-Best-Way-to-Install-TensorFlow-with-GPU-Support-on-Windows-10-Without-Installing-CUDA-1187/

but when I tried to run the style.py script from my activated environment I got thoses errors :

(tf-gpu) D:\ml\style_transfer\training-styletransfer-master>python style.py --style matildeperez.jpg --checkpoint-dir checkpoints/ --model-dir models/ --test matta.jpg --test-dir tests/ --content-weight 1.5e1 --checkpoint-iterations 1000 --batch-size 20
ml5.js Style Transfer Training!
Note: This traning will take a couple of hours.
Training is starting!...
Train set has been trimmed slightly..
(1, 940, 1190, 3)
UID: 88
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1278, in _do_call
return fn(*args)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1263, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[20,256,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: Conv2D_35 = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Relu_30, Const_5)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "style.py", line 179, in
main()
File "style.py", line 156, in main
for preds, losses, i, epoch in optimize(*args, **kwargs):
File "src\optimize.py", line 114, in optimize
train_step.run(feed_dict=feed_dict)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\framework\ops.py", line 2241, in run
_run_using_default_session(self, feed_dict, self.graph, session)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\framework\ops.py", line 4986, in _run_using_default_session
session.run(operation, feed_dict)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 877, in run
run_metadata_ptr)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1100, in _run
feed_dict_tensor, options, run_metadata)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1272, in _do_run
run_metadata)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1291, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[20,256,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: Conv2D_35 = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Relu_30, Const_5)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Caused by op 'Conv2D_35', defined at:
File "style.py", line 179, in
main()
File "style.py", line 156, in main
for preds, losses, i, epoch in optimize(*args, **kwargs):
File "src\optimize.py", line 60, in optimize
net = vgg.net(vgg_path, preds_pre)
File "src\vgg.py", line 41, in net
current = _conv_layer(current, kernels, bias)
File "src\vgg.py", line 54, in _conv_layer
padding='SAME')
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 1042, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\util\deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\framework\ops.py", line 3155, in create_op
op_def=op_def)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\framework\ops.py", line 1717, in init
self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[20,256,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: Conv2D_35 = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Relu_30, Const_5)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

what should I do ?

I tried running this on ubuntu 18.04 with the same kind of anaconda setup and got a similar error.

I was wondering if the docker container would run the gpu version of tensorflow or only the cpu version ?

Training time / GPU usage

I am running the training and everything seems to be working fine, except that I am suspicious of the training time. I have NVIDIA GeForce GTX 1080, and it takes around 30 sec to run 10 iterations. An epoch will have 60 000 iterations, yes? That means that 2 epochs will take 100 hours total. Is it normal training time for this model and my hardware?

I was afraid that maybe I am not using my GPU only CPU for some reason. But style.py defines device as gpu:0, and I am not sure if it would give me an error if it couldn't access it.
(Not a programmer, so sorry if I am not specific enough! Thanks a lot for help!)

Nvidia-docker for style transfer on aws help needed

As I'm unable to run this at home. I thought I try it out on the AWS platform.

No luck so far. I'm stuck at this step

sudo nvidia-docker run -e USER=$USER -e USERID=$UID -v $PWD:$PWD -w=$PWD -it -p 8888:8888 -p 6006:6006 -v ~/home/YourUserName/:/home cvalenzuelab/styletransfer bash

I don't know how or what I should do to run this on the aws. The above command doesn't give me any errors but it doesn't give me a folder when I need to go back to step 3 and get the /images folder.

Anybody who can help me with this please. I don't want to make to much errors on the aws platform because it's expensive.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.