hughperkins / deepcl Goto Github PK

View Code? Open in Web Editor NEW

863.0 76.0 200.0 6.44 MB

OpenCL library to train deep convolutional neural networks

License: Mozilla Public License 2.0

C 7.41% Python 7.40% JavaScript 0.91% C++ 81.67% Shell 0.35% CMake 1.89% Batchfile 0.37%

deepcl's Introduction

DeepCL

OpenCL library to train deep convolutional networks

C++
OpenCL
Deep convolutional
Python wrappers
Lua wrappers
Q-learning

APIs:

Layer types:

convolutional
max-pooling
normalization
activation
dropout
random translations
random patches
loss

Loss layer types:

softmax
cross-entropy (synonymous with multinomial logistic, etc)
square loss

Trainers:

SGD
Anneal
Nesterov
Adagrad
Rmsprop
Adadelta

Activations:

tanh
scaled tanh (1.7519 * tanh(2/3x) )
linear
sigmoid
relu
elu (new!)

Loader formats:

jpegs
mnist
kgsv2
norb

Weight initializers:

original
uniform
more possible...

Multicolumn net also possible, as in McDnn

Example usages

obtained 37.2% test accuracy, on next move prediction task, using 33.6 million training examples from kgsgo v2 dataset
- commandline used ./deepcl_train dataset=kgsgoall netdef=12*(32c5z-relu)-500n-tanh-361n numepochs=15 learningrate=0.0001
- 2 epochs, 2 days per epoch, on an Amazon GPU instance, comprising half an NVidia GRID K520 GPU (about half as powerful as a GTX780)
obtained 99.5% test accuracy on MNIST, using netdef=rt2-8c5z-relu-mp2-16c5z-relu-mp3-150n-tanh-10n numepochs=20 multinet=6 learningrate=0.002
- epoch time 99.8 seconds, using an Amazon GPU instance, ie half an NVidia GRID K520 GPU (since we are learning 6 nets in parallel, so 16.6seconds per epoch per net)

Installation

Native library installation

This section installs the native libraries, and the command-line tools. You always need to do this part, even if you will use the Python wrappers.

Windows

Pre-requisites:

OpenCL-enabled GPU or APU, along with appropriate OpenCL driver installed
Tested using Windows 2012 RC2, and (New!) Visual Studio 2015, this is how the CI builds run

Procedure:

Download latest binary zip file from http://deepcl.hughperkins.com/Downloads/ (eg from v8.0.0rc8)
unzip it, which creates the dist folder
To test it:
- open a cmd
- run call dist\bin\activate.bat (adjusting the path appropriately for wherever you downloaded deepcl binaries to)
- now, eg try deepcl_unittests
- (New!), you can choose which gpu to run tests on now, eg: deepcl_unittests gpuindex=1

Note that you need to "activate" the installation each time you open a new cmd prompt (or you could add appropriate environment variables permanently, using Control Panel | System | Advanced System Settings | Environment Variables)

Linux

Pre-requisites:

OpenCL-enabled GPU or APU, along with appropriate OpenCL driver installed (can check by running clinfo, which should show your desired GPU device)
Tested using Ubuntu 14.04 32-bit/64-bit

Procedure:

Download latest tar file from http://deepcl.hughperkins.com/Downloads/ (eg from v8.0.0rc8)
untar it, which creates the dist sub-folder
in a bash prompt, run source dist/bin/activate.sh (adjust the path appropriate for wherever you untarred the binaries tar file to)
test by doing, from the same bash prompt, eg deepcl_unittests
- (New!), you can choose which gpu to run tests on now, eg: deepcl_unittests gpuindex=1

Note that you need to "activate" the installation each time you open a new bash prompt (or you can call activate.sh from your .bashrc file)

Python wrappers

make sure you already installed the native library, and "activate"d it, by doing call dist\bin\activate.bat, or source dist/bin/activate.sh
run pip install --pre DeepCL
test by doing python -c "import PyDeepCL; cl = PyDeepCL.DeepCL()"

To build from source

Building from source is only needed if installing from binaries doesn't work for your configuration, or if you want to modify DeepCL.

See Build.md

What if it doesn't run?

Check if you have an OpenCL-enabled device on your system
- ideally a GPU, or accelerator, since there is no attempt to optimize DeepCL for CPUs (at least, not currently, could change, feel free to submit a pull request :-) )
Try running gpuinfo (from EasyCL, but built as part of this project too, for ease of use )
- it should output at least one OpenCL-enabled device
- if it doesn't, then you need to make sure you have an OpenCL-enabled device, and that appropriate drivers are installed, and that the ICD is configured appropriately (registry in Windows, and /etc/OpenCL/vendors in linux)

What if I need a new feature?

Please raise an issue, let me know you're interested.

If it's on my list of things I was going to do sooner or later anyway (see below), I might do it sooner rather than later.
If it's to do with usability, I will try to make that a priority

What if I want to contribute myself?

please feel free to fork this repository, tweak things, send a pull request. Or get in contact. Or both :-)

Third-party libraries

EasyCL
clew
libpng++
lua
cogapp

Hardware/driver specific issues

If you're using Clover, you might want to look at:
- this thread #35
- this branch https://github.com/hughperkins/DeepCL/tree/clover-compatibility
- Note that Clover is NOT supported, these are just provided as "starting-points", in case someone wants to dabble in this :)

Related projects

kgsgo-dataset-preprocessor Dataset based on kgsgo games; 33 million data points
cltorch
clnn

License

Mozilla Public License 2.0

Recent changes

2017 May 2nd:
- branch update-easycl-mac updated to latest EasyCL, and unit-tests tested on Mac Sierra against:
  - Intel HD Graphics 530 GPU
  - Radeon Pro 450 GPU
- This latest EasyCL lets you use environment variable CL_GPUOFFSET to select gpus, eg set to 1 for second GPU, or 2 for third
- Thank you to my employer ASAPP for providing me use of said Mac Sierra :-)
7th August 2016:
- "standard" version of windows compiler changed from msvc2010 to msvc2015 update 3 (no change to linux/mac)
- "standard" version of python 3.x on windows changed from 3.4 to 3.5 (no change to linux/mac)
- (note: python2.7 continues to work as before on all of Windows 32/64, linux, Mac)
- standard c++ version on linux/mac changed from c++0x to c++11
29th July 2016:
- python fixes:
  - CHANGE: must use numpy tensors now, array.array no longer accepted
  - New feature: can provide numpy tensors as 4d tensors now, no longer have to be 1d tensors
  - Bug fix: q-learning working again now (hopefully)
26th July 2016:
- fixed some bugs in manifest loader
- no longer need to specify the number of images in the first line of the manifest file
- added gpuindex= option to deepcl_unittests (quite beta for now...)
4th January 2016:
- fixed a number of build warnings on Mac, both in OpenCL build, and C++ build
3rd January 2016:
- create Mac OS X build on Travis, and fix the build, https://travis-ci.org/hughperkins/DeepCL
27th November:
- added ELU
Week of 26th October:
- created branch clblas-2.8.0, which works with Visual Studio 2015. It uses the latest 2.8.x release of clBLAS. Thank you to jakakonda for helping to test this and get it working.
Aug 28th:
- merged 8.x branch to master, will release first version of 8.x shortly
- installation of 8.x from binaries on Windows works now, by doing, eg on 32-bit Windows 7, and assuming you already activated an appropriate python environment (assumes 7-zip is installed, in default location, otherwise do the unzip by hand):

powershell Set-ExecutionPolicy unrestricted
rem following command is like `wget` in linux:
powershell.exe -Command (new-object System.Net.WebClient).DownloadFile('http://deepcl.hughperkins.com/Downloads/deepcl-win32-v8.0.0rc8.zip', 'deepcl-win32-v8.0.0rc8.zip')
rem following command is like `tar -xf` in linux:
"c:\program files\7-Zip\7z.exe" x deepcl-win32-v8.0.0rc8.zip
call dist\bin\activate.bat
pip install --pre DeepCL
python -c "import PyDeepCL; cl = PyDeepCL.DeepCL()"
# (last line is just to check works ok)

Aug 26th: installation of 8.x from binaries on linux works now, by doing, eg on 64-bit Ubuntu 14.04:

mkdir 8.0.0rc4
cd 8.0.0rc4
wget http://deepcl.hughperkins.com/Downloads/deepcl-linux64-v8.0.0rc4.tar.bz2
tar -xf deepcl-linux64-v8.0.0rc4.tar.bz2
virtualenv env
source env/bin/activate
source dist/bin/activate.sh
pip install --pre DeepCL
python -c "import PyDeepCL; cl = PyDeepCL.DeepCL()"

(last line is just to check works ok)

Aug 21st-24th:
- 8.x finally builds again on all CI tested configurations!
  - ubuntu 14.04 32-bit Python 2.7
  - ubuntu 14.04 32-bit Python 3.4
  - ubuntu 14.04 64-bit Python 2.7
  - ubuntu 14.04 64-bit Python 3.4
  - visual studio 2010 32-bit python 2.7
  - visual studio 2010 32-bit python 3.4
  - visual studio 2010 64-bit python 2.7
  - visual studio 2010 64-bit python 3.4
Aug 19th-20th:
- Python wrappers now built using a very thin setup.py layer, on top of the standard native DeepCL build
Aug 18th:
- added BackwardIm2Col layer, which uses im2col for backward propagation
- added BackpropWeightsIm2Col layer, which uses im2col for weight update
- added BackwardAuto layer, which automatically selects fastest Backward layer
- added BackpropWeightsAuto layer, which automatically selects faster weight update layer
- under the covers:
  - created ClBlasHelper, to handle Gemm and Gemv
  - factorized im2col into Im2Col class
week up to Aug 17th:
- added forward and backward im2col layer
- forward im2col automatically used during forward propagation, where appropriate
- backwards has yet to be integrated
- under the covers:
  - added clBLAS
  - migrated the Python build process to use cmake, rather than setup.py (whether this turns out to be good or bad is a bit up in the air for now)
June 22nd:
- removed lua wrappers
- if you want to use lua with OpenCL, please consider using cltorch and clnn

To get in contact

Just create an issues, in github, in the top right of this page. Don't worry about whether you think the issue sounds silly or anything. The more feedback the better!

Note that I'm currently focused 100.000% on cuda-on-cl, so please be patient during this period.

deepcl's People

Contributors

Stargazers

Watchers

Forkers

deepcl smallcattom xuxucmkox jethrotan vovoma chagge fyatao joefly zhimingz zhmz90 nuzhny007 wgmueller1 amaragak georgedittmar pmadhyastha yanweifu josephwinston xsongx zencoding cfandy darrenwang00 keyua-cisco grantbrown ray2020 adhamghazali orangelpai jmoudrik gotomypc tkaplan tsingjinyun soledad89 garymihalik smartbitcoin neuroidss mrgloom peilong fighterlyl crycrane vzvzx siddhartharay007 slimeq codeaudit lengmm l30nardosv cfregly maged prabindh qianglan phoenixstar7 amesianx davenso arvinyang nanothyll rollingstone liyancas phenixi feng124 nagyistge bcsharp bunnyrabbit8mile krijnen gh2k wbernoudy junjin8433 nucleargod leimingyu mlzxy a4a881d4 caomw a3213105 fsword73 deepcv neelmcw 2php yigo3000 jagleeso crypticgator waqasm86 cxf2015 neoblizz ovalery16 marty1885 markovchainmontecarlo gattia pranavsreedhar liujian0413 shirleyyim araldo rhythm92 blueberry zgsxwsdxg oursdescavernes sebasstrogg exmakhina zyms5244 hexiangquan coocoky nuaays kikaxa rakeshsafir

deepcl's Issues

Factorize kernels

Factorize kernels:

eg move activation to separate layers
need to check this doesnt affect performance too much, hence need the end to end benchmarking in place first.

Get swig python wrappers working with numpy.i

the cython python wrappers, in python directory, can accept numpy arrays (I think...)
however, the swig-based ones, in python_swig directory, cannot yet
numpy provides numpy.i, in the tools\swig directory of their .tar.gz distribution
could be good to integrate numpy.i into the python swig wrappers, so can directly pass numpy arrays into the python swig wrappers

elu activation

Hi,
can it be that your derivative of the elu activation function is wrong?

elu: (alpha=1 is left out the equation)
forward: x >= 0 ? x : exp(x) - 1 correct
backward: x >= 0 ? 1 : exp(x) instead of x >= 0 ? 1 : x + 1

thanks,
filip

Add DropoutLayer to lua, cython and python_swig wrappers

Unit tests fail

Hi, I ran unit tests and ran into some errors.
Compiled with Visual C++ 2015 x64 on Windows 7 with Radeon 7970.

Different errors:
"unknown file: error: C++ exception with description "memallocsize too small to use this kernel on this device. Need: 0MB, but only have: -1984MB max alloc size" thrown in the test body.
[ FAILED ] testforward.compare_1_n_biased_nopad (2230 ms)"

"error: Expected: (0.1f) >= (loss), actual: 0.1 vs 2.72727
clblas teardown
[ FAILED ] testsimpleconvolvenet.imagesize_5_3_2layers_filtersize_3_3_biased_n18 (11310 ms)"

"ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical"

"Something went wrong, code -55" thrown in the test body.
[ FAILED ] testbackward.compare_1_n_kgsgo_32c5 (843 ms)"

"ForwardAuto: kernel 2: this instance cant be used: cannot use forward2, since outputimagesize * outputimagesize > maxworkgroupsize"

Full log http://pastebin.com/v9d7ZMFu

Gpuinfo output:

platform index: 0:
platform id: 000007FEDD2AF180
platform vendor: Advanced Micro Devices, Inc.
platform name: AMD Accelerated Parallel Processing
platform num devices: 2

device index: 0
device id: 0000000000379A20
device type: 4
global memory size: 3072MB
local memory size: 32KB
global cache size: 16KB
global cacheline size: 64
max memory alloc size: 2112MB
max compute units: 32
max workgroup size: 256
max workitem dimensions: 3
max workitem sizes: 256 256 256
device name: Tahiti
opencl c version: OpenCL C 1.2
opencl device version: OpenCL 1.2 AMD-APP (1800.8)
frequency MHz: 925

PyDeepCL: ActivationMaker does not know scaledtanh

As python user I am not able to use
net.addLayer(PyDeepCL.ActivationMaker().scaledtanh())
because it is still not defined.

Make benchmarks display prettier / better formatted

Make benchmarks display prettier / better formatted, eg http://hughperkins.github.io/DeepCL/benchmarking/4.0.0rc8-33c637c.txt is not very readable.

Install on mac (el capitan)

Hello,
Is there any tutorial on how to install it on mac and run with XCode? It could be very helpful.
If I install it correctly, I can make an tutorial, but for now I can't help very much.
Thank you.

Where/How to start reading csv data?

Hi,
my problem data is available in CSV format. I have no documentation found how to load this data. Also, I was wondering no one else has requested this feature.
As a developer, I would simply add a custom Loader class.
Maybe it's worth it for someone else, so it should follow your guidelines. After all I would send it as pull request.

Add doxygen documentation to each class and method marked PUBLICAPI

all stable methods and classes are now tagged as PUBLICAPI, in 4.x.x
Doxygen documentation for the stable classes and methods, in 4.x.x, is available at http://hughperkins.github.io/DeepCL/4.x.x/html/annotated.html
Needs someone to go through each of these, and add a short description

Get ctrl-c working in python swig wrappers

the cython wrappers, in python subdirectory, currently abort correctly when ctrl- is pressed (tested on linux at least)
the swig ones , in python_swig subdirectoyr, do not
would be good to make them do so :-)

Multiple CPU/APU/GPU examples

Do you have any examples with multiple CPU/APU/GPU?

Add python swig wrappers to pypi

plausible name 'DeepCLSwig'

Adapt DeepCL to be able for replacing cuDNN in caffe

There is at least two tries to make caffe using OpenCL
https://github.com/naibaf7/caffe
https://github.com/lunochod/caffe

Now caffe focused on using CUDA as GPU backend. In addition to own *.cu code it uses NVIDIA's proprietary cuDNN library, which is similar to DeepCL (at least by destination and realized layers). Would you please thinking about to do some job to make some kind of wrapper which can be cuDNN replacement for caffe?

migrate python, lua, python_swig wrappers to use new ActivationLayer, and propagate/backward naming convention

DeepCL and optical flow computation

Hi there,

I'm sorry if this is not the best way to contact you, I'm not really pointing an issue, but rather I'd like to ask a question. Do you believe that DeepCL could be suitable to estimate optical flows from video frames, as available, for example, in http://damienteney.info/cnnFlow.htm. This project is very good but it is based on MatConvNet which only supports CUDA computations, while I need an OpenCL based solution.

Thank you for your time.

Best regards,

Davide

VS2015 Build Errors

Hi!

I'm trying to build DeepCL libraries from scratch with Visual Studio 2015, MSVC 14.0 compiler. I do know that is not officially supported, but I love the way this library is written (and supports OpenCL) and decided to give it a shoot.

The error bellow is shown ~4500 times on a couple of different lines and haven't got any ideas where to start dealing with this error.

I also tried building clBLAS as a standalone cloned from clMathLibraries/clBlas git repository and it works without any problems.
Same problem occurs on 32 and 64bit compiler settings, tested under Win 10 x64.

Severity	Code	Description	Project	File	Line
Error	C2719	'alpha': formal parameter with requested alignment of 16 won't be aligned	DeepCL	D:\DeepCL-Source\clMathLibraries\clBLAS\src\clBLAS.h	4149
Error	C2719	'beta': formal parameter with requested alignment of 16 won't be aligned	DeepCL	D:\DeepCL-Source\clMathLibraries\clBLAS\src\clBLAS.h	5663

Hook for model serialization

I've modified your qlearning example to play tetris and all is fine and dandy but it would be nice to save the net somehow so that I can resume training or use the trained net in another application. Are there any entry points in the current API for doing this or do I need to write something my self?

Supervised Classification Example

Is there any example for supervised classification of feature vectors? Can you please share some pointers?

Add end to end timing benchmark to benchmarks

since pondering factorizing activation layers and so on out, want to check this doesnt affect timings too much

Predicting use kgsgo filter

Hello,
i'm a noob at this domain, after i train a filter 'weights.dat' with kgsgov2 dataset, i'm wondering how to predict (generate) a move from some board state(maybe i'm missing some document that explain how it work.), i've tried using testLoadSgf.py to make a binary file as input file for prediction, and it's appearing:

ManifestLoaderv1 checking format for move1.dat
matched: 0
GenericLoader::getDimensions
trainFilepath: move1.dat
headstringGO
Something went wrong: Filetype of move1.dat not recognised

please help, thanks!
Deryk.

PyDeepCL: NetLearner not callable with new trainer types

PyDeepCL.NetLearner(adadelta, net, N, images, labels, testN, testImages, testLabels, batchSize)
results in
TypeError: Argument 'sgd' has incorrect type (expected PyDeepCL.SGD, got PyDeepCL.Adadelta)

Same error with Anneal, Nesterov, Adagrad and Rmsprop.

Doc from NeuralNetAPI gives trainer:

NetLearner netLearner(
    trainer, net,   // <<
    Ntrain, trainData, trainLabels,
    Ntest, testData, testLabels );

Float values as output

Is there a way to get float values as "labels" data?
The idea would be to use the network as a regressor, of course.

Replacing InputMaker with InputLayerMaker

This line from the NN API:

net->addLayer(InputLayerMaker::instance()->numPlanes(1)->imageSize(28));

Causes the compilation error:

nn.cpp: In function 'int main()':
nn.cpp:11:19: error: incomplete type 'InputMaker' used in nested name specifier
  net->addLayer(InputMaker::instance()->numPlanes(1)->imageSize(28));
               ^

Following the code in the test files though (testsimpleconvolvenet.cpp#L236), and using InputLayerMaker, compilation succeeds

net->addLayer(InputLayerMaker::instance()->numPlanes(1)->imageSize(28));

Should the docs be changed to use InputLayerMaker instead of InputMaker, or is my library linking incorrect?

Thanks

Error when run "deepcl_unittests.exe" on Win7 x64

error code:-1

C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.

It seems that it can't find my GPU.But other tests can find it.
You can see the last part of test log as follows:

[ RUN      ] testsgd.basic
Using NVIDIA Corporation , OpenCL platform: NVIDIA CUDA
Using OpenCL device: GeForce GTX 760
layer 0:InputLayer{ outputPlanes=1 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilter
s=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:SquareLossLayer{}

inputtotalsize=50 outputTotalSize=18
forward try kernel 0
  ... not plausibly optimal, skipping
forward try kernel 1
   ... seems valid
ForwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
  ... not plausibly optimal, skipping
calcGradWeights try kernel 1
   ... seems valid
BackpropWeightsAuto: kernel 1 0ms
[       OK ] testsgd.basic (234 ms)
[----------] 1 test from testsgd (234 ms total)

[----------] 9 tests from testCLMathWrapper
[ RUN      ] testCLMathWrapper.assign
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testCLMathWrapper.assign (0 ms)
[ RUN      ] testCLMathWrapper.assignScalar
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testCLMathWrapper.assignScalar (0 ms)
[ RUN      ] testCLMathWrapper.addinplace
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testCLMathWrapper.addinplace (0 ms)
[ RUN      ] testCLMathWrapper.multiplyinplace
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testCLMathWrapper.multiplyinplace (0 ms)
[ RUN      ] testCLMathWrapper.addscalar
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testCLMathWrapper.addscalar (0 ms)
[ RUN      ] testCLMathWrapper.sqrt
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testCLMathWrapper.sqrt (0 ms)
[ RUN      ] testCLMathWrapper.squared
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testCLMathWrapper.squared (0 ms)
[ RUN      ] testCLMathWrapper.inverse
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testCLMathWrapper.inverse (0 ms)
[ RUN      ] testCLMathWrapper.perelementmult
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testCLMathWrapper.perelementmult (0 ms)
[----------] 9 tests from testCLMathWrapper (15 ms total)

[----------] 1 test from testreducesegments
[ RUN      ] testreducesegments.basic
Using NVIDIA Corporation , OpenCL platform: NVIDIA CUDA
Using OpenCL device: GeForce GTX 760
[       OK ] testreducesegments.basic (94 ms)
[----------] 1 test from testreducesegments (94 ms total)

[----------] 4 tests from testGpuOp
[ RUN      ] testGpuOp.addinplace
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testGpuOp.addinplace (0 ms)
[ RUN      ] testGpuOp.addoutofplace
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testGpuOp.addoutofplace (0 ms)
[ RUN      ] testGpuOp.inverse
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testGpuOp.inverse (0 ms)
[ RUN      ] testGpuOp.addscalarinplace
unknown file: error: C++ exception with description "Error getting OpenCL device
 ids: -1" thrown in the test body.
[  FAILED  ] testGpuOp.addscalarinplace (0 ms)
[----------] 4 tests from testGpuOp (15 ms total)

[----------] 1 test from testjpeghelper
[ RUN      ] testjpeghelper.writeread
[       OK ] testjpeghelper.writeread (0 ms)
[----------] 1 test from testjpeghelper (0 ms total)

[----------] Global test environment tear-down
[==========] 158 tests from 29 test cases ran. (63525 ms total)
[  PASSED  ] 145 tests.
[  FAILED  ] 13 tests, listed below:
[  FAILED  ] testCLMathWrapper.assign
[  FAILED  ] testCLMathWrapper.assignScalar
[  FAILED  ] testCLMathWrapper.addinplace
[  FAILED  ] testCLMathWrapper.multiplyinplace
[  FAILED  ] testCLMathWrapper.addscalar
[  FAILED  ] testCLMathWrapper.sqrt
[  FAILED  ] testCLMathWrapper.squared
[  FAILED  ] testCLMathWrapper.inverse
[  FAILED  ] testCLMathWrapper.perelementmult
[  FAILED  ] testGpuOp.addinplace
[  FAILED  ] testGpuOp.addoutofplace
[  FAILED  ] testGpuOp.inverse
[  FAILED  ] testGpuOp.addscalarinplace

13 FAILED TESTS
  YOU HAVE 2 DISABLED TESTS


E:\dist\bin>

Output from network

In a multi class problem like MNIST, how can I get the output of a softmax layer for each class (plane) using the python wrapper? getOutput returns a list of what seems to be probabilities, but what these probabilities represent is unclear to me!

huge RAM used

Maybe it is more a question than an issue, but is it normal that the RAM used keep incresing with the learning ?

With the following network, I have ~70 GB (!) of used RAM after 14 epoch (input and output are float arrays):

N = 12312
batchSize = 171
cl = PyDeepCL.DeepCL()
net = PyDeepCL.NeuralNet(cl)
net.addLayer(PyDeepCL.InputLayerMaker().numPlanes(49).imageSize(3))
net.addLayer(PyDeepCL.ConvolutionalMaker().numFilters(20).filterSize(3).biased())
net.addLayer(PyDeepCL.ActivationMaker().relu())
net.addLayer(PyDeepCL.ConvolutionalMaker().numFilters(49).filterSize(1).biased())
net.addLayer(PyDeepCL.SquareLossMaker())
sgd = PyDeepCL.SGD(cl, 0.00002, 0.0001)

Get the soumith benchmark working

fatal error: 'png++/png.hpp' file not found

/Users/userone/Documents/workspace/DeepCL/src/util/ImagePng.h:10:10: fatal error: 'png++/png.hpp' file not found
In file included from /Users/userone/Documents/workspace/DeepCL/test/testPatchExtractor.cpp:5:
/Users/userone/Documents/workspace/DeepCL/src/util/ImagePng.h:10:10: fatal error: 'png++/png.hpp' file not found
#include "png++/png.hpp"#include "png++/png.hpp"

         ^         ^

1 error generated.
1 error generated.

Run time error report

Hi Hugh,

I have successfully compiled DeepCL on my PC (Windows 7 64Bit, Visual Studio 2010 x86). But when I run the mnist example, it reports an error:

ForwardAuto: kernel 1 6ms
clblas teardown
Something went wrong: Label 28 exceeds number of softmax planes 10

The script I used to run this demo is:

deepcl_train.exe datadir=. trainfile=train-images.idx3-ubyte validatefile=t10k-images.idx3-ubyte

When I traced the error, I found it is in the learning loop while(!netLearner->isLearningDone()) {...}.

Could you give me some clue about how to fix this problem? Thanks.

Add weight regularization

add weight regularization

Windows 7 installation failed

I've faced with an error during installation on Windows 7, python 2.7(from Anaconda).
pip install deepcl or python setup.py install resulting...

Is it a bug in installation script?

Replace the #defines in some of the opencl cl files with inline method calls

currnetly, some opencl cl implementations use #define macros for speed
it is plausible that all opencl implementaitons will inline function calls
therefore, we can plausibly replace these #define macros with standard function calls

Whoever does this would ideally need to benchmark before and after, on a fairly standard gpu, to check that the change doesnt in fact cause a speed reduction.

MNIST test fails to run on Radeon 8750M

Radeon 8750M on HP Probook 470 G1

clinfo:

Number of platforms                               1
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 MESA 11.0.4
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     AMD OLAND (DRM 2.42.0, LLVM 3.7.0)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 MESA 11.0.4
  Driver Version                                  11.0.4
  Device OpenCL C Version                         OpenCL C 1.1 
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE

Error:

initializing clblas
cl/activate.cl build log: 
input.cl:28:23: warning: implicit declaration of function 'tanh' is invalid in C99
input.cl:11:42: note: expanded from macro 'ACTIVATION_FUNCTION'
unsupported call to function tanh in activate

Remove unnecessary null pointer checks

Extra null pointer checks are not needed in functions like the following.

Add dropout

dropout is a vital function, not currently implemented. pretty vital for any self-respecting convolutional neural network really ;-)
it doesnt need to be added to all propagate and backprop implementations, but probably at least to:
- Propagate1: generic gpu-based forward propagation layer
- BackpropErrorsv2Naive: generic gpu-based backprop of error gradient to previous layer
- BackpropWeights2Naive: generic gpu-based backprop of error gradient onto weights of same layer
you'd need to also add the Dropout option to ConvolutionalMaker class
and you'd need to make all other implementations (Propagate2 etc...) throw a runtime_error, in their constructor, if dropout is requested in the passed-in maker object

Support for 1D and Non-square convolutional kernels

Currently only 2D square convolutional kernels are supported.
Thx.

Mac OSX install

How do you get this to work on Mac OSX?

Try using unroll+clblas GEMM

Following this article, http://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/ http://www.reddit.com/r/MachineLearning/comments/338lfs/why_gemm_is_at_the_heart_of_deep_learning/ , decided should try this, in case gives an easy way to speed up DeepCL for large image sizes.

My verdict? Not useful :-(

Tried on my laptop, and on a K520, and the results were:

unroll + matmult on cpu is a bit faster than direct cpu convolution. I suppose this is because memory access patterns a better
unroll + clblas was faster again
the most naive convolutional opencl kernel, ie not using any type of unroll or gemm, was the fastest

For batchsize=128, inputplanes=32, inputsize=128, numfilters=32, filtersize=5, on a K520 got:

convolution + cpu: 318s
unrolled + cpu: 218s
unrolled +clblas: invalid command queue
no unrolling, propagate1: 2s

Matrices are apparently a bit too big for unroll + clblas, so tried using a smaller batchsize:
batchsize=16, inputplanes=32, inputsize=128, numfilters=32, filtersize=5:

convolution + cpu: 39s
unroll + cpu: 26s
unroll + clblas GEMM: 2.2s
propagate1: 0.27s

Note that propagate1 is DeepCL's most generic, least optimized kernel. It doesnt use local memory (which is why it's generic, and works on anything really, unless it runs out of gpu global memory). Kernels using local memory are around 3-10 times faster than propagate1.

Overall: current conclusion: unroll + clblas GEMM doesnt seem promising?

=> closing issue.

stuckedm but GPU still running

while I am runing the example, I stuck here almost 24 hours,

and I checked the GPU still works!

ubgpu@ubgpu:~/github/DeepCL_Kgsgo/DeepCL/build$ ./deepclrun dataset=kgsgoall netdef=12_(32c5z-relu)-500n-tanh-361n numepochs=15 learningrate=0.0001
Using dataset kgsgoall:
datadir: ../data/kgsgo:
trainfile: kgsgo-trainall-v2.dat:
validatefile: kgsgo-test-v2.dat:
Ntrain 33630595 numPlanes 7 imageSize 19
Ntest 18860 Ntest
after load images 759 ms
image stats mean 12.3638 stdDev 54.7709
image norm translate -12.3638 scale 0.00912893
after getting stats 96 ms
Using NVIDIA Corporation platform: NVIDIA CUDA
Using device: GeForce GTX 970
netDefLower [12_(32c5z-relu)-500n-tanh-361n]
nnString: [12]
repeatNum 12
remainderString [(32c5z-relu)-500n-tanh-361n]
inner [32c5z-relu]
newRemainder [-500n-tanh-361n]
postfix [500n-tanh-361n]
multiplied string: 32c5z-relu-32c5z-relu-32c5z-relu-32c5z-relu-32c5z-relu-32c5z-relu-32c5z-relu-32c5z-relu-32c5z-relu-32c5z-relu-32c5z-relu-32c5z-relu-500n-tanh-361n
GpuAdd: building kernel
CopyBuffer: building kernel
Using trainer SGD{ learningRate=0.0001, momentum=0 }
layer 0:InputLayer{ outputPlanes=7 outputImageSize=19 }
layer 1:NormalizationLayer{ outputPlanes=7 outputImageSize=19 translate=-12.3638 scale=0.00912893 }
layer 2:ConvolutionalLayer{ LayerDimensions{ inputPlanes=7 inputImageSize=19 numFilters=32 filterSize=5 outputImageSize=19 padZeros=1 biased=1 skip=0} }
layer 3:ActivationLayer{ RELU }
layer 4:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputImageSize=19 numFilters=32 filterSize=5 outputImageSize=19 padZeros=1 biased=1 skip=0} }
layer 5:ActivationLayer{ RELU }
layer 6:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputImageSize=19 numFilters=32 filterSize=5 outputImageSize=19 padZeros=1 biased=1 skip=0} }
layer 7:ActivationLayer{ RELU }
layer 8:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputImageSize=19 numFilters=32 filterSize=5 outputImageSize=19 padZeros=1 biased=1 skip=0} }
layer 9:ActivationLayer{ RELU }
layer 10:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputImageSize=19 numFilters=32 filterSize=5 outputImageSize=19 padZeros=1 biased=1 skip=0} }
layer 11:ActivationLayer{ RELU }
layer 12:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputImageSize=19 numFilters=32 filterSize=5 outputImageSize=19 padZeros=1 biased=1 skip=0} }
layer 13:ActivationLayer{ RELU }
layer 14:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputImageSize=19 numFilters=32 filterSize=5 outputImageSize=19 padZeros=1 biased=1 skip=0} }
layer 15:ActivationLayer{ RELU }
layer 16:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputImageSize=19 numFilters=32 filterSize=5 outputImageSize=19 padZeros=1 biased=1 skip=0} }
layer 17:ActivationLayer{ RELU }
layer 18:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputImageSize=19 numFilters=32 filterSize=5 outputImageSize=19 padZeros=1 biased=1 skip=0} }
layer 19:ActivationLayer{ RELU }
layer 20:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputImageSize=19 numFilters=32 filterSize=5 outputImageSize=19 padZeros=1 biased=1 skip=0} }
layer 21:ActivationLayer{ RELU }
layer 22:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputImageSize=19 numFilters=32 filterSize=5 outputImageSize=19 padZeros=1 biased=1 skip=0} }
layer 23:ActivationLayer{ RELU }
layer 24:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputImageSize=19 numFilters=32 filterSize=5 outputImageSize=19 padZeros=1 biased=1 skip=0} }
layer 25:ActivationLayer{ RELU }
layer 26:FullyConnectedLayer{ numPlanes=500 imageSize=1 }
layer 27:ActivationLayer{ TANH }
layer 28:FullyConnectedLayer{ numPlanes=361 imageSize=1 }
layer 29:SoftMaxLayer{ perPlane=0 numPlanes=361 imageSize=1 }
Parameters overview: (skipping 16 layers with 0 params)
layer 2: params=5632 0.1%
layer 4: params=25632 0.4%
layer 6: params=25632 0.4%
layer 8: params=25632 0.4%
layer 10: params=25632 0.4%
layer 12: params=25632 0.4%
layer 14: params=25632 0.4%
layer 16: params=25632 0.4%
layer 18: params=25632 0.4%
layer 20: params=25632 0.4%
layer 22: params=25632 0.4%
layer 24: params=25632 0.4%
layer 26: params=5776500 92.5%
layer 28: params=180861 2.9%
TOTAL : params=6244945
before learning start 46587 ms
MultiplyInPlace: building kernel
sqrt: building kernel
squared: building kernel
PerElementMultInPlace: building kernel
kernelAddScalar: building kernel
kernelInv: building kernel
options -D gWorkgroupSize=361 -D gPixelsPerThread=1
options -D gWorkgroupSize=361 -D gPixelsPerThread=1
options -D gWorkgroupSize=361 -D gPixelsPerThread=1
options -D gWorkgroupSize=361 -D gPixelsPerThread=1
options -D gWorkgroupSize=361 -D gPixelsPerThread=1
options -D gWorkgroupSize=361 -D gPixelsPerThread=1
options -D gWorkgroupSize=361 -D gPixelsPerThread=1
options -D gWorkgroupSize=361 -D gPixelsPerThread=1
options -D gWorkgroupSize=361 -D gPixelsPerThread=1
options -D gWorkgroupSize=361 -D gPixelsPerThread=1
options -D gWorkgroupSize=361 -D gPixelsPerThread=1
options -D gWorkgroupSize=361 -D gPixelsPerThread=1
options -D gWorkgroupSize=32 -D gPixelsPerThread=1
options -D gWorkgroupSize=32 -D gPixelsPerThread=1
layer2 ForwardAuto: instance 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
layer4 ForwardAuto: instance 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
layer6 ForwardAuto: instance 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
layer8 ForwardAuto: instance 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
layer10 ForwardAuto: instance 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
layer12 ForwardAuto: instance 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
layer14 ForwardAuto: instance 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
layer16 ForwardAuto: instance 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
layer18 ForwardAuto: instance 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
layer20 ForwardAuto: instance 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
layer22 ForwardAuto: instance 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
layer24 ForwardAuto: instance 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
layer2 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 6ms
instance 2: 3ms
instance 3: 2ms
instance 4: 2ms
instance 5: cannot be used
instance 6: 4ms
selected: instance 3
layer4 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 22ms
instance 2: 13ms
instance 3: 10ms
instance 4: 6ms
instance 5: cannot be used
instance 6: 29ms
selected: instance 4
layer6 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 22ms
instance 2: 13ms
instance 3: 10ms
instance 4: 6ms
instance 5: cannot be used
instance 6: 29ms
selected: instance 4
layer8 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 22ms
instance 2: 13ms
instance 3: 10ms
instance 4: 6ms
instance 5: cannot be used
instance 6: 29ms
selected: instance 4
layer10 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 22ms
instance 2: 13ms
instance 3: 10ms
instance 4: 6ms
instance 5: cannot be used
instance 6: 29ms
selected: instance 4
layer12 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 23ms
instance 2: 13ms
instance 3: 10ms
instance 4: 6ms
instance 5: cannot be used
instance 6: 29ms
selected: instance 4
layer14 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 22ms
instance 2: 13ms
instance 3: 10ms
instance 4: 6ms
instance 5: cannot be used
instance 6: 29ms
selected: instance 4
layer16 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 22ms
instance 2: 13ms
instance 3: 10ms
instance 4: 6ms
instance 5: cannot be used
instance 6: 32ms
selected: instance 4
layer18 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 22ms
instance 2: 13ms
instance 3: 10ms
instance 4: 6ms
instance 5: cannot be used
instance 6: 31ms
selected: instance 4
layer20 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 22ms
instance 2: 13ms
instance 3: 10ms
instance 4: 6ms
instance 5: cannot be used
instance 6: 30ms
selected: instance 4
layer22 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 23ms
instance 2: 13ms
instance 3: 10ms
instance 4: 6ms
instance 5: cannot be used
instance 6: 30ms
selected: instance 4
layer24 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 22ms
instance 2: 14ms
instance 3: 10ms
instance 4: 6ms
instance 5: cannot be used
instance 6: 31ms
selected: instance 4
layer26 ForwardAuto: instance 6 this instance cant be used: Out of resources, code -5
layer26 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 153ms
instance 2: 378ms
instance 3: 767ms
instance 4: 93ms
instance 5: 27ms
instance 6: cannot be used
selected: instance 5
layer28 ForwardAuto::forward choosing best instance:
instance 0: cannot be used
instance 1: 2ms
instance 2: 16ms
instance 3: 15ms
instance 4: 15ms
instance 5: 13ms
instance 6: 11ms
selected: instance 1

my GPU info:

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 C+G Not Supported |
+-----------------------------------------------------------------------------+
ubgpu@ubgpu:~/big_data$

Add momentum to SGD trainer

currently SGD training is implemented, with a learning rate and an annealed learning rate
it could be good to add momentum

OpenCL build error on Activation kernel

I'm getting an OpenCL kernel build error when compiling activate.cl. I'm using PyDeepCL's NetdefToNet.createNetFromNetdef with the architecture: rt2-8c5z-relu-mp2-16c5z-relu-mp3-150n-tanh-7n.

Using NVIDIA Corporation , OpenCL platform: NVIDIA CUDA
Using OpenCL device: GeForce 940M
initializing clblas
cl/activate.cl build log: 
<built-in>:13:9: error: macro names must be identifiers
#define <C8><U+000F><EB><U+0003> 1
        ^
<built-in>:23:9: error: macro names must be identifiers
#define <C8><U+000F><EB><U+0003> 1
        ^

kernel build error:

kernel source:
1: // Copyright Hugh Perkins 2015 hughperkins at gmail
2: //
3: // This Source Code Form is subject to the terms of the Mozilla Public License,
4: // v. 2.0. If a copy of the MPL was not distributed with this file, You can
5: // obtain one at http://mozilla.org/MPL/2.0/.
6: 
7: // expected defines:
8: // one of: [ TANH | RELU | LINEAR | SIGMOID | SCALEDTANH | ELU ]
9: 
10: #ifdef TANH
11:     #define ACTIVATION_FUNCTION(output) (tanh(output))
12: #elif defined SCALEDTANH
13:     #define ACTIVATION_FUNCTION(output) (1.7159f * tanh(0.66667f * output))
14: #elif SIGMOID
15:     #define ACTIVATION_FUNCTION(output) (1.0f / (1 + exp(-output)))
16: #elif defined RELU
17:     #define ACTIVATION_FUNCTION(output) (output> 0 ? output : 0)
18: #elif defined ELU
19:     #define ACTIVATION_FUNCTION(output) (output> 0 ? output : exp(output) - 1)
20: #elif defined LINEAR
21:     #define ACTIVATION_FUNCTION(output) (output)
22: #endif
23: 
24: #ifdef ACTIVATION_FUNCTION // protect against not defined
25: kernel void activate(const int N, global float *inout) {
26:     const int globalId = get_global_id(0);
27:     if (globalId >= N) {
28:         return;
29:     }
30:     inout[globalId] = ACTIVATION_FUNCTION(inout[globalId]);
31: }
32: #endif
33: 
34: #ifdef ACTIVATION_FUNCTION // protect against not defined
35: kernel void forwardNaive(const int N, global float *out, global const float *in) {
36:     const int globalId = get_global_id(0);
37:     if (globalId >= N) {
38:         return;
39:     }
40:     out[globalId] = ACTIVATION_FUNCTION(in[globalId]);
41: }
42: #endif
43: 
44:

Debugging this, it looks like it's caused by a broken options string ("" -DgOutputSize=32 -DgOutputSizeSquared=1024 -DgInputSize=32 -DgInputSizeSquared=1024 -DgNumPlanes=8 -D \230zt\002"), which is in turn caused by an ActivationFunction that has been optimized out (according to gdb).

Fix luarocks install, so it can download the source

currently, luarocks with the rock file works, but just typing luarocks install luadeepcl fails, because it doesnt download the soruce file, or the rock, just the rockspsec, which points to a non-existent source file.

test_qlearning.py seems broken on py 3

(py 2 works ok)

No straightforward way to train on expected data rather than labels, in C++ API

Looking at the C++ API documentation at: https://github.com/hughperkins/DeepCL/blob/master/doc/NeuralNetAPI.md

I see an example using NetLearner to train on labels, but I don't see an equivalent way of training on expected data (floats)

Can you point to some example code on how to do this? Is it possible to extend NetLearner to provide support for expected data?

Create opencl kernels for large image sizes, using local memory

Currently, for large images, the only working kernel is propagate1, which is generic, but doesnt use local memory. If we make a dedicated kernel, that uses local memory, eg by blocking the input images, large images should run faster (eg 128x128, this kind of size)

Some bug in ActivationLayer in current master, I think

Some bug in ActivationLayer in current master, I think. Looking into it...

CrossEntropyLoss.cpp

Hi,

line:67 gradInput[i] = (input[i] - expectedOutput[i]) / input[i] / (1.0f - input[i]);
is this correct? In https://github.com/nyanp/tiny-cnn/blob/master/tiny_cnn/lossfunctions/loss_function.h
gradInput[i] = (input[i] - expectedOutput[i]) / (input[i] * (1.0f - input[i]));
is used.

thanks,
filip

Get lua wrappers into luarocks repository

lua wrappers exist now
would be good to get them published into luarocks repository

Run soumith's benchmarking code on a K520, so can use a K520 as a proxy

Run soumith's benchmarking code on a K520, so can use a K520 as a proxy for guesstimating results on a Titan/Titan-x in the future.

hughperkins / deepcl Goto Github PK

deepcl's Introduction

DeepCL

DeepCL

Example usages

Installation

Native library installation

Windows

Pre-requisites:

Procedure:

Linux

Pre-requisites:

Procedure:

Python wrappers

To build from source

What if it doesn't run?

What if I need a new feature?

What if I want to contribute myself?

Third-party libraries

Hardware/driver specific issues

Related projects

License

Recent changes

To get in contact

deepcl's People

Contributors

Stargazers

Watchers

Forkers

deepcl's Issues

clinfo:

Error:

Recommend Projects

Recommend Topics

Recommend Org

Jobs