fastai / courses Goto Github PK

fast.ai Courses

License: Apache License 2.0

Jupyter Notebook 99.76% Python 0.20% Shell 0.04%

courses's Introduction

Welcome to fastai

Installing

You can use fastai without any installation by using Google Colab. In fact, every page of this documentation is also available as an interactive notebook - click “Open in colab” at the top of any page to open it (be sure to change the Colab runtime to “GPU” to have it run fast!) See the fast.ai documentation on Using Colab for more information.

You can install fastai on your own machines with conda (highly recommended), as long as you’re running Linux or Windows (NB: Mac is not supported). For Windows, please see the “Running on Windows” for important notes.

We recommend using miniconda (or miniforge). First install PyTorch using the conda line shown here, and then run:

conda install -c fastai fastai

To install with pip, use: pip install fastai.

If you plan to develop fastai yourself, or want to be on the cutting edge, you can use an editable install (if you do this, you should also use an editable install of fastcore to go with it.) First install PyTorch, and then:

git clone https://github.com/fastai/fastai
pip install -e "fastai[dev]"

Learning fastai

The best way to get started with fastai (and deep learning) is to read the book, and complete the free course.

To see what’s possible with fastai, take a look at the Quick Start, which shows how to use around 5 lines of code to build an image classifier, an image segmentation model, a text sentiment model, a recommendation system, and a tabular model. For each of the applications, the code is much the same.

Read through the Tutorials to learn how to train your own models on your own datasets. Use the navigation sidebar to look through the fastai documentation. Every class, function, and method is documented here.

To learn about the design and motivation of the library, read the peer reviewed paper.

About fastai

fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying patterns of many deep learning and data processing techniques in terms of decoupled abstractions. These abstractions can be expressed concisely and clearly by leveraging the dynamism of the underlying Python language and the flexibility of the PyTorch library. fastai includes:

A new type dispatch system for Python along with a semantic type hierarchy for tensors
A GPU-optimized computer vision library which can be extended in pure Python
An optimizer which refactors out the common functionality of modern optimizers into two basic pieces, allowing optimization algorithms to be implemented in 4–5 lines of code
A novel 2-way callback system that can access any part of the data, model, or optimizer and change it at any point during training
A new data block API
And much more…

fastai is organized around two main design goals: to be approachable and rapidly productive, while also being deeply hackable and configurable. It is built on top of a hierarchy of lower-level APIs which provide composable building blocks. This way, a user wanting to rewrite part of the high-level API or add particular behavior to suit their needs does not have to learn how to use the lowest level.

Migrating from other libraries

It’s very easy to migrate from plain PyTorch, Ignite, or any other PyTorch-based library, or even to use fastai in conjunction with other libraries. Generally, you’ll be able to use all your existing data processing code, but will be able to reduce the amount of code you require for training, and more easily take advantage of modern best practices. Here are migration guides from some popular libraries to help you on your way:

Windows Support

Due to python multiprocessing issues on Jupyter and Windows, num_workers of Dataloader is reset to 0 automatically to avoid Jupyter hanging. This makes tasks such as computer vision in Jupyter on Windows many times slower than on Linux. This limitation doesn’t exist if you use fastai from a script.

See this example to fully leverage the fastai API on Windows.

We recommend using Windows Subsystem for Linux (WSL) instead – if you do that, you can use the regular Linux installation approach, and you won’t have any issues with num_workers.

Tests

To run the tests in parallel, launch:

nbdev_test

For all the tests to pass, you’ll need to install the dependencies specified as part of dev_requirements in settings.ini

pip install -e .[dev]

Tests are written using nbdev, for example see the documentation for test_eq.

Contributing

After you clone this repository, make sure you have run nbdev_install_hooks in your terminal. This install Jupyter and git hooks to automatically clean, trust, and fix merge conflicts in notebooks.

After making changes in the repo, you should run nbdev_prepare and make additional and necessary changes in order to pass all the tests.

Docker Containers

For those interested in official docker containers for this project, they can be found here.

courses's People

Contributors

Stargazers

Watchers

Forkers

phimachine nimmen toccomy tonydeep wanjinchang synpon theamazingfedex cpdis codeaudit jdc08161063 7472741 dfd jacola moorbrook cesposo allensmile richasdy benjamesbabala prashant-surya rg21 pdrocorrea waieez yun-li slack0 kowizards ericg108 gdtm86 bhelx lesthersk tspannhw jamiecollinson vivek531 yinleon mnazam chaitanyavardhan howardnewyork gfodor prany wwymak gojiragreen mguo001 aspratyush glfaissal p9anand chenmiao samimust microprocessorguy n0-angel no-angel michael3712003 ah- livst 1heart kevinchai richardgill mouatez sravya8 silogram willnewby jgweir juanlp slideroolz aikinogard u6c angiemaunz ryanniehaus ynshenoy yanshanjing adich23 bnjcbsn dennda shashankadidamu geoyi karthikbalasubramanian nagyist geniusgeek makerspaze ccrome iabhi7 akankshamalhotra judeebene koumoul gputest2 gsadhas hittudiv glyphx pvomelveny jonchoi chenjun0210 droud neto71 7hacker vinhqdang naman23awasthi tomlous mchirico ancoraimparo miaochenal amyxst vigneshradhakrishnan1

courses's Issues

Potential missing dependencies in us-east-1 AMI: opencv and updated keras

Just in case it helps anyone, when setting up a p2 instance in us-east-1 (Virginia), I had to upgrade keras and install opencv:

pip install --upgrade keras
conda install -c conda-forge opencv

InvalidAMIId when running setup_p2.sh

I get the following output when running setup_p2.sh. It looks like I am unable to find the ami. Do i need to be granted access to this AMI?

{
    "AssociationId": "rtbassoc-d41bf0ac"
}
{
    "Return": true
}
./setup_p2.sh: line 14: /Users/nickrobinson/.ssh/aws-key.pem: Permission denied

A client error (InvalidAMIID.NotFound) occurred when calling the RunInstances operation: The image id '[ami-bc508adc]' does not exist
Waiting for instance start...

list index out of range
usage: aws [options] <command> <subcommand> [parameters]
aws: error: argument --instance-id: expected one argument

list index out of range
securityGroupId=sg-015d597c
subnetId=subnet-03db4758
instanceId=
instanceUrl=
Connect: ssh -i /Users/nickrobinson/.ssh/aws-key.pem ubuntu@

Syntax problem with Setup_p2.sh file...prior version works fine

I have been struggling and leaving messages in the forum, but could not get setup_p2.sh to work for me. I went into history and downloaded the prior version and it works fine. I believe when the last change was made a syntax error was created.

vgg = Vgg16() error

#import Vgg16 helper class

vgg = Vgg16()

ValueError Traceback (most recent call last)
in ()
1 #import Vgg16 helper class
----> 2 vgg = Vgg16()

/home/chenjun/Projects/courses/deeplearning1/nbs/vgg16.pyc in init(self)
31 def init(self):
32 self.FILE_PATH = 'models/'
---> 33 self.create()
34 self.get_classes()
35

/home/chenjun/Projects/courses/deeplearning1/nbs/vgg16.pyc in create(self)
69
70 self.ConvBlock(2, 64)
---> 71 self.ConvBlock(2, 128)
72 self.ConvBlock(3, 256)
73 self.ConvBlock(3, 512)

/home/chenjun/Projects/courses/deeplearning1/nbs/vgg16.pyc in ConvBlock(self, layers, filters)
55 model.add(ZeroPadding2D((1, 1)))
56 model.add(Convolution2D(filters, 3, 3, activation='relu'))
---> 57 model.add(MaxPooling2D((2, 2), strides=(2, 2)))
58
59

/home/chenjun/anaconda2/lib/python2.7/site-packages/keras/models.pyc in add(self, layer)
325 output_shapes=[self.outputs[0]._keras_shape])
326 else:
--> 327 output_tensor = layer(self.outputs[0])
328 if isinstance(output_tensor, list):
329 raise TypeError('All layers in a Sequential model '

/home/chenjun/anaconda2/lib/python2.7/site-packages/keras/engine/topology.pyc in call(self, x, mask)
567 if inbound_layers:
568 # This will call layer.build() if necessary.
--> 569 self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
570 # Outputs were already computed when calling self.add_inbound_node.
571 outputs = self.inbound_nodes[-1].output_tensors

/home/chenjun/anaconda2/lib/python2.7/site-packages/keras/engine/topology.pyc in add_inbound_node(self, inbound_layers, node_indices, tensor_indices)
630 # creating the node automatically updates self.inbound_nodes
631 # as well as outbound_nodes on inbound layers.
--> 632 Node.create_node(self, inbound_layers, node_indices, tensor_indices)
633
634 def get_output_shape_for(self, input_shape):

/home/chenjun/anaconda2/lib/python2.7/site-packages/keras/engine/topology.pyc in create_node(cls, outbound_layer, inbound_layers, node_indices, tensor_indices)
162
163 if len(input_tensors) == 1:
--> 164 output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
165 output_masks = to_list(outbound_layer.compute_mask(input_tensors[0], input_masks[0]))
166 # TODO: try to auto-infer shape

/home/chenjun/anaconda2/lib/python2.7/site-packages/keras/layers/pooling.pyc in call(self, x, mask)
158 strides=self.strides,
159 border_mode=self.border_mode,
--> 160 dim_ordering=self.dim_ordering)
161 return output
162

/home/chenjun/anaconda2/lib/python2.7/site-packages/keras/layers/pooling.pyc in _pooling_function(self, inputs, pool_size, strides, border_mode, dim_ordering)
208 output = K.pool2d(inputs, pool_size, strides,
209 border_mode, dim_ordering,
--> 210 pool_mode='max')
211 return output
212

/home/chenjun/anaconda2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.pyc in pool2d(x, pool_size, strides, border_mode, dim_ordering, pool_mode)
2336
2337 if pool_mode == 'max':
-> 2338 x = tf.nn.max_pool(x, pool_size, strides, padding=padding)
2339 elif pool_mode == 'avg':
2340 x = tf.nn.avg_pool(x, pool_size, strides, padding=padding)

/home/chenjun/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.pyc in max_pool(value, ksize, strides, padding, data_format, name)
687 padding=padding,
688 data_format=data_format,
--> 689 name=name)
690
691

/home/chenjun/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.pyc in _max_pool(input, ksize, strides, padding, data_format, name)
1121 result = _op_def_lib.apply_op("MaxPool", input=input, ksize=ksize,
1122 strides=strides, padding=padding,
-> 1123 data_format=data_format, name=name)
1124 return result
1125

/home/chenjun/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.pyc in apply_op(self, op_type_name, name, **keywords)
701 op = g.create_op(op_type_name, inputs, output_types, name=scope,
702 input_types=input_types, attrs=attr_protos,
--> 703 op_def=op_def)
704 outputs = op.outputs
705 return _Restructure(ops.convert_n_to_tensor(outputs),

/home/chenjun/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.pyc in create_op(self, op_type, inputs, dtypes, input_types, name, attrs, op_def, compute_shapes, compute_device)
2334 original_op=self._default_original_op, op_def=op_def)
2335 if compute_shapes:
-> 2336 set_shapes_for_outputs(ret)
2337 self._add_op(ret)
2338 self._record_op_seen_by_control_dependencies(ret)

/home/chenjun/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.pyc in set_shapes_for_outputs(op)
1723 raise RuntimeError("No shape function registered for standard op: %s"
1724 % op.type)
-> 1725 shapes = shape_func(op)
1726 if shapes is None:
1727 raise RuntimeError(

/home/chenjun/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.pyc in max_pool_shape(op)
508 out_rows, out_cols = get2d_conv_output_size(in_rows, in_cols, ksize_r,
509 ksize_c, stride_r, stride_c,
--> 510 padding)
511 output_shape = [batch_size, out_rows, out_cols, depth]
512 else:

/home/chenjun/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.pyc in get2d_conv_output_size(input_height, input_width, filter_height, filter_width, row_stride, col_stride, padding_type)
187 return get_conv_output_size((input_height, input_width),
188 (filter_height, filter_width),
--> 189 (row_stride, col_stride), padding_type)
190
191

/home/chenjun/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.pyc in get_conv_output_size(input_size, filter_size, strides, padding_type)
152 zip(filter_size, input_size)):
153 raise ValueError("Filter must not be larger than the input: "
--> 154 "Filter: %r Input: %r" % (filter_size, input_size))
155
156 if padding_type == b"VALID":

ValueError: Filter must not be larger than the input: Filter: (2, 2) Input: (1, 112)

Instantiating Vgg16 fails

I'm running notebook inside a Python 3.5 virtualenv.

vgg = Vgg16()

Output


vgg = Vgg16()
vgg = Vgg16()
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, debug_python_shape_fn, require_shape_fn)
    670           graph_def_version, node_def_str, input_shapes, input_tensors,
--> 671           input_tensors_as_shapes, status)
    672   except errors.InvalidArgumentError as err:

/usr/lib/python3.5/contextlib.py in __exit__(self, type, value, traceback)
     65             try:
---> 66                 next(self.gen)
     67             except StopIteration:

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py in raise_exception_on_not_ok_status()
    465           compat.as_text(pywrap_tensorflow.TF_Message(status)),
--> 466           pywrap_tensorflow.TF_GetCode(status))
    467   finally:

InvalidArgumentError: Negative dimension size caused by subtracting 2 from 1 for 'MaxPool_1' (op: 'MaxPool') with input shapes: [?,1,112,128].

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-5-c073eb7a6e14> in <module>()
----> 1 vgg = Vgg16()

/media/user/code/courses/deeplearning1/nbs/vgg16.py in __init__(self)
     41     def __init__(self):
     42         self.FILE_PATH = 'http://files.fast.ai/models/'
---> 43         self.create()
     44         self.get_classes()
     45 

/media/user/code/courses/deeplearning1/nbs/vgg16.py in create(self)
    122 
    123         self.ConvBlock(2, 64)
--> 124         self.ConvBlock(2, 128)
    125         self.ConvBlock(3, 256)
    126         self.ConvBlock(3, 512)

/media/user/code/courses/deeplearning1/nbs/vgg16.py in ConvBlock(self, layers, filters)
     95             model.add(ZeroPadding2D((1, 1)))
     96             model.add(Convolution2D(filters, 3, 3, activation='relu'))
---> 97         model.add(MaxPooling2D((2, 2), strides=(2, 2)))
     98 
     99 

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/keras/models.py in add(self, layer)
    330                  output_shapes=[self.outputs[0]._keras_shape])
    331         else:
--> 332             output_tensor = layer(self.outputs[0])
    333             if isinstance(output_tensor, list):
    334                 raise TypeError('All layers in a Sequential model '

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/keras/engine/topology.py in __call__(self, x, mask)
    570         if inbound_layers:
    571             # This will call layer.build() if necessary.
--> 572             self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
    573             # Outputs were already computed when calling self.add_inbound_node.
    574             outputs = self.inbound_nodes[-1].output_tensors

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/keras/engine/topology.py in add_inbound_node(self, inbound_layers, node_indices, tensor_indices)
    633         # creating the node automatically updates self.inbound_nodes
    634         # as well as outbound_nodes on inbound layers.
--> 635         Node.create_node(self, inbound_layers, node_indices, tensor_indices)
    636 
    637     def get_output_shape_for(self, input_shape):

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/keras/engine/topology.py in create_node(cls, outbound_layer, inbound_layers, node_indices, tensor_indices)
    164 
    165         if len(input_tensors) == 1:
--> 166             output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
    167             output_masks = to_list(outbound_layer.compute_mask(input_tensors[0], input_masks[0]))
    168             # TODO: try to auto-infer shape

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/keras/layers/pooling.py in call(self, x, mask)
    158                                         strides=self.strides,
    159                                         border_mode=self.border_mode,
--> 160                                         dim_ordering=self.dim_ordering)
    161         return output
    162 

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/keras/layers/pooling.py in _pooling_function(self, inputs, pool_size, strides, border_mode, dim_ordering)
    208         output = K.pool2d(inputs, pool_size, strides,
    209                           border_mode, dim_ordering,
--> 210                           pool_mode='max')
    211         return output
    212 

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in pool2d(x, pool_size, strides, border_mode, dim_ordering, pool_mode)
   2864 
   2865     if pool_mode == 'max':
-> 2866         x = tf.nn.max_pool(x, pool_size, strides, padding=padding)
   2867     elif pool_mode == 'avg':
   2868         x = tf.nn.avg_pool(x, pool_size, strides, padding=padding)

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py in max_pool(value, ksize, strides, padding, data_format, name)
   1819                                 padding=padding,
   1820                                 data_format=data_format,
-> 1821                                 name=name)
   1822 
   1823 

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/tensorflow/python/ops/gen_nn_ops.py in _max_pool(input, ksize, strides, padding, data_format, name)
   1636   result = _op_def_lib.apply_op("MaxPool", input=input, ksize=ksize,
   1637                                 strides=strides, padding=padding,
-> 1638                                 data_format=data_format, name=name)
   1639   return result
   1640 

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py in apply_op(self, op_type_name, name, **keywords)
    766         op = g.create_op(op_type_name, inputs, output_types, name=scope,
    767                          input_types=input_types, attrs=attr_protos,
--> 768                          op_def=op_def)
    769         if output_structure:
    770           outputs = op.outputs

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in create_op(self, op_type, inputs, dtypes, input_types, name, attrs, op_def, compute_shapes, compute_device)
   2336                     original_op=self._default_original_op, op_def=op_def)
   2337     if compute_shapes:
-> 2338       set_shapes_for_outputs(ret)
   2339     self._add_op(ret)
   2340     self._record_op_seen_by_control_dependencies(ret)

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in set_shapes_for_outputs(op)
   1717       shape_func = _call_cpp_shape_fn_and_require_op
   1718 
-> 1719   shapes = shape_func(op)
   1720   if shapes is None:
   1721     raise RuntimeError(

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in call_with_requiring(op)
   1667 
   1668   def call_with_requiring(op):
-> 1669     return call_cpp_shape_fn(op, require_shape_fn=True)
   1670 
   1671   _call_cpp_shape_fn_and_require_op = call_with_requiring

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py in call_cpp_shape_fn(op, input_tensors_needed, input_tensors_as_shapes_needed, debug_python_shape_fn, require_shape_fn)
    608     res = _call_cpp_shape_fn_impl(op, input_tensors_needed,
    609                                   input_tensors_as_shapes_needed,
--> 610                                   debug_python_shape_fn, require_shape_fn)
    611     if not isinstance(res, dict):
    612       # Handles the case where _call_cpp_shape_fn_impl calls unknown_shape(op).

/media/user/code/courses/deeplearning1/venv/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, debug_python_shape_fn, require_shape_fn)
    674       missing_shape_fn = True
    675     else:
--> 676       raise ValueError(err.message)
    677 
    678   if missing_shape_fn:

ValueError: Negative dimension size caused by subtracting 2 from 1 for 'MaxPool_1' (op: 'MaxPool') with input shapes: [?,1,112,128].

Here is the pip freeze output

appdirs==1.4.3
bleach==2.0.0
cycler==0.10.0
decorator==4.0.11
entrypoints==0.2.2
html5lib==0.999999999
ipykernel==4.6.1
ipython==6.0.0
ipython-genutils==0.2.0
ipywidgets==6.0.0
jedi==0.10.2
Jinja2==2.9.6
jsonschema==2.6.0
jupyter==1.0.0
jupyter-client==5.0.1
jupyter-console==5.1.0
jupyter-core==4.3.0
Keras==1.2.2
MarkupSafe==1.0
matplotlib==2.0.2
mistune==0.7.4
nbconvert==5.2.1
nbformat==4.3.0
notebook==5.0.0
numpy==1.12.1
packaging==16.8
pandocfilters==1.4.1
pexpect==4.2.1
pickleshare==0.7.4
prompt-toolkit==1.0.14
protobuf==3.3.0
ptyprocess==0.5.1
Pygments==2.2.0
pyparsing==2.2.0
python-dateutil==2.6.0
pytz==2017.2
PyYAML==3.12
pyzmq==16.0.2
qtconsole==4.3.0
scipy==0.19.0
simplegeneric==0.8.1
six==1.10.0
tensorflow==1.1.0
terminado==0.6
testpath==0.3.1
Theano==0.9.0
tornado==4.5.1
traitlets==4.3.2
wcwidth==0.1.7
webencodings==0.5.1
Werkzeug==0.12.2
widgetsnbextension==2.0.0

problem with the import module in utils.py

Hi,
In utils.py there is
from keras.regularizers import l2, activity_l2, l1, activity_l1
but, I checked with keras package, there is no such methods as activity_l2/activity_l1

other than that, I also found the same problem with
from keras.utils.layer_utils import layer_from_config

I wonder what the problems might be on my end.
thanks.

./setup_p2.sh: line 119: syntax error: unexpected end of file error on mac

Hi, this is a great DL course, thanks very much!

I got "./line 119: syntax error: unexpected end of file" when trying to run the setup_p2 script on a mac machine. I have tried to use dos2unix and the method described in #17

but didn't fix the error.

Have you seen a similar error? Thanks!

AMI provisioning script

Hello, thanks for the great course and resources! 🙏

Would it be possible to publish the AMI's provisioning script?

(And if a reply to this is possible, why is an AMI required? As far as I can tell the setup scripts only require Ubuntu 16.04 with Git installed, is this correct?)

Thanks again!

Setup Instructions

Hi,

I just went through the steps to set up the EC2 instance, and I'm glad it was so easy. Thanks for all the work you've done to make it so.

I did notice that the video points to www.platform.ai/files for the AWS setup files. That helpfully points to http://course.fast.ai/start.html, which says the files are now here on github. Once I get to this repo, I see the setup/ directory and can figure things out from there. It did take some investigating to see that I needed both setup/setup_p2.sh and setup/setup_instance.sh. Are there instructions with this somewhere? If not, I could submit a PR and add instructions to this repo. If you'd like me to do that, would you prefer the setup instructions in the main README.md or in a new setup/README.md?

setup_p2.sh error

After starting instance

1.type in cmd "python"
2.import theano
Wait for long time .....

nvidia-smi gives below thing

Failed to initialize NVML: Driver/library version mismatch


ubuntu@ip-10-0-0-4:~$ python
Python 2.7.12 |Anaconda 4.2.0 (64-bit)| (default, Jul  2 2016, 17:42:40) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import theano
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available  (error: Unable to get the number of gpus available: no CUDA-capable device is detected)
>>> exit()
ubuntu@ip-10-0-0-4:~$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
ubuntu@ip-10-0-0-4:~$ exit

May be cuDNN library causing problem. Can you please fix this bug so We can use this script to run keras on AWS

issue with vgg initiating

I met an issue when initiating vgg
vgg = Vgg16()

not sure whether there is still such file on platform.ia

the backtrace:
/home/sparkuser/anaconda2/lib/python2.7/site-packages/h5py/_hl/files.pyc in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
90 if swmr and swmr_support:
91 flags |= h5f.ACC_SWMR_READ
---> 92 fid = h5f.open(name, flags, fapl=fapl)
93 elif mode == 'r+':
94 fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)

h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1486190692013/work/h5py/_objects.c:2856)()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1486190692013/work/h5py/_objects.c:2814)()

h5py/h5f.pyx in h5py.h5f.open (/home/ilan/minonda/conda-bld/h5py_1486190692013/work/h5py/h5f.c:2102)()

IOError: Unable to open file (Truncated file: eof = 221519872, sblock->base_addr = 0, stored_eoa = 553482496)

Add /usr/local/cuda/bin to PATH ?

I'm trying to get this going on Azure using the appropriate instructions.

Should the install-gpu script add /usr/local/cuda/bin to my PATH ?

See this forum message (nvcc compiler not found on $PATH) and response linking to CUDA FAQ.

TypeError in Session 2 notebook when calling get_data function

I get this traceback when trying to call the get_data helper function with val_batches as the argument. It works when I pass just the file path to the validation data though. Also, I have not been able to successfully run the training data cell because I run out of memory. Would it be possible to upload the bcolz files for the training and validation sets for those of us on weaker CPUs?

`---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
----> 1 val_data = get_data(val_batches)

/home/clu/Python/courses-master/deeplearning1/nbs/utils.py in get_data(path, target_size)
133
134 def get_data(path, target_size=(224,224)):
--> 135 batches = get_batches(path, shuffle=False, batch_size=1, class_mode=None, target_size=target_size)
136 return np.concatenate([batches.next() for i in range(batches.nb_sample)])
137

/home/clu/Python/courses-master/deeplearning1/nbs/utils.py in get_batches(dirname, gen, shuffle, batch_size, class_mode, target_size)
89 target_size=(224,224)):
90 return gen.flow_from_directory(dirname, target_size=target_size,
---> 91 class_mode=class_mode, shuffle=shuffle, batch_size=batch_size)
92
93

/home/clu/anaconda3/lib/python3.5/site-packages/keras/preprocessing/image.py in flow_from_directory(self, directory, target_size, color_mode, classes, class_mode, batch_size, shuffle, seed, save_to_dir, save_prefix, save_format)
288 dim_ordering=self.dim_ordering,
289 batch_size=batch_size, shuffle=shuffle, seed=seed,
--> 290 save_to_dir=save_to_dir, save_prefix=save_prefix, save_format=save_format)
291
292 def standardize(self, x):

/home/clu/anaconda3/lib/python3.5/site-packages/keras/preprocessing/image.py in init(self, directory, image_data_generator, target_size, color_mode, dim_ordering, classes, class_mode, batch_size, shuffle, seed, save_to_dir, save_prefix, save_format)
556 if not classes:
557 classes = []
--> 558 for subdir in sorted(os.listdir(directory)):
559 if os.path.isdir(os.path.join(directory, subdir)):
560 classes.append(subdir)

TypeError: argument should be string, bytes or integer, not DirectoryIterator`

VGG batches path error - Lesson One

GitHub virgin here, but I found some errors that might be helpful to correct.
When instructing the path to the sample data sets in lesson one, "train" and "valid" are concatenated onto the path as shown below.

vgg = Vgg16() batches = vgg.get_batches(path+'train', batch_size=batch_size) val_batches = vgg.get_batches(path+'valid', batch_size=batch_size*2) vgg.finetune(batches) vgg.fit(batches, val_batches, nb_epoch=1)

Also, in the visual portion:

batches = vgg.get_batches(path+'train', batch_size=batch_size) val_batches = vgg.get_batches(path+'valid', batch_size=batch_size)

Both instances seem to need '/train' and '/valid' to avoid the following error:

OSError: [Errno 2] No such file or directory: 'data/dogscats/sampletrain' OSError: [Errno 2] No such file or directory: 'data/dogscats/samplevalid'

Thank y'all for the wonderful opportunity to be a part of this amazing journey! =)

Error: "AttributeError: 'module' object has no attribute 'tests'

Data-sharing between t2 and p2 using EFS/EBS volume?

Maybe this is covered somewhere else, but I wonder if the AWS setup either should have a shared EFS or a shared EBS volume between the t2 and p2 instances?

can not open wordvectors.ipynb file.

hi Jeremy

i could not load wordvectors.ipynb from deeplearning1 notebooks file in my local notebook. can't open it in github either. is there anything i should do before opening it,like any notebook extension to install? or is there an issue with the file. help is appreciated.

change to files.fast.ai.

in https://github.com/fastai/courses/blob/master/deeplearning2/imagenet_process.ipynb

change

fpath = get_file('imagenet_class_index.json', 
             'http://www.platform.ai/models/imagenet_class_index.json', 
             cache_subdir='models')

fpath = get_file('imagenet_class_index.json', 
                 'http://files.fast.ai/models/imagenet_class_index.json', 
                 cache_subdir='models')

setup_p2_ireland.sh missing or wrong link

Hello

while reading the page http://course.fast.ai/lessons/aws.html
it contains a link for the european setup file (setup_p2_ireland.sh)

that points to a 404
https://github.com/fastai/courses/blob/master/setup/setup_p2_ireland.sh

I've seen that you now have more than 2 install scripts and that the correct region is found throught the config.
however my $region is empty ...
(region=aws configure get region)

there should be a variable that you can edit to manually choose the region. (i'll do a PR if you want)

Also.. what's the difference between all the install scripts?

ps. thanks for providing these resources :)

p2 instance - nvidia-smi - mismatch versions

When running the p2 script I tried nvidia-smi and got an error.

Missing Terminator in "aws-alias.sh"

On line 7 of "aws-alias.sh" you're missing a terminating single quote.

Update scripts to work with Keras 2.0

Super slow AWS (used setup_p2_virginia.sh)

Hello,

Is normal this code to take about one minute to run each time machine is booted?

import utils; reload(utils)
from utils import plots

I'm using this AMI:
ami-id: ami-31ecfb26
availability-zone: us-east-1c
block-device-mapping: ami
instance-type: p2.xlarge

When I look at 'nmon' I just see the CPU in waiting mode, so this can be disk-related, even using gp2

lm.fit does not produce any output lesson2 deeplearning course 1

lm.fit(trn_features, trn_labels, nb_epoch=2, batch_size=batch_size,
validation_data=(val_features, val_labels))

fast-ai-remove.sh doesn't delete key pairs

http://wiki.fast.ai/index.php/Starting_Over_with_AWS instructs us to manually delete key pairs when starting over ... but the fast-ai-remove.sh script doesn't do this.

Assuming that fast-ai-remove.sh did everything necessary to start over when it didn't cost me a couple hours setting up ...

In lession 1, OOM when using gpu version tensorflow as keras backend, but i think i got enough memory

I Think I got enough gpu memory:
see log "Free memory: 3.55GiB"
for the these part 2.02GiB + 798.06MiB, but it gives me an OOM error.
I'm using tensorflow-gpu==1.3.0 and Keras==2.0.8 on a single GeForce GTX 960 card
This is log:

name: GeForce GTX 960
major: 5 minor: 2 memoryClockRate (GHz) 1.2025
pciBusID 0000:01:00.0
Total memory: 3.94GiB
Free memory: 3.55GiB
2017-10-23 22:07:24.232475: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 
2017-10-23 22:07:24.232478: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y 
2017-10-23 22:07:24.232483: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 960, pci bus id: 0000:01:00.0)
Found 23000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.
Epoch 1/1
2017-10-23 22:07:27.979547: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.02GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-10-23 22:07:38.053482: W tensorflow/core/common_runtime/bfc_allocator.cc:273] Allocator (GPU_0_bfc) ran out of memory trying to allocate 798.06MiB.  Current allocation summary follows.
2017-10-23 22:07:38.053544: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (256): 	Total Chunks: 1, Chunks in use: 0 256B allocated for chunks. 4B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
....

model.predict missing id remapping in lesson 4?

Regarding deeplearning 1, lesson 4, I am trying to understand if my neural net model for collaborative filtering currently generates predictions as expected, taking the RMSE value the training results in (around 0.8) into account. I got strange results, using this

From deeplearning1/nbs/lesson4.ipynb:

We can use the model to generate predictions by passing a pair of ints - a user id and a movie id. For instance, this predicts that user #3 would really enjoy movie #6.
model.predict([np.array([3]), np.array([6])])

should this not be

model.predict([np.array([userid2idx[3]]), np.array([movieid2idx[6]])])

since user id:s and movie id:s are remapped to continuous numbers?

Changing the jupyter notebook related script

I see the following code in the setup/install-gpu-azure.sh script giving error.

echo "c.NotebookApp.password = u'"$jupass"'" >> .jupyter/jupyter_notebook_config.py
echo "c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False" >> .jupyter/jupyter_notebook_config.py

Can this be changed to following:

echo "c.NotebookApp.password = u'"$jupass"'" >> .jupyter/jupyter_notebook_config.py
echo "c.NotebookApp.ip = '*'" >> .jupyter/jupyter_notebook_config.py
echo "c.NotebookApp.open_browser = False" >> .jupyter/jupyter_notebook_config.py

import compatibility issue with newest kereas 2.x

keras-team/keras#5870
two errors: cannot import activity_l2, cannot import layer_from_config

activity_l1 is deleted and was equal to l1
activity_l2 is deleted and was equal to l2
as for layer_from_config:
#from keras.utils.layer_utils import layer_from_config
from keras.layers import deserialize as layer_from_config

setup_t2.sh error

setup_t2.sh is giving me error like this. why is it so?

./setup_t2.sh: line 7: syntax error near unexpected token `newline'
./setup_t2.sh: line 7: `<!DOCTYPE html>'

AMI for `ap-northeast-2` (Seoul) and US-east (Ohio)

Would be possible to set up an AMI for these new places that have p2 instances since last month?

I am specially interested in the Asia Pacific region :D
Thanks!

Error: "AttributeError: 'module' object has no attribute 'tests'"

I have successfully installed the CUDA server by following the tutorial video step by step. In the end, when I typed "import theano", the error **AttributeError: 'module' object has no attribute 'tests'**appears. Could anyone help me out of this problem? Thanks in advance!

Creating infrastructure with Terraform

hello,

Was wondering if you would be interested in a PR that could create your AWS resources with Terraform. The scripts you have are nice, but there are ways to do this in a more common way that would allow for anyone to come in and run some simple terraform commands to get started.

I would be interested in helping out with this, but was just curious what your thoughts were.

Thanks for this awesome course. I just started and had this initial thought about the infrastructure creation in AWS.

the problem of file vgg16.py

By practice, I found the vgg16.py seems like a final version.It seems like a trouble, when somebody only study lecture 1.I think it is better to have a different version of vgg16.py for the course.It is a better way to understand how the whole structure is bulid.

Thank you very much for reading this suggestion.

NotJSONError in wordvectors.ipynb

When trying to open the wordvectors notebook I receive the following:

Unreadable Notebook: /home/mark/projects/fastai/courses/deeplearning1/nbs/wordvectors.ipynb NotJSONError('Notebook does not appear to be JSON: \'{\\n "cells": [\\n {\\n "cell_type": "c...',)

unable to launch lesson1 jupyter notebook on python 3 and python 2 kernels

Unreadable Notebook: lesson1.ipynb NotJSONError("Notebook does not appear to be JSON: '\n\n\n\n\n\n\n<html lang...",)

Theano: no GPU is abailable

Hi,

While following the guide to set up a P2 instance I'm running into an error when trying to import Theano in a Jupyter nodebook in the very last step of the AWS tutorial.

This P2 instance is running in Ireland (eu-west-1).
I've tried it with two separate instances with the same result.

import theano

WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available (error: Unable to get the number of gpus available: CUDA driver version is insufficient for CUDA runtime version)

Here is the output of nvidia-smi:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48                 Driver Version: 367.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 0000:00:1E.0     Off |                    0 |
| N/A   25C    P0    71W / 149W |      0MiB / 11439MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Have i missed a step or somewhere?

setup_p2.sh fail to create instance

when i bash setup_p2.sh as the vedio said, it just don't work, and the error shown:

An error occurred (InvalidKeyPair.NotFound) when calling the RunInstances operation: The key pair 'aws-key-fast-ai' does not exist

An error occurred (MissingParameter) when calling the CreateTags operation: The request must contain the parameter resourceIdSet

could help me? thanks

GCP is $0.20 less

https://cloud.google.com/gpu/

would love if GCP is used

install-gpu-azure : exception "The nvidia driver version ... does not give good results ..."

Hi, when running lesson 1 on azure gpu (Standard NC6 with Ubuntu 16.04) the following exception is thrown :
Exception: The nvidia driver version installed with this OS does not give good results for reduction.Installing the nvidia driver available on the same download page as the cuda package will fix the problem: http://developer.nvidia.com/cuda-downloads

I tried others nvidia drivers without luck. But installing cuda-8 instead on 9 solved it for me. In my local version of install-gpu-azure.sh in replaced :
sudo apt-get -y install cuda
with
sudo apt-get -y install cuda-8-0

It fixed the problem.

cannot import name activity_l2 from keras.regularizers

get_data() takes path as argument for lesson2

From the current version of utils.py it looks like get_data() takes a directory path as an arg, rather than a batches-like object.

The video and notebook use val_batches and batches, did the function change?

For the current version of get_data(), shouldn't it look more like this:
val_data = get_data(path + 'valid')
trn_data = get_data(path + 'train')

setup aliases that are assigned to variables have extra double qoutes

This probably only occurs for people who have configured their AWS CLI with a different output format than 'text.' For example, when you set the output format to 'json,' you get a result similar to this:

[jxstanford@jxmbp2017 ~]$ aws-get-p2
"i-0dfd46605fc163c5b"

The additional double quotes cause issues in with future calls using one of the start/stop/etc. aliases due to the " "s around the value.

I've submitted a PR that explicitly sets the output type to text, and that should resolves the issue.

An error occurred (MissingParameter) when calling the CreateTags operation: The request must contain the parameter resourceIdSet

I'm getting an error with the setup_p2.sh shell script. I don't know what this parameter refers to. Is this something new? In the text of the shell script I see many export calls that have a variable with 'Id' in the name but that specific variable is not found on searching.

Thx

Eric

Issue with missing `output_shape` in Lesson 1 "Model Creation"

The VGG_16 definition begins as

def VGG_16():
    model = Sequential()
    model.add(Lambda(vgg_preprocess, input_shape=(3,224,224)))

but this yields a warning

UserWarning: `output_shape` argument not specified for layer lambda_6 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 3, 224, 224)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.

specifying the output_shape fixes the issue

def VGG_16():
    model = Sequential()
    model.add(Lambda(vgg_preprocess, input_shape=(3,224,224), output_shape=(3,224,224)))

Also note that this is done correctly in vgg_16.py https://github.com/fastai/courses/blob/master/deeplearning1/nbs/vgg16.py#L67

install-gpu.sh script uses DOS line-endings

I had some trouble running the install-gpu.sh script on a ubuntu linux AWS instance:
For instance, it wouldn't correctly download some of the links and when it created the downloads folder, the name was downloads?. I found that copy-pasting the individual commands worked, which made me think that it's an issue with line endings, and indeed, the file uses DOS line endings.

I opened it in vim, executed :set ff=unix, saved it, reran it, and everything worked.

Overwrite $instanceId in aws-alias.sh

If I understand it correctly, the instanceId variable is obtained by running setup_instance.sh, specifically line 64:

export instanceId=$(aws ec2 run-instances --image-id $ami --count 1 --instance-type $instanceType --key-name aws-key-$name --security-group-ids $securityGroupId --subnet-id $subnetId --associate-public-ip-address --block-device-mapping "[ { \"DeviceName\": \"/dev/sda1\", \"Ebs\": { \"VolumeSize\": 128, \"VolumeType\": \"gp2\" } } ]" --query 'Instances[0].InstanceId' --output text)

But in the end of the aws-alias.sh, why do we hard-code and overwrite instanceId with i-9aa9c282?

Thanks!

Guys, vgg_16.py has duplicate import

In courses/deeplearning1/nbs/vg16.py line 9 and line 12 import get_file twice

from keras.utils.data_utils import get_file
from keras import backend as K
from keras.layers.normalization import BatchNormalization
from keras.utils.data_utils import get_file

Cheers

When will the course part 2 come out

Hi, all,

When will the course part 2 come out? According to this page info , part 2 will be released online in May.
I'm eager to study it.

THX!