facebookarchive / torchnet Goto Github PK

Torch on steroids

License: Other

CMake 0.15% Lua 99.85%

torchnet's Introduction

torchnet

torchnet is a framework for torch which provides a set of abstractions aiming at encouraging code re-use as well as encouraging modular programming.

At the moment, torchnet provides four set of important classes:

Dataset: handling and pre-processing data in various ways.
Engine: training/testing machine learning algorithm.
Meter: meter performance or any other quantity.
Log: output performance or any other string to file / disk in a consistent manner.

For an overview of the torchnet framework, please also refer to this paper.

Installation

Please install torch first, following instructions on torch.ch. If torch is already installed, make sure you have an up-to-date version of argcheck, otherwise you will get weird errors at runtime.

Assuming torch is already installed, the torchnet core is only a set of lua files, so it is straightforward to install it with luarocks

luarocks install torchnet

To run the MNIST example from the paper, install the mnist package:

luarocks install mnist

cd into the installed torchnet package directory and run:

th example/mnist.lua

Documentation

Requiring torchnet returns a local variable containing all torchnet class constructors.

local tnt = require 'torchnet'

tnt.Dataset()

torchnet provides a variety of data containers, which can be easily plugged between each others, allowing the user to easily concat, split, batch, resample etc... datasets.

A instance dataset of a tnt.Dataset() implements two main methods:

dataset:size() which returns the size of the dataset.
dataset:get(idx) where idx is a number between 1 and the dataset size.

While it is easy to iterate over a dataset with a for loop, several DatasetIterator iterators are nevertheless provided, allowing the user to filter out some samples in an on-the-fly manner, or to parallelize easily data fetching.

In torchnet, a sample returned by dataset:get() is supposed to be a Lua table. Fields of the table can be arbitrary, even though many datasets will only work with torch tensors.

tnt.utils

Torchnet provides a set of util functions which are used all over torchnet.

tnt.utils.table.clone(table)

This function do a deep copy of a table.

#### tnt.utils.table.merge(dst, src) ``` ({ dst = table -- src = table -- }) ```

This function add to the destination table dest, the element contained in the source table source.

The copy is shallow.

If a key exists in both tables, then the element in the source table is preferred.

tnt.utils.table.foreach(tbl, closure[, recursive])

({
   tbl       = table     --
   closure   = function  --
  [recursive = boolean]  --  [default=false]
})

This function applies the function defined by closure to the table tbl.

If recursive is given and set to true, the closure function will be apply recursively to the table.

tnt.utils.table.canmergetensor(tbl)

Check if a table can be merged into a tensor.

tnt.utils.table.mergetensor(tbl)

({
   tbl = table  --
})

Merge a table into a tensor in one extra dimension.

tnt.transform

Torchnet provides a set of general data transformations. These transformations are either directly on the data (e.g., normalization) or on their structure. This is particularly handy when manipulating tnt.Dataset.

Most of the transformations are simple but can be composed or merged.

transform.identity(...)

The identity transform takes any input and return it as it is.

For example, this function is useful when composing transformations on data from multiple sources, and some of the sources must not be transformed.

transform.compose(transforms)

({
   transforms = table  --
})

This function takes a table of functions and composes them to return one transformation.

This function assumes that the table of transformations is indexed by contiguous ordered keys starting at 1. The transformations are composed in the ascending order.

For example, the following code:

> f = transform.compose{
        [1] = function(x) return 2*x end,
        [2] = function(x) return x + 10 end,
        foo = function(x) return x / 2 end,
        [4] = function(x) return x - x end
   }
   > f(3)
   16

is equivalent to compose the transformations stored in [1] and [2], i.e., defining the following transformation:

> f =  function(x) return 2*x + 10 end

Note that transformations stored with keys foo and 4 are ignored.

transform.merge(transforms)

({
   transforms = table  --
})

This function takes a table of transformations and merge them into one transformation. Once apply to an input, this transformation will produce a table of output, containing the transformed input.

For example, the following code:

> f = transform.merge{
        [1] = function(x) return 2*x end,
        [2] = function(x) return x + 10 end,
        foo = function(x) return x / 2 end,
        [4] = function(x) return x - x end
   }

produces a function which applies a set of transformations to the same input:

> f(3)
   {
     1 : 6
     2 : 13
     foo : 1.5
     4 : 0
   }

#### transform.tablenew()

This function creates a new table of functions from an existing table of functions.

transform.tableapply(transform)

({
   transform = function  --
})

This function applies a transformation to a table of input. It return a table of output of the same size as the input.

For example, the following code:

> f = transform.tableapply(function(x) return 2*x end)

produces a function which multiplies any input by 2:

> f({[1] = 1, [2] = 2, foo = 3, [4] = 4})
   {
     1 : 2
     2 : 4
     foo : 6
     4 : 8
   }

#### transform.tablemergekeys()

This function merges tables by key. More precisely, the input must be a table of table and this function will reverse the table in order to make the keys from the nested table accessible first.

For example, if the input is:

> x = { sample1 = {input = 1, target = "a"} , sample2 = {input = 2, target = "b", flag = "hard"}

Then apply this function will produce:

> transform.tablemergekeys(x)
{
   input :
         {
           sample1 : 1
           sample2 : 2
         }
   target :
          {
            sample1 : "a"
            sample2 : "b"
          }
   flag :
        {
           sample2: "hard"
        }
}

#### transform.makebatch([merge]) ``` ({ [merge = function] -- }) ```

This function is used in many tnt.Dataset to format samples in the format used by the tnt.Engine.

This function first merges keys to produces a table of output. Then, transform this table into a tensor by either using a merge transformation provided by the user or by simply concatenating the table into a tensor directly.

This function uses the compose transform to apply successive transformations.

transform.randperm(size)

({
   size = number  --
})

This function create a vector containing a permutation of the indices from 1 to size. This vector is a LongTensor and size must be a number.

Once the vector created, this function can be used to call a specific indices in it.

For example:

> p = transform.randperm(3)

creates a function p which contains a permutation of indices:

> p(1)
2
> p(2)
1
> p(3)
3

#### transform.normalize([threshold]) ``` ({ [threshold = number] -- [default=0] }) ```

This function normalizes data, i.e., it removes its mean and divide it by its standard deviation.

The input must be a Tensor.

Once create, a threshold can be given (must be a number). Then, the data will be divided by their standard deviation, only if this deviation is greater than the threshold. This is handy, if the deviation is small and dividing by it could lead to instability.

tnt.ListDataset(self, list, load[, path])

({
   self = tnt.ListDataset  --
   list = tds.Hash         --
   load = function         --
  [path = string]          --
})

Considering a list (can be a tds.Hash, table or a torch.LongTensor) the i-th sample of a dataset will be returned by load(list[i]), where load() is a closure provided by the user.

If path is provided, list is assumed to be a list of string, and will each element list[i] will prefixed by path/ when fed to load().

Purpose: many low or medium-scale datasets can be seen as a list of files (for example representing input samples). For this list of file, a target can be often inferred in a simple manner.

tnt.ListDataset(self, filename, load[, maxload][, path])

({
   self     = tnt.ListDataset  --
   filename = string           --
   load     = function         --
  [maxload  = number]          --
  [path     = string]          --
})

The file specified by filename is interpreted as a list of strings (one string per line). The i-th sample of a dataset will be returned by load(line[i]), where load() is a closure provided by the user an line[i] is the i-the line of filename.

If path is provided, list is assumed to be a list of string, and will each element list[i] will prefixed by path/ when fed to load().

#### tnt.TableDataset(self, data) ``` { self = tnt.TableDataset -- data = table -- } ```

tnt.TableDataset interfaces existing data to torchnet. It is useful if you want to use torchnet on a small dataset.

The data must be contained in a tds.Hash.

tnt.TableDataset does a shallow copy of the data.

Data are loaded while constructing the tnt.TableDataset:

> a = tnt.TableDataset{data = {1,2,3}}
> print(a:size())
3

tnt.TableDataset assumes that table has contiguous keys starting at 1.

tnt.IndexedDataset(self, fields[, path][, maxload][, mmap][, mmapidx][, standalone])

{
   self       = tnt.IndexedDataset  --
   fields     = table               --
  [path       = string]             --
  [maxload    = number]             --
  [mmap       = boolean]            --  [default=false]
  [mmapidx    = boolean]            --  [default=false]
  [standalone = boolean]            --  [default=false]
}

A tnt.IndexedDataset() is a data structure built upon (possibly several) data archives containing a bunch of tensors of the same type.

See tnt.IndexedDatasetWriter and tnt.IndexedDatasetReader to see how to create and read a single archive.

Purpose: large datasets (containing a lot of files) are often not very well handled by filesystems (especially over network). tnt.IndexedDataset provides a convenient and efficient way to bundle them into a single archive file, associated with an indexed file.

If path is provided, then fields must be a Lua array (keys being numbers), where values are string representing a filename prefix to a (index,archive) pair. In other word path/field.{idx,bin} must exist. The i-th sample returned by this dataset will be a table containing each field as key, and a tensor found at the corresponding archive at index i.

If path is not provided, then fields must be a Lua hash. Each key represents sample fields and the corresponding value must be a table containing the keys idx (for the index filename path) and bin (for the archive filename path).

If provided (and positive), maxload limits the dataset size to the specified size.

Archives and/or indexes can also be memory mapped with the mmap and mmapidx flags.

If standalone is true, the constructor expects only one field to be provided. The i-th sample returned by the dataset will be the item found at the archive at index i. This is particularly useful with table archives.

##### tnt.IndexedDatasetWriter(self, indexfilename, datafilename, type) ``` ({ self = tnt.IndexedDatasetWriter -- indexfilename = string -- datafilename = string -- type = string -- }) ```

Creates a (archive,index) file pair. The archive will contain tensors of the same specified type.

type must be a string chosen in {byte, char, short, int, long, float, double or table}.

indexfilename is the full path to the index file to be created. datafilename is the full path to the data archive file to be created.

Tensors are added to the archive with add().

Note that you must call close() to ensure all data is written on disk and to create the index file.

The type table is special: data will be stored into a CharTensor, serialized from a Lua table object. IndexedDatasetReader will then deserialize the CharTensor into a table at read time. This allows storing heterogenous data easily into an IndexedDataset.

tnt.IndexedDatasetWriter(self, indexfilename, datafilename)

({
   self          = tnt.IndexedDatasetWriter  --
   indexfilename = string                    --
   datafilename  = string                    --
})

Opens an existing (archive,index) file pair for appending. The tensor type is inferred from the provided index file.

indexfilename is the full path to the index file to be opened. datafilename is the full path to the data archive file to be opened.

###### tnt.IndexedDatasetWriter.add(self, tensor) ``` ({ self = tnt.IndexedDatasetWriter -- tensor = torch.*Tensor -- }) ```

Add a tensor to the archive and record its index position. The tensor type must of the same type than the one specified at the creation of the tnt.IndexedDatasetWriter.

tnt.IndexedDatasetWriter.add(self, filename)

({
   self     = tnt.IndexedDatasetWriter  --
   filename = string                    --
})

Convenience method which given a filename will open the corresponding file in binary mode, and reads all data in there as if it was of the type specified at the tnt.IndexedDatasetWriter construction. A corresponding tensor is then added to the archive/index pair.

tnt.IndexedDatasetWriter.add(self, table)

(
   self  = tnt.IndexedDatasetWriter  --
   table = table                     --
)

Convenience method only available for table type IndexedDataset. The table will be serialized into a CharTensor.

tnt.IndexedDatasetWriter.add(self)

({
   self = tnt.IndexedDatasetWriter  --
})

Finalize the index, and Close the archive/index filename pair. This method must be called to ensure the index is written and all the archive data is flushed on disk.

tnt.IndexedDatasetReader(self, indexfilename, datafilename[, mmap][, mmapidx])

({
   self          = tnt.IndexedDatasetReader  --
   indexfilename = string                    --
   datafilename  = string                    --
  [mmap          = boolean]                  --  [default=false]
  [mmapidx       = boolean]                  --  [default=false]
})

Reads an archive/index pair previously created by tnt.IndexedDatasetWriter.

indexfilename is the full path to the index file. datafilename is the full path to the archive file.

Memory mapping can be specified for both the archive and index through the optional mmap and mmapidx flags.

###### tnt.IndexedDatasetReader.size(self)

Returns the number of tensors present in the archive.

tnt.IndexedDatasetReader.get(self, index)

Returns the tensor at the specified index in the archive.

tnt.TransformDataset(self, dataset, transform[, key])

({
   self      = tnt.TransformDataset  --
   dataset   = tnt.Dataset           --
   transform = function              --
  [key       = string]               --
})

Given a closure transform(), and a dataset, tnt.TransformDataset applies the closure in an on-the-fly manner when querying a sample with tnt.Dataset:get().

If key is provided, the closure is applied to the sample field specified by key (only). The closure must return the new corresponding field value.

If key is not provided, the closure is applied on the full sample. The closure must return the new sample table.

The size of the new dataset is equal to the size of the underlying dataset.

Purpose: when performing pre-processing operations, it is convenient to be able to perform on-the-fly transformations to a dataset.

tnt.TransformDataset(self, dataset, transforms)

({
   self       = tnt.TransformDataset  --
   dataset    = tnt.Dataset           --
   transforms = table                 --
})

Given a set of closures and a dataset, tnt.TransformDataset applies these closures in an on-the-fly manner when querying a sample with tnt.Dataset:get().

Closures are provided in transforms, a Lua table, where a (key,value) pair represents a (sample field name, corresponding closure to be applied to the field name).

Each closure must return the new value of the corresponding field.

tnt.BatchDataset(self, dataset, batchsize[, perm][, merge][, policy][, filter])

({
   self      = tnt.BatchDataset  --
   dataset   = tnt.Dataset       --
   batchsize = number            --
  [perm      = function]         --  [has default value]
  [merge     = function]         --
  [policy    = string]           --  [default=include-last]
  [filter    = function]         --  [has default value]
})

Given a dataset, tnt.BatchDataset merges samples from this dataset to form a new sample which can be interpreted as a batch (of size batchsize).

The merge function controls how the batch is performed. It is a closure taking a Lua array as input containing all occurrences (for a given batch) of a field of the sample, and returning the aggregated version of these occurrences. By default the occurrences are supposed to be tensors, and they aggregated along the first dimension.

More formally, if the i-th sample of the underlying dataset is written as:

{input=<input_i>, target=<target_i>}

assuming only two fields input and target in the sample, then merge() will be passed tables of the form:

{<input_i_1>, <input_i_2>, ... <input_i_n>}

{<target_i_1>, <target_i_2>, ... <target_i_n>}

with n being the batch size.

It is often important to shuffle examples while performing the batch operation. perm(idx, size) is a closure which returns the shuffled index of the sample at position idx in the underlying dataset. For convenience, the size of the underlying dataset is also passed to the closure. By default, the closure is the identity.

The underlying dataset size might or might not be always divisible by batchsize. The optional policy string specify how to handle corner cases:

include-last makes sure all samples of the underlying dataset will be seen, batches will be of size equal or inferior to batchsize.
skip-last will skip last examples of the underlying dataset if its size is not properly divisible. Batches will be always of size equal to batchsize.
divisible-only will raise an error if the underlying dataset has not a size divisible by batchsize.

Purpose: the concept of batch is problem dependent. In torchnet, it is up to the user to interpret a sample as a batch or not. When one wants to assemble samples from an existing dataset into a batch, then tnt.BatchDataset is suited for the job. Sometimes it is however more convenient to write a dataset from scratch providing "batched" samples.

tnt.CoroutineBatchDataset(self, dataset, batchsize[, perm][, merge][, policy][, filter])

({
   self      = tnt.CoroutineBatchDataset  --
   dataset   = tnt.Dataset                --
   batchsize = number                     --
  [perm      = function]                  --  [has default value]
  [merge     = function]                  --
  [policy    = string]                    --  [default=include-last]
  [filter    = function]                  --  [has default value]
})

Given a dataset, tnt.CoroutineBatchDataset merges samples from this dataset to form a new sample which can be interpreted as a batch (of size batchsize).

It behaves the same and has the same arguments as tnt.BatchDataset (see the documentation there for additional details), with one important distinction: it allows the underlying dataset to postpone returning the individual samples once by doing a call to coroutine.yield() (from the underlying dataset).

This is useful when using datasets that are inefficient or slow when they need to provide the required sample immediately after a call to dataset:get(). The general pattern of code in the underlying dataset:get() would be:

FooDataset.get = function(self, idx)
   prepare(idx)  -- stores sample in self.__data[idx]
   coroutine.yield()
   return self.__data[idx]
end

Herein, the function prepare(idx) can implement, for instance, a buffering of indices before actually fetching them.

tnt.ConcatDataset(self, datasets)

{
   self     = tnt.ConcatDataset  --
   datasets = table              --
}

Given a Lua array (datasets) of tnt.Dataset, concatenates them into a single dataset. The size of the new dataset is the sum of the underlying dataset sizes.

Purpose: useful to assemble different existing datasets, possibly large-scale datasets as the concatenation operation is done in an on-the-fly manner.

tnt.ResampleDataset(self, dataset[, sampler][, size])

Given a dataset, creates a new dataset which will (re-)sample from this underlying dataset using the provided sampler(dataset, idx) closure.

If size is provided, then the newly created dataset will have the specified size, which might be different than the underlying dataset size.

If size is not provided, then the new dataset will have the same size than the underlying one.

By default sampler(dataset, idx) is the identity, simply returning idx. dataset corresponds to the underlying dataset provided at construction, and idx may take a value between 1 to size. It must return an index in the range acceptable for the underlying dataset.

Purpose: shuffling data, re-weighting samples, getting a subset of the data. Note that an important sub-class is (tnt.ShuffleDataset), provided for convenience.

tnt.ShuffleDataset(self, dataset[, size][, replacement])

({
   self        = tnt.ShuffleDataset  --
   dataset     = tnt.Dataset         --
  [size        = number]             --
  [replacement = boolean]            --  [default=false]
})

tnt.ShuffleDataset is a sub-class of tnt.ResampleDataset provided for convenience.

It samples uniformly from the given dataset with, or without replacement. The chosen partition can be redrawn by calling resample().

If replacement is true, then the specified size may be larger than the underlying dataset.

If size is not provided, then the new dataset size will be equal to the underlying dataset size.

Purpose: the easiest way to shuffle a dataset!

tnt.ShuffleDataset.resample(self)

The permutation associated to tnt.ShuffleDataset is fixed, such that two calls to the same index will return the same sample from the underlying dataset.

Call resample() to draw randomly a new permutation.

tnt.SplitDataset(self, dataset, partitions[, initialpartition])

({
   self             = tnt.SplitDataset  --
   dataset          = tnt.Dataset       --
   partitions       = table             --
  [initialpartition = string]           --
})

Partition a given dataset, according to the specified partitions. Use the method select() to select the current partition in use.

The Lua hash table partitions is of the form (key, value) where key is a user-chosen string naming the partition, and value is a number representing the weight (as a number between 0 and 1) or the size (in number of samples) of the corresponding partition.

Partioning is achieved linearly (no shuffling). See tnt.ShuffleDataset if you want to shuffle the dataset before partitioning.

The optional variable initialpartition specifies the partition that is loaded initially.

Purpose: useful in machine learning to perform validation procedures.

tnt.SplitDataset.select(self, partition)

({
   self      = tnt.SplitDataset  --
   partition = string            --
})

Switch the current partition in use to the one specified by partition, which must be a string corresponding to one of the names provided at construction.

The current dataset size changes accordingly, as well as the samples returned by the get() method.

Dataset Iterators

It is easy to iterate over datasets using a for loop. However, sometimes one wants to filter out samples in a on-the-fly manner or thread sample fetching.

Iterators are here for this particular cases. In general, refrain from writing iterators for handling custom cases, and write instead a tnt.Dataset

Iterators implement two methods:

run() which returns a Lua iterator usable in a for loop.
exec(funcname, ...) which execute a given funcname on the underlying dataset.

Typical usage is achieved with a for loop:

for sample in iterator:run() do
  <do something with sample>
end

Iterators implement the __call event, so one might also use the () operator:

for sample in iterator() do
  <do something with sample>
end

#### tnt.DatasetIterator(self, dataset[, perm][, filter][, transform]) ``` ({ self = tnt.DatasetIterator -- dataset = tnt.Dataset -- [perm = function] -- [has default value] [filter = function] -- [has default value] [transform = function] -- [has default value] }) ```

The default dataset iterator.

perm(idx) is a permutation used to shuffle the examples. If shuffling is needed, one can use this closure, or (better) use tnt.ShuffleDataset on the underlying dataset.

filter(sample) is a closure which returns true if the given sample should be considered or false if not.

transform(sample) is a closure which can perform online transformation of samples. It returns a modified version of the given sample. It is the identity by default. It is often more interesting to use tnt.TransformDataset for that purpose.

tnt.DatasetIterator.exec(tnt.DatasetIterator, name, ...)

Execute the given method name on the underlying dataset, passing it the subsequent arguments, and returns what the name method returns.

tnt.ParallelDatasetIterator(self[, init], closure, nthread[, perm][, filter][, transform][, ordered])

({
   self      = tnt.ParallelDatasetIterator  --
  [init      = function]                    --  [has default value]
   closure   = function                     --
   nthread   = number                       --
  [perm      = function]                    --  [has default value]
  [filter    = function]                    --  [has default value]
  [transform = function]                    --  [has default value]
  [ordered   = boolean]                     --  [default=false]
})

Allows to iterate over a dataset in a thread manner. tnt.ParallelDatasetIterator:run() guarantees that all samples will be seen, but does not guarantee the order unless ordered is set to true.

The purpose of this class is to have a zero pre-processing cost. When reading datasets on the fly from disk (not loading them fully in memory), or performing complex pre-processing this can be of interest.

The number of threads used to parallelize is specified by nthread.

init(threadid) (where threadid=1..nthread) is a closure which may initialize the specified thread as needed, if needed. It is doing nothing by default.

closure(threadid) will be called on each thread and must return a tnt.Dataset instance.

perm(idx) is a permutation used to shuffle the examples. If shuffling is needed, one can use this closure, or (better) use tnt.ShuffleDataset on the underlying dataset (returned by closure()).

filter(sample) is a closure which returns true if the given sample should be considered or false if not. Note that filter is called after fetching the data in a threaded manner.

transform(sample) is a function which maps the given sample to a new value. This transformation occurs before filtering.

When ordered is set to true the ordering of samples returned by the iterator is guaranteed. This option is particularly useful for repeatable experiments. By default ordered is false, which means that order is not guaranteed by run() (though often the ordering is similar in practice).

A common error raised by this dataset is when closure() is not serializable. Make sure that all upvalues of closure() are serializable. It is recommended to avoid upvalues at all cost, and to make sure you require all the appropriate torch packages needed to (de-)serialize closure() in the init() function.

For more information, check out the threads package, on which tnt.ParallelDatasetIterator relies.

tnt.ParallelDatasetIterator.execSingle(tnt.DatasetIterator, name, ...)

Execute the given method name on the dataset corresponding to the first available thread, passing it the subsequent arguments, and returns what the name method returns.

For example:

  local iterator = tnt.ParallelDatasetIterator{...}
  print(iterator:execSingle("size"))

will print the size of the dataset loaded in the first available thread.

tnt.ParallelDatasetIterator.exec(tnt.DatasetIterator, name, ...)

Execute the given method name on the underlying datasets in each thread, passing to each of them the subsequent arguments, and returns a table of what the name method returns for each thread.

For example:

  local iterator = tnt.ParallelDatasetIterator{...}
  for _, v in pairs(iterator:exec("size")) do
      print(v)
  end

will print the size of the datasets loaded in each thread.

tnt.Engine

In experimenting with different models and datasets, the underlying training procedure is often the same. The Engine module provides the boilerplate logic necessary for the training and testing of models. This might include conducting the interaction between model (nn.Module), tnt.DatasetIterators, nn.Criterions, and tnt.Meters.

An instance engine of a tnt.Engine() implements two main methods:

engine:train(), for training the model on data (i.e. sample data, forward prop, backward prop).
engine:test(), for evaluating a model on data (optionally with respect to a nn.Criterion).

The Engine can be implemented for any common underlying training and testing procedure involving a model and data. It can also be designed to allow user control after certain events such as forward prop, criterion evaluation, or the end of an epoch, by using coroutines (see tnt.SGDEngine).

tnt.SGDEngine

The SGDEngine module implements the Stochastic Gradient Descent training procedure in train, including data sampling, forward prop, back prop, and parameter updates. It also operates as a coroutine allowing a user control (i.e. increment some sort of tnt.Meter) at events such as 'start', 'start-epoch', 'forward', 'forward-criterion', 'backward', etc. The available hooks are the following:

hooks = {
   ['onStart']             = function() end, -- Right before training
   ['onStartEpoch']        = function() end, -- Before new epoch
   ['onSample']            = function() end, -- After getting a sample
   ['onForward']           = function() end, -- After model:forward
   ['onForwardCriterion']  = function() end, -- After criterion:forward
   ['onBackwardCriterion'] = function() end, -- After criterion:backward
   ['onBackward']          = function() end, -- After model:backward
   ['onUpdate']            = function() end, -- After UpdateParameters
   ['onEndEpoch']          = function() end, -- Right before completing epoch
   ['onEnd']               = function() end, -- After training
}

To specify a new closure for a given hook, we can access to it with engine.hooks.<onEvent>. For example, we could reset a Meter before every epoch by:

local engine = tnt.SGDEngine()
local meter  = tnt.AverageValueMeter()
engine.hooks.onStartEpoch = function(state)
   meter:reset()
end

Accordingly, train requires a network (nn.Module), a criterion expressing the loss function (nn.Criterion), a dataset iterator (tnt.DatasetIterator), and a learning rate, at the minimum. The test function allows for simple evaluation of a model on a dataset.

A state is maintained for external access to outputs and parameters of modules as well as sampled data. The content of the state table is the following, where the passed values come from the arguments of engine:train():

state = {
   ['network']     = network,
   ['criterion']   = criterion,
   ['iterator']    = iterator,
   ['lr']          = lr,
   ['lrcriterion'] = lrcriterion,
   ['maxepoch']    = maxepoch,
   ['sample']      = {},
   ['epoch']       = 0, -- epoch done so far
   ['t']           = 0, -- samples seen so far
   ['training']    = true
}

tnt.OptimEngine

The OptimEngine module wraps the optimization functions from https://github.com/torch/optim. At the start of training, the engine will call getParameters on the provided network.

The train method requires the following parameters in addition to the SGDEngine.train parameters:

optimMethod the optimization function (e.g optim.sgd)
config a table with configuration parameters for the optimizer

Example:

  local engine = tnt.OptimEngine()
  engine:train{
     network = model,
     criterion = criterion,
     iterator = iterator,
     optimMethod = optim.sgd,
     config = {
        learningRate = 0.1,
        momentum = 0.9,
     },
  }

tnt.Meter

When training a model, you generally would like to measure how the model is performing. Specifically, you may want to measure the average processing time required per batch of data, the classification error or AUC of a classifier a validation set, or the precision@k of a retrieval model.

Meters provide a standardized way to measure a range of different measures, which makes it easy to measure a wide range of properties of your models.

Nearly all meters (except tnt.TimeMeter) implement three methods:

add() which adds an observation to the meter.
value() which returns the value of the meter, taking into account all observations.
reset() which removes all previously added observations, resetting the meter.

The exact input arguments to the add() method vary depending on the meter. Most meters define the method as add(output, target), where output is the output produced by the model and target is the ground-truth label of the data.

The value() method is parameterless for most meters, but for measures that have a parameter (such as the k parameter in precision@k), they may take an input argument.

An example of a typical usage of a meter is as follows:

local meter = tnt.<Measure>Meter()  -- initialize meter
for state, event in tnt.<Optimization>Engine:train{
   network   = network,
   criterion = criterion,
   iterator  = iterator,
} do
  if state == 'start-epoch' then
     meter:reset()  -- reset meter
  elseif state == 'forward-criterion' then
     meter:add(state.network.output, sample.target)  -- add value to meter
  elseif state == 'end-epoch' then
     print('value of meter:' .. meter:value())  -- get value of meter
  end
end

#### tnt.APMeter(self) ``` ({ self = tnt.APMeter -- }) ```

The tnt.APMeter measures the average precision per class.

The tnt.APMeter is designed to operate on NxK Tensors output and target, and optionally a Nx1 Tensor weight where (1) the output contains model output scores for N examples and K classes that ought to be higher when the model is more convinced that the example should be positively labeled, and smaller when the model believes the example should be negatively labeled (for instance, the output of a sigmoid function); (2) the target contains only values 0 (for negative examples) and 1 (for positive examples); and (3) the weight ( > 0) represents weight for each sample.

The tnt.APMeter has no parameters to be set.

tnt.AverageValueMeter(self)

({
   self = tnt.AverageValueMeter  --
})

The tnt.AverageValueMeter measures and returns the average value and the standard deviation of any collection of numbers that are added to it. It is useful, for instance, to measure the average loss over a collection of examples.

The add() function expects as input a Lua number value, which is the value that needs to be added to the list of values to average. It also takes as input an optional parameter n that assigns a weight to value in the average, in order to facilitate computing weighted averages (default = 1).

The tnt.AverageValueMeter has no parameters to be set at initialization time.

tnt.AUCMeter(self)

({
   self = tnt.AUCMeter  --
})

The tnt.AUCMeter measures the area under the receiver-operating characteristic (ROC) curve for binary classification problems. The area under the curve (AUC) can be interpreted as the probability that, given a randomly selected positive example and a randomly selected negative example, the positive example is assigned a higher score by the classification model than the negative example.

The tnt.AUCMeter is designed to operate on one-dimensional Tensors output and target, where (1) the output contains model output scores that ought to be higher when the model is more convinced that the example should be positively labeled, and smaller when the model believes the example should be negatively labeled (for instance, the output of a sigmoid function); and (2) the target contains only values 0 (for negative examples) and 1 (for positive examples).

The tnt.AUCMeter has no parameters to be set.

tnt.ConfusionMeter(self, k[, normalized])

{
   self       = tnt.ConfusionMeter  --
   k          = number              --
  [normalized = boolean]            --  [default=false]
}

The tnt.ConfusionMeter constructs a confusion matrix for a multi-class classification problems. It does not support multi-label, multi-class problems: for such problems, please use tnt.MultiLabelConfusionMeter.

At initialization time, the k parameter that indicates the number of classes in the classification problem under consideration must be specified. Additionally, an optional parameter normalized (default = false) may be specified that determines whether or not the confusion matrix is normalized (that is, it contains percentages) or not (that is, it contains counts).

The add(output, target) method takes as input an NxK tensor output that contains the output scores obtained from the model for N examples and K classes, and a corresponding N-tensor or NxK-tensor target that provides the targets for the N examples. When target is an N-tensor, the targets are assumed to be integer values between 1 and K. When target is an NxK-tensor, the targets are assumed to be provided as one-hot vectors (that is, vectors that contain only zeros and a single one at the location of the target value to be encoded).

The value() method has no parameters and returns the confusion matrix in a KxK tensor. In the confusion matrix, rows correspond to ground-truth targets and columns correspond to predicted targets.

tnt.mAPMeter(self)

({
   self = tnt.mAPMeter  --
})

The tnt.mAPMeter measures the mean average precision over all classes.

The tnt.mAPMeter is designed to operate on NxK Tensors output and target, and optionally a Nx1 Tensor weight where (1) the output contains model output scores for N examples and K classes that ought to be higher when the model is more convinced that the example should be positively labeled, and smaller when the model believes the example should be negatively labeled (for instance, the output of a sigmoid function); (2) the target contains only values 0 (for negative examples) and 1 (for positive examples); and (3) the weight ( > 0) reprsents weight for each sample.

The tnt.mAPMeter has no parameters to be set.

tnt.MovingAverageValueMeter(self, windowsize)

({
   self       = tnt.MovingAverageValueMeter  --
   windowsize = number                       --
})

The tnt.MovingAverageValueMeter measures and returns the average value and the standard deviation of any collection of numbers that are added to it within the most recent moving average window. It is useful, for instance, to measure the average loss over a collection of examples within the most recent window.

The add() function expects as input a Lua number value, which is the value that needs to be added to the list of values to average.

The tnt.MovingAverageValueMeter needs the moving window size to be set at initialization time.

tnt.MultiLabelConfusionMeter(self, k[, normalized])

{
   self       = tnt.MultiLabelConfusionMeter  --
   k          = number                        --
  [normalized = boolean]                      --  [default=true]
}

The tnt.MultiLabelConfusionMeter constructs a confusion matrix for multi- label, multi-class classification problems. In constructing the confusion matrix, the number of positive predictions is assumed to be equal to the number of positive labels in the ground-truth. Correct predictions (that is, labels in the prediction set that are also in the ground-truth set) are added to the diagonal of the confusion matrix. Incorrect predictions (that is, labels in the prediction set that are not in the ground-truth set) are equally divided over all non-predicted labels in the ground-truth set.

The add(output, target) method takes as input an NxK tensor output that contains the output scores obtained from the model for N examples and K classes, and a corresponding NxK-tensor target that provides the targets for the N examples using one-hot vectors (that is, vectors that contain only zeros and a single one at the location of the target value to be encoded).

The value() method has no parameters and returns the confusion matrix in a KxK tensor. In the confusion matrix, rows correspond to ground-truth targets and columns correspond to predicted targets.

tnt.ClassErrorMeter(self[, topk][, accuracy])

{
   self     = tnt.ClassErrorMeter  --
  [topk     = table]               --  [has default value]
  [accuracy = boolean]             --  [default=false]
}

The tnt.ClassErrorMeter measures the classification error (in %) of classification models (zero-one loss). The meter can also measure the error of predicting the correct label among the top-k scoring labels (for instance, in the Imagenet competition, one generally measures classification@5 errors).

At initialization time, it takes to optional parameters: (1) a table topk that contains the values at which the classification@k errors should be measures (default = {1}); and (2) a boolean accuracy that makes the meter output accuracies instead of errors (accuracy = 1 - error).

The add(output, target) method takes as input an NxK-tensor output that contains the output scores for each of the N examples and each of the K classes, and an N-tensor target that contains the targets corresponding to each of the N examples (targets are integers between 1 and K). If only one example is added, output may also be a K-tensor and target a 1-tensor.

Please note that topk (if specified) may not contain values larger than K.

The value() returns a table with the classification@k errors for all values at k that were specified in topk at initialization time. Alternatively, value(k) returns the classification@k error as a number; only values of k that were element of topk are allowed. If accuracy was set to true at initialization time, the value() method returns accuracies instead of errors.

tnt.TimeMeter(self[, unit])

({
   self = tnt.TimeMeter  --
  [unit = boolean]       --  [default=false]
})

The tnt.TimeMeter is designed to measure the time between events and can be used to measure, for instance, the average processing time per batch of data. It is different from most other meters in terms of the methods it provides:

At initialization time, an optional boolean parameter unit may be provided (default = false). When set to true, the value returned by the meter will be divided by the number of times that the incUnit() method is called. This allows the user to compute, for instance, the average processing time per batch by simply calling the incUnit() method after processing a batch.

The tnt.TimeMeter provides the following methods:

reset() resets the timer, setting the timer and unit counter to zero.
stop() stops the timer.
resume() resumes the timer.
incUnit() increments the unit counter by one.
value() returns the time passed since the last reset(); divided by the counter value when unit=true.

#### tnt.PrecisionAtKMeter(self[, topk][, dim][, online]) ``` { self = tnt.PrecisionAtKMeter -- [topk = table] -- [has default value] [dim = number] -- [default=2] [online = boolean] -- [default=false] } ```

The tnt.PrecisionAtKMeter measures the precision@k of ranking methods at pre-specified levels k. The precision@k is the percentage of the k front-ranked items according to the model that is in the list of correct (positive) targets.

At initialization time, a table topk may be given as input that specifies the levels k at which the precision@k will be measures (default = {10}). In addition, a number dim may be provided that specifies over which dimension the precision@k should be computed (default = 2), and a boolean online may be specified that indicates whether we see all inputs along dimension dim at once (default = false).

The add(output, target) method takes two inputs. In the default mode (dim=2 and online=false), the inputs mean:

A NxC tensor that for each of the N examples (queries) contains a score indicating to what extent each of the C classes (documents) is relevant to the query, according to the model.
A binary NxC target tensor that encodes which of the C classes (documents) are actually relevant to the the N-th input (query). For instance, a row of {0, 1, 0, 1} indicates that the example is associated with classes 2 and 4.

The result of setting dim to 1 is identical to transposing the tensors output and target in the above. The result of setting online=true is that the function assumes that it is not the number of queries N that is growing with repeated calls to add(), but the number of candidate documents C. (Use this mode in scenarios where C is large but N is small.)

The value() method returns a table that contains the precision@k (that is, the percentage of targets predicted correctly) at the cutoff levels in topk that were specified at initialization time. Alternatively, the precision@k at a specific level k can be obtained by calling value(k). Note that the level k should be an element of the table topk specified at initialization time.

Please note that the maximum value in topk cannot be higher than the total number of classes (documents).

tnt.RecallMeter(self[, threshold][, perclass])

{
   self      = tnt.RecallMeter  --
  [threshold = table]           --  [has default value]
  [perclass  = boolean]         --  [default=false]
}

The tnt.RecallMeter measures the recall of ranking methods at pre- specified thresholds. The recall is the percentage of the correct (positive) targets that is in the list of positively labeled items according to the model.

At initialization time, the tnt.RecallMeter provides two optional parameters. The first parameter is a table threshold that contains all thresholds at which the recall is measured (default = {0.5}). Thresholds should be numbers between 0 and 1. The second parameter is a boolean perclass that makes the meter measure the recall per class when set to true (default = false). When perclass is set to false, the recall is simply averaged over all examples.

The add(output, target) method takes two inputs:

A NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model. The probabilities should sum to one over all classes; that is, the row sums of output should all be one.
A binary NxK target tensor that encodes which of the K classes are associated with the N-th input. For instance, a row of {0, 1, 0, 1} indicates that the example is associated with classes 2 and 4.

The value() method returns a table containing the recall of the model predictions measured at the thresholds specified at initialization time. The value(t) method returns the recall at a particular threshold t. Note that this threshold t should be an element of the threshold table specified at initialization time of the meter.

tnt.PrecisionMeter(self[, threshold][, perclass])

{
   self      = tnt.PrecisionMeter  --
  [threshold = table]              --  [has default value]
  [perclass  = boolean]            --  [default=false]
}

The tnt.PrecisionMeter measures the precision of ranking methods at pre- specified thresholds. The precision is the percentage of the positively labeled items according to the model that is in the list of correct (positive) targets.

At initialization time, the tnt.PrecisionMeter provides two optional parameters. The first parameter is a table threshold that contains all thresholds at which the precision is measured (default = {0.5}). Thresholds should be numbers between 0 and 1. The second parameter is a boolean perclass that makes the meter measure the precision per class when set to true (default = false). When perclass is set to false, the precision is simply averaged over all examples.

The add(output, target) method takes two inputs:

A NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model. The probabilities should sum to one over all classes; that is, the row sums of output should all be one.
A binary NxK target tensor that encodes which of the K classes are associated with the N-th input. For instance, a row of {0, 1, 0, 1} indicates that the example is associated with classes 2 and 4.

The value() method returns a table containing the precision of the model predictions measured at the thresholds specified at initialization time. The value(t) method returns the precision at a particular threshold t. Note that this threshold t should be an element of the threshold table specified at initialization time of the meter.

tnt.NDCGMeter(self[, K])

{
   self = tnt.NDCGMeter  --
  [K    = table]         --  [has default value]
}

The tnt.NDCGMeter measures the normalized discounted cumulative gain (NDCG) of a ranking produced by a model at prespecified levels k, and averages the NDCG over all examples.

The discounted cumulative gain at level k is defined as:

DCG_k = rel_1 + \sum{i = 2}^k (rel_i / log_2(i))

Herein, rel_i is the relevance of item i as specified by an external rater. Defining ideal DCG (IDCG) as the best possible DCG for a given example, the NDCG at level k is defined as:

NDCG_k = DCG_k / IDCG_k

At initialization time, the meter takes as input a table K that contains all the levels k at which the NDCG is computed.

The add(output, relevance) method takes as input (1) a NxC tensor of model outputs, which scores for all C possible outputs for a batch of N examples; and (2) a NxC tensor relevance that contains the corresponding relevances for these scores, as provided by an external rater. Relevances are generally obtained from human raters.

The value() method returns a table that contains the NDCG values for all levels K that were provided at initialization time. Alternatively, the NDCG at a specific level k can be obtained by calling value(k). Note that the level k should be an element of the table K specified at initialization time.

Please note that the number of outputs and relevances C should always be at least as high as the highest NDCG level k that the meter is computing.

tnt.Log

Log classes act as tables indexed by string keys. Allowed keys must be provided at construction. A special key __status__ can be also set the convenience method log:status() to record basic messages.

Viewers closures can be attached to a Log, and called at different events:

onSet(log, key, value): when setting a key to the Log with log:set{}.
onGet(log, key): when querying a key with log:get().
onFlush(log): when flushing out the stored data of the Log with log:flush().
onClose(log): when closing a Log with log:close().

Typical viewer closures are text or json, which allow to write to disk or to the console a subset of the keys stored by the Log, in a particular format. The special viewer closure status is made to be called on set() events, and will print out only status records.

A typical use case would be the following:

tnt = require 'torchnet'

-- require the viewers we want
logtext = require 'torchnet.log.view.text'
logstatus = require 'torchnet.log.view.status'

log = tnt.Log{
   keys = {"loss", "accuracy"},
   onFlush = {
      -- write out all keys in "log" file
      logtext{filename='log.txt', keys={"loss", "accuracy"}, format={"%10.5f", "%3.2f"}},
      -- write out loss in a standalone file
      logtext{filename='loss.txt', keys={"loss"}},
      -- print on screen too
      logtext{keys={"loss", "accuracy"}},
   },
   onSet = {
      -- add status to log
      logstatus{filename='log.txt'},
      -- print status to screen
      logstatus{},
   }
}

-- set values
log:set{
  loss = 0.1,
  accuracy = 97
}

-- write some info
log:status("hello world")

-- flush out log
log:flush()

#### tnt.Log(self, keys[, onClose][, onFlush][, onGet][, onSet]) ``` { self = tnt.Log -- keys = table -- [onClose = table] -- [onFlush = table] -- [onGet = table] -- [onSet = table] -- } ```

Creates a new Log with allowed keys (strings) keys. Specifiy event closures with table of functions onClose, onFlush, onGet and onSet, which will be called when close(), flush(), get(), and set{} methods will be called, respectively.

tnt.Log:status(self[, message][, time])

({
   self    = tnt.Log   --
  [message = string]   --
  [time    = boolean]  --  [default=true]
})

Record a status message, with corresponding (optional) time of the event.

tnt.Log:set(self, keys)

(
   self = tnt.Log  --
   keys = table    --
)

Set a number of keys (a subset of the keys provided at construction) to their corresponding values.

Closures attached to the onSet(log, key, value) event will be called.

tnt.Log:get(self, key)

({
   self = tnt.Log  --
   key  = string   --
})

Get the value of a given key.

Closures attached to the onGet(log, key) event will be called.

tnt.Log:flush(self)

({
   self = tnt.Log  --
})

Flush (empty) the log data.

Closures attached to the onFlush(log) event will be called.

tnt.Log:close(self)

({
   self = tnt.Log  --
})

Close the log.

Closures attached to the onClose(log) event will be called.

tnt.Log:attach(self, event, closures)

({
   self     = tnt.Log  --
   event    = string   --
   closures = table    --
})

Attach a set of functions (provided in a table) to a given event.

torchnet's People

Contributors

Stargazers

Watchers

Forkers

shashankg7 atousatorabi rtvt123 zhixinshu raj347 codeaudit jtbarker fedorajzf liuyang1123 yinyinbigdata hushuitian zencoding tisma riccitensor hunterchen bigeyedestroyer rsarxiv deeprnd monkeymars liqingrikeiikyeong dreadlord1984 caomw miosen panyang benjamesbabala mutual-ai clear-datacenter thomasdic2000 qzhou003 gongfupanada adi007 wavelets smopart acesuer eternonq ieswxia holdlen2dh atuxhe bygreencn is00hcw deeplearningsprint avidoggy amyvmiwei madhanmohan adammendoza vikash7795 yurenyong123 ericzhouh coodeer luciferaaa pi31415926535987932 sohuren starkmchen zqj7 mldl wanjinchang drooids ml-lab tianling456 fysoft2006 winning1120xx shreyshahi ll36771 markojak ehosseiniasl rlugojr cw2018 cainiao1989 i-spark followheart hxl1990 campuslifeceo xiaoxu1101 yankaics cammette mythsya jingmufengyu idtek dqgong k0sky atcold frankfqchen fmassa davidmr001 truppen veterun 0-t-0 msnvip lemonhall boyaculture 80nianmo rishirajsurti caidongyun it-stone davidemaz 10sun jxchen01 prithv1 jsaribeiro colesbury

torchnet's Issues

GPU Sparse Support

Good news. The new version supports the GPU version sparse inputs?

imagenet example

hi guys, did anyone write an imagenet training script like this https://github.com/soumith/imagenet-multiGPU.torch
for torchnet?

MNIST example

Hi,
I just want to point out a minor problem in the mnist.lua example related to the criterion and database iterator.

If you redefine the model and the criterion (lines 49 and 50) with the following lines, the script will fail.

local net = nn.Sequential():add(nn.Linear(784,10))
net:add(net:add(nn.LogSoftMax()))

local criterion = nn.ClassNLLCriterion()

It looks that the database iterator is returning a target tensor with 2D instead of 1D, which is required by ClassNLLCriterion() . One simple solution is to reshape the sample during training

engine.hooks.onSample = function(state)
    state.sample.target = state.sample.target:view(state.sample.target:nElement())
end

slow and out of mem with threads on imagenet

Hi guys, maybe you can suggest some help here?
https://github.com/karandwivedi42/imagenet-multiGPU.torchnet/issues/5
thanks a lot
E

Breaking out of a ParallelDatasetIterator may lead to issues

The following code example produces incorrect results in the second run through the iterator iff we break out of the first run through the iterator:

local tnt = require 'torchnet'

local producebug = true

local N = 20
local iterator = tnt.ParallelDatasetIterator{
   nthread = 3,
   init    = function() require 'torchnet' end,
   closure = function()
      local list = torch.range(1, N):long()
      return tnt.ListDataset{
         list = list,
         load = function(idx)
            return {input  = torch.LongTensor{idx}}
         end,
      }
   end,
}

print('| run that we are breaking out:')
for sample in iterator() do
   print(' (1) -> ' .. sample.input[1])
   if producebug then break end
end

print('| run that may contain erroneous samples:')
for sample in iterator() do
   print(' (2) -> ' .. sample.input[1])
end

Is there a plan to multi-gpu support?

Thanks.

Garbage Collection

In ParallelDatasetIterator, moving collectgarbage() from Line 109-110

https://github.com/torchnet/torchnet/blob/master/dataset/paralleldatasetiterator.lua#L109-L110

to Line 114 (i.e. in second function)
https://github.com/karandwivedi42/torchnet/blob/mem-usage/dataset/paralleldatasetiterator.lua#L112-L113

reduces the memory usage.

Without moving, memory usage peaks at around 30GB (4 threads, 256 batchSize, fb.resnet.torch pre-processing) and after moving to second function, it peaks at around 10GB.

meter.MultilabelConfusionMeter invalid argument error

Hi.
I tried to use the MultilabelConfusionMeter with following code:

	local tnt = require 'torchnet'
	meter = tnt.MultiLabelConfusionMeter(#classes, false)

and got error messege:

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
tnt.MultiLabelConfusionMeter(self, k[, normalized])

{
   self       = tnt.MultiLabelConfusionMeter  -- 
   k          = number                        -- 
  [normalized = boolean]                      --  [default=true]
}


The tnt.MultiLabelConfusionMeter constructs a confusion matrix for multi-
label, multi-class classification problems. In constructing the confusion
matrix, the number of positive predictions is assumed to be equal to the
number of positive labels in the ground-truth. Correct predictions (that
is, labels in the prediction set that are also in the ground-truth set) are
added to the diagonal of the confusion matrix. Incorrect predictions (that
is, labels in the prediction set that are not in the ground-truth set) are
equally divided over all non-predicted labels in the ground-truth set.

At initialization time, the k parameter that indicates the number of
classes in the classification problem under consideration must be
specified. Additionally, an optional parameter normalized (default = false)
may be specified that determines whether or not the confusion matrix is
normalized (that is, it contains percentages) or not (that is, it contains
counts).

The add(output, target) method takes as input an NxK tensor output that
contains the output scores obtained from the model for N examples and K
classes, and a corresponding NxK-tensor target that provides the targets
for the N examples using one-hot vectors (that is, vectors that contain
only zeros and a single one at the location of the target value to be
encoded).

The value() method has no parameters and returns the confusion matrix in a
KxK tensor. In the confusion matrix, rows correspond to ground-truth
targets and columns correspond to predicted targets.

Got: tnt.MultiLabelConfusionMeter, number, boolean

invalid arguments!

It looks like tnt.MultiLabelConfusionMeter, number boolean is passed, as it should.
But then why do I see this messege?

Sunwoo

ClassErrorMeter throwing size mismatch error

I get the following error "qlua: ...install/share/lua/5.1/torchnet/meter/classerrormeter.lua:79: target and output do not match"

My target and network.output values are as below:

I initialize the error meter like " local clerr = tnt.ClassErrorMeter{topk = {1}}
target values
20
11
12
7
12
12
12
12
16
10
15
12
12
1
12
12
[torch.CudaTensor of size 16x1]

output size values 0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
[torch.CudaTensor of size 16x1]

I checked the source code of ClassErrorMeter to check what condition caused the error, but the error message doesn't say much .

Torch net install problem

After I did:
luarocks install torchnet

I created a new file "torchtest.lua" as follow:
require ’nn’ local tnt = require ’torchnet’ local mnist = require ’mnist’

run
th torchtest.lua

I got the errors below:
robysmac:lua test zhaochangkai$ th torchtest.lua /Users/zhaochangkai/torch/install/bin/luajit: .../zhaochangkai/torch/install/share/lua/5.1/trepl/init.lua:384: .../zhaochangkai/torch/install/share/lua/5.1/trepl/init.lua:384: .../zhaochangkai/torch/install/share/lua/5.1/trepl/init.lua:384: module 'tds' not found:No LuaRocks module found for tds no field package.preload['tds'] no file '/Users/zhaochangkai/.luarocks/share/lua/5.1/tds.lua' no file '/Users/zhaochangkai/.luarocks/share/lua/5.1/tds/init.lua' no file '/Users/zhaochangkai/torch/install/share/lua/5.1/tds.lua' no file '/Users/zhaochangkai/torch/install/share/lua/5.1/tds/init.lua' no file './tds.lua' no file '/Users/zhaochangkai/torch/install/share/luajit-2.1.0-beta1/tds.lua' no file '/usr/local/share/lua/5.1/tds.lua' no file '/usr/local/share/lua/5.1/tds/init.lua' no file '/Users/zhaochangkai/.luarocks/lib/lua/5.1/tds.so' no file '/Users/zhaochangkai/torch/install/lib/lua/5.1/tds.so' no file './tds.so' no file '/usr/local/lib/lua/5.1/tds.so' no file '/usr/local/lib/lua/5.1/loadall.so' stack traceback: [C]: in function 'error' .../zhaochangkai/torch/install/share/lua/5.1/trepl/init.lua:384: in function 'require' torchtest.lua:2: in main chunk [C]: in function 'dofile' ...gkai/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x0109718d40

I use Mac ,with cuda supported graph card.

I switched clang to version 6 in order to use cuda.
robysmac:lua test zhaochangkai$ clang -v Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn) Target: x86_64-apple-darwin15.4.0 Thread model: posix

CUDA MNIST example is missing

On the paper, there is also an example about how to run on GPU.
Maybe it would be worth it including it in the example folder.

About this example, I'm not very sure why you perform a resize() on the torch.CudaTensor() on each onSample. Isn't this resize() necessary only once?

OptimEngine.test not implemented

Why is test() not implemented for the OptimEngine class? This is rather surprising to me. Seems like it should share the same functionality as SgdEngine.test(), so that that function should be used as Engine.test(), and inherited by both engines.

Happy to do this myself. Just wondering why something so necessary, e.g. for early stopping when validation error does stops going down, is missing.

Tracking progress in an epoch

How do we track the progress in an epoch? I generally use xlua to see how many images have been processed. How can we do that here?

We can use state.t to get batches/samples processed so far. But how we calculate the number of total batches? Can an iterator or datasets have a generic attribute size?

Wrapping issue when passing tables as first parameter with argcheck

Lua handles the brackets arguments by passing everything into the first argument as a table. There is no indicator that the function has been called using my_function{arg1 = "hello", arg2 = "world"} and not with my_function({arg1 = "hello", arg2 = "world"}). This makes it impossible for argcheck to know how to parse the first argument when it is of type = "table" as in the NDCGMeter. Here's a test case:

function test.NDCGMeter()
   local mtr = tnt.NDCGMeter{K = {6}}

   -- From: https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG
   local relevance = torch.DoubleTensor{3,2,3,0,1,2}
   local output = torch.linspace(relevance:size(1), 1, relevance:size(1)):double()
   mtr:add(output, relevance)

   local est = mtr:value()
   tester:eq(est[6], 0.932, "nDGC with K=6", 10^-3)
end

As there is only one parameter it is easy to add a workaround (add at line 96):

         if (K.K) then
            K = K.K
         end

Unfortunately there is no elegant Lua solution as I see it and I've proposed for argcheck to add a table wrapper class. This allows the user to wrap the table in a class that then is easy for argcheck to identify. This has the downside of adding complexity but we believed that for the torch-dataframe package this was the best solution as there are plenty of instances where passing a table as the first argument makes sense.

Document uncorrect about "transform.perm"

The document is not correct about the function name "transform.perm"

I checked the source code. The correct version should be "transform.randperm".

Problem running the example

I may have missed something obvious, but I failed to run the provided examples.

If I run th examples/mnist.lua torchnet directory, it gives the following error output:

$ th example/mnist.lua 
running on CPU  
/home/joe/torch/install/bin/luajit: /home/joe/torch/install/share/lua/5.1/threads/threads.lua:264: 
[thread 1 callback] example/mnist.lua:26: module 'mnist' not found:
    no field package.preload['mnist']
    no file [some paths in my computer]
    ...

Where could I get the mnist dataset lua library?

putting vector as target

MSE criterion needs vector(;1d tensor) type input.
But in the example codes, I could see only cases in which scalar values are put as target.
How can i take vector as target and use MSE criterion?

Transforms

Recently I found a very nice set of preprocessing transformations in a project by Facebook implementing ResNet (of which you are probably aware). I guess adapting and incorporating transforms.lua will be a nice addition to torchnet.

Segmentation fault (core dumped)

I have tried to use torchnet but I have got this exception when I called the function getIterator
could you please help me to solve this issue.
here is my code.

local function getData(fname)
    local hdf5 = require 'hdf5'
    local f = hdf5.open(fname, 'r')
    
    local X1 = f:read('X1'):all()
    local X2 = f:read('X2'):all()
    local X3 = f:read('X3'):all()
    local labels = f:read('labels'):all()
    f:close()
    return {X1 = X1, X2 = X2, X3 = X3, labels = labels}
end

 
-- function that sets of dataset iterator:


local function getIterator(mode)
   return tnt.ParallelDatasetIterator{
      nthread = 1,
      init    = function() require 'torchnet' end,
      closure = function()

         -- load dataset:

         local dataset = getData('data/en/A/' .. mode .. '.h5')

         -- return batches of data:
         return tnt.BatchDataset{
            batchsize = 128,
            dataset = tnt.ListDataset{  -- replace this by your own dataset
               list = torch.range(1, dataset.X1:size(1)):long(),
               load = function(idx)
                  return {
                     input  = { dataset.X1[idx], dataset.X2[idx], dataset.X3[idx] },
                     target = torch.LongTensor{dataset.labels[idx] + 1},
                  }  -- sample contains input and target
               end,
            }
         }
      end,
   }
end

The AUC-meter evaluates differently from classical statistics

I've finished writing a basic test-suite for the meters and apart from issue #41 I've encountered an unexpected problem with the tnt.AUCMeter. The following test-case should hopefully implement classical AUC-calculation based on this paper:

function test.AUCMeter()
   local mtr = tnt.AUCMeter()

   -- From http://stats.stackexchange.com/questions/145566/how-to-calculate-area-under-the-curve-auc-or-the-c-statistic-by-hand
   local samples = torch.Tensor{
      {33,6,6,11,2}, --normal
      {3,2,2,11,33} -- abnormal
   }
   for i=1,samples:size(2) do
      local target = torch.Tensor():resize(samples:narrow(2,i,1):sum()):zero()
      target:narrow(1,1,samples[2][i]):fill(1)
      local output = torch.Tensor(target:size(1)):fill(i)
      mtr:add(output, target)
   end

   local error, tpr, fpr = mtr:value()

   tester:assert(math.abs(error - 0.8931711) < 10^-3,
      ("The AUC error does not match: %.3f is not equal to 0.893"):format(error))
end

Unfortunately the AUC is lower (0.704) than expected 0.893. I'm not familiar with ML enough to know if there ML AUC differs in some significant way but the value 0.704 seems intuitively low (my apologies if I missed something in the coding). After looking at how the AUC is calculated there is a zero appended that could possibly be pulling the value down.

for ListDataset, add an onComplete argument

This would be useful for closing any underlying data streams for connections that were opened in order to fetch data for the list data set.

For example, let's say you want to return a ListDataset from a method. This ListDataset loads its data from an IndexedDataSetReader. When the ListDataset is done loading, it should be able to close the IndexedDataSetReader stream.

This would look like:

function getDataSet()

  local dataSetReader = tnt.IndexedDatasetReader('dataset.index', 'dataset.data');

  local someList = {1, 2, 3);

  local list = tnt.ListDataset(
    list = someList,
    load = function(idx) return dataSetReader:get(idx); end,
    ... other arguments omitted ...
    onComplete = function() dataSetReader.close() end
  )
end

Bug report: not entering into iterator until thorough depth.

In this case,

local function getIterator(mode)
   return tnt.ParallelDatasetIterator{
         return tnt.BatchDataset{
            dataset = tnt.ListDataset{
                [place 1]

When I set batchSize as big one(:eg. 1000),
I met a case, one cannot reache the [place 1].

But strangely, the program runs without any error.
And it generating 'nan' training loss.

Validation at the end of every epoch.

We can create two engines and do validation. But is there a better solution than this?

How to shuffle selected partition after SplitDataset ?

local tnt = require 'torchnet'

local d = tnt.TableDataset{data = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}}

d = d:split{
  train = 0.7,
  val   = 0.3
}

d:select('val')
print('val')
for sample in d:iterator()() do
  print(sample)
end

d:select('train')
print('train')
for sample in d:iterator()() do
  print(sample)
end

d=d:shuffle() -- gives a new dataset
print('train2')
for sample in d:iterator()() do
  print(sample)
end

d:select('val') -- no more select
print('val2')
for sample in d:iterator()() do
  print(sample)
end

Would you provide an example for ListDataset(self, filename, load[, maxload][, path])?

I'm now trying to
ParallelDatasetIterator

BatchDataset
ListDataset(self, filename, load[, maxload][, path])
case.

Would you provide an example for ListDataset() case using string input having filenames?

This error is unclear - what is the problem with my code that is causing this?

qlua: ...hare/lua/5.1/torchnet/meter/multilabelconfusionmeter.lua:107: attempt to index local 'pos' (a nil value)
stack traceback:
[C]: in function '__index'
...hare/lua/5.1/torchnet/meter/multilabelconfusionmeter.lua:107: in function 'add'
main.lua:198: in function 'hooks'
...ch/install/share/lua/5.1/torchnet/engine/optimengine.lua:105: in function 'train'
main.lua:218: in main chunk

How to select different batch samples in each iteration?

The training code is like:

engine:train{
    network = model,
    iterator = getIterator('train'),
    criterion = criterion,
    optimMethod = optim.sgd,
    config = tablex.deepcopy(cnnopt),
    maxepoch = cnnopt.max_epoch,
}

The getIterator() function is following and only called once:

local function getIterator(mode)
    return tnt.ParallelDatasetIterator{
        nthread = 1,
        init = function()
            require 'torchnet'
            require 'image'
            require 'nn'
        end,
        closure = function()
            local dataset = provider[mode..'Data']

            local list_dataset = tnt.ListDataset{
                list = torch.range(1, dataset.labels:numel()):long(),
                load = function(idx)
                    return {
                        input = dataset.data[idx]:float(),
                        target = torch.LongTensor{dataset.labels[idx]},
                    }
                end,
            }
            if mode == 'train' then
                return list_dataset
                :transform{
                    input = tnt.transform.compose{
                        cnnopt.hflip and hflip,
                        cnnopt.randomcrop > 0 and randomcrop,
                    }
                }
                :batch(cnnopt.batchSize, 'skip-last')
            elseif mode == 'test' then
                return list_dataset
                :batch(cnnopt.batchSize, 'include-last')
            end
        end
    }
end

It seems that the DatasetIterator() will return a fixed list of batches for iterating. If I want to select batch samples at each iteration, how should I change the code? or are there other interfaces I should use?

Continuously load data from disk in separate thread

Hi,

After playing a bit with torchnet it is still unclear to me how to properly tackle the following problem: suppose my training data lies in a very big file on disk (not loadable into memory at once), each line of the file being one example point. I would like to build a data iterator that runs on a separate thread (or more threads) and that can provide mini-batches to the main thread that performs training of the network. I would also like to do multiple epochs on the ttraining data, so I require that the training file is reopened once it is finished.

I tried using ParallelDatasetIterator, but as far as I understand, the closure is run once per thread and the returned dataset is expected to have a finite size. Can someone please explain or give an example on this issue ? Thanks a lot.

IndexedDataset using string as index for large dataset

Hi !

I would like to extract features from a large dataset of images and to store them in memory.
IndexedDataset provides a really nice way to handle large dataset, however indexes must be integer:
first_tensor = dataset:get(1)

For now, I am using a tds.Hash to store the image names as keys and the corresponding indexes as values. Thus I am doing:
first_tensor = dataset:get(tdshash[first_image_name])

Do you know a better way to handle this using torchnet ?

Thank you for your precious help.

Greetings,
Remi

problems on the mnist example

@lvdmaaten I donot install mnist dataset to my torch directory by using luarocks install mnist command. Rather I downloaded the mnist to a separate place path_to_minist/, then i try to modify the path in the example code provided by the repos. I insert the code at https://github.com/torchnet/torchnet/blob/master/example/mnist.lua#L11 like below:
local mnist_pkg_path = '/home/jack/public_datasets/?/init.lua' -- concate to package path package.path = mnist_pkg_path .. ';' .. package.path
And the leaving remaining code of 'mnist.lua' the same.
however, I always get an error, donot know why? can you give me some tips? besides, other method like directly modified path of the require command in here, but failed.

returning vector in ListDataset problem.

in latest torchnet, i tried to run examples/mnist.lua changing criterion adopting MSECriterion.
and following the condition, i made target as vector like this,

target = torch.LongTensor(1,10) or target = torch.LongTensor(10, 1)
from
target = torch.LongTensor{dataset.label[idx] + 1}
(to just check whether it works, i didn't assign the label.)

but it's fail.

I suspect returning vector in getIterator(mode) {...}
(specifically, variable 'load' in ListDataset cannot contain vector as target)
does not work.

will you have a plan to repair it near some day?

Problem when using multiple GPUs

Sorry for posting long code here, but it is a pretty weird problem and this is the simplest example for me to possibly reproduce the problem.

I'm trying to use torchnet's ParallelDatasetIterator, but got problem when using multiple GPUs. For the following code, the output should not depend on the torchnet, since it is not using the torchnet code.

The program works normally when using single GPU:

$ th train_debug.lua -nGPU 1
Number of parameters:   11184650    
iteration 1: loss=2.289696  
iteration 2: loss=2.209115  
iteration 3: loss=2.136883  
iteration 4: loss=2.071370  
iteration 5: loss=2.011046

However, when using 2 GPUs, the loss goes to nan after the first iteration.

$ th train_debug.lua -nGPU 2
Number of parameters:   11184650    
iteration 1: loss=2.952482  
iteration 2: loss=nan   
iteration 3: loss=nan   
iteration 4: loss=nan   
iteration 5: loss=nan

Although the code should not really depend on torchnet, when I comment out the torchnet related code, the code works normally again.

Code:

require 'nn'
require 'cunn'
require 'cudnn'
local tnt = require 'torchnet'
local optim = require 'optim'

cmd = torch.CmdLine()
cmd:option('-nGPU', 1, 'GPU ID (only using cuda)')
cmd:option('-learning_rate', 1e-5, 'lr')
opt = cmd:parse(arg)

function dpt_model(nGPU, model)
  if nGPU > 1 then
    local gpus = torch.range(1, nGPU):totable()

    model = nn.DataParallelTable(1, true, false)
      :add(model, gpus)
      :threads(function()
        local cudnn = require 'cudnn'
      end)
    model.gradInput = nil
  end
  return model:cuda()
end


------------------------------------
-- start of torchnet related code
-- Create torchnet ParallelDatasetIterator, but not using it
local index = {}
for i=1,10000 do table.insert(index, {path='12345'}) end 
get_data_iterator_func = function ()
  return tnt.ParallelDatasetIterator{
    nthread = 8,
    init    = function() 
      require 'torchnet'
      torch.setdefaulttensortype('torch.FloatTensor')
    end,
    closure = function()
      return tnt.ListDataset{
        list = index,
        load = function (data)
          return {
            input = torch.zeros(3,224,224),
            label = torch.LongTensor{1},
          }
        end
      }:batch(16, 'skip-last')
    end
  }
end
get_data_iterator = get_data_iterator_func()
data_iterator = get_data_iterator()
-- end of torchnet related code
------------------------------------


-- modify the network a little bit (to trigger the error)
local model = torch.load('models/resnet-18.t7')
model:remove(#model)
model:add(nn.Linear(512, 10):cuda())

model = dpt_model(opt.nGPU, model)
criterion = nn.CrossEntropyCriterion():cuda()

local params, grad_params = model:getParameters()
print('Number of parameters: ', params:size(1))

-- same data in each iteration
local x = opt.nGPU > 1 and cutorch.createCudaHostTensor() or torch.CudaTensor()
local y = torch.ones(16):cuda()
x = x:resize(16,3,224,224):normal(0, 1)

function feval(xx)
  if xx ~= params then params:copy(xx) end
  grad_params:zero()
  model:training()
  local outputs = model:forward(x)
  local f = criterion:forward(outputs, y)
  local df_do = criterion:backward(outputs, y)
  model:backward(x, df_do)
  return f, grad_params
end


local optim_state = {learningRate=opt.learning_rate}
for i = 1,5 do
  local __, loss = optim.adam(feval, params, optim_state)
  cutorch.synchronize()
  print(('iteration %d: loss=%f'):format(i, loss[1]))
end

models/resnet-18.t7 is downloaded from pretrianed model of fb.resnet.torch. Very weirdly, the error only triggers when the model file is put in the subdirectory. I'm able to reproduce the error in two different machines. Occasionally it runs without any problem on one of the machines that I tested on, but the problem occurs most of the time.

It may also be that I'm using DataParallelTable incorrectly. Please let me know if this is the case.

Thanks in advance!

Documentation generation with chapter support

I was thinking to split the documentation into chapters, and relocate it into a doc folder, like we have done for the other packages of Torch.
Would this be an acceptable PR?

Error using MultiLabelConfusionMeter()

Hi,

I am running the Torch demo face detector code with a different dataset - one that has 158 classes. Here is a snippet from my train.lua file.

local tnt = require 'torchnet' 

local confusion = tnt.MultiLabelConfusionMeter{k = opt.numClasses}

       --create closure to evaluate f(X) and df/dX
       local eval_E = function(w)
       for i = 1,opt.batchSize do 
          confusion:add(y[i],yt[i])
       end

Etc.

When I run the code, the error is:

/home/uni/torch/install/bin/luajit: /home/uni/torch/install/share/lua/5.1/torch/Tensor.lua:462: Wrong size for view. Input size: 158. Output size: 1x1
stack traceback:
[C]: in function 'error'
/home/uni/torch/install/share/lua/5.1/torch/Tensor.lua:462: in function 'view'
...hare/lua/5.1/torchnet/meter/multilabelconfusionmeter.lua:80: in function 'add'
./train.lua:150: in function 'opfunc'
/home/uni/torch/install/share/lua/5.1/optim/sgd.lua:44: in function 'sgd'
./train.lua:158: in function 'train'
run.lua:76: in main chunk
[C]: in function 'dofile'
.../uni/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670

Can you please help?

I looked into the code and the problem seems to be coming from the following snippet from multilabelconfusionmeter.lua:

if output:nDimension() == 1 then
    output = output:view(1, output:size(1))
 end
if target:nDimension() == 1 then
    target = target:view(1, output:size(1))
end

Since my output and target are 158-dimensional tensors, reshaping the output means output:size(1) has now changed so when target is reshaped, the error arises.

Thank you.

Error when requiring torchnet

When doing

tnt = require 'torchnet'

I'm getting the following error:

/Users/marioyc/torch/install/bin/lua: /Users/marioyc/torch/install/share/lua/5.2/trepl/init.lua:384: /Users/marioyc/torch/install/share/lua/5.2/trepl/init.lua:384: ...rs/marioyc/torch/install/share/lua/5.2/sundown/ascii.lua:227: attempt to index global 'bit' (a nil value)
stack traceback:
    [C]: in function 'error'
    /Users/marioyc/torch/install/share/lua/5.2/trepl/init.lua:384: in function 'require'
    torchnet_test.lua:3: in main chunk
    [C]: in function 'dofile'
    ...ioyc/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: in ?

I've been able to reduce it to the fact that the following fails:

local sdascii

pcall(function()
         sdascii = require 'sundown.ascii'
      end)
local test_str = "- "
sdascii.render(test_str)

and this ends up producing an error when requiring for example batchdataset.

allocated memory estimation.

How can i estimate the amount of allocated memory?

I've now used single gpu and a memory loading dataset(*.t7) file.
I met these cases. (the number of thread: 1)

case 1: dataset is 13GB, datatype is double, allocated memory: 13GB
case 2: dataset is 7GB (a half-sized one of case 1), datatype is double, allocated memory: 55GB
case 3: dataset is 13GB, datatype is float, allocated memory: 13GB
case 4: dataset is 7.7GB, datatype is float, allocated memory: 24GB

I don't know the relation. (I expected linearity.)

FATAL THREAD PANIC after resuming training

For some reason ParallelDatasetIterator throws a "FATAL THREAD PANIC" error after I torch.load() (resume) a network from disk. It looks for a custom dataset class that is included in the init() and is found when simply running without resuming from a saved model. Any ideas?

fatal thread panic on parallelDatasetIterator

I get this message:

FATAL THREAD PANIC: (read) /Users/genovese/torch/install/share/lua/5.1/torch/File.lua:343: unknown Torch class <package.torchnet>

This is relevant code:

local tnt = require 'torchnet'

function getIterator(dataset)
    return tnt.ParallelDatasetIterator{
    init    = function() 
            tnt = require 'torchnet' 
            optParser = require 'opts'
            opt = optParser.parse(arg) end,
    nthread = opt.nThreads,
    closure = function(dataset)
            return tnt.DasetIterator{
                dataset = tnt.BatchDataset{
                    batchsize = opt.batchsize,
                    dataset = dataset
                }
            } end
        }
end

How to shuffle dataset after ParallelDatasetIterator ?

Thanks in advance :)

local tnt = require 'torchnet'

local d = tnt.TableDataset{data = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}}

local iterator = tnt.ParallelDatasetIterator{
    nthread = 3,
    init = function() require 'torchnet' end,
    closure = function()
        return d
    end,
    ordered = true
}

print('print 1')
for sample in iterator() do
    print(sample)
end

-- executed on the main thread
d = d:shuffle()

print('print 2')
for sample in iterator() do
    print(sample)
end

print('print 3')
for sample in d:iterator()() do
    print(sample)
end

-- executed on each threads, so return a table
-- containing 3 * datasets to the main thread.
-- but how to update d on the threads ?
iterator:exec('shuffle')

print('print 4')
for sample in iterator() do
    print(sample)
end

evaluation mode for models

For models that contain layers like batch normalization or dropout, does the API take care of calling model:evaluate() in the test engine or is it the user's responsibility to take care of this?

Impossible to save a Dataset when using torch/image

Please try the following code.

local tnt = require 'torchnet'
local image = require 'image'

local mode = 'train'
local mnist = require 'mnist'
local dataset = mnist[mode .. 'dataset']()

local listdataset = tnt.ListDataset{ 
   list = torch.range(1, dataset.data:size(1)):long(),
   load = function(idx)
      return {
         input  = image.scale(dataset.data[idx],10,10),
         target = torch.LongTensor{dataset.label[idx] + 1},
      } 
   end,
}

torch.save('image.scale', image.scale)     -- works
torch.save('listdataset.tnt', listdataset) -- doesnt work
torch.save('image',image)                  -- doesnt work

/home/cadene/torch/install/bin/luajit: /home/cadene/torch/install/share/lua/5.1/torch/File.lua:141: Unwritable object <function> at <?>.tnt.ListDataset.load.image.float.scaleBicubic
stack traceback:
    [C]: in function 'error'
    /home/cadene/torch/install/share/lua/5.1/torch/File.lua:141: in function 'writeObject'
    /home/cadene/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject'
    /home/cadene/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject'
    /home/cadene/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject'
    /home/cadene/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject'
    /home/cadene/torch/install/share/lua/5.1/torch/File.lua:200: in function 'writeObject'
    /home/cadene/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject'
    /home/cadene/torch/install/share/lua/5.1/torch/File.lua:220: in function 'writeObject'
    /home/cadene/torch/install/share/lua/5.1/torch/File.lua:388: in function 'save'
    src/bugserialize.lua:18: in main chunk
    [C]: in function 'dofile'
    ...dene/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00405ce0

Improve ParallelDatasetIterator documentation

Hey,
This is not really an issue, I would like to have so more details about how this iterator operates the threading. I came through situations where the original dataset was loaded n times, where n is the number of threads. What I would like to do is load my dataset on-the-fly and let torchnet do the multitasking. Problem here is because I don't understand how and what torchnet will multitask, I can't write my iterator efficiently.

I tried to get my hands into the code but it's really hard to understand how Threads operates. I hope someone will find the time to answer this question. Maybe we can improve the documentation to make it less error-prone 😃

Installation error on OS X 10.11 El Capitan with Xcode 8.0

When installing torchnet via luarocks on OS X 10.11 El Capitan, I receive the below error. This error appears to occur because the installation is searching for MacOSX10.11.sdk when only MacOSX.sdk and MacOSX10.12.sdk are installed by default with Xcode 8.0, even though I am running OS X 10.11. Supposedly, this is due to the issue of the latest Xcode only shipping with the latest SDK.

Scanning dependencies of target ads
[ 16%] Building C object CMakeFiles/tds.dir/tds_utils.c.o
[ 33%] Building C object CMakeFiles/tds.dir/tds_elem.c.o
[ 50%] Building C object CMakeFiles/tds.dir/tds_hash.c.o
[ 66%] Building C object CMakeFiles/tds.dir/tds_vec.c.o
[ 83%] Building C object CMakeFiles/tds.dir/tds_atomic_counter.c.o
make[2]: *** No rule to make target `/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Accelerate.framework', needed by `libtds.so'.  Stop.
make[1]: *** [CMakeFiles/tds.dir/all] Error 2
make: *** [all] Error 2

Error: Failed installing dependency: https://raw.githubusercontent.com/torch/rocks/master/tds-scm-1.rockspec - Build error: Failed building.

In the meantime, I will address the issue by downgrading Xcode to 7.3.1.

Saving network after epoch

I try to save my network after each epoch by doing:

egine.hooks.onEndEpoch = function(state)
        state.network:clearState()
        torch.save('model.t7', state.network)
end

When continuing with a new epoch, I get following error:

$ Error: cuda runtime error (77) : an illegal memory access was encountered at /users/visics/dneven/torch_recent/extra/cutorch/lib/THC/generic/THCStorage.c:158
THCudaCheck FAIL file=/users/visics/dneven/torch_recent/extra/cutorch/lib/THC/generic/THCStorage.c line=158 error=77 : an illegal memory access was encountered

Should I use a different method to clear my model before saving?

How can i use MSE criterion?

Running examples/mnist.lua simply, i can't change criterion to MSE Criterion.
Can you tell me the how-to?

Why the software history was not kept?

Hi there,

I'm a researcher studying software evolution. As part of my current research, I'm studying the implications of open-sourcing a proprietary software, for instance, if the project succeed in attracting newcomers. However, I observed that some projects, like torchnet, deleted their software history.

9b7759d

Knowing that software history is indispensable for developers (e.g., developers need to refer to history several times a day), I would like to ask torchnet developers the following four brief questions:

Why did you decide to not keep the software history?
Do the core developers faced any kind of problems, when trying to refer to the old history? If so, how did they solve these problems?
Do the newcomers faced any kind of problems, when trying to refer to the old history? If so, how did they solve these problems?
How does the lack of history impacted on software evolution? Does it placed any burden in understanding and evolving the software?

Thanks in advance for your collaboration,

Gustavo Pinto, PhD
http://www.gustavopinto.org

Fail to require torchnet when debugging using eclipseLDT

I am a bit new to torch. I setup eclipseLDT according to http://www.lighting-torch.com/2015/07/27/configuring-eclipse-with-torch/

I can debug code without require torchnet, but once I require torchnet, I get the following errors

qlua: .../torch/install/share/lua/5.1/argcheck/usage.lua:83: bad argument #2 to 'isatty' (FILE* expected, got table)
stack traceback:
[C]: at 0x7f30ab2179c0
[C]: in function 'isatty'
.../torch/install/share/lua/5.1/argcheck/usage.lua:83: in function 'render'
.../torch/install/share/lua/5.1/argcheck/init.lua:102: in function 'argcheck'
.../torch/install/share/lua/5.1/torchnet/utils/table.lua:35: in main chunk
[C]: in function 'require'
.../torch/install/share/lua/5.1/torchnet/utils/init.lua:23: in main chunk
[C]: in function 'require'
.../torch/install/share/lua/5.1/torchnet/transform.lua:12: in main chunk
[C]: in function 'require'
...h/install/share/lua/5.1/torchnet/dataset/listdataset.lua:13: in main chunk
[C]: in function 'require'
.../torch/install/share/lua/5.1/torchnet/init.lua:70: in main chunk
[C]: in function 'require'
.../LDT_workspace/test1/src/main.lua:14: in main chunk

Main documentation not updated

tnt.SplitDataset treats partition values as fractions only when they are < 1, otherwise treating them as absolute partition size.

The documentation states:
The sum of the partition weights may or may not sum to one (tnt.SplitDataset will make them sum to one!).

This is confusing as if someone gives partitions as percentage like
{ train = 70, test = 30 }, the 70 and 30 are treated as absolute values rather than fraction.

Edit: Also, sum of partition weights should sum 1, or be exact size.

Is it necessary to shallow copy the transforms in transform.lua

I think this line is redudent here, and remove this line is okay

errors during run the codes provided in the paper

File.lua:141: Unwritable object at <?>.callback.closure.mnist.testdataset.createdataset.readlush.torch.cat

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble