apple / turicreate Goto Github PK

Turi Create simplifies the development of custom machine learning models.

License: BSD 3-Clause "New" or "Revised" License

CMake 0.68% Shell 0.27% Python 17.69% Dockerfile 0.04% C 0.24% C++ 63.24% Objective-C++ 1.78% Objective-C 1.26% Makefile 0.03% HTML 0.09% CSS 0.81% JavaScript 8.76% Swift 3.89% SCSS 0.10% Cython 1.12%

machine-learning deep-learning python

turicreate's Introduction

Quick Links: Installation | Documentation

Turi Create

Turi Create simplifies the development of custom machine learning models. You don't have to be a machine learning expert to add recommendations, object detection, image classification, image similarity or activity classification to your app.

Easy-to-use: Focus on tasks instead of algorithms
Visual: Built-in, streaming visualizations to explore your data
Flexible: Supports text, images, audio, video and sensor data
Fast and Scalable: Work with large datasets on a single machine
Ready To Deploy: Export models to Core ML for use in iOS, macOS, watchOS, and tvOS apps

With Turi Create, you can accomplish many common ML tasks:

ML Task	Description
Recommender	Personalize choices for users
Image Classification	Label images
Drawing Classification	Recognize Pencil/Touch Drawings and Gestures
Sound Classification	Classify sounds
Object Detection	Recognize objects within images
One Shot Object Detection	Recognize 2D objects within images using a single example
Style Transfer	Stylize images
Activity Classification	Detect an activity using sensors
Image Similarity	Find similar images
Classifiers	Predict a label
Regression	Predict numeric values
Clustering	Group similar datapoints together
Text Classifier	Analyze sentiment of messages

Example: Image classifier with a few lines of code

If you want your app to recognize specific objects in images, you can build your own model with just a few lines of code:

import turicreate as tc

# Load data 
data = tc.SFrame('photoLabel.sframe')

# Create a model
model = tc.image_classifier.create(data, target='photoLabel')

# Make predictions
predictions = model.predict(data)

# Export to Core ML
model.export_coreml('MyClassifier.mlmodel')

It's easy to use the resulting model in an iOS application:

Supported Platforms

Turi Create supports:

macOS 10.12+
Linux (with glibc 2.10+)
Windows 10 (via WSL)

System Requirements

Turi Create requires:

Python 2.7, 3.5, 3.6, 3.7, 3.8
x86_64 architecture
At least 4 GB of RAM

Installation

For detailed instructions for different varieties of Linux see LINUX_INSTALL.md. For common installation issues see INSTALL_ISSUES.md.

We recommend using virtualenv to use, install, or build Turi Create.

pip install virtualenv

The method for installing Turi Create follows the standard python package installation steps. To create and activate a Python virtual environment called venv follow these steps:

# Create a Python virtual environment
cd ~
virtualenv venv

# Activate your virtual environment
source ~/venv/bin/activate

Alternatively, if you are using Anaconda, you may use its virtual environment:

conda create -n virtual_environment_name anaconda
conda activate virtual_environment_name

To install Turi Create within your virtual environment:

(venv) pip install -U turicreate

Documentation

The package User Guide and API Docs contain more details on how to use Turi Create.

GPU Support

Turi Create does not require a GPU, but certain models can be accelerated 9-13x by utilizing a GPU.

Linux	macOS 10.13+	macOS 10.14+ discrete GPUs, macOS 10.15+ integrated GPUs
Activity Classification	Image Classification	Activity Classification
Drawing Classification	Image Similarity	Object Detection
Image Classification	Sound Classification	One Shot Object Detection
Image Similarity		Style Transfer
Object Detection
One Shot Object Detection
Sound Classification
Style Transfer

macOS GPU support is automatic. For Linux GPU support, see LinuxGPU.md.

Building From Source

If you want to build Turi Create from source, see BUILD.md.

Contributing

Prior to contributing, please review CONTRIBUTING.md and do not provide any contributions unless you agree with the terms and conditions set forth in CONTRIBUTING.md.

We want the Turi Create community to be as welcoming and inclusive as possible, and have adopted a Code of Conduct that we expect all community members, including contributors, to read and observe.

turicreate's People

Contributors

Stargazers

Watchers

Forkers

srikris rajatarya gustavla tjrdev arjunchandra dacle77 benjirez iandanforth daking computerstaat esokullu chikaobuah shyamalschandra kitchensinkcollection igiloh pobrienjhu kediriweera spendyala aiexperts agnanachandran cartertsai shivam11 puppycodes bityangke mnrmja007 ashbt jithinraj equalll kod3r kovoja17 automatefoss wolfspyre guoyu richardknop zoid-anurag jgris shubhampachori12110095 tchen0123 war3gu shivamgupta211 aptx5788 raghav-kukreti mehrdad-shokri flyingfish42 edward-shaw miguelandrs junk16 godietion tangyiyong webon99 xtmgah pcharlet adolfoeliazat rhinojosa pjha1994 aashish24 yyzbigfish lucianobenetti pmadhyastha obarros mdpblue schlagelk olamyy afathi olivierh59500 jamesdale balestrapatrick vegakp backupmanager anushkmittal apingali amywork xc35 ianchan1988 ljimi yushenxiang axiosleo n0ooooone 7627043 shksa ssameerr zivwang tashby masud kangni chdzq isura rodespinoza cvalenzuela devenlu yak0xff kanasite k0e1n alonpal fdambrosio leeyyun hackcat lvyongtao yulgana33 simbwa

turicreate's Issues

Apending a _1 in the classes generated by xcode given a model name with '-'

I've created a model using turicreate and I named 't-shirt_detector.mlmodel' and when I drag into xcode is generating a class called t_shirt_detector_1, which is crashing due to the initializer:

convenience init() {
        let bundle = Bundle(for: t_shirt_detector_1.self)
        let assetPath = bundle.url(forResource: "t_shirt_detector_1", withExtension:"mlmodelc")
        try! self.init(contentsOf: assetPath!)
    }

doesn't play nice with pyodbc

Tested in Jupyter

import pyodbc
import turicreate as tc

with open('turi_auth.txt', 'r') as auth:
    conn = pyodbc.connect(auth.read())

data = tc.SFrame.from_sql(conn, "SELECT * FROM mytable", dbapi_module=pyodbc)
print data

Error message

ERROR:root:An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line string', (1, 0))

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-7745e517349c> in <module>()
     30 
     31 WHERE v.site_street_number != ''
---> 32 """, dbapi_module=pyodbc)
     33 data

/home/user/.venvs/env/local/lib/python2.7/site-packages/turicreate/data_structures/sframe.pyc in from_sql(cls, conn, sql_statement, params, type_inference_rows, dbapi_module, column_type_hints, cursor_arraysize)
   1814             sb = SFrameBuilder(result_types, column_names=result_names)
   1815 
-> 1816         sb.append_multiple(_force_cast_sql_types(temp_vals, result_types, cols_to_force_cast))
   1817         rows = c.fetchmany()
   1818         while len(rows) > 0:

/home/user/.venvs/env/local/lib/python2.7/site-packages/turicreate/data_structures/sframe_builder.pyc in append_multiple(self, data, segment)
    156         if hasattr(data, '__len__'):
    157             if len(data) <= self._block_size:
--> 158                 self._builder.append_multiple(data, segment)
    159                 return
    160 

turicreate/cython/cy_sframe_builder.pyx in turicreate.cython.cy_sframe_builder.UnitySFrameBuilderProxy.append_multiple()

turicreate/cython/cy_sframe_builder.pyx in turicreate.cython.cy_sframe_builder.UnitySFrameBuilderProxy.append_multiple()

turicreate/cython/cy_flexible_type.pyx in turicreate.cython.cy_flexible_type.flex_list_from_iterable()

turicreate/cython/cy_flexible_type.pyx in turicreate.cython.cy_flexible_type.common_typed_flex_list_from_iterable()

turicreate/cython/cy_flexible_type.pyx in turicreate.cython.cy_flexible_type.tr_buffer_to_flex_list()

TypeError: Could not convert python object with type Row to flex list.

SFrame head method is returning [n+1] rather than [n] rows

I'm currently using the SFrame object to load data from a csv file and it's working perfectly fine.

However, when I call the head(n) method to retrieve the first n rows (as mentioned in the API docs), the result is always n+1.

I forked this project and edited the unity_sframe::head(size_t nrows) method in the src/unity/lib/unity_sframe.cpp file, by subtracting 1 from nrows in the for loop.

But it seem to be not working properly. So I wanted to ask if this is the correct file that handles the head method in the Python package?

Great work by the way!
I'm very excited to create new CoreML models using this amazing tool :)

Rename introduction.md to Readme.md so that pages render right on github

When browsing the user-guide on github, we don't get a default rendering on folders because there is no README.md. We should rename introduction.md to README.md for all tools

sqlalchemy connection & pandas `object` dtype

It seems turicreate doesn't like sqlalchemy connections. Fair enough. I can connect to it using pandas instead and then create an SFrame from it.

Turns out, though, that in pandas string types are represented under the umbrella of dtype(O). And turicreate doesn't recognise these as strings.

pandas and sqlalchemy play really nice with each other, and I would wage there's a lot of people in the community who use these in conjunction to simplify fast prototyping. Is this problem something you would like to look into?

SArray.read_json() should be able to read an array

The following code fails:

import turicreate as tc
import json

with open('/tmp/test.json', 'w') as f:
    json.dump([1,2,3], f)

tc.SArray.read_json('/tmp/test.json')

with this error message:

Parsing JSON records from /tmp/test.json
/tmp/test.json has non-dictionary elements which are ignored.

Prevent installing on systems that are not x86_64

We don't have any checks for x86_64. We should have our wheels restrict to only the platforms that we support. Ideally with names like this: numpy-1.13.3-cp35-cp35m-manylinux1_x86_64.whl

Visualization support on Linux

In the 4.0.0 release, native visualization support (.show(), .explore()) is only for macOS. We should support Linux as well.

unable to read from/write to a non public S3 bucket

It seems not possible to use method turicreate.aws.set_credentials() described in documentation https://apple.github.io/turicreate/docs/api/generated/turicreate.SFrame.html

There is no aws object within turicreate.
It seems also not possible to use environment variables, either when set in the shell environment, or before calling the load_sframe method.

This is the code i'm using to get an SFrame in S3.
It works well with the sframe library, but fails with turicreate

def getSFrame(s3SFramePath):
import os
import turicreate as tc
global S3_ACCESS_KEY, S3_SECRET_KEY
os.environ['AWS_ACCESS_KEY_ID'] = S3_ACCESS_KEY
os.environ['AWS_SECRET_ACCESS_KEY'] = S3_SECRET_KEY
return tc.load_sframe(s3SFramePath)

error running `build.sh` on OSX 10.12.6

error message

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "turicreate/__init__.py", line 30, in <module>
    from turicreate.data_structures.sgraph import Vertex, Edge
  File "turicreate/data_structures/__init__.py", line 18, in <module>
    from . import sframe
  File "turicreate/data_structures/sframe.py", line 16, in <module>
    from ..connect import main as glconnect
  File "turicreate/connect/main.py", line 13, in <module>
    from ..cython.cy_unity import UnityGlobalProxy
ImportError: dlopen(turicreate/cython/cy_unity.so, 2): Symbol not found: _libssh2_agent_connect
  Referenced from: /turicreate/debug/src/unity/python/turicreate/cython/../libunity_shared.dylib
  Expected in: flat namespace
 in /turicreate/debug/src/unity/python/turicreate/cython/../libunity_shared.dylib```

Support for audio use cases

In the release notes you mention flexible data formats including audio. I see lots of information for images, but are there any docs related to reading or clustering mp3/wav files or dealing with spectrograms?

Add support for macOS 10.11

It would be nice if Turi Create could be used on macOS 10.11. Functions that rely on coremltools would have to fail gracefully (e.g. export_coreml). Currently coremltools==0.6.3 is a hard dependency, and they do not provide wheels for 10.11.

Some broken links in the API docs and user guide

Eg. API Docs link on https://github.com/apple/turicreate/blob/master/userguide/recommender/introduction.md#recommender-systems

Exporting Turi model in other ways (for example MXnet)

I was wondering if there was a way (other than CoreML) to export models out of Turi to other systems. As part of learning neural networks i wanted to see if i could deploy a Turi Create model to a RESTful-interface with AWS Lambda, but i have yet to see a good way to export the (i assume to be) underlying MXnet model or a way to save a the Turi-model in a more friendly format. Am i missing something here?

Will Turi get GPU support for AMD's ROCm software platform?

I see AMD Vega GPU that is in the new iMac Pro is being marketed as excellent for machine learning with their new ROCm software platform that apparently has tensor flow support coming.

Will Turi Create be able to take advantage of this GPU with an iMac Pro?

No way to return json ?

Hi guys, I know you can export to json file, but in the application I am building it includes an API, seems a bit over kill to write to a json file and then read the json file to return the response, is there any functionality to simply receive the data in json or convert it to json without having to save to save to file.

TIA.

Turi Create Visualisation often stuck on "Loading" screen

I don't know what additional information to provide here, as it's happened in a number of different situations. With a large or very small data set, Turi Create Visualisation remains stuck on the "Loading" screen and never loads. Sometimes I can run the same code again, and it will load almost instantly with all data shown. But often, it'll not move past that initial screen. Sometimes re-running it works.

Right now I'm simply following the Image Classification and Object Detection tutorials, so not doing anything particularly advanced. This occurred when calling explore() after creating the initial cats/dogs data set, and even when slimming that data set down to just 6 images.

Let me know if you need more specifics.

String formatting errors

turicreate/src/unity/python/turicreate/cython/cy_pylambda_workers.pyx

Line 193 in 7f07ce7

 raise ValueError("Row %d does not have the correct number of rows (%d, should be %d)" 

# current ^^^
for i in range(n):
    if lcd.input_keys[0][i].size() != n_keys:
        raise ValueError("Row %d does not have the correct number of rows (%d, should be %d)"
                         % (lcd.input_keys[0][i].size(), n))

# the problem is that there are only two arguments provided but the string is expecting three
>>> raise ValueError("Row %d does not have the correct number of rows (%d, should be %d)"
...                  % (22, 44))
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
TypeError: not enough arguments for format string

# if I am reading this correctly, shouldn't the first argument in the tuple be `i`?
raise ValueError("Row %d does not have the correct number of rows (%d, should be %d)"
                 % (i, lcd.input_keys[0][i].size(), n))
                    ^
                    ^
                    ^

turicreate/src/nnvm/tvm/python/tvm/contrib/xcode.py

Line 189 in 7f07ce7

raise RuntimeError("Cannot find tvmrpc.xcodeproj in %s," +

>>> rpc_root = 'TEST'
>>> raise RuntimeError("Cannot find tvmrpc.xcodeproj in %s," +
...                    (" please set env TVM_IOS_RPC_ROOT correctly" % rpc_root))
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
TypeError: not all arguments converted during string formatting

# this error can be fixed by removing two brackets and the `+` sign
>>> raise RuntimeError("Cannot find tvmrpc.xcodeproj in %s,"
...                    " please set env TVM_IOS_RPC_ROOT correctly" % rpc_root)
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
RuntimeError: Cannot find tvmrpc.xcodeproj in TEST, please set env TVM_IOS_RPC_ROOT correctly

turicreate/scenario-tests/tests/single_test_driver.py

Line 11 in 7f07ce7

{} [sub_directory] [Optional output xunit xml path prefix]

# current ^^^
def print_help():
    print("""\
{} [sub_directory] [Optional output xunit xml path prefix]

This test runs the scenario test located in the sub directory.
Every python file in the sub directory will be executed against pytest.

Optionally, a setup.py file may exist inside the subdirectory in which case
setup.py is effectively "sourced" before any of the tests are run. This allows
setup.py to modify environment variables which will be picked up by the
tests.

For instance, if the PATH variable is changed in setup.py a different python
environment may be used to run pytest
""" % sys.argv[0])

# the problem above is that the new style `{}` placeholder has been used instead of `%s`
>>> '{}' % 'TEST'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting
>>> '%s' % 'TEST'
'TEST'
>>> '{}'.format('TEST')
'TEST'

# since the project seems to prefer the old style VS new style,
# simply replacing `{}` with `%s` should fix this error
def print_help():
    print("""\
%s [sub_directory] [Optional output xunit xml path prefix]
^^
^^
^^

Instructions don't work. Cant create CoreML model, crashes

So I'm trying to make my first model. First the installation instructions are incorrect, if you follow them to the T, they fail with a non useful error on pip install -U turicreate. Turns out this project requires you to downgrade to an ancient Python version. How about the installation instructions include that detail, or better yet, make it work with the current version that ships on all Macs.

Ok, so I worked around that, and started on replicating https://apple.github.io/turicreate/docs/userguide/image_classifier/introduction.html to make sure everything is working properly before I go through the work of gathering real training data.

I downloaded the following images into a images folder, with a subfolder for dogs and tractor
dog_urls.txt
tractor_urls.txt

I copy/pasted the sample code into a new python file and just change the paths.

import turicreate as tc

# Load images
sf = tc.image_analysis.load_images('images', with_path=True)

# From the path-name, create a label column
data['label'] = data['path'].apply(lambda path: 'dog' if 'dogs' in path else 'tractor')

# Save the data for future use
data.save('dogs-tractors.sframe')

# Explore interatively
data.explore()

Then go to run it python <file_just_created>.py and it fails because sf = tc.image_analysis.load_images('train', with_path=True) writes to sf but the rest of the file expects it to be data. Another incorrect doc, but easy enough to fix. Changed sf to data and it runs, and I can see that the SFrame matches what I'd expect. Now I copy & paste the next code snippet from the guide, just changing the file names.


# Load the data
data =  tc.SFrame('dogs-tractors.sframe')

# Make a train-test split
train_data, test_data = data.random_split(0.8)

# Automatically picks the right model based on your data.
model = tc.image_classifier.create(train_data, target='label')

# Save predictions to an SArray
predictions = model.predict(test_data)

# Evaluate the model and save the results into a dictionary
metrics = model.evaluate(test_data)
print(metrics['accuracy'])

# Save the model for later use in Turi Create
model.save('mymodel.model')

# Export for use in Core ML
model.export_coreml('MyCustomImageClassifier.mlmodel')

It goes off and processes for a few minutes as expected, then fails in the export to CoreML step:

python export.py 
[09:35:58] src/nnvm/legacy_json_util.cc:190: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
[09:35:58] src/nnvm/legacy_json_util.cc:198: Symbol successfully upgraded!
Resizing images...
Performing feature extraction on resized images...
Completed 340/340
PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.

WARNING: The number of feature dimensions in this problem is very large in comparison with the number of examples. Unless an appropriate regularization value is set, this model may not provide accurate predictions for a validation/test set.
WARNING: Detected extremely low variance for feature(s) '__image_features__' because all entries are nearly the same.
Proceeding with model training using all features. If the model does not provide results of adequate quality, exclude the above mentioned feature(s) from the input dataset.
Logistic regression:
--------------------------------------------------------
Number of examples          : 324
Number of classes           : 2
Number of feature columns   : 1
Number of unpacked features : 2048
Number of coefficients    : 2049
Starting L-BFGS
--------------------------------------------------------
+-----------+----------+-----------+--------------+-------------------+---------------------+
| Iteration | Passes   | Step size | Elapsed Time | Training-accuracy | Validation-accuracy |
+-----------+----------+-----------+--------------+-------------------+---------------------+
| 1         | 7        | 0.000046  | 1.121025     | 0.672840          | 0.625000            |
| 2         | 9        | 1.000000  | 1.173264     | 0.944444          | 0.937500            |
| 3         | 10       | 1.000000  | 1.210695     | 0.996914          | 1.000000            |
| 4         | 11       | 1.000000  | 1.242987     | 0.996914          | 1.000000            |
| 5         | 12       | 1.000000  | 1.277638     | 0.990741          | 1.000000            |
| 6         | 14       | 1.000000  | 1.323503     | 0.996914          | 1.000000            |
+-----------+----------+-----------+--------------+-------------------+---------------------+
TERMINATED: Iteration limit reached.
This model may not be optimal. To improve it, consider increasing `max_iterations`.
0.987654320988
Fatal Python error: PyThreadState_Get: no current thread
Abort trap: 6

I know its getting past the model.save('mymodel.model') line fine as I can see that file being created, and have even been able to open it in another python script successfully.

How do you actually get this thing to export the CoreML file?

Deployment: How to fill author, license, etc.

The documentation that I have seen doesn't describe a way to put the author, description, license, input descriptions, and so on into the model before/while exporting. Is this feature not yet included?

Need the ability to set a smaller batch size for image classifier & image similarity

The feature extractor used image classifier and image similarity sets the batch size to 512 and we should expose that parameter all the way up to the image classifier and image similarity.

Consistent min/max values for y-axis of diagrams in documentation

It would be great if the diagrams had consistent min/max values. Especially the 3-axial gyroscope is really hard to evaluate/compare (walking vs. sitting) on this page: https://github.com/apple/turicreate/blob/master/userguide/activity_classifier/introduction.md

Incorporating new observation data

Does this actually train the new data to the model or is it only passed to reference ?

API docs missing `load_sgraph`

One of the 404s from #37 is actually to what looks like a correct URL; it seems the load_sgraph method is missing from the API docs. Should load_sframe be listed here as well?

Graphs link broken in User Guide > SFrame > Introduction

The link to the 'Graphs' .md is broken. See here:
https://github.com/apple/turicreate/blob/master/userguide/sframe/introduction.md

Training – IOError: Fail to write. Disk may be full.: iostream error

Hi,

I am using a fairly large dataset – 75GBs, that is stored on a separate server. During training, 250GB is written to the local hard drive completely filling it up. RAM usage is also extremely high it uses 80-85% of 128GBs of RAM. Training ends up failing due to – IOError: Fail to write. Disk may be full.: iostream error

Something seems very inefficient with loading remote training data via an sframe. In Tensorflow this works flawlessly loading training data from a local server. Is there some setting that needs to be enabled so the local hard drive is not filled up?

Activity Classifier doc improvements

I was trying to reproduce the Activity Classifier example by following along and quickly got stuck as the example appears to skip over the essential part of generating the contents of hapt_data.sframe and how to actually run the example snippet.

Apart from the missing yet crucial step from HAPT Data Set.zip to hapt_data.sframe it is also not clear to somebody new to turicreate how to invoke the example snippet.

The example falls short on a couple of questions:

How does one generate or get ahold of the file hapt_data.sframe?
Where does one have to place the hapt_data.sframe file?
Where does one have to place the example's snippet? Within ~/venv/, or outside?
How does one actually run the example? Via python ./activity_classifier.py?
Does one first have to cd into ~/venv/ before running the example file via python …?
Where does one find the produced output (mymodel.model and MyActivityClassifier.mlmodel)?

It is very nice to have such a wide range of applications and examples for most (if not all) of them for turicreate.

But an example that cannot be followed through without prior knowledge or filling in the blanks à la here be dragons at the end of the day is of little to no use for anybody new to the project/codebase. 😟

Python 3 support

In the 4.0.0 release, only Python 2.7 is supported. We should support Python 3 as well.

Open question: which versions of Python 3?

gcc 7 build failure

/src2/src2/turicreate/src/unity/toolkits/coreml_export/MLModel/src/NeuralNetworkValidator.cpp: At global scope:
/src2/src2/turicreate/src/unity/toolkits/coreml_export/MLModel/src/NeuralNetworkValidator.cpp:738:18: error: ‘function’ in namespace ‘std’ does not name a template type
typedef std::function<Result(const CoreML::Specification::NeuralNetworkLayer& specLayer)> validateSpecLayerFn;

plus lots of follow on errors

OpenCL | | syCL | | HIP support ?

The Apple i/GPU usebase is mostly made of Intel and AMD. Yet you've chosen to follow that irational trend of going for the ridiculously little minority (watch the JPR stats) that are nvidia gpus, because 1) they invest ton of money for brainwashing developpers, 2) some tools like Tensorflow do not yet support the standards because of the 1) reason.
The second reason would be valid for à little developer team that has no the choice, BUT you are APPLE, you have unlimited money and you are the creators of OpenCL !!
If you weren't aware, AMD has designed an impressive tool for translating existing modern CUDA code in OpenCL code. Often it take less than one week to convert à project with this tool.
https://github.com/ROCm-Developer-Tools/HIP
I'm not even talking of the fact that AMD hardware is far more suited for machine learning than Nvidia because for example it support full FP16...
So, will you choose the rational way that let choice to the consumer, and to the PC maker ?

Need the datasets used in the examples

Hello. I was intending to experiment with turicreate. However , I don't know where did you get the data from. Can anybody give me links?

It is unclear where to submit SFrame changes

when contributing to SFrame related changes, should we submit pull request to SFrame repo or this repo?

Loading HAPT dataset to sframe for Activity Classifier

The HAPT dataset used separate the gyro and accel data. Any steps how to create the data = tc.SFrame('hapt_data.sframe') from the datasets or the format to use?

Consolidate external source drops

We have source drops of external projects in several places:

src/external
src/nnvm
subtree/xgboost
deps/src

We should consolidate these into a single external directory, and make it clear that these are external source drops and generally shouldn't be modified except to take new source drops of these projects.

Support/examples for using TuriCreate with pandas?

The example in the README shows how to use Turi Create with an SFrame. However, SFrame has not had a release since July 2016.

pandas is pretty much the de-facto DataFrame library in the Python data science ecosystem. It is under active development, has a substantial community (with tens of thousands of StackOverflow questions), and provides tools for many common steps in data analysis.

Please consider aligning with the general momentum in the Python data science community by providing examples of using pandas with Turi Create.

Illegal instruction on training image similarity model from example

I am getting this error while training:

I am using ubuntu 16.04 on Windows 10 WSL.

The code is as following

import turicreate as tc

reference_data = tc.image_analysis.load_images('./101_ObjectCategories')
reference_data = reference_data.add_row_number()

reference_data.save('caltech-101.sframe')

model = tc.image_similarity.create(reference_data)

model.save('img_sim_model.model')

Export Object Detector model to CoreML crashes

So I've created an object detection model. Saved it out. When I then try and load it in and export to CoreML I get

Fatal Python error: PyThreadState_Get: no current thread
Abort trap: 6

The code I'm using to train the model all seems fine and completes.

model = tc.object_detector.create(data, feature='image', annotations='annotations')
model.save("test.model")

I then later do

model = tc.load_model("test.model")
model.export_coreml("TestDetector.mlmodel")

And then get the fatal error.

I'm running in CPU mode on a MacBook Pro 13" 2017 macOS 10.13.2

gcc 7 crashes

I get a compiler segfault while building on linux with g++ 7 or 8

This patch fixes it (removing attribute flatten from the destructor)

diff --git a/src/flexible_type/flexible_type.hpp b/src/flexible_type/flexible_type.hpp
index 4bf49c15..385951fe 100644
--- a/src/flexible_type/flexible_type.hpp
+++ b/src/flexible_type/flexible_type.hpp
@@ -1526,7 +1526,7 @@ inline FLEX_ALWAYS_INLINE_FLATTEN flexible_type::flexible_type(std::initializer_
val.vecval->second = std::move(flex_vec(list));
}

-inline FLEX_ALWAYS_INLINE_FLATTEN flexible_type::~flexible_type() {
+inline flexible_type::~flexible_type() {
reset();
}

Submitted a gcc bug here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83346

CUDA 9 and cuDNN 7 support

Since CUDA 9 and cuDNN 7 has already released, and there are multiple API changes for these low-level libraries. Something like cudnnGetConvolutionForwardAlgorithm_v7() may have better performance than older version. We should adapt to these as well.

Consider pipenv for install

Pip is great! So is virtualenv! Pipenv (from Kenneth Reitz) brings them together to simplify virtual env management and setup. It also provides deterministic builds.

Thanks for the new library, I hope you'll evaluate pipenv and see if it's right for turicreate.

CoreML errors for some models trained on dict features

Predicting using an exported svm_classifier, logistic_regression, or linear_regression trained on dict columns, errors out if the model was used to predict on dicts with keys it hasn't seen during train time.

I am able to consistently reproduce with the following code:

import turicreate as tc

data = tc.SFrame({
    'a': [{'x': 1}, {'y': 1}],
    'b': [0, 1],
})

model = tc.logistic_classifier.create(data, target='b', features=['a'], verbose=False)

test = tc.SFrame({'a': [{'z': 1}]})
model.predict(test)

model.export_coreml('LogisticClassifier.mlmodel')

import coremltools as cml

mlmodel = cml.models.MLModel('LogisticClassifier.mlmodel')
mldata = {'a': {'z': 1}}
mlmodel.predict(mldata)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-1-069618668271> in <module>()
     18 mlmodel = cml.models.MLModel('LogisticClassifier.mlmodel')
     19 mldata = {'a': {'z': 1}}
---> 20 mlmodel.predict(mldata)

/Users/alonpalombo/anaconda/envs/turi/lib/python2.7/site-packages/coremltools/models/model.pyc in predict(self, data, useCPUOnly, **kwargs)
    239         """
    240         if self.__proxy__:
--> 241             return self.__proxy__.predict(data,useCPUOnly)
    242         else:
    243             if _sys.platform != 'darwin' or float('.'.join(_platform.mac_ver()[0].split('.')[:2])) < 10.13:

RuntimeError: {
    NSLocalizedDescription = "Predicted feature named 'b' was not output by pipeline";
}

Removing the line model.predict(test) results in the expected output of the program:

{u'__vectorized_features__': array([ 0.,  0.]),
 u'a': {},
 u'b': 0L,
 u'bProbability': {0L: 0.5, 1L: 0.5}}

May be related to #61

Could not find a version that satisfies the requirement turicreate

I following the installation instruction, but end up faced with the error when running command (venv) pip install -U turicreate. The error:
Could not find a version that satisfies the requirement turicreate (from versions: ) No matching distribution found for turicreate
Please help me to fix it.

ToolkitError: Unsupported PNG bit depth: 4

While creating graph lab images - PNG images are not currently supported by the ToolKit, resulting in the following error to be thrown:
ToolkitError: Unsupported PNG bit depth: 4

Complete error details:

---------------------------------------------------------------------------
ToolkitError                              Traceback (most recent call last)
/Users/corrinetoracchio/Projects/corSheetMusic/turi/read-timed-notes.py in <module>()
     11 
     12                 for imageFile in data:
---> 13                          gl.Image(imageFile.replace('\n', ''))

/Users/corrinetoracchio/anaconda/envs/gl-env/lib/python2.7/site-packages/graphlab/data_structures/image.pyc in __init__(self, path, format, **_Image__internal_kw_args)
     70             from ..util import _make_internal_url
     71             from .. import extensions as _extensions
---> 72             img = _extensions.load_image(_make_internal_url(path), format)
     73             for key, value in list(img.__dict__.items()):
     74                 setattr(self, key, value)

/Users/corrinetoracchio/anaconda/envs/gl-env/lib/python2.7/site-packages/graphlab/extensions.pyc in <lambda>(*args, **kwargs)
    166 
    167 def _make_injected_function(fn, arguments):
--> 168     return lambda *args, **kwargs: _run_toolkit_function(fn, arguments, args, kwargs)
    169 
    170 def _class_instance_from_name(class_name, *arg, **kwarg):

/Users/corrinetoracchio/anaconda/envs/gl-env/lib/python2.7/site-packages/graphlab/extensions.pyc in _run_toolkit_function(fnname, arguments, args, kwargs)
    155     if ret[0] != True:
    156         if len(ret[1]) > 0:
--> 157             raise _ToolkitError(ret[1])
    158         else:
    159             raise _ToolkitError("Toolkit failed with unknown error")

ToolkitError: Unsupported PNG bit depth: 4

'NearestNeighborClassifier' object has no attribute 'export_coreml'

I created a KNN model which works well, but the export_coreml function seems missing! Am I misunderstanding? Are KNN models not exportable?

Export recommender model to CoreML

I would love to be able to use the recommender model on my app without having to send information to a server to be processed. Would it be possible to support exporting the recommender model to CoreML?

Unclear how to set up annotations column for Object Detection

The guide for Object Detection doesn't cover setting up the data, including how to attach the "annotations" category.

What I've done is to create a separate annotations file, which looks like this:

{
  "1.jpg": {
    "type": "rectangle",
    "coordinates": {
      "height": 97,
      "width": 243,
      "x": 4224,
      "y": 1821
    },
    "label": "cw"
}

I then setup my data using load_images, and this file:

# Load images
data = tc.image_analysis.load_images('train', with_path=True)
# Open annotations file as dict
annotations = eval(open("annotations").read())
# Add annotations column to SFrame, using the annotations dict key with the same name as the file name
data["annotations"] = data["path"].apply(lambda path: bounds[os.path.split(path)[1]])

That works well, and if I print data, I get something like this:

+-------------------------------+---------------------------+
|              path             |           image           |
+-------------------------------+---------------------------+
| /Users/Andrew/Code/turi/cw... | Height: 3816 Width: 11056 |
| /Users/Andrew/Code/turi/cw... | Height: 3888 Width: 10672 |
| /Users/Andrew/Code/turi/cw... |  Height: 3656 Width: 9700 |
| /Users/Andrew/Code/turi/cw... |  Height: 3872 Width: 8280 |
+-------------------------------+---------------------------+
+-------------------------------+
|          annotations          |
+-------------------------------+
| {'type': 'rectangle', 'coo... |
| {'type': 'rectangle', 'coo... |
| {'type': 'rectangle', 'coo... |
| {'type': 'rectangle', 'coo... |
+-------------------------------+

I don't know why that's separated onto 2 lines in the console - likely just for sizing reasons.

So then I get to this line in the Object Detection guide, where it intends to visualise the annotations applied to the data:

tc.object_detector.util.draw_bounding_boxes(data["image"], data["annotations"])

When I run this, I get this error in the console:

Traceback (most recent call last):
  File "app.py", line 62, in <module>
    load_data(bounds)
  File "app.py", line 23, in load_data
    tc.object_detector.util.draw_bounding_boxes(data["image"], data["annotations"])
  File "/Users/Andrew/turi/lib/python2.7/site-packages/turicreate/toolkits/object_detector/util/_visualization.py", line 139, in draw_bounding_boxes
    .apply(draw_single_image))
  File "/Users/Andrew/turi/lib/python2.7/site-packages/turicreate/data_structures/sframe.py", line 2463, in apply
    dryrun = [fn(row) for row in test_sf]
  File "/Users/Andrew/turi/lib/python2.7/site-packages/turicreate/toolkits/object_detector/util/_visualization.py", line 124, in draw_single_image
    _annotate_image(pil_img, anns, confidence_threshold=confidence_threshold)
  File "/Users/Andrew/turi/lib/python2.7/site-packages/turicreate/toolkits/object_detector/util/_visualization.py", line 49, in _annotate_image
    for ann in reversed(anns):
TypeError: argument to reversed() must be a sequence

In addition, if I comment that out, and then go ahead and do:

model = tc.object_detector.create(data, feature="image", annotations="annotations")

I get the error:

Traceback (most recent call last):
  File "app.py", line 65, in <module>
    learn()
  File "app.py", line 37, in learn
    model = tc.object_detector.create(data, feature="image", annotations="annotations")
  File "/Users/Andrew/turi/lib/python2.7/site-packages/turicreate/toolkits/object_detector/object_detector.py", line 170, in create
    require_annotations=True)
  File "/Users/Andrew/turi/lib/python2.7/site-packages/turicreate/toolkits/object_detector/object_detector.py", line 66, in _raise_error_if_not_detection_sframe
    raise _ToolkitError("Annotations column must contain lists")
turicreate.toolkits._main.ToolkitError: Annotations column must contain lists

Presumably I'm setting up my annotations column incorrectly to what its' expecting.

Core ML: Predicted feature named '___' was not output by pipeline

I successfully exported a Boosted Trees Classifier model to a Core ML model and imported it into a Xcode project. All seemed well until I tried using it in Swift. Whenever I run the predict function (even with valid inputs) I receive an error: "Predicted feature named 'room' was not output by pipeline" where room is my target name. Is this a Core ML issue or did I export it wrong from Turi Create?

sf = tc.SFrame.read_csv('all.csv')
model = tc.boosted_trees_classifier.create(sf, target='room')
model.export_coreml("Model.mlmodel")

For context, the top 10 lines of the SFrame:

+-------------+-------------+-------------+-------------+-------------+------+
| b1 | b2 | b3 | b4 | b5 | room |
+-------------+-------------+-------------+-------------+-------------+------+
| 15.40925007 | 2.448436747 | 0.875747109 | 3.116232715 | 0.55931885 | 3 |
| 12.34282797 | 2.448436745 | 1.050529028 | 2.49309418 | 0.547421866 | 3 |
| 11.35115228 | 2.593769273 | 1.314364103 | 2.327012418 | 0.56737106 | 3 |
| 10.90912658 | 2.76689536 | 1.639642453 | 2.284626409 | 0.691871329 | 3 |
| 10.96834609 | 3.053793429 | 1.953247127 | 2.252586412 | 0.722658468 | 3 |
| 10.62518757 | 3.562255003 | 2.129856143 | 1.912806725 | 0.668820263 | 3 |
| 10.34918951 | 3.569631802 | 2.27094845 | 1.708808372 | 0.798663933 | 3 |
| 10.12294684 | 3.359425223 | 2.446464643 | 1.805624742 | 0.995895032 | 3 |
| 10.65090275 | 3.216177055 | 2.655107978 | 1.88591092 | 1.264108161 | 3 |
| 10.88905096 | 3.566873863 | 2.684348503 | 1.888478765 | 1.325077843 | 3 |
+-------------+-------------+-------------+-------------+-------------+------+

No matching distribution found for coremltools==0.6.3

The latest version of coremltool I am able to install is 0.5.1 with python 2.7.10

Raise specific exceptions?

turicreate/src/unity/python/turicreate/util/config.py

Line 55 in 7f07ce7

raise "Please set the TEMP environment variable"

>>> raise "Please set the TEMP environment variable"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: exceptions must be old-style classes or derived from BaseException, not str

turicreate/src/unity/python/turicreate/test/test_nearest_neighbors.py

Line 1591 in 7f07ce7

raise "Unknown distance"

>>> raise "Unknown distance"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: exceptions must be old-style classes or derived from BaseException, not str

Export text classifier to Core ML

I would like to export the Sentence Classifier model to Core ML. Right now I'm getting the 'SentenceClassifier' object has no attribute 'export_coreml' error.
Is there some way to export it?