GithubHelp home page GithubHelp logo

covartech / prt Goto Github PK

View Code? Open in Web Editor NEW
144.0 39.0 70.0 68.48 MB

Pattern Recognition Toolbox for MATLAB

Home Page: http://covartech.github.io/

License: MIT License

MATLAB 94.27% C++ 2.61% Makefile 0.06% C 2.15% Shell 0.04% HTML 0.31% M 0.36% Python 0.09% Mathematica 0.10% Objective-C 0.01%

prt's Introduction

PRT: Pattern Recognition and Machine Learning in MATLAB

A free and permissively licensed object oriented approach to machine learning in MATLAB.

Machine learning and pattern recognition are everywhere. MATLAB is a high level interpreted language widely used throughout academia and engineering due to its ease of use and numerous available toolboxes. Currently available toolboxes for pattern recognition and machine learning in MATLAB are either costly or restrictively licensed. The PRT is a MIT licensed toolbox that provides access to a wide range of pattern recognition techniques in an easy to use unified framework. The PRT provides a suite of MATLAB commands and data-types to help you organize, visualize, process, cluster and classify your data. If you have data and need to make predictions based on your data, the PRT can help.

prt's People

Contributors

covarresearch avatar covartech avatar kennethmorton avatar newfolder avatar patrickkwang avatar petertorrione avatar samkeene avatar sudeepmandal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prt's Issues

Replace target vector with categorical arrays?

Seem to do a lot of what we already do, but faster and cleaner.

http://www.mathworks.com/help/matlab/matlab_prog/advantages-of-using-categorical-arrays.html

Advantages:

http://www.mathworks.com/help/matlab/matlab_prog/advantages-of-using-categorical-arrays.html

Disadvantage:
Lots of code to re-write; also breaks code for anyone not using 2013B.... which is bleeding-edge. Will need to fork, and wait. Back-burner for now, but potentially interesting.

k-folds breaking with R2011b

Hello,

My Matlab version is R2011b and I updated my working copy of PRT last week. This broke a well used script I have to k-fold a dataset. Here is the output:

Error using union (line 107)
Unknown flag.
Error in prtUtilIntegerAssociativeArray/merge (line 104)
[unionKeys,ind1,ind2] = union(keys1,keys2,'R2012a'); % Bug fix 2013-06-13
Error in prtUtilIntegerAssociativeArrayClassNames/merge (line 29)
temp = merge@prtUtilIntegerAssociativeArray(self,other);
Error in prtUtilIntegerAssociativeArray/combine (line 87)
out = merge(self,in2);
Error in prtDataInterfaceCategoricalTargets/catClasses (line 318)
[self.classNamesArray,integerSwaps] =
combine(self.classNamesArray,ds.classNamesArray);
Error in prtDataSetClass/catObservations (line 170)
self = catClasses(self,varargin{:});
Error in prtDataSetBase/crossValidateCombineFoldResults (line 406)
dsOut = catObservations(dsTestCell{:});
Error in prtDataSetStandard/crossValidateCombineFoldResults (line 627)
dsOut = crossValidateCombineFoldResults@prtDataSetBase(dsTestCell_first,
dsTestCell, testIndices);
Error in prtAction/crossValidate (line 387)
dsOut = crossValidateCombineFoldResults(outputDataSetCell{1},
outputDataSetCell, testingIndiciesCell);
Error in prtAction/kfolds (line 433)
[outputs{:}] = self.crossValidate(ds,keys);
Error in svm_one_vs_one (line 36)
crossvalout = classifier.kfolds(dataset, 2);

And this is the code causing the issue:

classifier = prtClassLibSvm;
classifier.internalDecider = prtDecisionBinaryMinPe;
crossvalout = classifier.kfolds(dataset, 2);

Thanks in advance.

proposal: remove prtDataSetCellArray

I believe that prtDataSetClassReshape solves the same problem better.

Three things use prtDataSetCellArray: prtDataGenCifar1, prtDataSetTimeSeries, and prtDataGenMsrcorid. The first two appear to be replaceable with prtDataSetClassReshape with no issues. For prtDataGenMsrcorid, the observations are actually different shapes... It's an older dataset anyway and probably superceded by something like ImageNet, but I don't know that there's a good reason to allow prtDataSets with observations of arbitrary sizes.

Are there other use cases for this of which I'm not aware?

Building MEX Files

As per your instruction I have done the modification of the following three files

  1. C:\Users\Administrator\Documents\MATLAB\PRT-master\doc\prtDoc.m
  2. C:\Users\Administrator\Documents\MATLAB\PRT-master\engine\dataset\prtDataSetClass.m
  3. C:\Users\Administrator\Documents\MATLAB\PRT-master\util\prtSetup.m

prtSetup
successfully return the matlab prompt
mex -setup

Error

mex -setup
MEX configured to use 'lcc-win32' for C language compilation.
Warning: The MATLAB C and Fortran API has changed to support MATLAB
variables with more than 2^32-1 elements. In the near future
you will be required to update your code to utilize the
new API. You can find more information about this at:
http://www.mathworks.com/help/matlab/matlab_external/upgrading-mex-files-to-use-64-bit-api.html.

To choose a different language, select one from the following:
mex -setup C++

mex -setup FORTRAN

However in my system following are already installed
Control Panel\Programs\Programs and Features

Microsoft .NEt Framework 4 Client Profile
Microsoft .NEt Framework 4 Extended
Microsoft Visual C++ 2005 Redistribution
Microsoft Visual C++ 2008 Express Edition-ENU
Microsoft Visual C++ 2008 Redistributable-x86 9.0.30729.17
Microsoft Visual C++ 2008 Redistributable-x86 9.0.30729.6161
Microsoft Visual C++ 2010 x86 Redistributable (x86) 10.0.30319
Microsoft Visual C++ 2012 Redistributable (x86) 11.0.60610

I need to use the classification techniques of PRT Toolbox. Is it necessary to set "Building MEX Files"
Please reply what to do

Regards

Trouble with prtClassGlrt

I am trying to classify a data set based on the observations given in the snapshot. However, after repeated trials using a combination of prtPreProcPca and prtClassGlrt, the ROC remains around 0.5 which is seriously no good. If any pointers could be provided on how to improve the AUC , I would be thankful.
bildschirmfoto 2014-01-20 um 12 22 39

PCA

Hi
after using
dataSet = prtDataSetClass(features,labels);

dataSet = setClassNames(dataSet,[{'Rest'} {'Index Extension'} {'Fourth Finger Extension'}]);
dataSet = setFeatureNames(dataSet,[{'IEMG'} {'MAV'} {'SSI'} {'VAR'} {'RMS'} {'WAMP'} {'MDF'} {'MNF'}]);

pca = prtPreProcPca;
pca = pca.train(dataSet);
dataSetNew = pca.run(dataSet);

How can I know, which features from the dataSet did the algorithm chose?

As you can see, I have a dataSet with 3 classes and 8 features

Thanks

Missing RV Util Code

Tried using my prtClusterDpgmm that's under submissions/Ratto2012 and noticed that several of the required prtRvUtil* files are missing (e.g., prtRvUtilDirichletKld, prtRvUtilMvnWishartKld, prtRvUtilGammaKld...). Were these ever part of the repo? Or did I just forget to commit them. Hopefully I can still find the original code if it's the latter...

prtScoreConfusionMatrix breaks after latest pull

Hey there!

I was getting an error using the prtScoreConfusionMatrix after the latest pull

132: prtUtilPlotConfusionMatrix(confusionMat,guessClassNames,truthClassNames);

I noticed that prtUtilPlotConfusionMatrix should be prtPlotUtilConfusionMatrix. Looks like the name of the function was changed recently

Thanks!

Feature Request: Rotate prtScoreConfusionMatrix XAxisLabels

When the class names are too long, running prtScoreConfusionMatrix causes the x-axis class labels to overlap. Tried using this, but it's kind of a hack. The text also appears distorted unless rotating either 90 or 270 degrees. Tried looking into prtUtilPlotConfusionMatrix and prtScoreConfusionMatrix, but it seems the labels are set as XTickLabels, so a handle is never set for them. If the x-axis class labels were set as text objects, they would have handles so that you could rotate them properly.

Example Conf Matrix

Add "update" methods to on-line classifiers (and write an on-line classifier)

A user asked for this, and it's

  1. A good idea
  2. Pretty easy to do (for some classifiers)
  3. Useful

The goal is to update a prtAction "on-line" with batches of data. For some actions this is easier than others.

One approach:

1)A) Define a new super-class:
prtOnline or prtStreaming or prtBatch or... something

Comes with an (Abstract? Default?) method.
action = action.update(newData,...)

or
1)B) Add the "update" method to the list of default abstract methods in prtAction

  1. "Update" enables on-line updating.
    action = action.train(dataSet);
    action = action.update(dataSetNew);

Note: the default behavior here can be to do this:

function action = action.update(newData)
totalData = catObservations(action.dataSet,newData); %if "dataStorage" is on
action = action.train(totalData);

Though:

  1. This is somewhat gross
  2. This can be quite slow
  3. It kind of violates the idea of "updating" (vs. re-training with all data)

Something to consider.

proposal

Enable retainFeatures and retainClasses to take cell arrays of feature Names, or cell Names.

e.g.

ds = ds.retainClasses({'explosive','inert'});

Can regexp work?

ds = ds.retainClasses({'explosive_','inert_'}); ?

(Maybe for retainObservations, too?)

Slowdown in new feature name calculations.

I have two data sets, of equal size. One I made by doing 3 pre-processing stages to an original data set, and one I just made with prtDataSetClass.

I run:

tic;
rt(prtPreProcZmuv + prtPreProcEnergyNormalizeRows,dsSynthetic);
toc;

On the data I just made, and I get:

Elapsed time is 0.544173 seconds.

When I run the same code on the data I made with pre-processing, I get:

Elapsed time is 2.422874 seconds.

I'm doing this in a loop, and this time is killing me.

All of the time is being eaten up in

function self = modifyNonDataAttributesFrom(self, action)

in prtDataSetStandard, specifically line 509:

self.featureNameModificationFunction = @(nameIn, index)modFun(self.featureNameModificationFunction(nameIn, index),index);

My hypothesis is that this will keep getting slower as you add more and more blocks together?

prtClassRasmusbergpalmDeepLearningNn broken?

Recently updated, tried:

ds = prtDataGenMnist;
nn = prtClassRasmusbergpalmDeepLearningNn + prtDecisionMap;
yOut = nn.kfolds(ds,2);

Eventually got this error:

epoch 100/100. Took 0.47235 seconds. Mean squared error on training set is 0.26585
??? Reference to non-existent field 'b'.

Error in ==>
prtClassRasmusbergpalmDeepLearningNn>prtClassRasmusbergpalmDeepLearningNn.runAction
at 74
tempY = zeros(size(dataSet.X,1),size(self.nn.b{end},1));

Error in ==> prtAction>prtAction.runActionOnTrainingData at 570
dsOut = runAction(self, dsIn);

Error in ==> prtAction>prtAction.runOnTrainingData at 182
dsOut = runActionOnTrainingData(self, dsOut);

Error in ==> prtAlgorithm>prtAlgorithm.trainAction at 309
input{topoOrder(i)} =
runOnTrainingData(Obj.actionCell{topoOrder(i-1)},currentInput);

Error in ==> prtAction>prtAction.train at 217
self = trainAction(self, ds);

Error in ==> prtAction>prtAction.crossValidate at 359
trainedAction = self.train(trainDs);

Error in ==> prtAction>prtAction.kfolds at 431
[outputs{:}] = self.crossValidate(ds,keys);

Cross Validation

I want to know the accuracy of a classifier by doing a cross validation using a "Leave-one-out" algorithm, so I used the command
classifier.kfolds(dataSet,length(labels))
How can I now the result from it? I tried using all the output variable as it's explained in
http://covartech.github.io/prtdoc/functionReference/prtClassKnn/kfolds.html
but I couldn't find the value indicating the accuracy of the classifier. Can anyone help?

Thanks,
Ana

Cross-validation in Matlab R2016b

Seems odd that this is throwing an error in the latest version of Matlab. I verified that it runs without issue in R2015a.

ds = prtDataGenUnimodal;
fld = prtClassFld;
dsOut = fld.kfolds(ds);

Input arguments to function include colon operator. To input the colon character, use ':' instead.

Error in prtClass/maryOutput2binaryOutput (line 412)
OutputDataSet = OutputDataSet.setObservations(OutputDataSet.getObservations(:,end));

Error in prtClass/postRunProcessing (line 360)
dsOut = maryOutput2binaryOutput(self,dsOut);

Error in prtAction/run (line 254)
dsOut = postRunProcessing(self, dsIn, dsOut);

Error in prtAction/crossValidate (line 369)
outputDataSetCell{uInd} = trainedAction.run(testDs);

Error in prtAction/kfolds (line 553)
[outputs{:}] = self.crossValidate(ds,keys);

ok... why is PCA different for regression and classification objects

consider:

data = rand(30,1000);
ds = prtDataSetRegress(data)

pca = prtPreProcPca('nComponents',100);
pca = pca.train(ds);

%%

ds1 = prtDataGenFeatureSelection
pca = prtPreProcPca('nComponents',100);
pca = pca.train(ds1);

They both warn, but the first one tells me my data dimension is 30, the second warning say 8. But the regess object, my number of features is 1000, and num samples is 30. The second, my number of features is 8 and my num samples is 800. So the 1st one should not warn, no? What is the difference here because its a regression object?

Find boundaries

I want to know the values that characterise each class in the classifier.
How can I extranct the class boundaries from it?

prtDecisionMap

What's the difference between

classifier = prtClassKnn + prtDecisionMap;

and

classifier = prtClassKnn;
classifier.internalDecider= prtDecisionMap; ???

Because I'm getting very different results in terms of classification between this 2...

Thanks

plotDensity in prtDataSetClass is broken

Breaks on line 762 of prtDataSetClass:

xLoc = sort(cat(1,xLoc(:),ds.getObservations(:,iFeature)),'ascend');

More specifically:

ds.getObservations(:,iFeature)

'Input arguments to function include colon operator. To input the colon character,
use ':' instead.'

Not sure if it has something to do with using a newer version of MATLAB (breaks in 2015b and 2016a), or a change made in ds.getObservations.

Changing the line in question to:

xLoc = sort(cat(1,xLoc(:),ds.X(:,iFeature)),'ascend');

fixes the problem.

Feature Request: Method for obtaining the highest/lowest confidence observations from each target class

After running/training a classifier on some observations for a binary decision problem, I often like to quickly extract the most "easy" and "difficult" observations from each target class. In other words, I would like a method (or methods) that will quickly provide me with:
(1) The 'n' observations with the largest decision statistic from the positive class
(2) The 'n' observations with the lowest decision statistic from the positive class
(3) The 'n' observations with the largest decision statistic from the negative class
(4) The 'n' observations with the lowest decision statistic from the negative class

Alternatively, it would be nice to have a single method that independently sorts the observations under each target class according to their decision statistics.

Subtleties when catObs is used on data sets with different class labels for the same class #'s

Some weird examples.

ds1 has class 1, name "tnt"
ds2 has class 1, name "rdx"

Should I be able to catObservations these two? Right now I can't since, TNT and RDX conflict in class integer space.
Maybe allow catObservations(ds1,ds2,'-force') to force the PRT to merge them so RDX is now class 2?

Also, say

ds1 has classes 1 and 2, names tnt and rdx
ds2 has classes 2 and 3 names hme and un

ds1 = ds1.retainClasses('tnt');
catObservations(ds1,ds2) this also errors, since the internal understanding inside ds1 is that "2" still corresponds to "rdx".

Two questions here:

  1. Should we check the class name cache all the time and reduce it, so if there aren't any "2"'s, the object doesn't know about "rdx"? This seems weird in some ways, but correct in others.

  2. Should we default to "-force" in catObservations?

  3. More to the point, should we hide "targets" and rely more on interfaces through "class" strings? This seems like a pain, but if catObservations is messing with .targets (which it has to do to make the above work, then maybe this is all we can do?)

time consuming line in prtRvMixture.set.components

I end up spending a lot of time in prtUtilIsMethodIncludeHidden, looking for weightedMle when I need to make a lot of GMM's over and over again...

Like > 1/3rd of the whole process. Can we have a new class prtRvMixable, that defines abstract weightedMle? Then if you want to be mixable, you inherit from prtRv & prtRvMixable?

                                          function R = set.components(R,CompArray)

0.20 26489 117 if ~isempty(CompArray)
1.29 26489 118 assert(isa(CompArray(1),'prtRv'),'components must be a prtRv');
64.10 26489 119 assert(prtUtilIsMethodIncludeHidden(CompArray(1),'weightedMle'),'The %s class is not capable of mixture modeling as it does not have a weightedMle method.',class(CompArray(1)));
0.26 26489 120 assert(isvector(CompArray),'components must be an array of prtRv objects');
0.01 26489 121 end

SVM multiclass problem with probability estimates

Hello,

As far as I could dig in, I could not find a way of training a M-Ary classifier using prtClassLibSvm as a base classifier and output not the class, but the probability estimates for each of the classes.

What am I missing? Thanks.

Undefined function 'subDir' for input arguments of type 'char'

Hi, when I run torrione_ExampleCoatesNg_Kmeans_Mscorid, The error information is:

Undefined function 'subDir' for input arguments of type 'char'.
Error in prtDataGenMsrcorid (line 103)
fList = subDir(cDir,'*.jpg');
Error in torrione_ExampleCoatesNg_Kmeans_Mscorid (line 48)
ds = prtDataGenMsrcorid;

My matlab version is 2012a.
I don't konw why.
Thank you。

catFeatures destroys feature names

ds = prtDataGenUnimodal;
ds.featureNames = {'asdf','asdf2'};
yOut = rt(prtClassFld,ds);
plot(catFeatures(yOut,ds))

All features are now called the wrong thing.

Not sure what to do with this. Seems to be a pretty fundamental problem in the way feature names are handled...

Problem in Signal Patter Recognition

Hi,
So I'm trying to distinguish between 2 movements of the hand with a signal acquired from the muscle of the arm.
I have 16 signals of 3 seconds for each movement (16 matrices of 1x3000 samples for each movement). Then I calculated the Integral, the mean absolute value, simple square integral, variance and root mean square of each signal, ending up with two 16x5 matrices. So, I have 16 signals and 5 features. I've made a 16x5 matrix of labels but know, I'm not able to use the toolbox. I mean, it always says that the dimensions don't match... I'm not sure what I'm doing wrong here. If you could help me I'll appreciated.

Thanks,
Ana

cluster plotting is broken

k = prtClusterKmeans('nClusters',4);
k = train(k,prtDataGenBimodal);

plot(k)

That does not look correct to me

RVM should check that input data set is binary

Unary data set gives weirdass error

ds = prtDataSetClass;
ds = ds.setX(1);
ds = ds.setY(1);
cl = prtClassRvm;
cl.train(ds)

??? Attempted to access yMat(:,2); index out of bounds because
numel(yMat)=1.

Error in ==> prtClassRvm>prtClassRvm.getMinusOneOneTargets at
325
y(yMat(:,2) == 1) = 1;

Error in ==> prtClassRvm>prtClassRvm.trainAction at 201
y = Obj.getMinusOneOneTargets(DataSet);

Error in ==> prtAction>prtAction.train at 220
Obj = trainAction(Obj, DataSet);

No appropriate method, property, or field nFeatures for class prtDataSetTimeSeries

HI,

I try the prtDataSetTimeSeries, then setup a HMM and start to train a model,

training = prtDataSetTimeSeries;
training = training.setX(features); %
...
training = training.setY(classes);
training.classNames = uniqueLabels;

And found that error occurs in prtDataSetInMem.m , says

No appropriate method, property, or field nFeatures for class prtDataSetTimeSeries.

            if self.nFeatures > 0 % No data?
                self.data = self.data(indices,:);
            end

Indeed I found the self.summarize.nFeatures in this file could give some value, so is there a bug ?

In master branch, UBUNTU 14.04 , matlab2014 and matlab2012

A possible problem for algorithms dealing with new, complicated datatypes that don't have catFeatures

Right now, algorithms use "catFeatures" to handle when there are multiple feature sets coming into a new Algorithm:

prtPreProcPca/prtClassPlsda+prtFusion...

It would really happen in the above, where a "/" is followed by a single action.

To make the code simple, we also call catFeatures everywhere instead of trying to figure out when we should call it (it's easy enough to do).

The problem is that for prtClassMultiplInstance, for example, there is no such thing as "catFeatures".

I see a few options.

  1. We can re-write the current algo.run code to only use catFeatures when necessary. This will still break for MIL data in the case above, but lets you write straight algorithms. This has to be a temporary fix. I'll do it now.

  2. Write catFeatures for dsMIL. There might be other data sets where catFeatures doesn't make any sense. We should either figure out the "right thing" or warn / error nicely...

Making predictions with PRT

Hello everybody and many many thanks to all those people who have developed the very useful Pattern Recognition Toolbox!
I have just started playing with PRT and I have a question about making predictions.
More in detail: once I have loaded my personal dataset, preprocessed the data, created and run the classifier and performed the cross-validation, which commands should I use to make a prediction (consider, for instance, a classification problem with 2 classes) for a new vector (say x) not belonging to the training set?
Thank you very much and best regards,

Mauro

prtRegressRvm: error due to empty variable relevantIndices

I get the error

    Error using .*
    Matrix dimensions must agree.

    Error in prtRegressRvm/trainAction (line 165)
    cG = 1 - alpha(relevantIndices).*diag(Obj.Sigma);

    Error in prtAction/train (line 194)
    self = trainAction(self, ds);

    Error in rvm_prt (line 23)
    regress = reg.train(ds_train);

which is the same as from the post "one step ahead regression with RVM/RVMSequential" (http://anewfolder.com/node/579).
The problem is that the variable relevantIndices is empty after some iterations in the for loop. This deterministically happens on some data, while it works fine with other data. If I change the DataSet.targets to some other values, it also works fine (so same dimensionality, number of features, etc.).

Any ideas on this? Is this a bug in prtRegressRvm or am I missing something about the behaviour of an RVM?

feature names can still be broken pretty easily

Code:
yOut = kfolds(prtClassPlsda,ds,3);
yOut.X = cat(2,yOut.X,randn(size(yOut.X)));
yOut.featureNames

Index exceeds matrix dimensions.

Error in prtDataSetStandard/getFeatureNames (line 209)
if ~isempty(obj.featureNameModificationFunction) &&
obj.featureNameModificationMask(iFeat)

Error in prtDataSetStandard/get.featureNames (line 58)
fn = self.getFeatureNames;

Warning: Could not find GraphViz installation

Hello, I have installed the graphviz-2.38.msi , when I run prtSetup, the warning information is as follows:

Warning: Could not find GraphViz installation. Graphviz must be installed and on the system
path to enable prtAlgorithm plotting. Go to http://www.graphviz.org/ and follow the
installation instructions for your operating system.
Anyone konw ?
Thank you!

Fix plotting methods to take varargin (e.g., to slicer or image) to easily make nicer plots.

A bunch of objects (e.g. prtRv.plotPdf, plotLogPdf, classifier plots, etc.) all call

prtPlotUtilPlotGriddedEvaledFunction

or some equivalent. It would be nice if we could pass input arguments into that method cleanly. So you could specify, e.g., parameters to slice, or imagesc.

I did this in prtRv.plot so you can do:

close all;
plot(dsPrtPreProc);
hold on;
g = prtRvGmm
g.plotPdf(axis,'slicerLocations',{4,4,[0 20]},'FaceAlpha',0.5);

But it's not ideal - e.g., the first input to plotPdf is the axis limits for some reason. That should be:

g,plotPdf('axisLimits',axis,...)

The generalization of this is not straightforward to me though. Should there be:

prtUtilDefaultImagescVarargin?

Or did we miss the boat here, and we need a

prtPlotUtilPlotGriddedEvaled OBJECT?

Proposal: add field to featureInfo or a field to prtDataSetStandard - isDiscrete

We often deal with discrete variables simply by pretending they don't exist.

But sometimes you want to do things slightly differently if things are discrete - e.g., histograms instead of ksdensities. Or even categorical (vs. discrete).

even ksdensity is OK., but even it often treats things you want to be continuous as discrete.

Couple of options:

  1. Don't do anything

  2. add isFeatureDiscrete to prtDataSetStandard; gets modified like everything else (defaults to "true"; no automatic inference, you gotta set it)

  3. add field to .featureInfo, isDiscrete.

(1) is the way things have been. It's not that bad.

(2) takes a little more behind the scenes effort to make sure catFeatures, removeFeatures, etc. all work (even catObservations), but then is pretty much invisible unless you want to use it, (and it's easy to use - anything that subclasses prtDataSetStandard automatically gets it)

(3) takes more memory (we need an array of structs instead of an array of logicals), and is harder to enforce and query; but may be simpler in some cases.

error during prtSetup

I have download the PRT from this web link https://github.com/covartech/PRT
Unzip , keep in C:\Users\Administrator\Documents\MATLAB

prtDoc
prtSetup
Warning: Removed
'C:\Users\Administrator\Documents\MATLAB\covartech-PRT-e944d00\doc\functionReference\prtAction'
from the MATLAB path for this MATLAB session.
..
many more

problem in "explore" for unlabeled targets

ds = catFeatures(prtDataGenUnimodal,prtDataGenUnimodal);
ds.targets(10) = nan;
plot(ds)

Error using prtDataSetClass/explore (line 928)
An unexpected error was encountered with explore(). If this error persists you may
want to try using exploreSimple().

Error in prtDataSetClass/plot (line 959)
explore(obj);

The problem has something to do with the number of things in the UITABLE, and the fact that "displayLogical" is too short when there are NANs in the targets:

The actual error is on line 165 of prtPlotUtilDataSetExploreGuiWithNavigation

prtClassLibSVM doesn't like Singles as the input

It looks like LibSVM doesn't handle singles as its input, and displays an error, but then the PRT wigs out.

TestDataSet = prtDataGenUnimodal; % Create some test and
TrainingDataSet = prtDataGenUnimodal; % training data
classifier = prtClassLibSvm; % Create a classifier
classifier = classifier.train(TrainingDataSet); % Train
classified = run(classifier, TestDataSet); % Test

Gives this error:

Error: label vector and instance matrix must be double
Error: label vector and instance matrix must be double
Index exceeds matrix dimensions.

Error in prtDataSetInMem/retainObservationData (line 612)
self.data = self.data(indices,:);

Error in prtDataSetInMem/retainObservations (line 195)
self = self.retainObservationData(indices);

Error in prtDataInterfaceCategoricalTargets/retainLabeled (line 392)
obj = obj.retainObservations(retainInd);

Error in prtClassLibSvm/trainAction (line 274)
auc = prtScoreAuc(yOut.retainLabeled);

Error in prtAction/train (line 221)
self = trainAction(self, ds);

libsvm compilation in Matlab2016a

prtSetupMex fails with

error: conflicting types for ‘mwIndex’
typedef int mwIndex;

for each .c file in +prtExternal/+libsvm

Libsvm now has a Github page: https://github.com/cjlin1/libsvm/tree/master/matlab
and adopting the latest version seems to solve this problem. Specifically,
#ifdef MX_API_VER
#if MX_API_VER < 0x07030000
typedef int mwIndex;
#endif
#endif
replaces
#if MX_API_VER < 0x07030000
typedef int mwIndex;
#endif

More testing is required to verify this solution. We should also determine if there is a better way of including dependencies on external software like this.

M-ary dataSet in feature selection

Hi,

Is there a way to use feature selection with a dataSet with 3 classes (besides prtFeatSelStatic) ?

I'm using a DataSet with 7 features and 3 classes, I would like my code to choose from all 7 features, the ones that work better.

Thanks,
Ana

Dificulties using classifier

Hi,

I'm trying to use a classifier that I've created, to see how accurate it can be.
I have 2 features of 157 independent observations of 2 classes, which I used to created a Maximum a Posteriori classifier
features is a 157x2 matrix
labels = [zeros(79,1); ones(78,1)];
dataSet = prtDataSetClass(features,labels);
classifier = prtClassMap;
classifier = classifier.train(dataSet);

I have 1 single observation to test the classifier
features_test is a 1x2 matrix
labels_test = 0;
dataSet_test = prtDataSetClass(features_test,labels_test)

After I run the classifier
classified = run(classifier, dataSet_test)

I try to use this comand
prtScoreRoc(classified,dataSet_test);

but it gives me this error
Error using prtScoreRoc (line 111)
ROC requires input labels to have 2 unique classes; unique(y(:)) = 0

How can I solve this? Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.