gyrdym / ml_algo Goto Github PK
View Code? Open in Web Editor NEWMachine learning algorithms in Dart programming language
Home Page: https://gyrdym.github.io/ml_algo/
License: BSD 2-Clause "Simplified" License
Machine learning algorithms in Dart programming language
Home Page: https://gyrdym.github.io/ml_algo/
License: BSD 2-Clause "Simplified" License
Going to http://ml-algo.com, I get a Namecheap page saying the domain was recently registered. The https version of that URL is registered at https://pub.dev/publishers/ml-algo.com/packages . I'd recommend at least setting up a quick one-page website there with a link back to github - it helps with people like me, who try to look into a publisher before using a package off pub.dev :-)
Hello,
At the moment I have to retrain the model each time I use it. Fortunately, this does not take a lot of time, but because of that all sample data has to be present anywhere the model is used. One workaround that I found is to manually initialize a LinearRegressorImpl object with coefficients obtained from a trained LogisticRegressor model. However, this requires importing package private files which is not ideal. Adding a way to persist regressor models would be a great improvement!
Best regards.
Why does this lib doesn't work on the web platform? any plan to implement web support?
I see we can only choose between Gradient and Coordinate. I believe this is why the data I'm getting back isn't what I expect. If that's not the reason, please let me know, thanks!
For example, if I have the data
Grind, RoastLevel, Time
5.5, 5, 20
5.25, 5, 22
And I make RoastLevel and Time independent variables, and Grind dependent, I can get an expected prediction with Python using Sklearn out of the box.
data_minus_grind = []
grind_data = []
for row in floats:
data_minus_grind.append(row[1:3])
grind_data.append(row[0])
model = LinearRegression().fit(data_minus_grind, grind_data)
prediction_data = [[5,25]]
prediction = model.predict(prediction_data)
This prediction, trying to predict the Grind for the Time 25, I receive 4.875, which sounds about right. Grind should go down while Time goes up.
However, if I try to use this library, my slope is always in the wrong direction, with the two variables moving in the same direction for some reason.
If I try
final samples = DataFrame.fromRawCsv(rawCsvContent, headerExists: true);
const targetColName = "Grind";
final defaultRegressor = LinearRegressor(
samples,
targetColName
);
final dataToPredict = [
// Roast level, time
[ 5, 25.0 ]
];
final dataframeToPredict = DataFrame(dataToPredict, headerExists: false);
final prediction = regressor.predict(dataframeToPredict);
Then the result for Grind with a Time of 25.0 is 6.79. As I move up the time, Grind should decrease, but instead it increases. I've tried tweaking many of the parameters but haven't found a fix.
Thanks!
See details in the report on pub.dartlang.org
How difficult would it be to implement RL aglorithms / provide support for RL-based training methods? Is this on the roadmap by any chance?
Create an entry file
Hello !
I have been using flutter on my free time and found out about this interesting library.
As a side project, I would like to add an implementation of the random forest to the library via a pull request.
Would you be interested ?
Is there a way to get the path down the tree for a prediction from the decision tree? I can import the implementation, copy/paste the predict method and add nodes to a list, but that's not ideal.
Hi, when I made a DecisionTreeClassification how can I presist it's model so we can reuse it in my next sessions because training in DTree takes a lot of time?
I want to return the nearest neighbor to my data point. However, it is returning the wrong one. I tried returning multiple nearest neighbors and there also seem to be some inconsistencies with the returned list of neighbors.
In the screenshot below, if I return the nearest neighbor, it has distance 8,9. But if I return the two nearest neighbors then it actually returns to me the two nearest ones with distances 2,0 and 3,19.
Also when returning the 3 nearest and the 4 nearest neighbors. On the k=3, the values are wrong
Now it is extremely slowly to calculate derivatives due to indexed vector elements access in gradient calculator. Use the well known formula.
Maybe it worths to refer to kaggle competitions
Hello, I have a chart with many data points in my Flutter app and I am trying to draw a trend line in it as you can see in this example: https://google.github.io/charts/flutter/example/combo_charts/scatter_plot_line
To do so, I have programmed the class seen below and implemented the LinearRegressor in it. I want to display the line by determining the y-values for two values on the x-axis of my chart using my predict function and then drawing a line through those points. However, it seems that there is a bug in my predict function which you can see below and I can't quite figure it out.
I can say, that in the train function I have two columns, in the first column are the x values and in the second column are the corresponding y values of the points in the chart. I then use this data to train the LinearRegressor. The predict function takes an x-value, which is why the ySeries, the second column, is empty there. With one row and two columns, the second of which is empty, I then try to predict this empty space, which is the y-value.
I assume that something is wrong here in my implementation of the prediction, but unfortunately I don't understand what. I hope the error description is sufficient.
This is the error I get:
E/flutter ( 8937): [ERROR:flutter/lib/ui/ui_dart_state.cc(199)] Unhandled Exception: Exception: The dimension of the vector and the columns number of the matrix mismatch
E/flutter ( 8937): #0 MatrixImpl._matrixVectorMul (package:ml_linalg/src/matrix/matrix_impl.dart:500:7)
E/flutter ( 8937): #1 MatrixImpl.* (package:ml_linalg/src/matrix/matrix_impl.dart:89:14)
E/flutter ( 8937): #2 LinearRegressorImpl.predict (package:ml_algo/src/regressor/linear_regressor/linear_regressor_impl.dart:152:11)
E/flutter ( 8937): #3 AnalyticsLinearRegression.predict (package:trimlog/services/ml/analytics_linear_regression.dart:87:35)
E/flutter ( 8937): <asynchronous suspension>
E/flutter ( 8937): #4 _AnalyticGraphState.build.<anonymous closure>.<anonymous closure>.<anonymous closure>._predict.<anonymous closure> (package:trimlog/screens/analytics/analytics_graphs.dart:143:49)
E/flutter ( 8937): <asynchronous suspension>
E/flutter ( 8937):
This is the class used for the linear regression:
class AnalyticsLinearRegression {
List<Trim> trims;
final String xCategory;
final String xParameter;
final String yCategory;
final String yParameter;
AnalyticsLinearRegression(this.trims, this.xCategory, this.xParameter, this.yCategory, this.yParameter);
List<Series> _prepareData() {
// Remove trims which do not contain the parameter shown in this analytic
List<Trim> temp = new List.from(trims);
trims.forEach((trim) {
Map<String, dynamic> map = trim.toMap();
if ((!(map[xCategory] as Map).containsKey(xParameter)) || (map[xCategory][xParameter] == null) || (!(map[yCategory] as Map).containsKey(yParameter)) || (map[yCategory][yParameter] == null))
temp.remove(trim);
});
trims = temp;
// Extract the parameters shown in the analytic from the trims
List x = [];
List y = [];
trims.forEach((trim) {
Map<String, dynamic> map = trim.toMap();
x.add((map[xCategory][xParameter] is List ? map[xCategory][xParameter].first : map[xCategory][xParameter]) * 1.0);
y.add((map[yCategory][yParameter] is List ? map[yCategory][yParameter].first : map[yCategory][yParameter]) * 1.0);
});
Series xSeries = Series(xParameter, x); // First column, given parameter
Series ySeries = Series(yParameter, y); // Second column, predicted parameter
return [xSeries, ySeries];
}
Future train() async {
final Iterable<Series> data = _prepareData();
final dataFrame = DataFrame.fromSeries(data);
if (dataFrame.rows.length <= 2) return; // <= 2 datapoints results in errors
final targetColumnName = yParameter; // The second column (y) contains the parameter that I later want to predict
final splits = splitData(dataFrame, [0.7]);
final validationData = splits[0];
// final testData = splits[1];
final validator = CrossValidator.kFold(validationData, numberOfFolds: validationData.rows.length - 1);
final createClassifier = (DataFrame samples) => LinearRegressor(
samples,
targetColumnName,
optimizerType: LinearOptimizerType.gradient,
iterationsLimit: 90,
learningRateType: LearningRateType.decreasingAdaptive,
batchSize: samples.rows.length,
);
final scores = await validator.evaluate(createClassifier, MetricType.rmse);
final accuracy = scores.mean();
print('Accuracy on root mean squared error (RMSE) validation: ${accuracy.toStringAsFixed(2)}');
// final testSplits = splitData(testData, [1.00]);
// final classifier = createClassifier(testSplits[0]);
// final finalScore = classifier.assess(testSplits[1], MetricType.rmse);
// print(finalScore.toStringAsFixed(2));
// await classifier.saveAsJson(xParameter + '_' + yParameter + '_classifier.json');
final classifier = createClassifier(dataFrame);
await classifier.saveAsJson(await _classifierPath);
}
Future retrain(List<Trim> newData) async {
final classifier = await _linearRegressor;
trims = newData;
final Iterable<Series> data = _prepareData();
final dataFrame = DataFrame.fromSeries(data);
final retrainedClassifier = classifier.retrain(dataFrame);
await retrainedClassifier.saveAsJson(await _classifierPath);
}
/// Predicts the y value (seceond column) of a given x value (double)
/// Can be used / Is used to get to points and draw a line through these points as a trendline (like here: https://google.github.io/charts/flutter/example/combo_charts/scatter_plot_line)
Future<double> predict(double x) async {
final classifier = await _linearRegressor;
Series xSeries = Series(xParameter, [x]); // First column value (x)
Series ySeries = Series(yParameter, []); // Second column value (y) should get predicted and returned, therefore this is empty
final data = DataFrame.fromSeries([xSeries, ySeries]);
final prediction = classifier.predict(data); // Predict the corresponding y value to the given x value
return prediction.rows.first.first; // Prediction should only contain one row and this row should contain the predicted y value
}
Future<String> get _classifierPath async => (await getTemporaryDirectory()).path + "/" + xParameter + '_' + yParameter + '_classifier.json'; // Path where the classifier is saved
Future<File> get _file async => File(await _classifierPath); // File containing the classifier (JSON)
Future<String> get _encodedModel async => (await _file).readAsString(); // Classifier as JSON
Future<LinearRegressor> get _linearRegressor async => LinearRegressor.fromJson(await _encodedModel); // Linear regressor from file
}
It is needed to give possibility to fill these values either 0, mean value of the column, custom value, median value or thrown an error
Here is my data :
(src, day, time, dest)
(0, 0, 450, 4)
(1, 0, 110, 5)
(0, 1, 450, 4)
(1, 1, 110, 5)
(0, 2, 450, 4)
(1, 2, 110, 5)
(0, 3, 450, 4)
(1, 3, 110, 5)
(0, 4, 450, 4)
(1, 4, 110, 5)
(0, 5, 450, 4)
(1, 5, 110, 5)
(2, 6, 660, 6)
(3, 6, 1170, 7)
(0, 0, 450, 4)
(1, 0, 110, 5)
(0, 1, 450, 4)
(1, 1, 110, 5)
(0, 2, 450, 4)
(1, 2, 110, 5)
(0, 3, 450, 4)
(1, 3, 110, 5)
(0, 4, 450, 4)
(1, 4, 110, 5)
(0, 5, 450, 4)
(1, 5, 110, 5)
(2, 6, 660, 6)
(3, 6, 1170, 8)
And it then throws this exception while trying try to create the classifier:
Unhandled exception:
Invalid argument(s)
#0 _TypedList._setFloat32 (dart:typed_data-patch/typed_data_patch.dart:2126:36)
#1 _Float32ArrayView.[]= (dart:typed_data-patch/typed_data_patch.dart:4461:16)
#2 new Float32MatrixDataManager.fromList
package:ml_linalg/…/data_manager/float32_matrix_data_manager.dart:37
#3 MatrixFactoryImpl.fromList
package:ml_linalg/…/matrix/matrix_factory_impl.dart:21
#4 new Matrix.fromList
package:ml_linalg/matrix.dart:42
#5 DataFrameImpl.toMatrix
package:ml_dataframe/…/data_frame/data_frame_impl.dart:143
#6 createLogLikelihoodOptimizer
package:ml_algo/…/_helpers/create_log_likelihood_optimizer.dart:46
#7 LogisticRegressorFactoryImpl.create
package:ml_algo/…/logistic_regressor/logistic_regressor_factory_impl.dart:58
#8 new LogisticRegressor
package:ml_algo/…/logistic_regressor/logistic_regressor.dart:153
#9 main.<anonymous closure>
bin\knn.dart:41
#10 main
bin\knn.dart:53
<asynchronous suspension>
Classifier is constructed this way :
final createClassifier = (DataFrame samples) => LogisticRegressor(
samples,
targetColumnName,
optimizerType: LinearOptimizerType.gradient,
iterationsLimit: 90,
learningRateType: LearningRateType.decreasingAdaptive,
batchSize: samples.rows.length,
probabilityThreshold: 0.7,
);
Hey,
Thanks a lot for the library. Really impressed with how much you can do with dart!
Trying to run a linear regression for a simple line y(x) = x
, found following issues which I suppose are due to configuration of the regressor. Please help to configure
The code below gives my expected result for most of the cases, with k
around 1.00
. However in some cases, i.e.
a=1 n=10 -> k (0.9994153380393982) rows ((9.994153022766113))
a=0 n=10 -> k (0.3038938045501709) rows ((3.038938045501709))
a=-10 n=10 -> k (0.5980027318000793) rows ((5.980027198791504))
a=1 n=100 -> k (NaN) rows ((0.0))
the result is different. Is this because of the configuration?
Also is there a way to retrieve b
from y(x) = kx + b
?
Thank you!
import 'package:ml_algo/ml_algo.dart';
import 'package:ml_dataframe/ml_dataframe.dart';
import 'package:xrange/xrange.dart';
main() {
var a = 1;
var n = 100;
var _data = NumRange.closed(a, n).values().map((it) => [it, it]) ;
final data = [['x', 'y'], ..._data];
print(data);
final samples = DataFrame(data, headerExists: true);
final regressor = LinearRegressor(samples, 'y');
var prediction = regressor.predict(DataFrame([['x', 'y'], [10.0,]],));
print("a=$a n=$n -> k ${regressor.coefficients} rows ${prediction.rows}");
}
This is a great project. I started to hate Python after using Dart with Flutter. I realized that I am still googling every basic stuff when using Python whereas with Dart it just takes microseconds to find the right method after putting dot. I wonder if it is possible to integrate popular machine learning algorithms like xgboost in Dart using FFI.
Thanks a lot for creating this library.
Hi! Thanks for the great plugin!
Are you planning to make a classifier with support vector machine?
If that is the case, when is it going to be released?
Thanks.
I'm facing a problem with ml_algo. The package requires > 3.0.1 for json_annotation, but I also use build_runner that requires > 4.0.1. This stop my development because pub get doesn't work with this incompatibility. If it won't be a problem please update all json_annotation to 4.0.1 from the packages.
decreasing 2 times as smaller every iteration, custom change and so on
Hi, what would be the best way to persist a knn model without recreating it every time? Is this also serializable, or does this model always need data to compute predictions lazily?
Add an entry file
There are some methods from ml_linalg that are deprecated. It is needed to replace them with actual ones.
Can this library be used for face recognition?,
sorry I'm a beginner in this part of the data, sorry my question is stupid
Hi, I have been trying to replicate the example here, but I cannot write the json classifier model due to the following error:
Running "flutter pub get" in logistic_regressor...
Launching lib\main.dart on sdk gphone x86 in debug mode...
Running Gradle task 'assembleDebug'...
√ Built build\app\outputs\flutter-apk\app-debug.apk.
Installing build\app\outputs\flutter-apk\app.apk...
Debug service listening on ws://127.0.0.1:64760/6bAEB-pabM4=/ws
Syncing files to device sdk gphone x86...
I/flutter ( 7530): accuracy on k fold validation: 0.63
I/flutter ( 7530): 0.76
E/flutter ( 7530): [ERROR:flutter/lib/ui/ui_dart_state.cc(199)] Unhandled Exception: FileSystemException: Cannot create file, path = 'diabetes_classifier.json' (OS Error: Read-only file system, errno = 30)
E/flutter ( 7530): #0 _File.create.<anonymous closure> (dart:io/file_impl.dart:255:9)
E/flutter ( 7530): #1 _rootRunUnary (dart:async/zone.dart:1362:47)
E/flutter ( 7530): #2 _CustomZone.runUnary (dart:async/zone.dart:1265:19)
E/flutter ( 7530): <asynchronous suspension>
E/flutter ( 7530): #3 SerializableMixin.saveAsJson (package:ml_algo/src/common/serializable/serializable_mixin.dart:9:18)
E/flutter ( 7530): <asynchronous suspension>
E/flutter ( 7530): #4 _MyHomePageState.trainModel (package:logistic_regressor/main.dart:99:5)
E/flutter ( 7530): <asynchronous suspension>
E/flutter ( 7530):
I have tried using permission in android/app/src/main/AndroidManifest.xml
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
package="com.example.logistic_regressor">
<uses-permission android:name="android.permission.MANAGE_EXTERNAL_STORAGE" />
and also requested the permission via Permission.manageExternalStorage.request() in my main.dart:
import 'dart:io';
import 'package:flutter/material.dart';
import 'package:ml_algo/ml_algo.dart';
import 'package:ml_dataframe/ml_dataframe.dart';
import 'package:ml_preprocessing/ml_preprocessing.dart';
import 'package:flutter/services.dart' show rootBundle;
import 'package:permission_handler/permission_handler.dart';
void main() {
runApp(MyApp());
}
class MyApp extends StatelessWidget {
@override
Widget build(BuildContext context) {
return MaterialApp(
title: 'Flutter Demo',
theme: ThemeData(
primarySwatch: Colors.blue,
),
home: MyHomePage(title: 'Flutter Demo Home Page'),
);
}
}
class MyHomePage extends StatefulWidget {
MyHomePage({Key? key, required this.title}) : super(key: key);
final String title;
@override
_MyHomePageState createState() => _MyHomePageState();
}
class _MyHomePageState extends State<MyHomePage> {
void trainModel() async {
final rawCsvContent = await rootBundle.loadString('datasets/pima_indians_diabetes_database.csv');
final samples = DataFrame.fromRawCsv(rawCsvContent);
// === Prepare Dataset ===
final targetColumnName = 'class variable (0 or 1)';
final splits = splitData(samples, [0.7]);
final validationData = splits[0];
final testData = splits[1];
// === Setup model selection algorithm ===
final validator = CrossValidator.kFold(validationData, numberOfFolds: 5);
final createClassifier = (DataFrame samples) =>
LogisticRegressor(
samples,
targetColumnName,
optimizerType: LinearOptimizerType.gradient,
iterationsLimit: 90,
learningRateType: LearningRateType.decreasingAdaptive,
batchSize: samples.rows.length,
probabilityThreshold: 0.7,
collectLearningData: true,
);
// === Evaluate model performance ===
final scores = await validator.evaluate(createClassifier, MetricType.accuracy);
final accuracy = scores.mean();
print('accuracy on k fold validation: ${accuracy.toStringAsFixed(2)}');
final testSplits = splitData(testData, [0.8]);
final classifier = createClassifier(testSplits[0]);
final finalScore = classifier.assess(testSplits[1], MetricType.accuracy);
print(finalScore.toStringAsFixed(2)); // approx. 0.75
// === Write the model to JSON file ===
var status = await Permission.manageExternalStorage.status;
if (status.isDenied) {
await Permission.manageExternalStorage.request();
}
await classifier.saveAsJson('diabetes_classifier.json');
}
@override
void initState() {
// TODO: implement initState
super.initState();
trainModel();
}
@override
Widget build(BuildContext context) {
return new MaterialApp(
home: new Scaffold(
appBar: new AppBar(
title: new Text('Plugin example app'),
),
body: new Center(
child: new Column(children: <Widget>[
new Text('Running'),
]),
),
),
);
}
}
If anyone got insight on what the problem is, I would really appreciate if you could help.
Cheers!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.