GithubHelp home page GithubHelp logo

winvector / logistic Goto Github PK

View Code? Open in Web Editor NEW
35.0 35.0 60.0 3.66 MB

Experimental logistic regression code supporting multiple result categories, many levels of categorical modeling variables, good optimization, L2 regularization and more.

Home Page: http://www.win-vector.com/blog/2010/12/large-data-logistic-regression-with-example-hadoop-code/

Shell 0.22% Java 99.78%

logistic's People

Contributors

johnmount avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

logistic's Issues

question of the mapreduce's function

hi~ I read your Logistic code, but I don't understand that your map's function yet. Which part of the Logistic regression algorithm does it implement? Could you tell me what your map and reduce's code implement?

File does not exist: for log reg training

Hi,

I am trying to run the logistic regression training and get an error java.io.FileNotFoundException: File does not exist: /user/ameet/test.csv I tried both dir and file but same result

I have tried both plain java and hadoop modes and get the same issue.
in Java, plain file system, I get following result:
java -cp WinVectorLogistic.Hadoop0.20.2.jar:commons-cli-1.2.jar:hadoop-0.20.2-cdh3u2-core.jar:commons-logging-api-1.0.4.jar:commons-logging-1.0.4.jar com.winvector.logistic.LogisticTrain -trainURI test.csv -formula "col4 ~ col1 + col2 + col3"
Feb 22, 2012 9:58:44 AM com.winvector.logistic.LogisticTrain main
INFO: start LogisticTrain Wed Feb 22 09:58:44 EST 2012
Feb 22, 2012 9:58:44 AM com.winvector.logistic.LogisticTrain main
INFO: source URI: /mnt/home/ameet/bofa/ML/logistic/uciCarTrain.tsv
Feb 22, 2012 9:58:44 AM com.winvector.logistic.LogisticTrain main
INFO: trainer: class com.winvector.logistic.LogisticTrain
Exception in thread "main" java.lang.RuntimeException: java.lang.IllegalArgumentException: URI is not absolute
at com.winvector.util.TrivialReader.iterator(TrivialReader.java:203)
at com.winvector.util.TrivialReader.iterator(TrivialReader.java:1)
at com.winvector.logistic.LogisticTrain.buildVariableDefs(LogisticTrain.java:48)
at com.winvector.logistic.LogisticTrain.buildAdpater(LogisticTrain.java:60)
at com.winvector.logistic.LogisticTrain.train(LogisticTrain.java:272)
at com.winvector.logistic.LogisticTrain.run(LogisticTrain.java:291)
at com.winvector.logistic.LogisticTrain.main(LogisticTrain.java:258)
Caused by: java.lang.IllegalArgumentException: URI is not absolute
at java.net.URI.toURL(URI.java:1079)
at com.winvector.util.TrivialReader.openBufferedReader(TrivialReader.java:82)
at com.winvector.util.TrivialReader.iterator(TrivialReader.java:201)
... 6 more

and in hadoop:
java.io.FileNotFoundException: File does not exist: /user/ameet/test.csv
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1602)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.(DFSClient.java:1593)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:428)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456)
at com.winvector.logistic.mr.WritableUtils.readFirstLine(WritableUtils.java:39)
at com.winvector.logistic.demo.MapReduceLogisticTrain.run(MapReduceLogisticTrain.java:64)
at com.winvector.logistic.demo.MapReduceLogisticTrain.run(MapReduceLogisticTrain.java:51)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at com.winvector.logistic.demo.MapReduceLogisticTrain.main(MapReduceLogisticTrain.java:108)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at com.winvector.logistic.demo.DemoDriver.main(DemoDriver.java:21)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Exception in thread "main" java.lang.Exception: java.io.FileNotFoundException: File does not exist: /user/ameet/test.csv
at com.winvector.logistic.demo.DemoDriver.main(DemoDriver.java:25)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

what am I doing wrong?

thanks

ameet

question of the mapreduce's function

hi~ I read your Logistic code, but I don't understand that your map's function yet. Which part of the Logistic regression algorithm does it implement? Could you tell me what your map and reduce's code implement? Thank you~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.