Comments (6)
Yes. We only support MemoryData right now.
What's content of your /data/mnist_memory_autoencoder_solver.prototxt? More specifically, I like to know its value for source_class
from caffeonspark.
name: "MNISTAutoencoder"
layer {
name: "data"
type: "MemoryData"
top: "data"
include {
phase: TRAIN
}
transform_param {
scale: 0.0039215684
}
source_class: "com.yahoo.ml.caffe.LMDB"
memory_data_param {
source: "mnist_train_lmdb/"
batch_size: 64
channels: 1
height: 28
width: 28
share_in_parallel: false
}
}
This source_class is the same for phases TEST/stage test-on-train and TEST/stage test-on-test.
from caffeonspark.
I suspect that you missed some spaces in your CLI. Please add a space char before all \s.
from caffeonspark.
I found the problem! At second line in my CLI there was a space between file's names:
--files mnist_memory_autoencoder.prototxt, mnist_memory_autoencoder_solver.prototxt\
With a space after the comma, the error occurs.
After I removed the comma, my CLI worked.
However, we got another error:
16/03/30 12:49:12 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ip-172-31-2-118.eu->west-1.compute.internal, partition 0,PROCESS_LOCAL, 2216 bytes)
16/03/30 12:49:12 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, ip-172-31-2-117.eu->west-1.compute.internal, partition 1,PROCESS_LOCAL, 2216 bytes)
16/03/30 12:49:12 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-172-31-2->118.eu-west-1.compute.internal:39514 (size: 2.0 KB, free: 8.9 GB)
16/03/30 12:49:12 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-172-31-2->117.eu-west-1.compute.internal:58422 (size: 2.0 KB, free: 8.9 GB)
16/03/30 12:49:12 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, ip-172-31-2-117.eu->west-1.compute.internal): java.io.FileNotFoundException: >/root/CaffeOnSpark/data/mnist_memory_autoencoder.prototxt (No such file or directory)
The question is: the prototxt files must be exists in all workers (nodes)? If yes, how may I copy these files to workers?
Thanks again!
from caffeonspark.
My mistake! I had used a version of mnist_memory_autoencoder_solver.prototxt with path at "net" parameter. After I removed the path, it worked.
from caffeonspark.
I have met the same error (NullPointerException) when I train other network(Caffenet), more detail see the #issue 217. I
I change the spark submit :
${CAFFE_ON_SPARK}/data/, so the spark submit is:
spark-submit --master ${MASTER_URL}
--files
--conf spark.cores.max=${TOTAL_CORES}
--conf spark.task.cpus=${CORES_PER_WORKER}
--conf spark.driver.extraLibraryPath="${LD_LIBRARY_PATH}"
--conf spark.executorEnv.LD_LIBRARY_PATH="${LD_LIBRARY_PATH}"
--class com.yahoo.ml.caffe.CaffeOnSpark
${CAFFE_ON_SPARK}/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar
-train
-features accuracy,loss -label label
-conf solver.prototxt
-clusterSize ${SPARK_WORKER_INSTANCES}
-devices 1
-connection ethernet
-model file:${CAFFE_ON_SPARK}/myself_caffenet.model
-output file:${CAFFE_ON_SPARK}/myself_result
the solver.prototxt and train_val.prototxt at the path: ${CAFFE_ON_SPARK}/data/,
and the error is:
17/01/11 20:49:34 ERROR caffe.DataSource$: source_class must be defined for input data layer:Data
Exception in thread "main" java.lang.NullPointerException
at com.yahoo.ml.caffe.CaffeOnSpark.train(CaffeOnSpark.scala:103)
at com.yahoo.ml.caffe.CaffeOnSpark$.main(CaffeOnSpark.scala:40)
at com.yahoo.ml.caffe.CaffeOnSpark.main(CaffeOnSpark.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/01/11 20:49:34 INFO spark.SparkContext: Invoking stop() from shutdown hook
where is my error?
from caffeonspark.
Related Issues (20)
- MNIST example at one out of four worker nodes only HOT 2
- CaffeOnSpark CPU model failed but GPU model success(same scripts and same data) HOT 1
- Trouble in performing test with existing model, dataframe empty. HOT 2
- Could anyone help about build CaffeOnSpak while caffe-distri failed ? HOT 5
- Core dump failures HOT 3
- CaffeOnSpark use infiniband but Cannot find the address of another infiniband host. HOT 4
- Use infiniband HOT 2
- Infiniband not work, Help me
- Feature extraction mode running slow HOT 7
- DataLayer use data_param instead of memory_data_param HOT 2
- Parameter synchronization mode HOT 2
- Error: Exception in thread "AWT-EventQueue-0" java.lang.UnsatisfiedLinkError: /Applications/Alice 2.4.app/Contents/Required/lib/osx/libjogl_awt.jnilib: Library not loaded: /System/Library/Frameworks/JavaVM.framework/Libraries/libjawt.dylib
- err “java.lang.UnsupportedOperationException: empty.reduceLeft”
- Error running javah command: Error executing command line HOT 2
- hive java.io.filenotfoundexception system cannot find specified path
- Is CaffeOnSpark still being maintained and developed? HOT 2
- Attribute protoFile not valid
- Exception in thread "main" org.apache.spark.SparkException: Cannot load main class from JAR file:/data/lenet_memory_solver.prototxt,/data/lenet_memory_train_test.prototxt
- have anyone faced that cannot fine class google V3 when doing test HOT 2
- java.lang.RuntimeException: Error while encoding: java.lang.ArrayIndexOutOfBoundsException: 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from caffeonspark.