Comments (48)
Unfortunately i'm not Scala guy, so i'd like to ask few more questions regarding tests equality, if you don't mind.
Line 144: Was number of threads set to 1 in runtime? Or number of CPU threads were > 1 during both tests?
Lines 153 & 168: Does that means you've included input generation time to Tanh measurements?
Line 170: So, you do use "cache" for your library, but not using workspaces for ND4j? Nice.
Line 174: foldLeft InlineTensor
does in this case it means that operation in this case is executed in-place? As in "input array gets modified and returned back" once .flatArray()
method is called?
from compute.scala.
from compute.scala.
from compute.scala.
from compute.scala.
from compute.scala.
from compute.scala.
I see.
This benchmark isn't comparing apples to apples. Thanks for your time.
from compute.scala.
You can see there is a notice for Deeplearning4j in the README.md.
We have never criticized the performance of ND4J's in a mutable style.
from compute.scala.
from compute.scala.
You are wrong. The benchmark is comparing apples to apples: immutable
operations vs immutable operations
Few messages above you've said that 1 array was allocated for the loop. Now you say: it's immutable vs immutable comparison. So which answer is correct?
I.e. your code that uses Nd4j does numberOfIterations
x2 allocations, because your Transform.tanh() call creates new INDArray each time, and each INDArray has 2 buffers - 1 on gpu side, 1 on host side. With 5 total iterations your test basically benchmarks CUDA allocation performance, and not actual tanh.
If you call that "apples to apples comparison" - okay, that's up to you :)
re Nd4j for Dl4j. Nd4j just mimics numpy basically.
from compute.scala.
from compute.scala.
No, i've just explained that you hadn't understood how to implement stuff with ND4j efficiently, and did claims like this:
ND4J's implementation consumes more memory
It's not about ND4j implementation. It's about what YOU've implemented. Because obviously the same code could be written without allocating new INDArray on each iteration. Just difference of 1 argument :)
from compute.scala.
P.d. don't get me wrong please. Personally i don't care about your claims etc. You want to claim you're faster then Nd4j? I'm ok with that. If you'll want to claim that you're faster then light - I'll be ok with that as well.
The only reason I was here - is performance feedback. When i hear about Nd4j performance problems - i'm always trying to get to the bottom of the problem, and improve whatever is possible to improve. In this particular case i see - it's a waste of time for me, due to various reasons. Different approaches, bad benchmarking setup, different goals etc.
Thanks for your time.
from compute.scala.
from compute.scala.
Imagine Alice does some reading of documentation, and instead of:
a.mul(b).addi(c)
does something like:
a.muli(b).addi(c)
Thats when it becomes interesting... :)
from compute.scala.
from compute.scala.
@raver119 The inplace version of ND4J operation is indeed super fast. It is 1.44 times faster than the ND4J's immutable version when performing a * b + c
100 times on 32x32x32 arrays, despite the fact that Compute.scala's immutable version is 43 times faster than ND4J's inplace version.
All the tests are running on a Titan X GPU.
from compute.scala.
That's already something, thank you.
Please tell me, what OS was used, and what CUDA Toolkit version was used?
EDIT: And which Titan X generation was used? There were 2 different generations sharing the same X name. Which one you've used? M or P?
from compute.scala.
Ubuntu 16.04 and CUDA 8.0 from this docker image: https://github.com/ThoughtWorksInc/scala-cuda/tree/sbt-openjdk8-cuda8.0-opencl-ubuntu16.04
from compute.scala.
What's different on your local branch? I've tried to run your nvidia-gpu
branch locally but I'm having some dependency issues. All blockingAwait
methods appear to be unresolved, making me think you might have a local dependency somewhere. Can you update the branch so I can import it in an IDE?
A couple more things:
- how are you running the Docker image?
- you're also running ND4J v8, which is out of date but b/c we're testing primitive operations here it should be a negligible difference
from compute.scala.
blockingAwait
is marked red in IntelliJ, which is a bug in IntelliJ typer. The bug does not affect actual compilation.
from compute.scala.
The reason why I was using 0.8 is the CUDA backend of ND4J 0.9.x is broken in sbt, even when compiling from a clean docker image.
https://github.com/deeplearning4j/nd4j/issues/2767
from compute.scala.
from compute.scala.
sbt 'benchmarks/Jmh/run Issue137'
The first run of the command may be fail due to sbt-jmh
's bug. But retry would be good.
Run sbt 'benchmarks/Jmh/run -help
for more flags
from compute.scala.
from compute.scala.
from compute.scala.
from compute.scala.
Imagine you do some reading of documentation, and instead of:
java -cp ...
do something like:
sbt benchmarks/Jmh/run ...
That's when it becomes interesting... :)
from compute.scala.
Imagine you do some reading of documentation
Burden of reproducibility falls on you. The command you gave me is sbt 'benchmarks/Jmh/bgRun Issue137'
. I ran that command exactly. The output is a JAR file. Here's my output:
$ sbt 'benchmarks/Jmh/bgRun Issue137'
[info] Loading settings from plugins.sbt ...
[info] Loading project definition from /home/justin/Projects/Compute.scala/project
[info] Loading settings from build.sbt ...
[info] Loading settings from build.sbt ...
[info] Loading settings from build.sbt ...
[info] Loading settings from build.sbt ...
[info] Loading settings from build.sbt ...
[info] Loading settings from build.sbt ...
[info] Loading settings from build.sbt ...
[info] Loading settings from build.sbt ...
[info] Loading settings from build.sbt ...
[info] Loading settings from build.sbt ...
[info] Loading settings from build.sbt,version.sbt ...
[info] Set current project to compute-scala (in build file:/home/justin/Projects/Compute.scala/)
[info] Packaging /home/justin/Projects/Compute.scala/NDimensionalAffineTransform/target/scala-2.12/ndimensionalaffinetransform_2.12-0.3.2-SNAPSHOT.jar ...
[info] Packaging /home/justin/Projects/Compute.scala/Memory/target/scala-2.12/memory_2.12-0.3.2-SNAPSHOT.jar ...
[info] Done packaging.
[info] Done packaging.
[info] Packaging /home/justin/Projects/Compute.scala/Expressions/target/scala-2.12/expressions_2.12-0.3.2-SNAPSHOT.jar ...
[info] Done packaging.
[info] Packaging /home/justin/Projects/Compute.scala/OpenCLKernelBuilder/target/scala-2.12/openclkernelbuilder_2.12-0.3.2-SNAPSHOT.jar ...
[info] Done packaging.
[info] Packaging /home/justin/Projects/Compute.scala/Trees/target/scala-2.12/trees_2.12-0.3.2-SNAPSHOT.jar ...
[info] Packaging /home/justin/Projects/Compute.scala/OpenCL/target/scala-2.12/opencl_2.12-0.3.2-SNAPSHOT.jar ...
[info] Done packaging.
[info] Done packaging.
[info] Packaging /home/justin/Projects/Compute.scala/Tensors/target/scala-2.12/tensors_2.12-0.3.2-SNAPSHOT.jar ...
[info] Done packaging.
[info] Packaging /home/justin/Projects/Compute.scala/benchmarks/target/scala-2.12/benchmarks_2.12-0.3.2-SNAPSHOT.jar ...
[info] Packaging /home/justin/Projects/Compute.scala/benchmarks/target/scala-2.12/benchmarks_2.12-0.3.2-SNAPSHOT-tests.jar ...
[info] Done packaging.
Processing 24 classes from /home/justin/Projects/Compute.scala/benchmarks/target/scala-2.12/classes with "reflection" generator
[info] Done packaging.
Writing out Java source to /home/justin/Projects/Compute.scala/benchmarks/target/scala-2.12/src_managed/jmh and resources to /home/justin/Projects/Compute.scala/benchmarks/target/scala-2.12/resource_managed/jmh
[info] Compiling 1 Scala source and 37 Java sources to /home/justin/Projects/Compute.scala/benchmarks/target/scala-2.12/classes ...
[warn] /home/justin/Projects/Compute.scala/benchmarks/src/jmh/scala/com/thoughtworks/compute/benchmarks.scala:453:24: The outer reference in this type test cannot be checked at run time.
[warn] final case class ConvolutionalLayer(weight: NonInlineTensor, bias: NonInlineTensor) {
[warn] ^
[warn] one warning found
[info] Done compiling.
[info] Packaging /home/justin/Projects/Compute.scala/benchmarks/target/scala-2.12/benchmarks_2.12-0.3.2-SNAPSHOT-jmh.jar ...
[info] Done packaging.
[success] Total time: 8 s, completed Apr 2, 2018 6:23:45 PM
If you can't give me something that's reproducible, that's very suspect. I see that you have since edited your answer to use run
instead of bgRun
. Next time, please give me a heads up when you make a change.
from compute.scala.
Clarify:
bgRun
starts a benchmark in background, which is suitable when you want to still use sbt shell during running.run
runs the benchmark and wait for finish. You should userun
when it is the only command submitted to sbt batch mode.
from compute.scala.
Your Tensors have an asynchronous component. Instead of calling flatArray.blockingAwait
how can I issue a command to execute all ops in the queue?
In ND4J, we have a simple Nd4j.getExecutioner().commit()
which does this for us.
from compute.scala.
from compute.scala.
I'm aware that InlineTensors are lazily evaluated because you compile all of the ops until final evaluation: https://github.com/ThoughtWorksInc/Compute.scala#lazy-evaluation
What I'm looking for is an op that triggers evaluation without calling .toString
or flatArray.blockingAwait
. I want to isolate the execution itself without dumping the contents of the tensor. If I call something such as tensor.nonInline
does that trigger execution? Glancing at the code, it appears there's an asynchronous operation triggered by Do
and it's handled on a separate thread. So if I try and call tensor.nonInline
all of the ops will be executed possibly(?) on a separate thread and I can't evaluate execution time.
My goal here is to break this into smaller, consumable pieces.
from compute.scala.
from compute.scala.
"Consumable pieces" was meant in a programmer's POV kind of way (informal). I'll try to explain a bit better:
- I want to benchmark only the evaluation of the tensor ops, such as addition/subtraction/etc
- I want to isolate that specific op execution time from lazy evaluation trigger to op finish
- I do not want to dump the contents of the tensor, since this is quite costly and from my experience we rarely need to do this in the wild
- Because a method like
.toString
will trigger lazy evaluation then dump the tensor contents and print them, I'm looking for the equivalent of this that does not require dumping tensor contents
So, for example, I do val future = Future { myClass.myUnitMethod() }
. While this is an abuse of Scala I have the ability to block the current thread and wait for the result of Future
by calling Await.result()
. In your context, I want to do the same without calling flatArray
or toString
because there's a cost associated with dumping the contents of a tensor.
Tell me if I'm wrong, but because your Tensors are lazily evaluated if I simply define an op without invoking toString
or flatArray
if I were to set iterations = 1
then your Tensor will appear to be blazing fast because we're only benchmarking the time it takes to evaluate a line of code, not the execution of the op. Nothing has invoked lazy evaluation in that case, and if it has then evaluation is happening on a different thread.
from compute.scala.
from compute.scala.
Both flatArray
and flatBuffer
are dumping content, not the equivalent to ND4J's commit
.
I think cache
is the equivalent to ND4J's commit
.
from compute.scala.
I'm right now writing an isolated benchmark using the methods you suggested. I added your library to SBT as per:
libraryDependencies ++= Seq(
"com.thoughtworks.compute" %% "gpu" % "latest.release"
)
However, when I try to run this I get:
[LWJGL] Failed to load a library. Possible solutions:
a) Add the directory that contains the shared library to -Djava.library.path or -Dorg.lwjgl.librarypath.
b) Add the JAR that contains the shared library to the classpath.
[LWJGL] Enable debug mode with -Dorg.lwjgl.util.Debug=true for better diagnostics.
[LWJGL] Enable the SharedLibraryLoader debug mode with -Dorg.lwjgl.util.DebugLoader=true for better diagnostics.
Exception in thread "main" com.google.common.util.concurrent.ExecutionError: java.lang.UnsatisfiedLinkError: Failed to locate library: liblwjgl.so
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2049)
at com.google.common.cache.LocalCache.get(LocalCache.java:3962)
at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4864)
at com.thoughtworks.compute.Tensors.$anonfun$enqueueClosure$1(Tensors.scala:1151)
at com.thoughtworks.raii.asynchronous$Do$.$anonfun$suspend$1(asynchronous.scala:399)
at com.thoughtworks.raii.shared$SharedStateMachine.$anonfun$acquire$2(shared.scala:94)
at com.thoughtworks.continuation$Continuation$.$anonfun$safeOnComplete$1(continuation.scala:428)
at scalaz.Free$.$anonfun$suspend$2(Free.scala:27)
at scalaz.std.FunctionInstances$$anon$1.$anonfun$map$1(Function.scala:77)
at scalaz.Free.$anonfun$run$1(Free.scala:271)
at scalaz.Free.go2$1(Free.scala:162)
at scalaz.Free.go(Free.scala:165)
at scalaz.Free.run(Free.scala:271)
at com.thoughtworks.compute.Tensors$NonInlineTensor.cache(Tensors.scala:1270)
at com.thoughtworks.compute.Tensors$NonInlineTensor.cache$(Tensors.scala:1216)
at com.thoughtworks.compute.Tensors$InlineTensor$$anon$51.cache(Tensors.scala:1166)
at benchmark.RunBenchmark$.doBenchmark(RunBenchmark.scala:26)
at benchmark.RunBenchmark$.$anonfun$main$1(RunBenchmark.scala:15)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:156)
at benchmark.RunBenchmark$.main(RunBenchmark.scala:13)
at benchmark.RunBenchmark.main(RunBenchmark.scala)
Caused by: java.lang.UnsatisfiedLinkError: Failed to locate library: liblwjgl.so
at org.lwjgl.system.Library.loadSystem(Library.java:147)
at org.lwjgl.system.Library.loadSystem(Library.java:67)
at org.lwjgl.system.Library.<clinit>(Library.java:50)
at org.lwjgl.system.MemoryUtil.<clinit>(MemoryUtil.java:61)
at org.lwjgl.system.MemoryStack.<init>(MemoryStack.java:61)
at org.lwjgl.system.MemoryStack.create(MemoryStack.java:82)
at org.lwjgl.system.MemoryStack.create(MemoryStack.java:71)
at java.lang.ThreadLocal$SuppliedThreadLocal.initialValue(ThreadLocal.java:284)
at java.lang.ThreadLocal.setInitialValue(ThreadLocal.java:180)
at java.lang.ThreadLocal.get(ThreadLocal.java:170)
at org.lwjgl.system.MemoryStack.stackGet(MemoryStack.java:628)
at org.lwjgl.system.MemoryStack.stackPush(MemoryStack.java:637)
at com.thoughtworks.compute.OpenCL.createProgramWithSource(OpenCL.scala:1211)
at com.thoughtworks.compute.OpenCL.createProgramWithSource$(OpenCL.scala:1210)
at com.thoughtworks.compute.gpu$.createProgramWithSource(gpu.scala:15)
at com.thoughtworks.compute.Tensors$$anon$55.call(Tensors.scala:1111)
at com.thoughtworks.compute.Tensors$$anon$55.call(Tensors.scala:1085)
at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4869)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3523)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2249)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2132)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2045)
... 20 more
Did I forget a dependency somewhere? I've been able to run your tests just fine and I have CUDA 9.1 installed on my system, making me think there's an issue with my SBT configuration.
from compute.scala.
Please read the Getting started section in the README @crockpotveggies
from compute.scala.
Quick follow up. After spending a couple days examining your APIs I'm unsure how to allocate tensors outside of a benchmark loop. While I've attempted to also use the for {} yield {}
syntax I get flatMap errors in SBT due to incompatibility (flatMap is not a member of NonInlineTensor
etc).
However, I was able to get some ops-only numbers using this code: https://gist.github.com/crockpotveggies/88a5e0f3b067b30065063790102be2fd
The results are:
[info] Benchmark (numberOfCommandQueuesPerDevice) (tensorDeviceType) Mode Cnt Score Error Units
[info] AmiBaiC.doBenchmark 5 GPU sample 5 14703552.102 ± 7159671.681 us/op
[info] AmiBaiC.doBenchmark:doBenchmark·p0.00 5 GPU sample 12465471.488 us/op
[info] AmiBaiC.doBenchmark:doBenchmark·p0.50 5 GPU sample 14612955.136 us/op
[info] AmiBaiC.doBenchmark:doBenchmark·p0.90 5 GPU sample 17146314.752 us/op
[info] AmiBaiC.doBenchmark:doBenchmark·p0.95 5 GPU sample 17146314.752 us/op
[info] AmiBaiC.doBenchmark:doBenchmark·p0.99 5 GPU sample 17146314.752 us/op
[info] AmiBaiC.doBenchmark:doBenchmark·p0.999 5 GPU sample 17146314.752 us/op
[info] AmiBaiC.doBenchmark:doBenchmark·p0.9999 5 GPU sample 17146314.752 us/op
[info] AmiBaiC.doBenchmark:doBenchmark·p1.00 5 GPU sample 17146314.752 us/op
However, if I remove .nonInline.cache()
then I get non-sensical and very small numbers, which mean that the tensor never actually computed (confirmed if I examine output from watch -n 1 nvidia-smi
).
The other issue is that if I move .nonInline.cache()
to outside of the loop then I get a JVM crash:
Code:
@Benchmark
@BenchmarkMode(Array(Mode.SampleTime))
@OutputTimeUnit(TimeUnit.MICROSECONDS)
def doBenchmark(state: SetupState): Unit = {
(0 until state.numberOfIterations).foreach { _i =>
state.a = state.a * state.b + state.c
}
state.a.nonInline.cache()
}
Result:
[info] # Warmup Iteration 1: 1998585.856 us/op
[info] # Warmup Iteration 2: #
[info] # A fatal error has been detected by the Java Runtime Environment:
[info] #
[info] # SIGSEGV (0xb) at pc=0x00007f02f4c844c0, pid=25998, tid=0x00007f02da6b9700
[info] #
[info] # JRE version: OpenJDK Runtime Environment (8.0_121-b15) (build 1.8.0_121-b15)
[info] # Java VM: OpenJDK 64-Bit Server VM (25.121-b15 mixed mode linux-amd64 compressed oops)
[info] # Problematic frame:
[info] # J 1610 C2 scala.collection.JavaConverters$.mapAsScalaMap(Ljava/util/Map;)Lscala/collection/mutable/Map; (6 bytes) @ 0x00007f02f4c844c0 [0x00007f02f4c844a0+0x20]
[info] #
[info] # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
[info] #
[info] # An error report file with more information is saved as:
[info] # /home/justin/Projects/compute-scala-benchmark/hs_err_pid25998.log
[info] #
[info] # If you would like to submit a bug report, please visit:
[info] # http://www.azulsystems.com/support/
[info] #
[info] <forked VM failed with exit code 134>
[info] <stdout last='20 lines'>
[info] #
[info] # A fatal error has been detected by the Java Runtime Environment:
[info] #
[info] # SIGSEGV (0xb) at pc=0x00007f02f4c844c0, pid=25998, tid=0x00007f02da6b9700
[info] #
[info] # JRE version: OpenJDK Runtime Environment (8.0_121-b15) (build 1.8.0_121-b15)
[info] # Java VM: OpenJDK 64-Bit Server VM (25.121-b15 mixed mode linux-amd64 compressed oops)
[info] # Problematic frame:
[info] # J 1610 C2 scala.collection.JavaConverters$.mapAsScalaMap(Ljava/util/Map;)Lscala/collection/mutable/Map; (6 bytes) @ 0x00007f02f4c844c0 [0x00007f02f4c844a0+0x20]
[info] #
[info] # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
[info] #
[info] # An error report file with more information is saved as:
[info] # /home/justin/Projects/compute-scala-benchmark/hs_err_pid25998.log
[info] #
[info] # If you would like to submit a bug report, please visit:
[info] # http://www.azulsystems.com/support/
[info] #
I'd be interested in knowing a better way to do this.
from compute.scala.
Have you tried to close
the cache
?
from compute.scala.
Could you please provide the source code so I could fix it?
from compute.scala.
Thanks for the tip, I'll try the close
method. Here's the source code: https://github.com/crockpotveggies/compute-scala-benchmark
from compute.scala.
Added a close() method to cache and still have the same SIGSEGV error.
[info] #
[info] # A fatal error has been detected by the Java Runtime Environment:
[info] #
[info] # SIGSEGV (0xb) at pc=0x00007ff4354ce140, pid=29355, tid=0x00007ff428e1d700
[info] #
[info] # JRE version: OpenJDK Runtime Environment (8.0_121-b15) (build 1.8.0_121-b15)
[info] # Java VM: OpenJDK 64-Bit Server VM (25.121-b15 mixed mode linux-amd64 compressed oops)
[info] # Problematic frame:
[info] # J 1590 C2 scala.collection.JavaConverters$.mapAsScalaMap(Ljava/util/Map;)Lscala/collection/mutable/Map; (6 bytes) @ 0x00007ff4354ce140 [0x00007ff4354ce120+0x20]
[info] #
[info] # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
[info] #
[info] # An error report file with more information is saved as:
[info] # /home/justin/Projects/compute-scala-benchmark/hs_err_pid29355.log
[info] #
[info] # If you would like to submit a bug report, please visit:
[info] # http://www.azulsystems.com/support/
[info] #
from compute.scala.
I need the complete source code and your system configuration (OS, OpenCL runtime vendor and version, clinfo
output).
from compute.scala.
You have complete source code in the link above. Using Ubuntu 16.04, OpenCL runtime vendor is CUDA 9.1.
Output of clinfo
is here: https://gist.github.com/crockpotveggies/d86e927a13b778bd9a2b2df8bf9cfeea
Not sure why it's trying to use cuda-8.0...I only have cuda 9.1 installed.
from compute.scala.
@crockpotveggies Your first version allocated caches but never release them, resulting out of memory.
Your second version allocated caches and released them immediately. So those cache will never be actually used and the entire computational graph become larger and larger, resulting stack overflow when compiling the computational graph.
It seems that my cache
API design is error prone. I removed the cache
API in 0.4.0 version. Try doCache
instead.
The README has been updated at https://github.com/ThoughtWorksInc/Compute.scala/blob/0.4.x/README.md#caching
from compute.scala.
Caching is designed for permanent data, for example, the weights of a neural network. It's not design for intermediate variables.
from compute.scala.
Related Issues (20)
- Import OpenCL math functions HOT 1
- Optimize split HOT 1
- Reduce the usage of Tensor.transform
- Support Tensor of int/byte/double type
- unrolled matrix multiplication may hang up on Nvidia's OpenCL HOT 1
- Add Tensor#toSeq and Tensor#toArray methods HOT 26
- Add Tensor#transpose HOT 6
- Scaladoc for methods in Tensor HOT 3
- Provide an approach to create a closeable Tensor HOT 1
- Optimize broadcast/translate/reshape when the transform is identity
- Support array of dimensions more than 3
- Getting Started build.sbt failed HOT 1
- core dump HOT 3
- Benchmark Compute.scala and nd4j HOT 2
- Implement sum HOT 1
- Optimize the performance of tensor.split.map
- PendingBuffer should release the command queue when the event is complete
- Configure optimized OpenCL compiler flags HOT 1
- Optimize sum HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from compute.scala.