Comments (10)
In defining the specification for the WebNN operations, we looked into not only TensorFlow and NNAPI but also other popular frameworks e.g. PyTorch, ONNX, and CoreML, to name a few, and found that most building block operations are reusable and share an unusually high degree of similarity across all the frameworks; many of them are in fact identical.
When we looked at the current hardware design in the market, especially on the GPU, we found that the same building blocks are also implemented either in the hardware or at the system software level. This is not surprising because the optimizations are there to satisfy the existing software use cases.
A note on the TensorFlow's operators: TensorFlow indeed supports over a thousand operators (around 1200), but only about half of that are implemented in CUDA today, the rest are just unimportant. Of that amount, most are implemented as compositions of basic building block operations, which altogether are no more than 150-200. These building block operators are at the level of operations targeted in the WebNN design because it's the level just high enough to be efficiently handled by the hardware today, but also low enough to be used to compose new reusable ones. The count is finite, in small numbers, and not open-ended.
As to whether this set of building block operations will stand the test of time, only time will tell. But in my experience working on various platform technologies, I have yet to see a new piece of software technology that would come in and render the existing ones completely obsolete, especially with those that are already supported in a healthy ecosystem i.e. will people stop using convolutions, gemm, or the various activations in their model? I highly doubt it.
from machine-learning-charter.
The plan on Android is to replace the current NN API with a lower level instruction set. There are multiple candidates, with no clear winner yet. We want to develop the instruction set in an open, vendor-neutral, standards-based way that would work for Android (an open-source project) as well as the Web -- and including Windows, MacOS and iOS.
@jbingham do you have links to what the multiple candidates are so we can read about them and give feedback?
from machine-learning-charter.
Some relevant links:
XLA HLO
Tensor Compute Primitives
TensorFlow RISC (no public link yet)
Tensor Operator Set Architecture
from machine-learning-charter.
Thanks @jbingham for the pointers!
XLA HLO
We did check XLA HLO compatibility when defining WebNN ops for the first-wave models. When defining the higher level WebNN ops, the lower level primitives that the higher level op can be decomposed to are also defined. So according to the table, most of the relevant XLA HLO operations are mapped by corresponding WebNN operations. The exception is ReduceWindow which is used to support pooling operations. However, in other references, e.g. the Tensor Operator Set Architecture, the pooling operations are defined as primitives. We may explore it further.
Tensor Operator Set Architecture
I did the first round cross checking between TOSA and WebNN. It seems to me that TOSA operations have good mappings to WebNN ops. The gaps include bitwise ops, logical ops and control flows. I believe we can explore these ops with the use cases and models for WebNN. More details are in the following tables.
Tensor Operators
TOSA | WebNN | Remarks |
---|---|---|
ARGMAX | N/A | |
AVG_POOL2D | averagePool2d | |
CONV2D | conv2d | |
CONV3D | N/A | considering |
DEPTHWISE_CONV2D | conv2d | supported as a variant of grouped conv2d |
FULLY_CONNECTED | gemm | |
MATMUL | matmul | |
MAX_POOL2D | maxPool2d | |
TRANSPOSE_CONV2D | N/A | WIP |
Activation Functions
TOSA | WebNN | Remarks |
---|---|---|
CLAMP | clamp | |
RELUN | N/A | partially supported by relu |
SIGMOID | sigmoid | |
TANH | tanh |
Elementwise Binary Operators
TOSA | WebNN | Remarks |
---|---|---|
ADD | add | |
ARITHMETIC_RIGHT_SHIFT | N/A | |
BITWISE_AND | N/A | |
BITWISE_OR | N/A | |
BITWISE_XOR | N/A | |
LOGICAL_AND | N/A | |
LOGICAL_LEFT_SHIFT | N/A | |
LOGICAL_RIGHT_SHIFT | N/A | |
LOGICAL_OR | N/A | |
LOGICAL_XOR | N/A | |
MAXIMUM | max | |
MINIMUM | min | |
MUL | mul | |
POW | N/A | WIP |
SUB | sub | |
TABLE | N/A |
Elementwise Unary Operators
TOSA | WebNN | Remarks |
---|---|---|
ABS | abs | |
BITWISE_NOT | N/A | |
CEIL | ceil | |
CLZ | N/A | |
EXP | exp | |
FLOOR | floor | |
LOG | log | |
LOGICAL_NOT | N/A | |
NEGATE | neg | |
RECIPROCAL | N/A | |
RSQRT | N/A |
Elementwise Ternary Operators
TOSA | WebNN | Remarks |
---|---|---|
SELECT | N/A |
Comparison Operators
TOSA | WebNN | Remarks |
---|---|---|
EQUAL | N/A | |
GREATER | N/A | |
GREATER_EQUAL | N/A |
Reduction Operators
TOSA | WebNN | Remarks |
---|---|---|
REDUCE_ALL | N/A | |
REDUCE_ANY | N/A | |
REDUCE_MAX | reduceMax | |
REDUCE_MIN | reduceMin | |
REDUCE_PRODUCT | reduceProduct | |
REDUCE_SUM | reduceSum |
Data Layout
TOSA | WebNN | Remarks |
---|---|---|
CONCAT | concat | |
PAD | N/A | WIP |
RESHAPE | reshape | |
REVERSE | N/A | |
SLICE | slice | |
TILE | N/A | |
TRANSPOSE | transpose |
Scatter/Gather Operators
TOSA | WebNN | Remarks |
---|---|---|
GATHER | N/A |
Image Operators
TOSA | WebNN | Remarks |
---|---|---|
RESIZE | N/A | WIP |
Type Conversion
TOSA | WebNN | Remarks |
---|---|---|
CAST | N/A | |
RESCALE | N/A |
Data Nodes
TOSA | WebNN | Remarks |
---|---|---|
CONST | constant | |
IDENTITY | N/A | |
IDENTITYN | N/A | |
PLACEHOLDER | input |
Custom Operators
TOSA | WebNN | Remarks |
---|---|---|
CUSTOM | N/A | custom ops |
Control Flow Operators
TOSA | WebNN | Remarks |
---|---|---|
COND_IF | N/A | |
WHILE_LOOP | N/A |
from machine-learning-charter.
Thanks for the detailed comparison, @huningxin .
IIUC, what we're saying is this:
A graph API can accommodate both the higher level ops, like what's defined in ONNX or TF Lite, and the lower level ops, like what's defined in XLA HLO and TOSA.
If that's right, we could define a single operation set that includes the union of both sets of operations. Or we could define two operation sets, one for the higher level and one for the lower level. In either case, the graph construction, compilation, and prediction APIs could be the same.
Is that an accurate summary?
from machine-learning-charter.
Thanks @jbingham , your summary looks accurate to me.
According to operations selection, TOSA spec defines a set of principles that may be a useful reference. I am pasting it here for convince.
1.3. Operator Selection
TOSA defines a set of primitive operators to which higher level operators can be lowered in a consistent way. To remain effective and efficient to implement the set of operators must be constrained to a reasonably small set of primitive operations out of which others can be constructed. The following principles govern the selection of operators within TOSA.
Table 2. Principles
ID | Principle | Reason for this |
---|---|---|
P0 | An operator shall be a primitive operation or building block that cannot be broken down into simpler whole tensor operations | If the operator can be broken down, then we should look at the component operators. |
P1 | An operator shall be a usable as a component out of which more complex operations can be constructed | Single use operators have a high architectural cost and a more reusable version should be considered instead. |
P2 | Precision should be appropriate for the input and output data types | Precision higher than that needed to calculate the result leads to extra implementation cost |
P3 | Numerical definition of common sub operations should be consistent between operators (for example: value scaling) | Consistent sub-operation definition reduces the operator implementation cost |
P4 | The valid input and output ranges for all operands shall be specified | Ranges are required to makes consistent (numerically agreeing) implementations possible |
P5 | Integer operators shall be implementable in a bit-exact form with good efficiency on CPU, GPU and hardware targets. | Reduces implementation cost and gives consistent inference result |
from machine-learning-charter.
Thanks for sharing those Principles, @huningxin .
Do you have a sense of how many of the current NN API operators could be broken down into simpler tensor operations? Or how many of the ~120 ONNX/TF Lite operations could be?
from machine-learning-charter.
Regarding to the 47 ops of current WebNN spec, there are 9 decomposable operations, such as gemm
, gruCell
among others.
ONNX also has a guideline of Proposing and submitting a new operator or function to ONNX where a function can be composed by other ONNX operators. Ping @wchao1115 @gramalingam for more insights.
Ping @pyu10055 @miaowang14 for insights of TF Lite / NNAPI. Thanks!
from machine-learning-charter.
I've looked at TOSA before as well, and as @huningxin thoroughly enumerated here, from the conceptual standpoint they are not that much different from WebNN or even ONNX. They do share a lot of overlaps because these are all basic building blocks for deep learning neural networks.
Even when you look at XLA-HLO, you will find element-wise operators, reductions, convolutions, tensor mutations, normalizations, activations, and recurrent networks. They are all mappable to one another. And while a bigger operation such as a normalization function like batchnorm
could be broken up further, in practice it makes more sense to handle the whole calculation all at once to avoid unnecessary intermediate results, and that's what most native API normally do.
However, from the conceptual standpoint, it is still useful to also define all of the smaller operations from which new functions in the future may be composed. One of our design principles for WebNN operation is to also define the lower level operations that, semantically, together compose the bigger operation. The most vivid example of this principle may be in the way we define gruCell
operation, which could be broken up down to the slice
tensor mutation operation and many more, all of which are also defined as WebNN operations.
from machine-learning-charter.
Per discussion on the WebML CG Teleconference β 10 December 2020 this issue can be closed.
from machine-learning-charter.
Related Issues (20)
- Is a graph of operations the right level of abstraction for a web standard? HOT 3
- Detailed explainer for Web NN API HOT 4
- Possible data process spec HOT 1
- Evidence of customer demand thatβs not met by WebGL and WASM HOT 1
- Graph vs model loader HOT 7
- Making sure the charter is flexible enough HOT 5
- Coordination between the CG and WG HOT 5
- WebGPU interoperability HOT 1
- WebRTC coordination HOT 1
- Level of abstraction for neural net operations HOT 1
- Set of ops supported must be more comprehensive HOT 1
- Dedicated ML hardware accelerators: NPU, VPU, xPU HOT 1
- Features deferred to WebNN v2 HOT 1
- Descriptive input formats for building graphs? HOT 1
- On-device training HOT 1
- OpenXLA coordination HOT 4
- Speech synthesis and machine learning HOT 2
- AC review comments HOT 5
- Web Translation API HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from machine-learning-charter.