Opening this issue to discuss the next steps. I would suggest a patc

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Resolve operator + type about connxr HOT 3 CLOSED

alrevuelta commented on August 16, 2024

Resolve operator + type

from connxr.

Comments (3)

nopeslide commented on August 16, 2024

* generate the resolvers (already done?)

is already done, but not tested.
sadly, it makes no sense to hardcoded the resolving in an array imho, because some operators take 4 inputs with just a few types resulting in a few combinations but hardcoding it would result in thousands of entries.
currently the operator specific resolver takes an operator context and returns the executer function matching the input types of the context.

* more?

I'm currently overhauling the operator_info / operator_set structure to comply with #18. (merging sanity check structures with operator set structures)
Could you provide some feedback on the info structure?
The current idea is that resolving consists of two parts:

find the right operator (depends on version, domain, name)
find the right executer (depends on input types)

the set structure is a list of all existing domain/onnx_version combinations.
each combination contains the latest operator version valid for the onnx version.
I.e. you have onnx version 4 and an operator from the default domain ('onnx'), you iterate through the list until you get to ('onnx', 4). in there is a list of operator_info structs of the latest operators valid for this version i.e. Add__4, Conv_2 (no new version since onnx_version 2), etc
The operator_info struct contains the constraints for the sanity check and the resolver function.
Therefore identifies the actual operator. This identification is also the reason I wanted a reference inside the operator context to the info struct instead of the plain resolver function pointer.

from connxr.

alrevuelta commented on August 16, 2024

@nopeslide

sadly, it makes no sense to hardcoded the resolving in an array imho, because some operators take 4 inputs with just a few types resulting in a few combinations but hardcoding it would result in thousands of entries.

Yes, I agree. To many entries.

currently the operator specific resolver takes an operator context and returns the executer function matching the input types of the context.

[1](Based on what we currently have on master)
I guess you are referring to for example resolve_operator__onnx__add__7 function, that takes the context as input and returns the final resolved function (operator add, operatorset 7 and type double for example) like operator__onnx__add__7__T_tensor_double. Can we agree that this is the way to do this resolving?

[2]On the other hand, and also based on what we have in master, this is the way of resolving the operator (taking into account that different operator sets exist). So we will have one table per operatorset version, and in each one different operators. Each table won't have more than around 150 entries (if we consider the whole onnx).

operator_set operator_set__onnx__7 = {
  .version = 7,
  .domain  = "onnx",
  .length  = 3,
  .entries = {
    {
  .name = "Relu",
  .resolver = (operator_resolver) &resolve_operator__onnx__relu__6
},{
  .name = "Reshape",
  .resolver = (operator_resolver) &resolve_operator__onnx__reshape__5
},{
  .name = "Add",
  .resolver = (operator_resolver) &resolve_operator__onnx__add__7
}
  }
};

IMHO [1] and [2] are the way to go, and we have all the pieces of the puzzle. We just need to make some minor modifications (i.e. resolve_operator__onnx__add__7 should accept the new interface node_context *ctx). Once this modifications are done in the Python script, we can start using it and resolving the functions properly (and not hardcoded as it is so far)

Regarding #18
I'm a bit confused. I understand that files like list.h will be removed, right?

Is this the operator_info that you are referring to? I don't see the need of having this structure, we have everything we need in node_context. On top of that, we already had some discussions few time ago about the sanity checks. I don't mind having some as long as it doesn't impact the rest of the code. I don't think we need things like operator_info_range. We are an inference runtime that runs inference on whatever model we get. If the model is wrong, we shouldn't care, its not our problem. Of course we can have some checks (or even have some Python code that verifies the model beforehand) but having all that information per layer I think its unnecessary. The runtime should be as dummy and simple and possible.

struct operator_info
{
  char                     *name;
  operator_resolver         resolver;
  operator_info_range       range_input;
  operator_info_range       range_output;
  size_t                    n_attribute;
  operator_info_attribute  *attribute;
  size_t                    n_input;
  operator_info_tensor     *input;
  size_t                    n_output;
  operator_info_tensor     *output;
  size_t                    n_constraint;
  operator_info_constraint *constraint;
};

So I would focus on implementing the operator resolution and integrating it with what we currently have. We are quite close to start implementing the operators, and I think that we have a really nice architecture.

from connxr.

nopeslide commented on August 16, 2024

@alrevuelta

Is this the operator_info that you are referring to? I don't see the need of having this structure, we have everything we need in node_context. On top of that, we already had some discussions few time ago about the sanity checks. I don't mind having some as long as it doesn't impact the rest of the code. I don't think we need things like operator_info_range. We are an inference runtime that runs inference on whatever model we get. If the model is wrong, we shouldn't care, its not our problem. Of course we can have some checks (or even have some Python code that verifies the model beforehand) but having all that information per layer I think its unnecessary. The runtime should be as dummy and simple and possible.

I just put these structures together without thinking. how about I outsource the info struct (making it optional), so we can handle this stuff later?
The info struct must be findable and if it's optional we should put in the set structure, shouldn't we?
Something like this?

operator_set operator_set__onnx__7 = {
  .version = 7,
  .domain  = "onnx",
  .length  = 3,
  .entries = {
    {
  .name = "Relu",
  .resolver = (operator_resolver) &resolve_operator__onnx__relu__6
  .info = <pointer to info struct>
},{
  .name = "Reshape",
  .resolver = (operator_resolver) &resolve_operator__onnx__reshape__5
  .info = <pointer to info struct>
},{
  .name = "Add",
  .resolver = (operator_resolver) &resolve_operator__onnx__add__7
  .info = <pointer to info struct>
}
  }
};

IMHO [1] and [2] are the way to go, and we have all the pieces of the puzzle. We just need to make some minor modifications (i.e. resolve_operator__onnx__add__7 should accept the new interface node_context *ctx). Once this modifications are done in the Python script, we can start using it and resolving the functions properly (and not hardcoded as it is so far)

I have the feeling we would make our life easier if we would have a reference inside the node_context to the set_entry.
I see it this way:
we have two structures regarding a node

the node_context, which is our data structure needed to execute a specific, unique node instance in our runtime.
So everything our node instance needs as context is put there, so the runtime can work with it.
whatever resides inside the set_entry, which helps map the onnx node to an executable runtime node.
If we want to translate an onnx node we ask the set structure to "build" our executable node_context. therefore the set entry we use to create our node_context holds our runtime specific things needed to instantiate the node_context. It is the only mapping between an onnx node and our node implementations. So everything that a node_context needs, which is not specific to its use, but its type, resides inside this set_entry (like the resolver function).

So if we want to add something to a node (i.e. the sanity checks) all nodes of the same type share this information, the only place to put it would be this set_entry. A reference to the set entry would do two things:

make our internal type/implementation of a node_context easily identifiable in our runtime (no need to find it anymore)
- may be used for debugging (we can easily identify what the runtime actually uses to build/execute this node)
make extensions possible at a single point
- you don't need to touch the node_context to add information to a node

I see this set_entry as our node description parallel to the onnx node description.
the node_context as the actual instance that glues these two parts together and integrates them in the runtime.

Your thoughts?

Regarding #18
I'm a bit confused. I understand that files like list.h will be removed, right?

I will cherry-pick everything needed in a new PR and close the draft PR.

So I would focus on implementing the operator resolution and integrating it with what we currently have. We are quite close to start implementing the operators, and I think that we have a really nice architecture.

I completely concur :D
I just want to finish this discussion before I touch the generator

from connxr.

Resolve operator + type about connxr HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs