Our current architecture in sample-factory is just an MLP encoder; I suspect a permuta

Add examples of permutation invariant architectures about nocturne HOT 5 OPEN

eugenevinitsky commented on August 11, 2024

Add examples of permutation invariant architectures

from nocturne.

Comments (5)

eugenevinitsky commented on August 11, 2024 1

Oh hooray! There's no docs yet, sorry, I might write them this weekend but I've written up the description of each of the states below as well as a pointer to the chunk of code where they're computed. All the feature extraction code is in lines 142-203 in https://github.com/facebookresearch/nocturne/blob/main/nocturne/cpp/src/scenario.cc

The 500 is a fixed number of road points that are viewed, controlled by the scenario.max_visible_road_points value in the config. We use a fixed value that is padded to ensure that a fixed size state is provided to users; if you don't want padding you can set padding=false in the call. As for the column values, I'm detailing them below.

For notation I'll call the vehicle whose feature is being constructed the "ego" agent. Everything where possible is constructed in ego-centric frame.
Object features (zero-indexed):
Element 0: a 1 if it's a valid feature i.e. not padding, 0 otherwise
Element 1: distance between the ego and the agent
Element 2: angle between the positions of the ego and the agent with angle increasing counter-clockwise
Element 3: length of the object
Element 4: width of the object
Element 5: angle between the headings of the agents
Element 6: angle between the velocities of the agents
Element 7: norm of the relative velocities of the agents
Element 8-12: a one hot vector indicating the type of the object. In order these are unset, vehicle, pedestrian, cyclists, other.

Since road points are connected, we refer to the next road point in the road line as the "neighbor road point" i.e. the road line is a series of connected points and there's a direction we traverse the road line in. Note that all of these are also in ego-centric frame.
road point features (zero-indexed):
Element 0: a 1 if it's a valid feature i.e. not padding, 0 otherwise
Element 1: distance between the ego and the road point
Element 2: angle between the positions of the ego and the road point increasing counter-clockwise
Element 3: distance between the road point and its neighbor point in the road.
Element 4: angle between the road point and its neighbor point in the road.
Element 5-12: one hot vector indicating the type of the road point. In order these are none, lane, road-line, road-edge (these can't be crossed without colliding with them), stop-sign, crosswalk, speedbump, other. Note that there's a tiny bug here where stop-sign is included as a possible type but shouldn't be because stop signs are a distinct object from road points.

stop-sign features: also in ego-centric coordinates
Element 0: a 1 if it's a valid feature i.e. not padding, 0 otherwise
Element 1: distance between the ego and the stop sign
Element 2: angle between the positions of the ego and the stop sign

ego features: these are features describing the ego vehicle itself
Element 0: length of ego vehicle
Element 1: width of ego vehicle
Element 2: speed of ego vehicle
Element 3: distance to goal
Element 4: angle to goal
Element 5: desired heading at goal
Element 6: desired speed at goal
Element 7: current acceleration
Element 8: current steering angle
Element 9: current angle of head

Hope that helps! I'll write it up into a nice doc shortly.

from nocturne.

eugenevinitsky commented on August 11, 2024 1

Btw, do feel free to message if more issues arise. I'd really like to see this feature work.

from nocturne.

katerakelly commented on August 11, 2024

I am looking into doing this :)
However, I am struggling to understand the data format. For example, the road points (accessed via the Python wrapper via scenario.visible_state(obj, view_dist, view_angle)['road_points']) are a 2 -dimensional array - as an example, of shape (500, 13). How is this data to be interpreted? I read in the white paper that the data is in VectorNet format - can each row then be decoded into (start, end, features, idx)? Could you point me to some processing code or docs to understand this? Thanks.

from nocturne.

katerakelly commented on August 11, 2024

Hi Eugene,

Question about the road points - my understanding is that in the single road_points array are all the road points for all the different line segments of road visible to that agent. For example, both the left and right lines of the road would have points in there. Is there any index indicating which line segment each point belongs to? I am asking because I want to embed line segments separately rather than embedding all the road points together.

Thanks for your help!

from nocturne.

eugenevinitsky commented on August 11, 2024

Hi Kate,

Unfortunately there isn't, there's an element indicating what the neighboring point in the line (elements 3 and 4 of the vector) is but no direct pointer to the entire line itself. I admit that does seem useful so we could provide such a functionality but we would have to write it, it doesn't already exist

from nocturne.

Add examples of permutation invariant architectures about nocturne HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs