nrel / routee-compass Goto Github PK

View Code? Open in Web Editor NEW

10.0 8.0 4.0 51.59 MB

The RouteE-Compass energy-aware routing engine

Home Page: https://nrel.github.io/routee-compass/

License: BSD 3-Clause "New" or "Revised" License

Shell 0.06% Python 6.13% Rust 93.81%

networks road-networks route-planning routing eco-routing

routee-compass's Introduction

RouteE Compass is an energy-aware routing engine for the RouteE ecosystem of software tools with the following key features:

Dynamic and extensible search objectives that allow customized blends of distance, time, cost, and energy (via RouteE Powertrain) at query-time
Core engine written in Rust for improved runtimes, parallel query execution, and the ability to load nation-sized road networks into memory
Rust, HTTP, and Python APIs for integration into different research pipelines and other software

RouteE Compass is a part of the RouteE family of mobility tools created at the National Renewable Energy Laboratory and uses RouteE Powertrain to predict vehicle energy during the search.

Installation

See the installation guide for installing RouteE Compass

Usage

See the documentation for more information.

Contributors

RouteE Compass is currently maintained by Nick Reinicke (@nreinicke) and Rob Fitzgerald (@robfitzgerald).

If you're interested in contributing, please checkout the contributing guide.

License

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

routee-compass's People

Contributors

Stargazers

Watchers

Forkers

zenon18 vbmendes snoopj bradonzhang

routee-compass's Issues

grid search support for objects in arrays is limited

while looking to update the notebook test case to the new cost model, i attempted to reproduce the grid search of time vs energy-optimal routes:

{
  "ignore_me": 1234,
  "grid_search": {
    "state_variable_coefficients": [
      { "time": 1, "energy": 0 },
      { "time": 0, "energy": 1 }
    ]
  }
}

however, this simply copies the keys into the root object:

[
  { "ignore_me": 1234, "time": 1, "energy": 0 },
  { "ignore_me": 1234, "time": 0, "energy": 1 }
]

what i wanted was

[
  { "ignore_me": 1234, "state_variable_coefficients": { "time": 1, "energy": 0 }},
  { "ignore_me": 1234, "state_variable_coefficients": { "time": 0, "energy": 1 }}
]

the way it is implemented, this is because we also want to have a way to copy files directly into the object as the current behavior and remove the nesting.

potential solution

perhaps we need a convention to know when to perform which behavior. for example, using a leading underscore ("_") in the key could signify that the key should be dropped, such as with _coordinates, below:

{
  "ignore_me": 1234,
  "grid_search": {
    "state_variable_coefficients": [
      { "time": 1, "energy": 0 },
      { "time": 0, "energy": 1 }
    ],
    "_coordinates": [
      { "x": -122, "y": 39 },
      { "x": -121, "y": 38 }
    ]
  }
}

one of the results of this grid search could be:

{
  "ignore_me": 1234,
  "state_variable_coefficients": {
    "time": 1,
    "energy": 0
  },
  "x": -122,
  "y": 39
}

Edge RTree resulting in no path found

When using the edge based rtree, with the following config:

Config

parallelism = 2
search_orientation = "edge"

[graph]
edge_list_input_file = "data/tomtom_metro_denver_network/edges-compass.csv.gz"
vertex_list_input_file = "data/tomtom_metro_denver_network/vertices-compass.csv.gz"
verbose = true

[traversal]
type = "distance"
distance_unit = "miles"

[plugin]
input_plugins = [
    { type = "edge_rtree", geometry_input_file = "data/tomtom_metro_denver_network/edges-geometries-enumerated.txt.gz", road_class_input_file = "data/tomtom_metro_denver_network/edges-road-class-enumerated.txt.gz" },
    { type = "grid_search" },
    { type = "load_balancer", weight_heuristic = { type = "haversine" } },
]
output_plugins = [
    { type = "summary" },
    { type = "traversal", geometry_input_file = "data/tomtom_metro_denver_network/edges-geometries-enumerated.txt.gz" },
]

I get the error:

ERROR routee_compass] Error: "no path exists between vertices 57208 and 356583"

There are no restrictions being applied here and so it's not a problem with certain edges being excluded from the search. I wonder if the edge based RTree is starting the search at a terminal vertex?

Standardize paths types

Right now we have a mix of different ways to represent our paths in the application, for example:

PathBuf
&Path
P: AsRef<Path>

We should spend some time to review and attempt to standardize these where it makes sense to do so.

Web Service API

issue imported from internal repo

Compass V2 App is itself just a Json -> Result<Json, CompassAppError>. in order to serve this via HTTP, we want a new cargo module, named something like "compass-http" or "compass-server", with the added dependency of a rust HTTP server library. some additional thoughts:

we run the server from the CLI, passing the TOML needed to create CompassApp
that TOML should also likely have a [server] section with server arguments
we instantiate the CompassApp
we wrap the CompassApp search method in a POST endpoint

over-use of cloning on things that could be dereferenced or simply passed by reference

noting that

there are quite a few places where something like a newtype is being cloned just because the function signature expects a type passed by value, not by reference (see the units libraries like Distance and DistanceUnit, etc)
some things are being passed by value that could be passed by reference, and those calls end up cloning those values (can't think of where i've seen this but i know there's a good deal of it)

this unnecessary cloning adds a time and space complexity hit when we are talking about low-level operations such as unit conversion, serialization, or anything in the loop of search, so we should go spelunking to remove them.

expose graph operations to python

in order to make an integration between RouteE Compass and HIVE, we need to be able to lookup Link data by LinkId. in general, this relates to exposing Graph methods from CompassAppWrapper so we can locally traverse the graph and inspect the graph properties for representing state in HIVE.

Make heading a unit type

#103 introduced the idea of the EdgeHeading:

pub struct EdgeHeading {
    start_heading: i16,
    end_heading: Option<i16>,
}

It would be a bit more robust to make a unit Heading type in the library rather than just referring to the headings as i16.

`apply_output_processing` should accept results that have no routes

the logic for processing a search result currently fails with an error if the route is empty. however, the search logic allows for running queries without destinations that only intend to report search tree results.

move JSON getter extensions to core crate

we have helpful utilities to grab arbitrary things from JSON values and deserialize them into other types:

config.get_config_serde_optional::<MySerdeType>(&"key", &"parent_key")

the definition for these is in the routee-compass crate, which means they cannot be used by anything upstream in the core crate. the core crate is the place where a few default traversal models are defined, and it's counter-intuitive that the services need to live in a separate place from these. and, really, pulling stuff off of JSON is a pretty generally-used pattern in this repo.

the error type for these methods is also elsewhere, it is a CompassConfigurationError.

to migrate this to the core crate, we could create a new error type just for these methods which only covers the subset of failures related to deserializing.

endless loop when "grid_search" contains "grid_search"

by mistake i injected a value at key "grid_search" which contained a value "grid_search" within it:

 "grid_search": {
    "grid_search": {
      "_details": [ {}, {}, {}, {} ],
     }
  }

this causes an infinite loop.

parameterize energy model at query time

at present, the server is initialized with a single energy model. but energy models have a relatively small footprint in ram. for example, the Camry Coupe model we used in tests is about 420kb on my computer. it seems like it would be simple to load a catalog of models and allow users to pick the model type at query time. this would allow for researchers to look at how sets of queries vary across models, or, a RouteE/HIVE integration to work with heterogeneous fleets.

if a query included a field requesting some model by name:

{
  "model_name": "2016_TOYOTA_Camry_4cyl_2WD"
}

under-the-hood, there could be something along the lines of a HashMap<String, SpeedGradeModelRecord> that would hold the catalog of in-memory models, such as the entire current catalog (.zip) of 60+ trained models we have. we would then have a SpeedGradeModelRecord that has the model and metadata:

struct SpeedGradeModelRecord {
  name: String,
  model: SpeedGradePredictionModel,
  speed_unit: SpeedUnit,
  grade_unit: GradeUnit,
  energy_rate_unit: EnergyRateUnit
}

Review, modify libs for cargo API compliance

Best practices for library APIs are well-documented. We have done a rough draft of work towards API use-ability for the extension points of compass; a second sweep based on these guidelines would get us up to standards.

time or distance-optimal route plans have poor runtime performance

after a monster refactor to support parameterized cost models, we were getting normal energy-optimal runtimes from the EnergyTraversalModel, but time and distance-optimal routes are still taking long, suggesting that the a* heuristic is not correctly calculated for those objectives. see table here.

to reproduce, use this test case on our internal tomtom-based denver scenario:

{
  "model_name": "2017_CHEVROLET_Bolt",
  "starting_soc_percent": 100,
  "destination_y": 39.62627481432341,
  "destination_x": -104.99460207519721,
  "origin_y": 39.798311884359094,
  "origin_x": -104.86796368632217,
  "state_variable_coefficients": {
    "distance": 0,
    "time": 0,
    "energy_electric": 0
  }
}

Update PHEV control logic

Our current PHEV control logic does a simple switching between electric and gasoline based on battery state of charge. A couple of things that would make this a bit more accurate:

Allow the vehicle to consume both gasoline and electrical energy on a link if the battery runs out of energy mid link.
Add a new charge sustaining routee-powertrain model that returns both gasoline and electrical energy (to capture things like regen braking)

Update optional output directory from application

In #22 we added a new output plugin that writes the results of the application to an output file and that output file is specified in the config for the output plugin. This is inconvenient if you want to run batches of queries and save them to different files.

One idea to address this would be to expose a method on the compass application that attempts to set the output directory for the to disk plugin. It would have to scan the output plugins that have been loaded into the app and then either err if the to disk output plugin isn't loaded or modify it's output directory. Then, users could load the application like:

app = CompassApp.from_config_file("my/config.toml").with_output_directory("path/to/outputs.json")

app = app.with_output_directory("new/path/to/outputs.json")

Represent regen braking

This task involves representing regen braking events for a BEV or PHEV vehicle which get returned from the underlying powertrain model as a negative energy value. Since we can't have negative costs in our path search algorithm, we could potentially add an offset to each link traversal that will always be bigger than the smallest possible negative energy value returned from the powertrain model. One place to start would be to just use the battery size since there are no realistic scenarios where a powertrain model would return a negative value larger than the vehicle battery size for a single link traversal (although we should still check to make sure this is true at runtime).

Here's an example of a couple link passes for a BEV with a 40kwh battery:

flat link 1 mile: 0.3kwh + 40kwh = 40.3kwh cost
downhill link 1 mile: -0.05kwh + 40kwh = 39.95kwh cost

At the end of the search we would need to know to strip the offset back out to get the raw energy values and we would need to be careful when combining the energy cost with the time cost or with a utility based model.

improve error message when malformed/empty input_file is provided

while building a new TOML file, i stubbed out a required input_file:

geometry_input_file = ""

and forgot to come back and fill it in. when i ran Compass, i got a somewhat cryptic error:

Could not find incoming configuration file, tried  and configuration/. Make sure the file exists and that the config key ends with '_input_file'

this doesn't tell us what the key was, and so the resulting message is confusing. also, this message is only possible if we already have met it's later requirement of naming, "that the config key ends with '_input_file'", so that is a red herring.

crate readmes need to be github markdown

each crate's Cargo.toml currently sets it's readme field to the "README.md" file at the crate root. those files are also used by rust docs, and extend github markdown with additional features such as links like crate::app::compass_app::CompassApp, which cannot be displayed on crates.io, as seen here.

let's write markdown content for each crate that is restricted to markdown. let's also make it simple, something like this, because the reader of this readme just wants to have high-level information.

Energy traversal service gets cloned

speed_grade_energy_model_builder.rs has a service defined inline with the builder; in total, the chain looks like this:

SpeedGradeEnergyModelBuilder (1)
|
SpeedGradeModelService (2)
|
SpeedGradeEnergyModelService (3)
|
SpeedGradeModel (4)

there's an extra layer there. in the process, we end up cloning SpeedGradeModelService (2) each time we build the SpeedGradeModel, which now means cloning the hashmap of prediction models (cloning the speed lookup table is ok, that is cloning an Arc).

the cloning is likely leading to longer runtimes as each query creates a clone. this should be rewritten so that no cloning is needed.

parallelize input plugin operations

i'm running routee-compass with 2200 queries. each has 3 grid search values. as a result, the original JSON values are cloned 6600 times during the grid search plugin. they are also cloned during the rtree plugin operation. apart from some other usual stuff, these queries also have long WKT strings attached to them that are stored for later.

this is taking a real long time, 10+ minutes on my laptop. we can address this by parallelizing the run of input plugins in CompassApp. if kdam::Bar is thread-safe, we can also provide a progress bar for the user.

Support batch runs

When running a large amount of queries to one call of CompassApp.run() it would be nice to optionally batch the queries and periodically write the results to a file rather than holding them in memory.

Update docs according to recent changes (0.3->0.4)

Some names are now incorrect in the documentation, such as the traversal model type "speed_grade_energy_model", or use of "vehicle", and need an update before we do our next release.

filter by road classes doesn't sync with the rtree input plugin

this issue imported from the internal repo

r-tree pre-processing maps coordinates to vertex ids on the map. if the user chooses a frontier model that restricts some links, the r-tree will not know of these restrictions. as a result, when the road class frontier model ignores TomTom's "restricted links" (road class 7), the denver beer run test case fails, as the origin or destination must have been pegged to vertices that are only connected via restricted links:

[frontier]
type = "road_class"
road_class_file = "data/tomtom_metro_denver_network/edges-road-class-enumerated.txt.gz"
valid_road_classes = ["1", "2", "3", "4", "5", "6"]  # ignore 7, "restricted road"

{
        "error": "no path exists between vertices 57209 and 356583",
        "request": {
            "destination_name": "Comrade Brewing Company",
            "destination_vertex": 356583,
            "destination_x": -104.9009913,
            "destination_y": 39.6757025,
            "energy_cost_coefficient": 0.0,
            "origin_name": "NREL",
            "origin_vertex": 57209,
            "origin_x": -105.1710052,
            "origin_y": 39.7402804
        }
    },

some possible solutions:

make an edge-oriented rtree that is built based on the DirectedGraph or otherwise knows the list of valid edges
- may require passing the DirectedGraph or the SearchApp in the InputPlugin constructor
promote map matching algorithms from InputPlugin to its own component of SearchApp, where we can build it based on the combination of the frontier model and the directed graph
~~modify the a star priority values so they are a tuple of (-int(cost * f), -road_class) where f is some factor such as 100~~
- no longer possible since we removed road class annotations from Edge
- requires a lower-bound on cost values, which is practical if cost is utility-based
- cast to integers in order to make an equality check possible
- follows the road network hierarchy as a tie breaker, using higher-priority links when the costs are equivalent

rename vehicle/network "rate"

per this discussion, the VehicleCostRate (and NetworkCostRate) are misleading names, but renaming will creep into other parts of the system.

Investigate turn restrictions

When running some test queries, I noticed that our turn restrictions might not be functioning entirely right. For example, we see this u-turn behavior at the destination of one query:

But, when looking at this intersection on google maps, there is indeed a turn lane to go from Union to W 2nd Pl:

We should take a deeper look at the turn restrictions and make sure they're accurately represented.

pip install fails with "current package believes it's in a workspace when it's not"

Problem

Attempting to pip install nrel.routee.compass from pypi results in a rust compiler error during maturin build. The error message implies that routee-compass-py is not a member of the surrounding workspace. There is also an error asking "Does your crate compile with cargo build?"

Context

(nrel.routee.compass) rfitzger-36698s:routee rfitzger$ pip install nrel.routee.compass[osm]
Looking in indexes: https://pypi.org/simple, https://github.nrel.gov/pages/MBAP/mbap-pypi/
Collecting nrel.routee.compass[osm]
  Downloading nrel.routee.compass-0.2.0.tar.gz (7.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.8/7.8 MB 4.2 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [12 lines of output]
      error: current package believes it's in a workspace when it's not:
      current:   /private/var/folders/bb/q3jbvk751d74rvpmrs35h4q1dfwf5g/T/pip-install-wz1dtf6z/nrel-routee-compass_f5f3968fad714f72a7f017ef26e05bcc/rust/routee-compass-py/Cargo.toml
      workspace: /private/var/folders/bb/q3jbvk751d74rvpmrs35h4q1dfwf5g/T/pip-install-wz1dtf6z/nrel-routee-compass_f5f3968fad714f72a7f017ef26e05bcc/rust/Cargo.toml
      
      this may be fixable by adding `routee-compass-py` to the `workspace.members` array of the manifest located at: /private/var/folders/bb/q3jbvk751d74rvpmrs35h4q1dfwf5g/T/pip-install-wz1dtf6z/nrel-routee-compass_f5f3968fad714f72a7f017ef26e05bcc/rust/Cargo.toml
      Alternatively, to keep it out of the workspace, add the package to the `workspace.exclude` array, or add an empty `[workspace]` table to the package's manifest.
      💥 maturin failed
        Caused by: Cargo metadata failed. Does your crate compile with `cargo build`?
        Caused by: `cargo metadata` exited with an error:
      Error running maturin: Command '['maturin', 'pep517', 'write-dist-info', '--metadata-directory', '/private/var/folders/bb/q3jbvk751d74rvpmrs35h4q1dfwf5g/T/pip-modern-metadata-jf2vqfe5', '--interpreter', '/Users/rfitzger/anaconda3/envs/nrel.routee.compass/bin/python3.11']' returned non-zero exit status 1.
      Checking for Rust toolchain....
      Running `maturin pep517 write-dist-info --metadata-directory /private/var/folders/bb/q3jbvk751d74rvpmrs35h4q1dfwf5g/T/pip-modern-metadata-jf2vqfe5 --interpreter /Users/rfitzger/anaconda3/envs/nrel.routee.compass/bin/python3.11`
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

missing termination "type" falls back to default even if other fields are present

running a configuration with the following:

[termination]
models = [
    { type = "query_runtime", limit = "00:10:00" },
    { type = "solution_size", limit = 30000000 },
]

i would expect this to fail, since the (required) type field is missing, but instead, it runs. that is because the reified final config, after merging with defaults, looks like this:

[termination]
type = "query_runtime"
limit = "00:02:00"
frequency = 100_000
models = [
    { type = "query_runtime", limit = "00:10:00" },
    { type = "solution_size", limit = 30000000 },
]

here, "models" gets ignored, since the "query_runtime" type is set, and it loads correctly since all fields required are present.

to make it fail, i need to specify an invalid type:

[termination]
type = "pizza"
models = [
    { type = "query_runtime", limit = "00:10:00" },
    { type = "solution_size", limit = 30000000 },
]

which fails as expected:

[2023-11-21T20:54:02Z ERROR routee_compass] Could not build CompassApp from config file: unknown module pizza for component termination provided by configuration
Error: CompassConfigurationError(UnknownModelNameForComponent("pizza", "termination"))

PHEV implementation

Our current energy traversal model supports conventional vehicles and battery electric vehicles but not plugin hybrid electric (PHEV) vehicles and would require a bit of a modification to do so.

One primary limitation is that we're not keeping track of vehicle state of charge. This is important for a PHEV as it generally dictates what fuel type is going to be used. For example, if the state of charge is 100% and the route can be completed with battery energy only, we would report back 0 gasoline usage and X electrical energy usage. On the other hand, if the state of charge starts at 10% and the route cannot be completed using the battery energy, we might report back some electrical energy usage and some gasoline usage when the vehicle switched over to a different source of energy.

Routee-Powertrain currently models PHEVs by creating two separate models, a charge sustaining (battery and gasoline) model and a charge depleting (just battery) model.

To implement this as a traversal model we could build a simple PHEV that would use the charge depleting RouteE Powertrain model until the battery ran out of energy and then switch to the charge sustaining model for the remainder of the trip. This would require the user to pass in a state of charge parameter in the query and would require us to keep track of battery energy in our state variable. We would also need to define the battery capacity at the config level.

bi-directional a star algorithm

we are hitting some slowdowns while using energy-based routing on national trips. while we work to address this by improving our cost functions and caching, we also have a lower-hanging fruit to implement bi-directional a star. there are also other improvements to a star that could be found on the same page linked above which might apply in our problem setting.

Improve rtree distance metric

edit

this issue originally implied that the distance metric for the vertex rtree was faulty, but it turns out that's not the case. PointDistance::distance_2 expects a value that is the squared Euclidean distance between two points.

an attempt was made to swap this with haversine distance in meters, but that broke a valid test case. see that thread for discussion. the gist is that

it may be that the distance metric needs to be in the same space/unit as the rtree coordinates
to improve over lat/lon, we would need to project our vertex coordinates to a coordinate system with more consistent distances, but we ran into some pains attempting to add the proj crate which would support that kind of operation

original description

for the vertex rtree, we have a distance function

impl PointDistance for RTreeVertex {
    fn distance_2(&self, point: &Coord) -> f64 {
        let dx = self.x() - point.x;
        let dy = self.y() - point.y;
        dx * dx + dy * dy
    }
}

this looks like a quick take on Euclidean distance but missing the square root call at the end (see here).

but we can do better, we have an implementation of haversine (routee-compass-core::util::geo::haversine) that can give us this distance value in meters and avoids some skew that comes from using WGS84 values directly in a Euclidean space.

Use ONNX vehicle models for default library

Now that we support ONNX models by default, we can update our default model library to use onnx based models rather than the slower smartcore random forest models.

CompassApp option to write responses to file system

when batches are submitted to CompassApp, it keeps all incremental response objects in memory until the batch is completed. for very large batches, this may result in an out-of-memory error. to avoid this, we can immediately write the response to disk and then discard the object from RAM.

implementation will require adding functionality to the rust CompassApp that writes responses to disk and removes the response object. this can be done one of two ways:

add another run method to CompassApp that appends each response object from a chunk to some output file
create a "to disk" output plugin (rjf: i think i prefer this one, it doesn't require changes to the CompassApp Python or Rust API)
- perhaps a refactor of OutputPlugins to perform a flat_map operation similar to InputPlugins, so we can flatten the result

serializing responses as JSON and appending them to a file is likely simplest if we stick to a newline-delimited JSON format, and if we serialize the JSON as a single row (no newlines) object.

Vehicle type discussion

When building out a new PHEV vehicle model in #33, we added a new Vehicle abstraction. Right now, there are two implementations of vehicles:

SingleFuelVehicle: this represents vehicles that just take in a single fuel source (ICE, BEV, HEV).
DualFuelVehicle: this represents vehicles that take in two fuel sources (PHEV).

This distinction works for now but we might need a more granular splitting of types since each of the vehicles represented in the SingleFuelVehicle will have different energy consumption behavior (and eventually energy addition behavior). For example, a BEV can return a negative energy value from the internal routee-powertrain model and this should be added back into the battery.

Perhaps we should build out a unique Vehicle implementation for each of the following:

ICE (gas and diesel)
BEV
HEV
PHEV

If we do this, there might be some shared code/behavior between some of these and so it might involve create a set of functions to share between these models.

remove summary output plugin, roll behavior into CompassApp::apply_output_processing function

the summary module was built before we fully developed traversal and cost summaries and could be removed. we would still want to port the logic to compute the route cost into apply_output_processing. the whole module, including the summary json extensions, can be removed.

bad intra-crate links in rust documentation

just took a tour of the documentation on docs.rs for routee-compass. the links to routee-compass-core and routee-compass-powertrain both work but links like CompassApp and CompassAppBuilder do not, and these should link to the doc pages for those structs. i'm guessing there are similar problems acorss the docs in the workspace.

Explore energy turn costs

In addition to a time penalty for taking a turn, there is also a potential energy consequence (vehicle decelerates to take turn and then accelerates after turn). We should investigate ways to represent this and apply it in the search.

Improve routee-powertrain model error message

Right now if we pass in a model_name for computing energy routes and the SpeedGradeModel can't find a model that matches that name we just get a simple error:

No energy model found with name flying_car"

I would be helpful for the user if we could list out all the available energy models so they can pick one from the error message rather than having to dig into the config (especially if this query was executed on a opaque server).

Missing edge field error

When running the national beer run over the latest compass code, I get back an error:

{"error":"missing field origin_edge","request":{"destination_name":"Jack's Abby Craft Lagers","destination_vertex":20659444,"destination_vertex_uuid":"0000554d-4100-2800-ffff-ffffd044a62d","destination_x":-71.4133386,"destination_y":42.2803461,"energy_cost_coefficient":0.0,"model_name":"2016_TOYOTA_Camry_4cyl_2WD","origin_name":"NREL","origin_vertex":188668,"origin_vertex_uuid":"00004358-3100-2800-ffff-ffffd02850f1","origin_x":-105.1710052,"origin_y":39.7402804,"query_weight_estimate":2827.912636598859}}

The vertex rtree plugin is enabled and it's correctly adding the origin and destination vertex ids and so it shouldn't be attempting an edge based search.

[Discussion] Turn Cost Abstracion

In #103 we introduced a static time turn cost to the energy traversal model but acknowledged that this functionality could be useful across different traversal models. In order to share the logic we would need to come up with a clean abstraction for how to represent a turn (mapping turn degrees to turn type) and how much cost to apply for said turn. We might also consider the idea of using road class transitions as an indicator of time or energy penalty as a motivation for how to implement this abstraction.

Update fuel representations

Right now we allow conversion between petro based fuel and kwh, using a fixed value of 33.41. This should be exposed to the vehicle configuration since different fuels can have a different energy density. While we're at it, we could also add in gallons_diesel as a new energy unit and update our vehicle library for any diesel vehicle to use this new unit (and it's corresponding conversion between diesel and kwh).

inject plugin should have overwrite behaviors

noting that we have this inject plugin, and some configuration may have it used like this:

[[plugin.input_plugins]]
type = "inject"
key = "road_classes"
value = '["4","5","6","7"]'
format = "json"

what if an incoming query has a "road_classes" key already defined? really, we should be able to control what we do with this by giving inject plugins an "overwrite" behavior. i'm proposing here a taxonomy for overwrite but open to anything along these lines:

[[plugin.input_plugins]]
type = "inject"
key = "road_classes"
value = '["4","5","6","7"]'
overwrite = ""              # one of "error", "overwrite", "merge", "ignore"
format = "json"

where

"error" - fail on key collision
"overwrite" - overwrite the user's "road_classes" with those coming from the inject plugin
"merge" - attempt to merge the values if possible (may still fail)
"ignore" - pass the user's "road_classes" (a noop for the plugin)

query-time road class filter selection

the road class FrontierModel currently is parameterized at configuration time with the set of valid road classes. in order to explore different sets of valid road classes, we can optionally apply road class filtering based on a query road_classes argument.

CostFunction: abstraction or convention

aggregating ~~arbitrary~~ raw state transition data into a cost function is a naive way to estimate the cost of graph traversal and doesn't correctly capture the tradeoffs. we see this in our results, where a simple cost function such as

$$\alpha * t + (1-\alpha) * e$$

where $t$ is the time to traverse a link, $e$ is the energy to traverse a link, and $\alpha$ is a energy/time cost coefficient in the range $[0,1]$.

but since the magnitudes of $T$ and $E$ are different, and $E$ space may not increase monotonically, the composition of these values lacks the invariants of a ~~good~~ valid cost function for a graph search.

what we want

we want to combine state transitions in a way that is properly normalized. in this task, we are talking about the software engineering task of designing how we model the function from state transition to Cost in code. this could be an abstraction or maybe just a convention, because

while an abstraction feels right, it may not be helpful since any CostFunction is tightly coupled with it's TraversalModel that defines what the state transition variables are
but the right abstraction would let us define in one place some aggregation logic that could be applied to arbitrary TraversalModels

some ideas

here's an idea that doesn't try to solve 100% of the problem but come up with something reasonable to configure. we could provide cost mappings by unit type in configuration of the traversal model. for each unit type in compass, we could have an optional cost mapping:

pub enum CostMapping {
  Raw,
  Scalar { value: f64 },
  // leaving room for extension if we need to do any fancier maths, maybe not needed
  Poly2 { x0: f64, x1: f64 }, 
  Exp { base: f64, exp_coefficient: f64 },
  Combined(Vec<CostMapping>),
  ...
}

pub struct CostMappingConfiguration {
  distance: Option<CostMapping>,
  time: Option<CostMapping>,
  energy: Option<CostMapping>,
  // ... future units here
}

impl CostMappingConfiguration {
  pub fn distance_cost(&self) -> Result<Option<Cost>, Error> {} 
  // ... etc
}

and users could specify that in configuration, assumed to match units in the traversal model:

[traversal.cost]
distance_cost = { type = "scalar", value = 0.0123 }  # some mileage-based cost factor
time_cost = { type = "scalar", value = 0.0456 }  # some value-of-time cost factor
energy_cost = { type = "scalar", value = 0.00789 } # some price per KwH, possibly based on charging
aggregation = "sum"

the user could provide a cost for any unit type, and then the traversal model scoops up from this whatever cost mapping it needs.

the only potential problem here is that the mapping is 1:1 with the unit type. that may be too simple for users who may want to come up with a mix of costs. or it may be fine, they may be able to pre-compute how those factors reduce to a single factor, or they may be able to make use of the Combined approach for that too.

Input Plugin: load balance weighting heuristic

issue imported from internal repo

a recent test run took 12 seconds to run 10 queries using the distance cost function. other runs take milliseconds. here is the log for the 12 second run:

[2023-08-11T19:15:52Z INFO  compass_app] running search with 2 threads
[2023-08-11T19:16:05Z INFO  compass_app] finished search with duration 0:00:12.922
[2023-08-11T19:16:05Z INFO  compass_app] (36275214) -> (13999272) had route with 17470 links, tree with 95161 links, travel time of +1.16:15:31.209
[2023-08-11T19:16:05Z INFO  compass_app] (73939598) -> (54330726) had route with 5856 links, tree with 13756 links, travel time of 15:14:13.967
[2023-08-11T19:16:05Z INFO  compass_app] (33383999) -> (80341190) had route with 5377 links, tree with 12656 links, travel time of 10:20:42.005
[2023-08-11T19:16:05Z INFO  compass_app] (3517662) -> (434217) had route with 16189 links, tree with 61483 links, travel time of +1.8:36:54.149
[2023-08-11T19:16:05Z INFO  compass_app] (113774513) -> (4199505) had route with 8318 links, tree with 58637 links, travel time of 14:01:31.077
[2023-08-11T19:16:05Z INFO  compass_app] (85443905) -> (46951417) had route with 15112 links, tree with 269583 links, travel time of +1.9:53:07.769
[2023-08-11T19:16:05Z INFO  compass_app] (96979824) -> (77114920) had route with 8627 links, tree with 37215 links, travel time of 19:48:32.075
[2023-08-11T19:16:05Z INFO  compass_app] (105942995) -> (93507484) had route with 7025 links, tree with 22425 links, travel time of 11:13:09.247
[2023-08-11T19:16:05Z INFO  compass_app] (27824952) -> (111605148) had route with 2759 links, tree with 6635 links, travel time of 5:12:45.379
[2023-08-11T19:16:05Z INFO  compass_app] (25330684) -> (82962092) had route with 9814 links, tree with 24493 links, travel time of 16:21:21.207

the guilty party here is likely the query that generated a search tree with 269583 links, which is an order of magnitude greater than any of it's peers. what's probably happening is there are a few intra-regional trips, a few inter-regional trips, and one monster trip. because all trip queries are partitioned randomly, there is no strategic load balancing of trips to threads, and so we get this blocking behavior.

a smarter solution would assign queries by some weight heuristic. below is a table showing the above queries, the number of links per route, the number of links per tree, the ratio of route to tree, and a guess at how to assign them to threads. to do this, i manually attempt a solution of the 0-1 multi-choice knapsack problem targeting two sets based on the tree size, and one such solution is the following (cols thread a and thread b):

query	route	tree	tree/route	thread a	thread b	wc a	wc b
(36275214) -> (13999272)	17470	95161	5.45		95161	95161
(73939598) -> (54330726)	5856	13756	2.35		13756		13756
(33383999) -> (80341190)	5377	12656	2.35	12656			12656
(3517662) -> (434217)	16189	61483	3.80		61483	61483
(113774513) -> (4199505)	8318	58637	7.05		58637	58637
(85443905) -> (46951417)	15112	269583	17.84	269583		269583
(96979824) -> (77114920)	8627	37215	4.31		37215	37215
(105942995) -> (93507484)	7025	22425	3.19	22425			22425
(27824952) -> (111605148)	2759	6635	2.40		6635		6635
(25330684) -> (82962092)	9814	24493	2.50		24493		24493
TOTAL				304664	297380	522079	79965

this assignment ends up fairly well distributing tree search load evenly across two threads. comparing this to the worst case (assigning the 5 largest trees to one thread in cols wc a and wc b), where the imbalance results in one thread with 6.52x the work assigned to it. by balancing the two, we should expect to approx. half the runtime (304664/522079 = 0.5835591931). that's just in this case; in other cases, the effect may be more/less pronounced.

a real heuristic

we cannot know the resulting tree size before running the search as implied above, so we must instead use a heuristic.

one obvious choice is the same distance heuristic used by a* search, which is the haversine formula. this estimates trip distance, which could be used directly as a weight value. but this may not accurately capture the runtime effect. while the haversine distance grows in linear space, the associated search grows proportional to the depth in

. for this reason, to approximate the branching of the search, the weight should be approximated with a function like for haversine distance and avg branching factor which is likely in the range but can be computed directly from the graph adjacency list.

Python CompassApp run_from_file method supports CSV, newline JSON with batching

in order to run very large batches of queries, we want a method on a CompassApp which

takes a filename and chunksize argument (with reasonable default)
by testing the filename suffix, chooses pandas from_csv or from_json with lines=True
chunks the input, passes it to CompassApp.run

question: is it possible to do this without 1) converting each dataframe to a python List[Dict], 2) stringifying the List?

More robust path handling

Moved from our internal repository:

from @nreinicke:

When building a compass application and pointing to a configuration file that is outside of the current working directory, the building step fails as we are using relative paths in the configuration file.

We could make this more robust by first checking if the path is absolute and if that fails, constructing an asset path relative to where the configuration file lives, similar to what we do in HIVE.

This will require that we have the configuration path in scope at all build steps. It might make sense to just inject it into the config here before we pass it down the chain to the rest of the builders.

from @robfitzgerald:

right. here's an unsolicited brain dump on this (was just thinking about this yesterday when talking with tim).
a lot of these loaders are not running with CompassApp in context, so it might be tempting to do this check everywhere in the program with some helper.
i'd suggest instead that we find the correct working directory in that TryFrom for CompassApp instead of pushing the problem down to the components and fs utils.
we could test the graph file paths, first via something akin to Path("") / filepath as the path, second with conf_file.parent / filepath. it's just a "peek" operation, something cheap, like a Path().exists.

question i have then is, do we: assign the working/input directory as a field on CompassApp and then pass it along into any file loading function, or, update the paths on everything in the config file to use whichever directory path we found

2 seems easier, but it's implicit semantics could confuse the user if they don't understand why the path got prepended onto their config; 1 is messier but maybe more explicit? 🤔

from @nreinicke:

Yeah these are great points and I think I would lean towards number 2 as a preferred solution. Perhaps we could try to mitigate user confusion with more info in the error messages?

FileNotFound: File data/tomtom_metro_denver_network/edges-compass.csv.gz could not be located. You can provide file path locations as an absolute path or a relative path from where your config file is located: /Users/nreinick/repos/routee-compass-tomtom.

use github release to trigger pypi, crates releases

taking a look at polars, they seem to have figured out how to automate releases using a tag scheme where

"rs-*": a rust release
"py-*" a python release

they pattern-match off of the release name and activate a release action based on the language type. this allows the rust and python "version" of polars to move at different speeds.

our current python build action doesn't have this sophistication, but we depend on it for pypi releases, so we cannot use the naming scheme until we sort out how to get the string pattern match working.

chunk_size must not be zero

problem

i ran a grid search test which used different destination coordinates. the configuration i used didn't include the grid search plugin. the rtree was unable to parse a destination vertex. the only error i received was:

[2023-11-06T15:36:58Z INFO  routee_compass::app::compass::compass_app] creating 2 parallel batches across 16 threads to run queries with chunk size 0
thread 'main' panicked at 'chunk_size must not be zero', routee-compass/src/app/compass/compass_app.rs:241:14

the error should be that "destination vertex is missing" and it should not blow up the application (error JSON response).

normalize_file_paths will treat array of strings as array of file paths

i have a custom frontier model and it includes an array of strings:

[frontier]
type = "mep"
time_limit = 40.0
time_index = 2
road_class_input_file = "/Users/rfitzger/dev/nrel/routee/routee-compass-tomtom/data/tomtom_metro_denver_network/edges-road-class-enumerated.txt.gz"
valid_road_classes = ["1", "2", "3", "4", "5", "6"]

(this is a wrapper around two other models, one being our built-in road class model)

when i attempt to load this TOML file, i get the following error:

Could not find incoming configuration file, tried 1 and configuration/1. Make sure the file exists and that the config key ends with '_input_file'

this is because normalize_file_paths assumes all strings are paths and recurses if it sees an array.

Add turn restriction frontier model

To capture turn restrictions, we need a new frontier model that can restrict links where a turn restriction is present (like no left turn or no u-turn). The restriction can be represented as a tuple of edge ids where if an edge id tuples exists in the set of all turn restrictions, the model should return an invalid frontier.

Our current frontier model trait will need to be expanded to include the previous edge in scope:

pub trait FrontierModel: Send + Sync {
    fn valid_frontier(
        &self,
        _edge: &Edge,
        _state: &TraversalState,
        _previous_edge: Option<&Edge>,
    ) -> Result<bool, FrontierModelError> {
        Ok(true)
    }
}

In addition, it might make sense to allow chaining of frontier models so a user can easily define what restrictions are desired at configuration time. This should be as simple as making the frontier model type go from: Arc<dyn FrontierModel> to &[Arc<dyn FrontierModel>] and omitting a link from the search if any of the models return an invalid frontier.

nrel / routee-compass Goto Github PK

routee-compass's Introduction

Installation

Usage

Contributors

License

routee-compass's People

Contributors

Stargazers

Watchers

Forkers

routee-compass's Issues

potential solution

Problem

Context

edit

original description

what we want

some ideas

a real heuristic

problem

Recommend Projects

Recommend Topics

Recommend Org

Jobs