autonomousvision / tuplan_garage Goto Github PK

[CoRL'23] Parting with Misconceptions about Learning-based Vehicle Motion Planning

License: Other

Python 97.74% Shell 2.26%

autonomous-driving planning self-driving corl23 nuplan

tuplan_garage's Issues

pdm_closed_planner trajectory's states have no dynamic information

Hello,
I was inspecting the trajectory generated by the pdm_closed_planner and I realized that it contains just static information and not dynamic states (I think it would be the same for the other planners as well):

InterpolatedTrajectory with 81 states
(wrapped_fn pid=1814060) EgoState(time=1633419573.0001209), Position=(365882.03552731016, 143116.11716887038, -1.80036579096818), Velocity=(0.0, 0.0), Acceleration=(0.0, 0.0), Steering_Angle=0.0)

This is the output (I showed just one state) I get when I print the states of the trajectory with this code (in ego_state.py):

def __str__(self):
        return (
            f"EgoState(time={self.time_point.time_s}), "
            f"Position=({self.rear_axle.x}, {self.rear_axle.y}, {self.rear_axle.heading}), "
            f"Velocity=({self.dynamic_car_state.rear_axle_velocity_2d.x}, {self.dynamic_car_state.rear_axle_velocity_2d.y}), "
            f"Acceleration=({self.dynamic_car_state.rear_axle_acceleration_2d.x}, {self.dynamic_car_state.rear_axle_acceleration_2d.y}), "
            f"Steering_Angle={self.tire_steering_angle})"
        )

and this (in interpolated_trajectory.py):

def __str__(self):
        return f"InterpolatedTrajectory with {len(self._trajectory)} states"

    def print_states(self):
        for state in self._trajectory:
            print(state)

and finally this (in pdm_closed_planner.py):

def compute_planner_trajectory():
     ......................................
     ......................................
      print(trajectory)
      trajectory.print_states()         
        
      return trajectory

So, my question is: do you also generate a velocity profile for every state of the generated trajectory? If yes, how do you do it and how can you have access to it?

Also, do you take into account the left and right bounds of the road when computing the trajectory? If yes, how do you extract/generate them and how can you have access to it?

Thank you very much in advance, very appreciated!

AssertionError: Class to be of type <class 'pytorch_lightning.callbacks.base.Callback'>, but is <class 'omegaconf.dictconfig.DictConfig'>!

I met the following error when running train_gc_pgp.sh. I did not modify the repo except essential path setups. Here is the error information:

Starting Pre-Training with gt traversals as input for decoder
Global seed set to 0
2023-12-19 12:59:32,159 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-12-19 12:59:32,159 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /home/mh/code/nuplan/exp/exp/training_gc_pgp_model/training_gc_pgp_model/2023.12.19.12.59.31
2023-12-19 12:59:32,159 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-12-19 12:59:32,160 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-12-19 12:59:33,686 INFO worker.py:1664 -- Started a local Ray instance. View the dashboard at 35.3.215.205:8265
2023-12-19 12:59:34,153 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101} Worker: RayDistributed
2023-12-19 12:59:34,155 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102} Number of nodes: 1
Number of CPUs per node: 32
Number of GPUs per node: 1
Number of threads across all nodes: 32
2023-12-19 12:59:34,155 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27} Building WorkerPool...DONE!
2023-12-19 12:59:34,155 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/training/experiments/training.py:41} Building training engine...
2023-12-19 12:59:34,155 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18} Building TorchModuleWrapper...
2023-12-19 12:59:34,299 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21} Building TorchModuleWrapper...DONE!
2023-12-19 12:59:34,299 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18} Building Splitter...
2023-12-19 12:59:34,675 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21} Building Splitter...DONE!
2023-12-19 12:59:34,675 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/data_augmentation_builder.py:19} Building augmentors...
2023-12-19 12:59:34,685 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/data_augmentation_builder.py:28} Building augmentors...DONE!
2023-12-19 12:59:34,686 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18} Building AbstractScenarioBuilder...
2023-12-19 12:59:34,737 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21} Building AbstractScenarioBuilder...DONE!
2023-12-19 12:59:34,737 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35} Building ScenarioFilter...
2023-12-19 12:59:34,738 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44} Building ScenarioFilter...DONE!
Ray objects: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [02:11<00:00, 4.12s/it]
2023-12-19 13:01:49,405 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_builder.py:171} Extracted 177435 scenarios for training
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258} WORLD_SIZE was not set.
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266} PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277} Number of gpus found to be in use: 1
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:114} World size: 1
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:115} Learning rate before: 0.0001
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:119} Scaling method: Equal Variance
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:141} Betas after scaling: [0.9, 0.999]
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:143} Learning rate after scaling: 0.0001
2023-12-19 13:01:49,487 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:172} Updating Learning Rate Scheduler Config...
2023-12-19 13:01:49,487 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258} WORLD_SIZE was not set.
2023-12-19 13:01:49,487 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266} PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-12-19 13:01:49,487 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277} Number of gpus found to be in use: 1
2023-12-19 13:01:49,487 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:199} Updating torch.optim.lr_scheduler.MultiStepLR in ddp setting is not yet supported. Learning rate scheduler config will not be updated.
2023-12-19 13:01:49,487 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:245} Optimizer and LR Scheduler configs updated according to ddp strategy.
2023-12-19 13:01:49,494 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py:19} Building callbacks...
Error executing job with overrides: ['seed=0', 'py_func=train', '+training=training_gc_pgp_model', 'job_name=training_gc_pgp_model', 'scenario_builder=nuplan', 'scenario_filter.num_scenarios_per_type=4000', 'cache.cache_path=/home/mh/code/nuplan/exp/mh/cache', 'cache.use_cache_without_dataset=False', 'callbacks.visualization_callback.pixel_size=0.25', '+callbacks.multimodal_visualization_callback.pixel_size=0.25', 'lightning.trainer.params.max_epochs=20', 'lightning.trainer.params.max_time=null', 'data_loader.params.batch_size=32', 'optimizer.lr=1e-4', 'lr_scheduler=multistep_lr', 'lr_scheduler.milestones=[40,50,55]', 'lr_scheduler.gamma=0.5', 'model.encoder.use_red_light_feature=TRUE', 'model.aggregator.use_route_mask=FALSE', 'model.aggregator.hard_masking=FALSE', 'model.aggregator.pre_train=true']
Traceback (most recent call last):
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/run_training.py", line 89, in
main()
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/main.py", line 49, in decorated_main
_run_hydra(
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 367, in _run_hydra
run_and_report(
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 214, in run_and_report
raise ex
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
return func()
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 368, in
lambda: hydra.run(
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 110, in run
_ = ret.return_value
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/core/utils.py", line 233, in return_value
raise self._return_value
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/core/utils.py", line 160, in run_job
ret.return_value = task_function(task_cfg)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/run_training.py", line 59, in main
engine = build_training_engine(cfg, worker)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/training/experiments/training.py", line 60, in build_training_engine
trainer = build_trainer(cfg)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_builder.py", line 109, in build_trainer
callbacks = build_callbacks(cfg)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py", line 25, in build_callbacks
validate_type(callback, pl.Callback)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_type.py", line 32, in validate_type
assert isinstance(
AssertionError: Class to be of type <class 'pytorch_lightning.callbacks.base.Callback'>, but is <class 'omegaconf.dictconfig.DictConfig'>!
Starting Training with aggregator traversals as input for decoder
Global seed set to 0
2023-12-19 13:01:56,346 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-12-19 13:01:56,346 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /home/mh/code/nuplan/exp/exp/training_gc_pgp_model/training_gc_pgp_model/2023.12.19.13.01.55
2023-12-19 13:01:56,347 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-12-19 13:01:56,347 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-12-19 13:01:57,778 INFO worker.py:1664 -- Started a local Ray instance. View the dashboard at 35.3.215.205:8265
2023-12-19 13:01:58,236 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101} Worker: RayDistributed
2023-12-19 13:01:58,237 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102} Number of nodes: 1
Number of CPUs per node: 32
Number of GPUs per node: 1
Number of threads across all nodes: 32
2023-12-19 13:01:58,237 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27} Building WorkerPool...DONE!
2023-12-19 13:01:58,237 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/training/experiments/training.py:41} Building training engine...
2023-12-19 13:01:58,237 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18} Building TorchModuleWrapper...
2023-12-19 13:01:58,376 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21} Building TorchModuleWrapper...DONE!
2023-12-19 13:01:58,376 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18} Building Splitter...
2023-12-19 13:01:58,760 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21} Building Splitter...DONE!
2023-12-19 13:01:58,760 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/data_augmentation_builder.py:19} Building augmentors...
2023-12-19 13:01:58,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/data_augmentation_builder.py:28} Building augmentors...DONE!
2023-12-19 13:01:58,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18} Building AbstractScenarioBuilder...
2023-12-19 13:01:58,821 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21} Building AbstractScenarioBuilder...DONE!
2023-12-19 13:01:58,821 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35} Building ScenarioFilter...
2023-12-19 13:01:58,821 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44} Building ScenarioFilter...DONE!
Ray objects: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [02:10<00:00, 4.09s/it]
2023-12-19 13:04:12,689 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_builder.py:171} Extracted 177435 scenarios for training
2023-12-19 13:04:12,692 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258} WORLD_SIZE was not set.
2023-12-19 13:04:12,692 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266} PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-12-19 13:04:12,692 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277} Number of gpus found to be in use: 1
2023-12-19 13:04:12,692 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:114} World size: 1
2023-12-19 13:04:12,692 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:115} Learning rate before: 0.0001
2023-12-19 13:04:12,692 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:119} Scaling method: Equal Variance
2023-12-19 13:04:12,693 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:141} Betas after scaling: [0.9, 0.999]
2023-12-19 13:04:12,693 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:143} Learning rate after scaling: 0.0001
2023-12-19 13:04:12,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:172} Updating Learning Rate Scheduler Config...
2023-12-19 13:04:12,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258} WORLD_SIZE was not set.
2023-12-19 13:04:12,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266} PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-12-19 13:04:12,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277} Number of gpus found to be in use: 1
2023-12-19 13:04:12,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:199} Updating torch.optim.lr_scheduler.MultiStepLR in ddp setting is not yet supported. Learning rate scheduler config will not be updated.
2023-12-19 13:04:12,771 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:245} Optimizer and LR Scheduler configs updated according to ddp strategy.
2023-12-19 13:04:12,777 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py:19} Building callbacks...
Error executing job with overrides: ['seed=0', 'py_func=train', '+training=training_gc_pgp_model', 'job_name=training_gc_pgp_model', 'scenario_builder=nuplan', 'scenario_filter.num_scenarios_per_type=4000', 'cache.cache_path=/home/mh/code/nuplan/exp/mh/cache', 'cache.use_cache_without_dataset=False', 'callbacks.visualization_callback.pixel_size=0.25', '+callbacks.multimodal_visualization_callback.pixel_size=0.25', 'lightning.trainer.params.max_epochs=90', 'lightning.trainer.params.max_time=null', 'lightning.trainer.checkpoint.resume_training=true', 'data_loader.params.batch_size=32', 'optimizer.lr=1e-4', 'lr_scheduler=multistep_lr', 'model.encoder.use_red_light_feature=TRUE', 'model.aggregator.use_route_mask=FALSE', 'model.aggregator.hard_masking=FALSE', 'model.aggregator.pre_train=false']
Traceback (most recent call last):
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/run_training.py", line 89, in
main()
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/main.py", line 49, in decorated_main
_run_hydra(
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 367, in _run_hydra
run_and_report(
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 214, in run_and_report
raise ex
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
return func()
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 368, in
lambda: hydra.run(
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 110, in run
_ = ret.return_value
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/core/utils.py", line 233, in return_value
raise self._return_value
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/core/utils.py", line 160, in run_job
ret.return_value = task_function(task_cfg)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/run_training.py", line 59, in main
engine = build_training_engine(cfg, worker)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/training/experiments/training.py", line 60, in build_training_engine
trainer = build_trainer(cfg)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_builder.py", line 109, in build_trainer
callbacks = build_callbacks(cfg)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py", line 25, in build_callbacks
validate_type(callback, pl.Callback)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_type.py", line 32, in validate_type
assert isinstance(
AssertionError: Class to be of type <class 'pytorch_lightning.callbacks.base.Callback'>, but is <class 'omegaconf.dictconfig.DictConfig'>!

I appreciate any suggestions!

accelerate simulation by cuda

Hello,
I run simulation by using cpu, but the speed is slowly.
I see this code in pdm_open_planer.py

        self._device = "cpu"
        self._model = LightningModuleWrapper.load_from_checkpoint(
            checkpoint_path,
            model=model,
            map_location=self._device,
        ).model

I change device cpu to cuda:0
but have bug about tensors have two different device cpu and cuda:0
so I want to ask if you use cuda to accelerate simulation process or usecpu run it.
and how to change cofing to use cuda:0
Thanks for your help!

Inconsistency between centerline and PDM-Closed trajectory

Hi, running the simulation I noticed that the centerline and the trajectory generated by PDM-Closed planner are not consistent in more than one scenario, as you can see in the pictures below (the red line is the centerline, the blue line is the PDM-Closed trajectory):

My question is the following:

for the PDM-Open model you take as input just the centerline (together with ego history). Instead, for the hybrid model, you fuse together the PDM-Closed trajectory with the PDM-Open one (trained on the centerline).
Since, as shown in the plots above, PDM-Closed trajectory and centerline are not consistent, is it possible that this leads to a distorted hybrid trajectory?

Indeed, as shown above, the hybrid trajectory is distorted in the point where you fuse together PDM-Closed and PDM-Open.

Do you think this could be the cause of the distorted trajectory?
If that is the case, is there a reason why you trained the MLP-Open just taking the centerline as input?

Thanks a lot in advance!

Ray error before the start of training

Problem

Hello. I have set up the nuplan environment and installed tuplan_garage as a package, followed every step for the preparation in the readme.md. However, when I tried to train the model, I have encountered a fatal Ray error. Every time after 'ray objects' is finished, it soon failed to start the dashboard, causing the program to 'ray objects' again. Because of the failure to initialize the ray instance, there is no log recording the error. I have searched a similar issue here but of little help. Thank you for the assistance.

Reproduce

bash the code file below:

TRAIN_EPOCHS=100
TRAIN_LR=1e-4
TRAIN_LR_MILESTONES=[50,75]
TRAIN_LR_DECAY=0.1
BATCH_SIZE=64
SEED=0

JOB_NAME=training_pdm_open_model
CACHE_PATH=/mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/cache
USE_CACHE_WITHOUT_DATASET=False

source ~/.bashrc
conda activate nuplan
python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_training.py \
seed=$SEED \
py_func=train \
+training=training_pdm_open_model \
job_name=$JOB_NAME \
scenario_builder=nuplan \
cache.cache_path=$CACHE_PATH \
cache.use_cache_without_dataset=$USE_CACHE_WITHOUT_DATASET \
lightning.trainer.params.max_epochs=$TRAIN_EPOCHS \
data_loader.params.batch_size=$BATCH_SIZE \
optimizer.lr=$TRAIN_LR \
lr_scheduler=multistep_lr \
lr_scheduler.milestones=$TRAIN_LR_MILESTONES \
lr_scheduler.gamma=$TRAIN_LR_DECAY \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.training, pkg://tuplan_garage.planning.script.experiments, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"

Output

Global seed set to 0
2023-09-16 11:08:48,865 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:08:48,868 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.08.46
2023-09-16 11:08:48,868 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:08:48,870 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-09-16 11:08:52,865 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:08:58,481 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101}  Worker: RayDistributed
2023-09-16 11:08:58,482 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102}  Number of nodes: 1
Number of CPUs per node: 96
Number of GPUs per node: 8
Number of threads across all nodes: 96
2023-09-16 11:08:58,482 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27}  Building WorkerPool...DONE!
2023-09-16 11:08:58,482 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/training/experiments/training.py:41}  Building training engine...
2023-09-16 11:08:58,483 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18}  Building TorchModuleWrapper...
2023-09-16 11:08:59,487 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21}  Building TorchModuleWrapper...DONE!
2023-09-16 11:08:59,488 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18}  Building Splitter...
2023-09-16 11:09:00,464 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21}  Building Splitter...DONE!
2023-09-16 11:09:00,465 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18}  Building AbstractScenarioBuilder...
2023-09-16 11:09:00,988 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21}  Building AbstractScenarioBuilder...DONE!
2023-09-16 11:09:00,988 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35}  Building ScenarioFilter...
2023-09-16 11:09:00,989 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44}  Building ScenarioFilter...DONE!
Ray objects: 100%|██████████████████████████████████████████████████████████████████████████████| 96/96 [13:16<00:00,  8.29s/it]
2023-09-16 11:22:25,347 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_builder.py:171}  Extracted 177435 scenarios for training
2023-09-16 11:22:25,347 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258}  WORLD_SIZE was not set.
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266}  PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277}  Number of gpus found to be in use: 8
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:114}  World size: 8
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:115}  Learning rate before: 0.0001
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:119}  Scaling method: Equal Variance
2023-09-16 11:22:25,349 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:141}  Betas after scaling: [0.7422979694372631, 0.9971741579476155]
2023-09-16 11:22:25,349 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:143}  Learning rate after scaling: 0.000282842712474619
2023-09-16 11:22:25,478 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:172}  Updating Learning Rate Scheduler Config...
2023-09-16 11:22:25,478 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258}  WORLD_SIZE was not set.
2023-09-16 11:22:25,478 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266}  PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-09-16 11:22:25,479 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277}  Number of gpus found to be in use: 8
2023-09-16 11:22:25,479 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:199}  Updating torch.optim.lr_scheduler.MultiStepLR in ddp setting is not yet supported. Learning rate scheduler config will not be updated.
2023-09-16 11:22:25,479 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:245}  Optimizer and LR Scheduler configs updated according to ddp strategy.
2023-09-16 11:22:25,503 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py:19}  Building callbacks...
2023-09-16 11:22:25,538 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py:37}  Building callbacks...DONE!
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
Using native 16bit precision.
2023-09-16 11:22:25,539 INFO {/home/linqing.zhao/nuplan-devkit//nuplan/planning/script/run_training.py:62}  Starting training...
Global seed set to 0
2023-09-16 11:22:39,118 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:22:39,121 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.38
2023-09-16 11:22:39,121 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:22:39,123 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
Global seed set to 0
2023-09-16 11:22:41,279 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:22:41,281 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.40
2023-09-16 11:22:41,281 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:22:41,283 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-09-16 11:22:42,819 INFO worker.py:1621 -- Started a local Ray instance.
Global seed set to 0
2023-09-16 11:22:45,132 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:22:45,138 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.44
2023-09-16 11:22:45,138 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:22:45,140 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-09-16 11:22:45,659 INFO worker.py:1621 -- Started a local Ray instance.
Global seed set to 0
2023-09-16 11:22:49,560 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101}  Worker: RayDistributed
2023-09-16 11:22:49,560 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102}  Number of nodes: 1
Number of CPUs per node: 96
Number of GPUs per node: 8
Number of threads across all nodes: 96
2023-09-16 11:22:49,561 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27}  Building WorkerPool...DONE!
2023-09-16 11:22:49,561 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/training/experiments/training.py:41}  Building training engine...
2023-09-16 11:22:49,561 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18}  Building TorchModuleWrapper...
2023-09-16 11:22:49,782 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:22:49,784 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.49
2023-09-16 11:22:49,785 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:22:49,787 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-09-16 11:22:50,106 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21}  Building TorchModuleWrapper...DONE!
2023-09-16 11:22:50,106 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18}  Building Splitter...
Global seed set to 0
initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/8
2023-09-16 11:22:51,378 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21}  Building Splitter...DONE!
2023-09-16 11:22:51,379 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18}  Building AbstractScenarioBuilder...
2023-09-16 11:22:51,571 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21}  Building AbstractScenarioBuilder...DONE!
2023-09-16 11:22:51,571 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35}  Building ScenarioFilter...
2023-09-16 11:22:51,573 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44}  Building ScenarioFilter...DONE!
Ray objects:   0%|                                                                      | 0/96 [00:00<?, ?it/s]2023-09-16 11:22:54,601 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101}  Worker: RayDistributed
2023-09-16 11:22:54,601 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102}  Number of nodes: 1
Number of CPUs per node: 96
Number of GPUs per node: 8
Number of threads across all nodes: 96
2023-09-16 11:22:54,602 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27}  Building WorkerPool...DONE!
2023-09-16 11:22:54,602 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/training/experiments/training.py:41}  Building training engine...
2023-09-16 11:22:54,602 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18}  Building TorchModuleWrapper...
Global seed set to 0
2023-09-16 11:22:55,184 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21}  Building TorchModuleWrapper...DONE!
2023-09-16 11:22:55,184 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18}  Building Splitter...
2023-09-16 11:22:55,567 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:22:55,599 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:22:55,607 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.55
2023-09-16 11:22:55,608 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:22:55,610 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-09-16 11:22:55,752 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:22:56,570 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21}  Building Splitter...DONE!
2023-09-16 11:22:56,571 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18}  Building AbstractScenarioBuilder...
2023-09-16 11:22:56,780 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21}  Building AbstractScenarioBuilder...DONE!
2023-09-16 11:22:56,780 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35}  Building ScenarioFilter...
2023-09-16 11:22:56,782 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44}  Building ScenarioFilter...DONE!
Ray objects:   0%|                                                                      | 0/96 [00:00<?, ?it/s]Global seed set to 0
2023-09-16 11:23:03,159 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:23:03,167 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.23.02
2023-09-16 11:23:03,168 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:23:03,170 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
Global seed set to 0
2023-09-16 11:23:12,023 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:23:12,030 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.23.11
2023-09-16 11:23:12,031 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:23:12,034 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-09-16 11:23:17,381 ERROR services.py:1207 -- Failed to start the dashboard 
2023-09-16 11:23:17,382 ERROR services.py:1232 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2023-09-16 11:23:17,382 ERROR services.py:1242 -- Couldn't read dashboard.log file. Error: [Errno 2] No such file or directory: '/tmp/ray/session_2023-09-16_11-22-55_720350_89715/logs/dashboard.log'. It means the dashboard is broken even before it initializes the logger (mostly dependency issues). Reading the dashboard.err file which contains stdout/stderr.
2023-09-16 11:23:17,382 ERROR services.py:1276 -- Failed to read dashboard.err file: cannot mmap an empty file. It is unexpected. Please report an issue to Ray github. https://github.com/ray-project/ray/issues
2023-09-16 11:23:17,582 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:23:25,116 ERROR services.py:1207 -- Failed to start the dashboard 
2023-09-16 11:23:25,116 ERROR services.py:1232 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2023-09-16 11:23:25,116 ERROR services.py:1242 -- Couldn't read dashboard.log file. Error: [Errno 2] No such file or directory: '/tmp/ray/session_2023-09-16_11-23-03_304985_90490/logs/dashboard.log'. It means the dashboard is broken even before it initializes the logger (mostly dependency issues). Reading the dashboard.err file which contains stdout/stderr.
2023-09-16 11:23:25,116 ERROR services.py:1276 -- Failed to read dashboard.err file: cannot mmap an empty file. It is unexpected. Please report an issue to Ray github. https://github.com/ray-project/ray/issues
2023-09-16 11:23:25,233 INFO worker.py:1621 -- Started a local Ray instance.
[2023-09-16 11:23:26,416 E 89301 89301] core_worker.cc:201: Failed to register worker 01000000ffffffffffffffffffffffffffffffffffffffffffffffff to Raylet. IOError: [RayletClient] Unable to register worker with raylet. No such file or directory

GPUs seems not runing when training

I encountered issues when running the tuplan-garage project on my computer. The training process on the nuplan dataset was slow and the GPU memory usage was only at 20%. I tried on the A100 cluster, the GPU usage was pretty low too.

I have been using the default training settings. I'm curious if any important settings were overlooked during the process?

GPU_no_runing.txt

GPU usage info

Error in evaluation

Hi,

Thank you for the code! However, when I run the evaluation command in readme, I encounter this error:

INFO:nuplan.planning.script.builders.main_callback_builder:Building MultiMainCallback...
INFO:nuplan.planning.script.builders.main_callback_builder:Building MultiMainCallback: 4...DONE!
2023-07-19 21:44:16,393 INFO {/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-07-19 21:44:16,436 INFO {/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-07-19 21:44:20,657 ERROR services.py:1207 -- Failed to start the dashboard , return code -11
2023-07-19 21:44:20,657 ERROR services.py:1232 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2023-07-19 21:44:20,658 ERROR services.py:1276 --
The last 20 lines of /tmp/ray/session_2023-07-19_21-44-16_486474_454342/logs/dashboard.log (it contains the error message from the dashboard):
2023-07-19 21:44:18,280 INFO head.py:242 -- Starting dashboard metrics server on port 44227

2023-07-19 21:44:20,971 INFO worker.py:1636 -- Started a local Ray instance.
[2023-07-19 21:44:22,949 E 454342 454342] core_worker.cc:193: Failed to register worker 01000000ffffffffffffffffffffffffffffffffffffffffffffffff to Raylet. IOError: [RayletClient] Unable to register worker with raylet. No such file or directory

Do you have any hints on what went wrong?

How to perform visualization of the PDM algorithm?

I noticed the item related to visualization in the 'To Do' list. I'd like to inquire about when the corresponding visualization scripts will be made available. Thank you !

default_simulation planner

In 'default_simulation': Could not find 'planner/pdm_closed_planner'

Available options in 'planner':
idm_planner
log_future_planner
ml_planner
remote_planner
simple_planner

dealed with file:///home/***/tuplan_garage/tuplan_garage/planning/script/config/common

Updated checkpoints for urban driver and PDM open missing

Hello,

I was wondering when the updated versions of the checkpoints at this link (https://drive.google.com/drive/folders/1LLdunqyvQQuBuknzmf7KMIJiA2grLYB2) would be uploaded to be compatible with the renamed version of tuplan garage.

about the validation set in the training process

Hi,

I am a little confused about the val14 data split. Did you use all the validation set and the training set for sampling the 178k samples for the train150 set, in which the samples in the val set is used to do validation, then the val14 set with 1118 samples from the val set is used to test the performance? That means the val set and the test set of val14 setting might overlap？

time for training

Hello，
Great work
How much time needed for training based on a single 3090 GPU
Thanks a lot!

training gc_pgp model

Hi,

For you released gc_pgp code,I notice you pretrained the aggregator. Did you load the pretrained aggregator when training the gc_pgp? I didn't find any model loading code. Also,could you share the training configuration of urban driver? I trained the urbandriver on the train150k set with 4000 scenarios per type, but the performance is much lower than it in the table in your readme, with only 55 open-loop score.

Bests,

Do I need to download all the data of the camera and lidar?

No module named 'nuplan_garage' when loading "pdm_offset_checkpoint.ckpt"

I want to run simulations using your provided models from https://drive.google.com/drive/folders/1LLdunqyvQQuBuknzmf7KMIJiA2grLYB2. However, there is an error "No module named 'nuplan_garage'" when I load "pdm_offset_checkpoint.ckpt" and "gc_pgp_checkpoint.ckpt". This error does not happen when I load "pdm_open_checkpoint.ckpt".

Maybe the reason is that the checkpoints are not regenerated after you changed the repository name due to the trademark conflicts.

Thanks in advance!

some questions about nuplan dataset

Hello, I met some issues, these made me confusing, I would appreaite it if u could help me.

Firstly, what's the different between the open-loop and closed-loop(NR)? I have read the paper of nuplan, and i know these are both non-reactive patterns, so they simulate agents by directly replaying logs. I think the difference is that planner know the new state in closed-loop(NR) in each step, so planner can correct its trajectory in future, but open-loop dont considers that. Do I understand correctly？Can you tell me more about the differences between these two modes？

Secondly, i am going to try to use my own controller based RL to control the background agents, but actually i dont know how. Have you ever tried before? And could you tell me how they implement background IDM in the nuplan-devkit in the code?

L2 computation issue in open-loop settings

Hi, thanks for your great work.

I‘m trying to compute L2 error in open-loop settings. (Like NuScenes we compute L2 in 0.5s, 1s, 1.5s, 2s, 2.5, 3s timestamps)

So I wonder these infos:

how to distinguish Pred and GT waypoints in open-loop simulation logs.
pred & GT's frequency setting (are they same? this may be set in simulation config but I'm not familiar with the whole architecture :), looking forward to your generous guidance)

for example, when I try to analysis the simulation log file:

"exp/simulation/open_loop_boxes/2024.05.09.13.50.26/simulation_log/MLPlanner/following_lane_with_lead/2021.06.07.11.59.52_veh-35_00765_01072/5d91ff45ef9d568f/5d91ff45ef9d568f.msgpack.xz"

How to use these two paras (or other paras) to compute L2?
data.simulation_history.data[i].ego_state.waypoint.center
data.simulation_history.data[i].trajectory._trajectory[j].center

(where i = 1,2,..., frame_num-1； j = 0,1,2,...,16, and data = SimulationLog.load_data(file_path=log_path))

can

the series of [data.simulation_history.data[i].ego_state.waypoint.center, data.simulation_history.data[i+1].ego_state.waypoint.center, data.simulation_history.data[i+2].ego_state.waypoint.center, ..., data.simulation_history.data[i+N].ego_state.waypoint.center] be seen as GT?

and

[data.simulation_history.data[i].trajectory._trajectory[j].center, data.simulation_history.data[i].trajectory._trajectory[j+1].center, data.simulation_history.data[i].trajectory._trajectory[j+2].center,..., data.simulation_history.data[i].trajectory._trajectory[j+N].center] be seen as Pred ?

Many thanks!

Greetings,
Ryhn

about visualization

Thank you for open-sourcing your work. How can I export high-resolution simulation images from NuBoard in SVG or other formats? If I use the save button provided by NuBoard, the quality of the images obtained is very poor.

Ray problem before the start of training

Problem

Reproduce

bash the code file below:

TRAIN_EPOCHS=100
TRAIN_LR=1e-4
TRAIN_LR_MILESTONES=[50,75]
TRAIN_LR_DECAY=0.1
BATCH_SIZE=64
SEED=0

JOB_NAME=training_pdm_open_model
CACHE_PATH=/mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/cache
USE_CACHE_WITHOUT_DATASET=False

source ~/.bashrc
conda activate nuplan
python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_training.py \
seed=$SEED \
py_func=train \
+training=training_pdm_open_model \
job_name=$JOB_NAME \
scenario_builder=nuplan \
cache.cache_path=$CACHE_PATH \
cache.use_cache_without_dataset=$USE_CACHE_WITHOUT_DATASET \
lightning.trainer.params.max_epochs=$TRAIN_EPOCHS \
data_loader.params.batch_size=$BATCH_SIZE \
optimizer.lr=$TRAIN_LR \
lr_scheduler=multistep_lr \
lr_scheduler.milestones=$TRAIN_LR_MILESTONES \
lr_scheduler.gamma=$TRAIN_LR_DECAY \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.training, pkg://tuplan_garage.planning.script.experiments, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"

Output

Global seed set to 0
2023-09-16 11:08:48,865 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:08:48,868 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.08.46
2023-09-16 11:08:48,868 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:08:48,870 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-09-16 11:08:52,865 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:08:58,481 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101}  Worker: RayDistributed
2023-09-16 11:08:58,482 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102}  Number of nodes: 1
Number of CPUs per node: 96
Number of GPUs per node: 8
Number of threads across all nodes: 96
2023-09-16 11:08:58,482 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27}  Building WorkerPool...DONE!
2023-09-16 11:08:58,482 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/training/experiments/training.py:41}  Building training engine...
2023-09-16 11:08:58,483 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18}  Building TorchModuleWrapper...
2023-09-16 11:08:59,487 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21}  Building TorchModuleWrapper...DONE!
2023-09-16 11:08:59,488 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18}  Building Splitter...
2023-09-16 11:09:00,464 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21}  Building Splitter...DONE!
2023-09-16 11:09:00,465 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18}  Building AbstractScenarioBuilder...
2023-09-16 11:09:00,988 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21}  Building AbstractScenarioBuilder...DONE!
2023-09-16 11:09:00,988 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35}  Building ScenarioFilter...
2023-09-16 11:09:00,989 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44}  Building ScenarioFilter...DONE!
Ray objects: 100%|██████████████████████████████████████████████████████████████████████████████| 96/96 [13:16<00:00,  8.29s/it]
2023-09-16 11:22:25,347 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_builder.py:171}  Extracted 177435 scenarios for training
2023-09-16 11:22:25,347 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258}  WORLD_SIZE was not set.
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266}  PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277}  Number of gpus found to be in use: 8
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:114}  World size: 8
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:115}  Learning rate before: 0.0001
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:119}  Scaling method: Equal Variance
2023-09-16 11:22:25,349 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:141}  Betas after scaling: [0.7422979694372631, 0.9971741579476155]
2023-09-16 11:22:25,349 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:143}  Learning rate after scaling: 0.000282842712474619
2023-09-16 11:22:25,478 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:172}  Updating Learning Rate Scheduler Config...
2023-09-16 11:22:25,478 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258}  WORLD_SIZE was not set.
2023-09-16 11:22:25,478 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266}  PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-09-16 11:22:25,479 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277}  Number of gpus found to be in use: 8
2023-09-16 11:22:25,479 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:199}  Updating torch.optim.lr_scheduler.MultiStepLR in ddp setting is not yet supported. Learning rate scheduler config will not be updated.
2023-09-16 11:22:25,479 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:245}  Optimizer and LR Scheduler configs updated according to ddp strategy.
2023-09-16 11:22:25,503 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py:19}  Building callbacks...
2023-09-16 11:22:25,538 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py:37}  Building callbacks...DONE!
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
Using native 16bit precision.
2023-09-16 11:22:25,539 INFO {/home/linqing.zhao/nuplan-devkit//nuplan/planning/script/run_training.py:62}  Starting training...
Global seed set to 0
2023-09-16 11:22:39,118 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:22:39,121 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.38
2023-09-16 11:22:39,121 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:22:39,123 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
Global seed set to 0
2023-09-16 11:22:41,279 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:22:41,281 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.40
2023-09-16 11:22:41,281 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:22:41,283 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-09-16 11:22:42,819 INFO worker.py:1621 -- Started a local Ray instance.
Global seed set to 0
2023-09-16 11:22:45,132 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:22:45,138 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.44
2023-09-16 11:22:45,138 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:22:45,140 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-09-16 11:22:45,659 INFO worker.py:1621 -- Started a local Ray instance.
Global seed set to 0
2023-09-16 11:22:49,560 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101}  Worker: RayDistributed
2023-09-16 11:22:49,560 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102}  Number of nodes: 1
Number of CPUs per node: 96
Number of GPUs per node: 8
Number of threads across all nodes: 96
2023-09-16 11:22:49,561 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27}  Building WorkerPool...DONE!
2023-09-16 11:22:49,561 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/training/experiments/training.py:41}  Building training engine...
2023-09-16 11:22:49,561 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18}  Building TorchModuleWrapper...
2023-09-16 11:22:49,782 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:22:49,784 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.49
2023-09-16 11:22:49,785 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:22:49,787 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-09-16 11:22:50,106 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21}  Building TorchModuleWrapper...DONE!
2023-09-16 11:22:50,106 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18}  Building Splitter...
Global seed set to 0
initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/8
2023-09-16 11:22:51,378 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21}  Building Splitter...DONE!
2023-09-16 11:22:51,379 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18}  Building AbstractScenarioBuilder...
2023-09-16 11:22:51,571 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21}  Building AbstractScenarioBuilder...DONE!
2023-09-16 11:22:51,571 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35}  Building ScenarioFilter...
2023-09-16 11:22:51,573 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44}  Building ScenarioFilter...DONE!
Ray objects:   0%|                                                                      | 0/96 [00:00<?, ?it/s]2023-09-16 11:22:54,601 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101}  Worker: RayDistributed
2023-09-16 11:22:54,601 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102}  Number of nodes: 1
Number of CPUs per node: 96
Number of GPUs per node: 8
Number of threads across all nodes: 96
2023-09-16 11:22:54,602 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27}  Building WorkerPool...DONE!
2023-09-16 11:22:54,602 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/training/experiments/training.py:41}  Building training engine...
2023-09-16 11:22:54,602 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18}  Building TorchModuleWrapper...
Global seed set to 0
2023-09-16 11:22:55,184 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21}  Building TorchModuleWrapper...DONE!
2023-09-16 11:22:55,184 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18}  Building Splitter...
2023-09-16 11:22:55,567 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:22:55,599 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:22:55,607 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.55
2023-09-16 11:22:55,608 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:22:55,610 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-09-16 11:22:55,752 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:22:56,570 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21}  Building Splitter...DONE!
2023-09-16 11:22:56,571 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18}  Building AbstractScenarioBuilder...
2023-09-16 11:22:56,780 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21}  Building AbstractScenarioBuilder...DONE!
2023-09-16 11:22:56,780 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35}  Building ScenarioFilter...
2023-09-16 11:22:56,782 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44}  Building ScenarioFilter...DONE!
Ray objects:   0%|                                                                      | 0/96 [00:00<?, ?it/s]Global seed set to 0
2023-09-16 11:23:03,159 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:23:03,167 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.23.02
2023-09-16 11:23:03,168 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:23:03,170 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
Global seed set to 0
2023-09-16 11:23:12,023 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20}  Building experiment folders...
2023-09-16 11:23:12,030 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22}  Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.23.11
2023-09-16 11:23:12,031 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19}  Building WorkerPool...
2023-09-16 11:23:12,034 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78}  Starting ray local!
2023-09-16 11:23:17,381 ERROR services.py:1207 -- Failed to start the dashboard 
2023-09-16 11:23:17,382 ERROR services.py:1232 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2023-09-16 11:23:17,382 ERROR services.py:1242 -- Couldn't read dashboard.log file. Error: [Errno 2] No such file or directory: '/tmp/ray/session_2023-09-16_11-22-55_720350_89715/logs/dashboard.log'. It means the dashboard is broken even before it initializes the logger (mostly dependency issues). Reading the dashboard.err file which contains stdout/stderr.
2023-09-16 11:23:17,382 ERROR services.py:1276 -- Failed to read dashboard.err file: cannot mmap an empty file. It is unexpected. Please report an issue to Ray github. https://github.com/ray-project/ray/issues
2023-09-16 11:23:17,582 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:23:25,116 ERROR services.py:1207 -- Failed to start the dashboard 
2023-09-16 11:23:25,116 ERROR services.py:1232 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2023-09-16 11:23:25,116 ERROR services.py:1242 -- Couldn't read dashboard.log file. Error: [Errno 2] No such file or directory: '/tmp/ray/session_2023-09-16_11-23-03_304985_90490/logs/dashboard.log'. It means the dashboard is broken even before it initializes the logger (mostly dependency issues). Reading the dashboard.err file which contains stdout/stderr.
2023-09-16 11:23:25,116 ERROR services.py:1276 -- Failed to read dashboard.err file: cannot mmap an empty file. It is unexpected. Please report an issue to Ray github. https://github.com/ray-project/ray/issues
2023-09-16 11:23:25,233 INFO worker.py:1621 -- Started a local Ray instance.
[2023-09-16 11:23:26,416 E 89301 89301] core_worker.cc:201: Failed to register worker 01000000ffffffffffffffffffffffffffffffffffffffffffffffff to Raylet. IOError: [RayletClient] Unable to register worker with raylet. No such file or directory

Confused about the checkpoint

Thank you for open-sourcing your work. I am a little confused that I didn't download any checkpoint you mentioned in the readme. But I ran the example pdm_closed_planner, no error occurred. Is this because this method is totally rule-based? But it seems there are some networks in this algorithm.

Qusetion about the result of PDM-Open with only the centerline input

Hi, I am trying to reproduce the result in CoRL23 Table1 with only the centerline input. In the article, pdm-open achieved a score of 85 on the OLS, while my replication results only reached 21 in reduced_val14. I would like to know whether it is due to differences in my hyperparameter settings or if there are other issues.

Changed pdm_open_model.py

self.planner_head = nn.Sequential(
            # nn.Linear(self.hidden_dim * 2, self.hidden_dim),
            nn.Linear(self.hidden_dim * 1, self.hidden_dim),
            nn.Dropout(0.1),
            nn.ReLU(),
            nn.Linear(self.hidden_dim, self.hidden_dim),
            nn.ReLU(),
            nn.Linear(self.hidden_dim, trajectory_sampling.num_poses * len(SE2Index)),
        )
# planner_features = torch.cat([state_encodings, centerline_encodings], dim=-1)
planner_features = centerline_encodings

Val14 Dataset

Hi,

Are there any plans to release the Val14 dataset to the public or the logic used to split the dataset? Thanks in advance.

Train data and val14 data

Hi, great work!

if the train and val data are combined according to the official data setup method of nuplan, will there be a situation where the data from val14 is trained during the training process, causing the data used for testing to have been seen by the model?

Visualization scripts release

Thanks for great coding and work!

I'm wondering if you still have plans to release visualization script to generate videos like the teaser.mp4 file in README. Is this file made from simulation (closed-loop)?

Thanks beforehand for taking the time to consider my question!

Some question about "Val14" evaluation

Thank you for your work.
However I have some question sim urban driver with Val14_split evaluation result

when run bash file located ../tupaln_garage/scripts/simulation/sim_urban_driver.sh
,I met IOError with [RayletClient]. So modify worker parameter to worker=single_machine_thread_pool in bashfile
The full bash file here!

SPLIT=val14_split
CHALLENGE=closed_loop_reactive_agents # open_loop_boxes, closed_loop_nonreactive_agents, closed_loop_reactive_agents
CHECKPOINT_PATH="/mnt/sdd/jyYun/planning/tuplan_garage/trained_weights/urban_driver.ckpt"

python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_simulation.py \
+simulation=$CHALLENGE \
planner=ml_planner \
worker=single_machine_thread_pool \
scenario_filter=$SPLIT \
scenario_builder=nuplan \
planner.ml_planner.model_config='\${model}' \
planner.ml_planner.checkpoint_path=$CHECKPOINT_PATH \
model=urban_driver_open_loop_model \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.simulation, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"

It works, but I have noticed that some scenarios fail during evaluation
when evaluation ended it report 38 out of 1118 scenarios have failed
Despite the severall failed scenarios, we have confirmed that the results are still being successfully saved in the 'exp' folder as configured in Nuplan

Under these circumstances, I have some questions

Could you provide information about the meaning of scenario failure and the reasons behind its occurrence?
The scenario failed with this terminal log

Traceback (most recent call last):

  File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/simulation/runner/executor.py", line 28, in run_simulation

    run_results = sim_runner.run()

  File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/simulation/runner/simulations_runner.py", line 128, in run

    self.simulation.callback.on_simulation_end(self.simulation.setup, self.planner, self.simulation.history)

  File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/simulation/callback/multi_callback.py", line 68, in on_simulation_end

    callback.on_simulation_end(setup, planner, history)

  File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/simulation/callback/metric_callback.py", line 102, in on_simulation_end

    run_metric_engine(

  File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/simulation/callback/metric_callback.py", line 24, in run_metric_engine

    metric_files = metric_engine.compute(history, scenario=scenario, planner_name=planner_name)

  File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 133, in compute

    all_metrics_results = self.compute_metric_results(history=history, scenario=scenario)

  File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 119, in compute_metric_results

    raise RuntimeError(f"Metric Engine failed with: {e}")

RuntimeError: Metric Engine failed with: 

'

2023-12-09 09:45:38,768 WARNING {/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/simulation/runner/executor.py:125}  Failed Simulation.

 'Traceback (most recent call last):

  File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 112, in compute_metric_results

    metric_results[metric.name] = metric.compute(history, scenario=scenario)

  File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/metrics/evaluation_metrics/common/speed_limit_compliance.py", line 218, in compute

    time_series = TimeSeries(

  File "<string>", line 7, in __init__

  File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/metrics/metric_result.py", line 127, in __post_init__

    assert len(self.time_stamps) == len(self.values)

AssertionError

Is the failure of scenarios related to the worker parameter in the bash file being modified?
Is the higher evaluation performance compared to the values reported in the paper related to scenario failures?
It appears that the CLS-R score is 0.5986 in nuboard report

calculate inference time

Hello，
I see the eval table have times subject
I guess it is taken from exp/simulation/runner_report.parquet
but I don't know which it is from duration,comput_trajectory_runtimes and so on
so I want to ask how to calculate inference time(from readme.md table)
Thanks a lot!

Error when trying to run pgp model with nuplan config: Metric target: "multimodal_trajectories" is not in model computed targets!

Hello, im trying to get your urban_driver and GC-PGP pretrained model working and am experiencing an error. There are two separate problems. First, it seems that that the urban_driver model was not updated with the rest when you retrained GC-PGP, so it still has the 'cannot find module nuplan_garage' error. But further, when running the GC-PGP model with the type of config used in the nuplan tutorials, I get a different error regarding multimodal_trajectories.

Config I am using and the resulting error is below. This was run in a jupyter notebook sitting in the nuplan-devkit repo.

# Location of path with all simulation configs
CONFIG_PATH = '../nuplan/planning/script/config/simulation'
CONFIG_NAME = 'default_simulation'

CHECKPOINT_PATH='run_sim_closed_loop/pretrained_checkpoints/gc_pgp_checkpoint.ckpt'

# Select the planner and simulation challenge
PLANNER = 'ml_planner'  # [simple_planner, ml_planner]
CHALLENGE = 'closed_loop_reactive_agents'  # [open_loop_boxes, closed_loop_nonreactive_agents, closed_loop_reactive_agents]
DATASET_PARAMS = [
    'scenario_builder=nuplan_mini',  # use nuplan mini database
    'scenario_filter=all_scenarios',  # initially select all scenarios in the database
    'scenario_filter.scenario_types=[near_multiple_vehicles, on_pickup_dropoff, starting_unprotected_cross_turn, high_magnitude_jerk]',  # select scenario types
    'scenario_filter.num_scenarios_per_type=5',  # use 5 scenarios per scenario type
]

# Name of the experiment
EXPERIMENT = 'simulation_simple_experiment'

# Initialize configuration management system
hydra.core.global_hydra.GlobalHydra.instance().clear()  # reinitialize hydra if already initialized
hydra.initialize(config_path=CONFIG_PATH)

# Compose the configuration
cfg = hydra.compose(config_name=CONFIG_NAME, overrides=[
    f'experiment_name={EXPERIMENT}',
    f'planner={PLANNER}',
    f'model=raster_model',
    'planner.ml_planner.model_config=${model}',  # hydra notation to select model config
    f'planner.ml_planner.checkpoint_path={CHECKPOINT_PATH}',  # this path can be replaced by the checkpoint of the model trained in the previous section
    f'group={SAVE_DIR}',
    f'+simulation={CHALLENGE}',
    *DATASET_PARAMS,
    'hydra.searchpath=[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.simulation, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]'
])

from nuplan.planning.script.run_simulation import main as main_simulation

# Run the simulation loop (real-time visualization not yet supported, see next section for visualization)
main_simulation(cfg)

# Simple simulation folder for visualization in nuBoard
simple_simulation_folder = cfg.output_dir

AssertionError                            Traceback (most recent call last)
Cell In[10], line 4
      1 from nuplan.planning.script.run_simulation import main as main_simulation
      3 # Run the simulation loop (real-time visualization not yet supported, see next section for visualization)
----> 4 main_simulation(cfg)
      6 # Simple simulation folder for visualization in nuBoard
      7 simple_simulation_folder = cfg.output_dir

File /opt/conda/lib/python3.9/site-packages/hydra/main.py:44, in main.<locals>.main_decorator.<locals>.decorated_main(cfg_passthrough)
     41 @functools.wraps(task_function)
     42 def decorated_main(cfg_passthrough: Optional[DictConfig] = None) -> Any:
     43     if cfg_passthrough is not None:
---> 44         return task_function(cfg_passthrough)
     45     else:
     46         args = get_args_parser()

File ~/nuplan-devkit/nuplan/planning/script/run_simulation.py:110, in main(cfg)
    107 assert cfg.simulation_log_main_path is None, 'Simulation_log_main_path must not be set when running simulation.'
    109 # Execute simulation with preconfigured planner(s).
--> 110 run_simulation(cfg=cfg)
    112 if is_s3_path(Path(cfg.output_dir)):
    113     clean_up_s3_artifacts()

File ~/nuplan-devkit/nuplan/planning/script/run_simulation.py:66, in run_simulation(cfg, planners)
     63 if isinstance(planners, AbstractPlanner):
     64     planners = [planners]
---> 66 runners = build_simulations(
     67     cfg=cfg,
     68     callbacks=callbacks,
     69     worker=common_builder.worker,
     70     pre_built_planners=planners,
     71     callbacks_worker=callbacks_worker_pool,
     72 )
     74 if common_builder.profiler:
     75     # Stop simulation construction profiling
     76     common_builder.profiler.save_profiler(profiler_name)

File ~/nuplan-devkit/nuplan/planning/script/builders/simulation_builder.py:90, in build_simulations(cfg, worker, callbacks, callbacks_worker, pre_built_planners)
     87     if 'planner' not in cfg.keys():
     88         raise KeyError('Planner not specified in config. Please specify a planner using "planner" field.')
---> 90     planners = build_planners(cfg.planner, scenario)
     91 else:
     92     planners = pre_built_planners

File ~/nuplan-devkit/nuplan/planning/script/builders/planner_builder.py:58, in build_planners(planner_cfg, scenario)
     51 def build_planners(planner_cfg: DictConfig, scenario: Optional[AbstractScenario]) -> List[AbstractPlanner]:
     52     """
     53     Instantiate multiple planners by calling build_planner
     54     :param planners_cfg: planners config
     55     :param scenario: scenario
     56     :return planners: List of AbstractPlanners
     57     """
---> 58     return [_build_planner(planner, scenario) for planner in planner_cfg.values()]

File ~/nuplan-devkit/nuplan/planning/script/builders/planner_builder.py:58, in <listcomp>(.0)
     51 def build_planners(planner_cfg: DictConfig, scenario: Optional[AbstractScenario]) -> List[AbstractPlanner]:
     52     """
     53     Instantiate multiple planners by calling build_planner
     54     :param planners_cfg: planners config
     55     :param scenario: scenario
     56     :return planners: List of AbstractPlanners
     57     """
---> 58     return [_build_planner(planner, scenario) for planner in planner_cfg.values()]

File ~/nuplan-devkit/nuplan/planning/script/builders/planner_builder.py:26, in _build_planner(planner_cfg, scenario)
     23 if is_target_type(planner_cfg, MLPlanner):
     24     # Build model and feature builders needed to run an ML model in simulation
     25     torch_module_wrapper = build_torch_module_wrapper(planner_cfg.model_config)
---> 26     model = LightningModuleWrapper.load_from_checkpoint(
     27         planner_cfg.checkpoint_path, model=torch_module_wrapper
     28     ).model
     30     # Remove config elements that are redundant to MLPlanner
     31     OmegaConf.set_struct(config, False)

File /opt/conda/lib/python3.9/site-packages/pytorch_lightning/core/saving.py:157, in ModelIO.load_from_checkpoint(cls, checkpoint_path, map_location, hparams_file, strict, **kwargs)
    154 # override the hparams with values that were passed in
    155 checkpoint[cls.CHECKPOINT_HYPER_PARAMS_KEY].update(kwargs)
--> 157 model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
    158 return model

File /opt/conda/lib/python3.9/site-packages/pytorch_lightning/core/saving.py:199, in ModelIO._load_model_state(cls, checkpoint, strict, **cls_kwargs_new)
    195 if not cls_spec.varkw:
    196     # filter kwargs according to class init unless it allows any argument via kwargs
    197     _cls_kwargs = {k: v for k, v in _cls_kwargs.items() if k in cls_init_args_name}
--> 199 model = cls(**_cls_kwargs)
    201 # give model a chance to load something
    202 model.on_load_checkpoint(checkpoint)

File ~/nuplan-devkit/nuplan/planning/training/modeling/lightning_module_wrapper.py:67, in LightningModuleWrapper.__init__(self, model, objectives, metrics, batch_size, optimizer, lr_scheduler, warm_up_lr_scheduler, objective_aggregate_mode)
     65 for metric in self.metrics:
     66     for feature in metric.get_list_of_required_target_types():
---> 67         assert feature in model_targets, f"Metric target: \"{feature}\" is not in model computed targets!"

AssertionError: Metric target: "multimodal_trajectories" is not in model computed targets!

Training settings for Val14 benchmark

Thanks for open sourcing the great work!

I want to ask about the training details of PDM-open and PDM-Offset.

Hardwares:
- GPU type and numbers
- CPU cores numbers
- Maximum Memory Usage
Training time
- How many hours per epoch
- How many epochs do you used?
Cache
- Do you save the features like central lines that you calculated?
  - If yes, what is the size of it and how to save and load it?

Thank you in advance.

ROS bridge for real vehicle implementation

Hi, I was wondering if you have tried to implement this learning-based planner on a real vehicle, or if you have tried to build a bridge to connect the simulator with a ROS environment.
If that's the case, could you please provide more information about it?
I think it would be very interesting to test this model on a real vehicle! Thanks

AssertionError in evaluation

Hi,

When I evaluate pdm_closed_planner and pdm_open_planner in closed loop, non reactive agent setting, both planners runs fine without error if I use a reduced scenario filter (with 3 or 5 scenarios randomly picked from the validation dataset). However, in val14_split, when running

python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_simulation.py +simulation=closed_loop_nonreactive_agents planner=pdm_closed_planner scenario_filter=val14_split scenario_builder=nuplan worker=single_machine_thread_pool scenario_builder.data_root=/fs/scratch/projects/proj-ai-planning/archive/nuScenes/nuplan/dataset/nuplan-v1.1/splits/val hydra.searchpath="[pkg://nuplan_garage.planning.script.config.common, pkg://nuplan_garage.planning.script.config.simulation, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"

I get the following error when the simulations are being executed:

Traceback (most recent call last):
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 112, in compute_metric_results
    metric_results[metric.name] = metric.compute(history, scenario=scenario)
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/evaluation_metrics/common/speed_limit_compliance.py", line 218, in compute
    time_series = TimeSeries(
  File "<string>", line 7, in __init__
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/metric_result.py", line 127, in __post_init__
    assert len(self.time_stamps) == len(self.values)
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/simulation/runner/executor.py", line 27, in run_simulation
    return sim_runner.run()
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/simulation/runner/simulations_runner.py", line 128, in run
    self.simulation.callback.on_simulation_end(self.simulation.setup, self.planner, self.simulation.history)
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/simulation/callback/multi_callback.py", line 68, in on_simulation_end
    callback.on_simulation_end(setup, planner, history)
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/simulation/callback/metric_callback.py", line 102, in on_simulation_end
    run_metric_engine(
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/simulation/callback/metric_callback.py", line 24, in run_metric_engine
    metric_files = metric_engine.compute(history, scenario=scenario, planner_name=planner_name)
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 133, in compute
    all_metrics_results = self.compute_metric_results(history=history, scenario=scenario)
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 119, in compute_metric_results
    raise RuntimeError(f"Metric Engine failed with: {e}")
RuntimeError: Metric Engine failed with: 
Traceback (most recent call last):
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 112, in compute_metric_results
    metric_results[metric.name] = metric.compute(history, scenario=scenario)
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/evaluation_metrics/common/speed_limit_compliance.py", line 218, in compute
    time_series = TimeSeries(
  File "<string>", line 7, in __init__
  File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/metric_result.py", line 127, in __post_init__
    assert len(self.time_stamps) == len(self.values)
AssertionError

PDM-Hybrid that combined PDM-Closed with GC-PGP

Dear Author
Thank you for your work.
I have built my own prediction models based on PGP and try to combine it with PDM-Closed, but It didn't work so well.

CLS-R 61
CLS-NR 60
OLS 37

GC-PGP's feature is different from PDM
So I want to ask how to conbine GC-PGP with PDM-Closed.
Uploading the code would be great!
Thank you for all the help you've given me before!
wishes,
Yifan

Centerline and States encoding for PDM_Open and PDM_Offset models

Hello,
I have a doubt regarding the encoding process of the centerline and the states of the ego:

1) Centerline: from my understanding, you first extract the centerline using Dijkstra, from the starting position to the goal position.
Afterwards, you just keep a length of 120m sampled at 1m and you pass it to the neural network.
My question is: how do you select the "local" centerline of 120m? Is its starting point the same as the ego vehicle for each frame? Or do you implement another logic?

Moreover, each state of the centerline only includes x, y and theta, but in which reference frame?

2) Ego states: ego_position, ego_velocity and ego_acceleration have 11 elements (the last one is the present one and the first 10 are the past ones), each with 3 states (x, y and theta). Also in this case, which is the reference frame?

I guess it's the local one, but I ask you just to make sure.

Thank you very much in advance!

Some errors feedback

Dear Author

I want to make some suggestions.When run bash file located

/home/wjl/jyf/tuplan_garage/scripts/simulation/sim_gc_pgp.sh

I met error mismatched input '=' expecting <EOF>

After many attempts I realized that the problem is checkpoint path

"/home/wjl/jyf/nuplandata/exp/exp/training_gc_pgp_model/training_gc_pgp_model/2023.12.12.09.53.38/checkpoints/epoch=1.ckpt"

When I delete the '='and rename the checkpoint file , this bash file can run correctly. This code name epoch=<int> initially. So I suggest name checkpoint file likeepoch-<int>

APPENDIX

error shell

SPLIT=val14_split
CHALLENGE=closed_loop_reactive_agents # open_loop_boxes, closed_loop_nonreactive_agents, closed_loop_reactive_agents
CHECKPOINT_PATH="/home/wjl/jyf/nuplandata/exp/exp/training_gc_pgp_model/training_gc_pgp_model/2023.12.12.09.53.38/checkpoints/epoch=1.ckpt" 

python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_simulation.py \
+simulation=$CHALLENGE \
planner=ml_planner \
scenario_filter=$SPLIT \
scenario_builder=nuplan \
planner.ml_planner.model_config='\${model}' \
planner.ml_planner.checkpoint_path=$CHECKPOINT_PATH \
model=gc_pgp_model \
model.aggregator.pre_train=false \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.simulation, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"

correct shell

SPLIT=val14_split
CHALLENGE=closed_loop_reactive_agents # open_loop_boxes, closed_loop_nonreactive_agents, closed_loop_reactive_agents
CHECKPOINT_PATH="/home/wjl/jyf/nuplandata/exp/exp/training_gc_pgp_model/training_gc_pgp_model/2023.12.12.09.53.38/checkpoints/epoch1.ckpt"  #delete '='

python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_simulation.py \
+simulation=$CHALLENGE \
planner=ml_planner \
scenario_filter=$SPLIT \
scenario_builder=nuplan \
planner.ml_planner.model_config='\${model}' \
planner.ml_planner.checkpoint_path=$CHECKPOINT_PATH \
model=gc_pgp_model \
model.aggregator.pre_train=false \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.simulation, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"

Best whishes
Yifan

The plan for open source code?

Great work！It helped me a lot！
Could you tell me what is the plan for open source code?
Thank you very much!

Questions about NuPlan to Use

Hi Dear Authors,

Thank you for the excellent work! I am new to the field of planning and would really like to follow the path of your paper. However, the resource in my lab is quite limited and the >1T data of nuPlan looks really so large.

Therefore, I am curious if such data only contains the trajectory data, or do you have other suggestions for running your algorithms with some condensed trajectory data?

Best,

Ziqi

Callbacks'module problem before the start of training GC-PGP

Dear Authors,
Thank you for the excellent work! I have set up the nuplan environment and installed tuplan_garage as a package.But I have a problem before the start of training GC-PGP. I can't find callbacks.multimodal_visualization_callback.pixel_size key so I can't train GC-PGP model. Then I try to look the path /home/wjl/jyf/nuplan-devkit/nuplan/planning/training/callbacks but it not exists the file multimodal_visualization_callback .So do you have suggestions for trainning the GC-PGP model

bash the code file below:

BATCH_SIZE=32
SEED=0
NUPLAN_DEVKIT_ROOT=/home/wjl/jyf/nuplan-devkit #加nuplan-devkit地址

PRETRAIN_EPOCHS=20
PRETRAIN_LR=1e-4
TRAIN_EPOCHS=90
TRAIN_LR=1e-4
TRAIN_LR_MILESTONES=[40,50,55]
TRAIN_LR_DECAY=0.5

JOB_NAME=training_gc_pgp_model
CACHE_PATH=/home/wjl/jyf/tuplan_garage/nuplandata/exp/jyf/cache #cache地址
USE_CACHE_WITHOUT_DATASET=False

ROUTE_FEATURE=FALSE
ROUTE_MASK=FALSE
HARD_MASK=FALSE
TRAFFIC_LIGHT=TRUE


echo "Starting Pre-Training with gt traversals as input for decoder"
python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_training.py \
seed=$SEED \
py_func=train \
+training=training_gc_pgp_model \
job_name=$JOB_NAME \
scenario_builder=nuplan \
scenario_filter.num_scenarios_per_type=4000 \
cache.cache_path=$CACHE_PATH \
cache.use_cache_without_dataset=$USE_CACHE_WITHOUT_DATASET \
callbacks.visualization_callback.pixel_size=0.25 \
callbacks.multimodal_visualization_callback.pixel_size=0.25 \
lightning.trainer.params.max_epochs=$PRETRAIN_EPOCHS \
lightning.trainer.params.max_time=null \
data_loader.params.batch_size=$BATCH_SIZE \
optimizer.lr=$PRETRAIN_LR \
lr_scheduler=multistep_lr \
lr_scheduler.milestones=$TRAIN_LR_MILESTONES \
lr_scheduler.gamma=$TRAIN_LR_DECAY \
model.encoder.use_red_light_feature=$TRAFFIC_LIGHT \
model.aggregator.use_route_mask=$ROUTE_MASK \
model.aggregator.hard_masking=$HARD_MASK \
model.aggregator.pre_train=true \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.training, pkg://tuplan_garage.planning.script.experiments, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"

echo "Starting Training with aggregator traversals as input for decoder"
python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_training.py \
seed=$SEED \
py_func=train \
+training=training_gc_pgp_model \
job_name=$JOB_NAME \
scenario_builder=nuplan \
scenario_filter.num_scenarios_per_type=4000 \
cache.cache_path=$CACHE_PATH \
cache.use_cache_without_dataset=$USE_CACHE_WITHOUT_DATASET \
callbacks.visualization_callback.pixel_size=0.25 \
callbacks.multimodal_visualization_callback.pixel_size=0.25 \
lightning.trainer.params.max_epochs=$TRAIN_EPOCHS \
lightning.trainer.params.max_time=null \
lightning.trainer.checkpoint.resume_training=true \
data_loader.params.batch_size=$BATCH_SIZE \
optimizer.lr=$TRAIN_LR \
lr_scheduler=multistep_lr \
model.encoder.use_red_light_feature=$TRAFFIC_LIGHT \
model.aggregator.use_route_mask=$ROUTE_MASK \
model.aggregator.hard_masking=$HARD_MASK \
model.aggregator.pre_train=false \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.training, pkg://tuplan_garage.planning.script.experiments, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"

output

Starting Pre-Training with gt traversals as input for decoder
Could not override 'callbacks.multimodal_visualization_callback.pixel_size'.
To append to your config use +callbacks.multimodal_visualization_callback.pixel_size=0.25
Key 'multimodal_visualization_callback' is not in struct
    full_key: callbacks.multimodal_visualization_callback
    object_type=dict

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Starting Training with aggregator traversals as input for decoder
Could not override 'callbacks.multimodal_visualization_callback.pixel_size'.
To append to your config use +callbacks.multimodal_visualization_callback.pixel_size=0.25
Key 'multimodal_visualization_callback' is not in struct
    full_key: callbacks.multimodal_visualization_callback
    object_type=dict

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Best wishes,
Yifan

Issue eveluating

While running

SPLIT=mini_test_split
CHALLENGE=closed_loop_reactive_agents # open_loop_boxes, closed_loop_nonreactive_agents, closed_loop_reactive_agents
CHECKPOINT=~/tuplan_garage/pdm_open_checkpoint.ckpt

taskset -c 0-30 python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_simulation.py \
+simulation=$CHALLENGE \
planner=pdm_open_planner \
planner.pdm_open_planner.checkpoint_path=$CHECKPOINT \
scenario_filter=$SPLIT \
scenario_builder=nuplan \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.simulation, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]" \

on a cluster i've gotten the issue


(wrapped_fn pid=3822105) Before assert: _drivable_area_map = <tuplan_garage.planning.simulation.planner.pdm_planner.observation.pdm_occupancy_map.PDMOccupancyMap object at 0x7f996e5255b0> [repeated 27x across cluster]
(wrapped_fn pid=3822105) Drivable area map has 'intersects' method. [repeated 21x across cluster]
(wrapped_fn pid=3822105) Test intersecting lanes at (0,0): [] [repeated 21x across cluster]
(wrapped_fn pid=3822105) Intersecting lanes at ego position [ 664863.43947442 3999616.06394007]: ['63253'] [repeated 18x across cluster]
(wrapped_fn pid=3822105) [] [repeated 5x across cluster]
(wrapped_fn pid=3822445) Intersecting lanes at ego position [ 664430.74693994 3998337.4457144 ]: ['63319']
Converting detections to smart agents:   0%|          | 0/17 [00:00<?, ?it/s] [repeated 5x across cluster]
                                                                                      [repeated 7x across cluster]
(wrapped_fn pid=3822105) WARNING:nuplan.planning.simulation.runner.executor:----------- Simulation failed: with the following trace: [repeated 5x across cluster]
(wrapped_fn pid=3822105) Traceback (most recent call last): [repeated 5x across cluster]
(wrapped_fn pid=3822105)   File "/home/tombo/nuplan-devkit/nuplan/planning/simulation/runner/executor.py", line 27, in run_simulation [repeated 5x across cluster]
(wrapped_fn pid=3822105)     return sim_runner.run() [repeated 5x across cluster]
(wrapped_fn pid=3822105)   File "/home/tombo/nuplan-devkit/nuplan/planning/simulation/runner/simulations_runner.py", line 113, in run [repeated 5x across cluster]
(wrapped_fn pid=3822105)     trajectory = self.planner.compute_trajectory(planner_input) [repeated 5x across cluster]
(wrapped_fn pid=3822105)   File "/home/tombo/nuplan-devkit/nuplan/planning/simulation/planner/abstract_planner.py", line 105, in compute_trajectory [repeated 10x across cluster]
(wrapped_fn pid=3822105)     raise e [repeated 5x across cluster]
(wrapped_fn pid=3822105)     trajectory = self.compute_planner_trajectory(current_input) [repeated 5x across cluster]
(wrapped_fn pid=3822105)   File "/home/tombo/tuplan_garage/tuplan_garage/planning/simulation/planner/pdm_planner/pdm_open_planner.py", line 109, in compute_planner_trajectory [repeated 5x across cluster]
(wrapped_fn pid=3822105)     current_lane = self._get_starting_lane(ego_state) [repeated 5x across cluster]
(wrapped_fn pid=3822105)   File "/home/tombo/tuplan_garage/tuplan_garage/planning/simulation/planner/pdm_planner/abstract_pdm_planner.py", line 123, in _get_starting_lane [repeated 5x across cluster]
(wrapped_fn pid=3822105)     on_route_lanes, heading_error = self._get_intersecting_lanes(ego_state) [repeated 5x across cluster]
(wrapped_fn pid=3822105)   File "/home/tombo/tuplan_garage/tuplan_garage/planning/simulation/planner/pdm_planner/abstract_pdm_planner.py", line 158, in _get_intersecting_lanes [repeated 5x across cluster]
(wrapped_fn pid=3822105)     assert ( [repeated 5x across cluster]
(wrapped_fn pid=3822105) AssertionError: AbstractPDMPlanner: Drivable area map must be initialized first! [repeated 5x across cluster]
(wrapped_fn pid=3822105) WARNING:nuplan.planning.simulation.runner.executor:Simulation failed with error: [repeated 5x across cluster]
(wrapped_fn pid=3822105)  AbstractPDMPlanner: Drivable area map must be initialized first! [repeated 5x across cluster]
(wrapped_fn pid=3822105) WARNING:nuplan.planning.simulation.runner.executor: [repeated 5x across cluster]
(wrapped_fn pid=3822105) Failed simulation [log,token]: [repeated 5x across cluster]
(wrapped_fn pid=3822105)  [repeated 13x across cluster]
(wrapped_fn pid=3822105) WARNING:nuplan.planning.simulation.runner.executor:----------- Simulation failed! [repeated 5x across cluster]

Why PDM planner is providing trajectories which the first state doesn't match with the ego current state?

Hi,

I noticed that inside the planner (get_closed_loop_trajectory) we have two types one trajectory:

proposals_array which don't start at ego current state and are later extended from 4s to 8s (41 poses -> 81 poses as 10Hz is used as frequency)

tuplan_garage/tuplan_garage/planning/simulation/planner/pdm_planner/abstract_pdm_closed_planner.py

Line 160 in 5dd9206

proposals_array = self._generator.generate_proposals(

tuplan_garage/tuplan_garage/planning/simulation/planner/pdm_planner/abstract_pdm_closed_planner.py

Line 187 in 5dd9206

trajectory = self._generator.generate_trajectory(np.argmax(proposal_scores))
simulated_proposals_array which are the same trajectories as proposals_array but starting at current ego pose and simulated using Bicycle model and LQRtracker, which are used for scoring. (and also have 81 poses)

tuplan_garage/tuplan_garage/planning/simulation/planner/pdm_planner/abstract_pdm_closed_planner.py

Line 165 in 5dd9206

simulated_proposals_array = self._simulator.simulate_proposals(

What I am wondering is why return a trajectory that is not starting at ego current pose as starting point (extension of the best trajectory inside proposals_array) instead of using one trajectory that starts at ego position as in simulated_proposals? Is that done on purpose as part of a trick for the simulation?

Thanks in advance :)

How to accelerate simulation process

Dear Author:
I tried to run the simulation process and found it to be very slow and low CPU usage.I estimate it will take 30 days to run val14,so I'm asking if there's a way to speed up.

Device Hardware

i7-12700K
RTX3090 24G
64G RAM

autonomousvision / tuplan_garage Goto Github PK

tuplan_garage's Issues

Problem

Reproduce

Output

Problem

Reproduce

Output

Recommend Projects

Recommend Topics

Recommend Org

Jobs