autonomousvision / tuplan_garage Goto Github PK
View Code? Open in Web Editor NEW[CoRL'23] Parting with Misconceptions about Learning-based Vehicle Motion Planning
License: Other
[CoRL'23] Parting with Misconceptions about Learning-based Vehicle Motion Planning
License: Other
Hello,
I was inspecting the trajectory generated by the pdm_closed_planner and I realized that it contains just static information and not dynamic states (I think it would be the same for the other planners as well):
InterpolatedTrajectory with 81 states
(wrapped_fn pid=1814060) EgoState(time=1633419573.0001209), Position=(365882.03552731016, 143116.11716887038, -1.80036579096818), Velocity=(0.0, 0.0), Acceleration=(0.0, 0.0), Steering_Angle=0.0)
This is the output (I showed just one state) I get when I print the states of the trajectory with this code (in ego_state.py):
def __str__(self):
return (
f"EgoState(time={self.time_point.time_s}), "
f"Position=({self.rear_axle.x}, {self.rear_axle.y}, {self.rear_axle.heading}), "
f"Velocity=({self.dynamic_car_state.rear_axle_velocity_2d.x}, {self.dynamic_car_state.rear_axle_velocity_2d.y}), "
f"Acceleration=({self.dynamic_car_state.rear_axle_acceleration_2d.x}, {self.dynamic_car_state.rear_axle_acceleration_2d.y}), "
f"Steering_Angle={self.tire_steering_angle})"
)
and this (in interpolated_trajectory.py):
def __str__(self):
return f"InterpolatedTrajectory with {len(self._trajectory)} states"
def print_states(self):
for state in self._trajectory:
print(state)
and finally this (in pdm_closed_planner.py):
def compute_planner_trajectory():
......................................
......................................
print(trajectory)
trajectory.print_states()
return trajectory
So, my question is: do you also generate a velocity profile for every state of the generated trajectory? If yes, how do you do it and how can you have access to it?
Also, do you take into account the left and right bounds of the road when computing the trajectory? If yes, how do you extract/generate them and how can you have access to it?
Thank you very much in advance, very appreciated!
I met the following error when running train_gc_pgp.sh
. I did not modify the repo except essential path setups. Here is the error information:
Starting Pre-Training with gt traversals as input for decoder
Global seed set to 0
2023-12-19 12:59:32,159 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-12-19 12:59:32,159 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /home/mh/code/nuplan/exp/exp/training_gc_pgp_model/training_gc_pgp_model/2023.12.19.12.59.31
2023-12-19 12:59:32,159 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-12-19 12:59:32,160 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-12-19 12:59:33,686 INFO worker.py:1664 -- Started a local Ray instance. View the dashboard at 35.3.215.205:8265
2023-12-19 12:59:34,153 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101} Worker: RayDistributed
2023-12-19 12:59:34,155 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102} Number of nodes: 1
Number of CPUs per node: 32
Number of GPUs per node: 1
Number of threads across all nodes: 32
2023-12-19 12:59:34,155 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27} Building WorkerPool...DONE!
2023-12-19 12:59:34,155 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/training/experiments/training.py:41} Building training engine...
2023-12-19 12:59:34,155 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18} Building TorchModuleWrapper...
2023-12-19 12:59:34,299 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21} Building TorchModuleWrapper...DONE!
2023-12-19 12:59:34,299 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18} Building Splitter...
2023-12-19 12:59:34,675 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21} Building Splitter...DONE!
2023-12-19 12:59:34,675 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/data_augmentation_builder.py:19} Building augmentors...
2023-12-19 12:59:34,685 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/data_augmentation_builder.py:28} Building augmentors...DONE!
2023-12-19 12:59:34,686 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18} Building AbstractScenarioBuilder...
2023-12-19 12:59:34,737 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21} Building AbstractScenarioBuilder...DONE!
2023-12-19 12:59:34,737 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35} Building ScenarioFilter...
2023-12-19 12:59:34,738 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44} Building ScenarioFilter...DONE!
Ray objects: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [02:11<00:00, 4.12s/it]
2023-12-19 13:01:49,405 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_builder.py:171} Extracted 177435 scenarios for training
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258} WORLD_SIZE was not set.
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266} PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277} Number of gpus found to be in use: 1
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:114} World size: 1
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:115} Learning rate before: 0.0001
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:119} Scaling method: Equal Variance
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:141} Betas after scaling: [0.9, 0.999]
2023-12-19 13:01:49,408 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:143} Learning rate after scaling: 0.0001
2023-12-19 13:01:49,487 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:172} Updating Learning Rate Scheduler Config...
2023-12-19 13:01:49,487 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258} WORLD_SIZE was not set.
2023-12-19 13:01:49,487 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266} PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-12-19 13:01:49,487 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277} Number of gpus found to be in use: 1
2023-12-19 13:01:49,487 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:199} Updating torch.optim.lr_scheduler.MultiStepLR in ddp setting is not yet supported. Learning rate scheduler config will not be updated.
2023-12-19 13:01:49,487 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:245} Optimizer and LR Scheduler configs updated according to ddp strategy.
2023-12-19 13:01:49,494 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py:19} Building callbacks...
Error executing job with overrides: ['seed=0', 'py_func=train', '+training=training_gc_pgp_model', 'job_name=training_gc_pgp_model', 'scenario_builder=nuplan', 'scenario_filter.num_scenarios_per_type=4000', 'cache.cache_path=/home/mh/code/nuplan/exp/mh/cache', 'cache.use_cache_without_dataset=False', 'callbacks.visualization_callback.pixel_size=0.25', '+callbacks.multimodal_visualization_callback.pixel_size=0.25', 'lightning.trainer.params.max_epochs=20', 'lightning.trainer.params.max_time=null', 'data_loader.params.batch_size=32', 'optimizer.lr=1e-4', 'lr_scheduler=multistep_lr', 'lr_scheduler.milestones=[40,50,55]', 'lr_scheduler.gamma=0.5', 'model.encoder.use_red_light_feature=TRUE', 'model.aggregator.use_route_mask=FALSE', 'model.aggregator.hard_masking=FALSE', 'model.aggregator.pre_train=true']
Traceback (most recent call last):
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/run_training.py", line 89, in
main()
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/main.py", line 49, in decorated_main
_run_hydra(
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 367, in _run_hydra
run_and_report(
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 214, in run_and_report
raise ex
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
return func()
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 368, in
lambda: hydra.run(
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 110, in run
_ = ret.return_value
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/core/utils.py", line 233, in return_value
raise self._return_value
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/core/utils.py", line 160, in run_job
ret.return_value = task_function(task_cfg)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/run_training.py", line 59, in main
engine = build_training_engine(cfg, worker)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/training/experiments/training.py", line 60, in build_training_engine
trainer = build_trainer(cfg)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_builder.py", line 109, in build_trainer
callbacks = build_callbacks(cfg)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py", line 25, in build_callbacks
validate_type(callback, pl.Callback)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_type.py", line 32, in validate_type
assert isinstance(
AssertionError: Class to be of type <class 'pytorch_lightning.callbacks.base.Callback'>, but is <class 'omegaconf.dictconfig.DictConfig'>!
Starting Training with aggregator traversals as input for decoder
Global seed set to 0
2023-12-19 13:01:56,346 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-12-19 13:01:56,346 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /home/mh/code/nuplan/exp/exp/training_gc_pgp_model/training_gc_pgp_model/2023.12.19.13.01.55
2023-12-19 13:01:56,347 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-12-19 13:01:56,347 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-12-19 13:01:57,778 INFO worker.py:1664 -- Started a local Ray instance. View the dashboard at 35.3.215.205:8265
2023-12-19 13:01:58,236 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101} Worker: RayDistributed
2023-12-19 13:01:58,237 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102} Number of nodes: 1
Number of CPUs per node: 32
Number of GPUs per node: 1
Number of threads across all nodes: 32
2023-12-19 13:01:58,237 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27} Building WorkerPool...DONE!
2023-12-19 13:01:58,237 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/training/experiments/training.py:41} Building training engine...
2023-12-19 13:01:58,237 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18} Building TorchModuleWrapper...
2023-12-19 13:01:58,376 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21} Building TorchModuleWrapper...DONE!
2023-12-19 13:01:58,376 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18} Building Splitter...
2023-12-19 13:01:58,760 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21} Building Splitter...DONE!
2023-12-19 13:01:58,760 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/data_augmentation_builder.py:19} Building augmentors...
2023-12-19 13:01:58,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/data_augmentation_builder.py:28} Building augmentors...DONE!
2023-12-19 13:01:58,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18} Building AbstractScenarioBuilder...
2023-12-19 13:01:58,821 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21} Building AbstractScenarioBuilder...DONE!
2023-12-19 13:01:58,821 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35} Building ScenarioFilter...
2023-12-19 13:01:58,821 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44} Building ScenarioFilter...DONE!
Ray objects: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [02:10<00:00, 4.09s/it]
2023-12-19 13:04:12,689 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_builder.py:171} Extracted 177435 scenarios for training
2023-12-19 13:04:12,692 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258} WORLD_SIZE was not set.
2023-12-19 13:04:12,692 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266} PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-12-19 13:04:12,692 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277} Number of gpus found to be in use: 1
2023-12-19 13:04:12,692 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:114} World size: 1
2023-12-19 13:04:12,692 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:115} Learning rate before: 0.0001
2023-12-19 13:04:12,692 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:119} Scaling method: Equal Variance
2023-12-19 13:04:12,693 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:141} Betas after scaling: [0.9, 0.999]
2023-12-19 13:04:12,693 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:143} Learning rate after scaling: 0.0001
2023-12-19 13:04:12,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:172} Updating Learning Rate Scheduler Config...
2023-12-19 13:04:12,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258} WORLD_SIZE was not set.
2023-12-19 13:04:12,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266} PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-12-19 13:04:12,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277} Number of gpus found to be in use: 1
2023-12-19 13:04:12,770 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:199} Updating torch.optim.lr_scheduler.MultiStepLR in ddp setting is not yet supported. Learning rate scheduler config will not be updated.
2023-12-19 13:04:12,771 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:245} Optimizer and LR Scheduler configs updated according to ddp strategy.
2023-12-19 13:04:12,777 INFO {/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py:19} Building callbacks...
Error executing job with overrides: ['seed=0', 'py_func=train', '+training=training_gc_pgp_model', 'job_name=training_gc_pgp_model', 'scenario_builder=nuplan', 'scenario_filter.num_scenarios_per_type=4000', 'cache.cache_path=/home/mh/code/nuplan/exp/mh/cache', 'cache.use_cache_without_dataset=False', 'callbacks.visualization_callback.pixel_size=0.25', '+callbacks.multimodal_visualization_callback.pixel_size=0.25', 'lightning.trainer.params.max_epochs=90', 'lightning.trainer.params.max_time=null', 'lightning.trainer.checkpoint.resume_training=true', 'data_loader.params.batch_size=32', 'optimizer.lr=1e-4', 'lr_scheduler=multistep_lr', 'model.encoder.use_red_light_feature=TRUE', 'model.aggregator.use_route_mask=FALSE', 'model.aggregator.hard_masking=FALSE', 'model.aggregator.pre_train=false']
Traceback (most recent call last):
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/run_training.py", line 89, in
main()
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/main.py", line 49, in decorated_main
_run_hydra(
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 367, in _run_hydra
run_and_report(
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 214, in run_and_report
raise ex
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
return func()
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/utils.py", line 368, in
lambda: hydra.run(
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 110, in run
_ = ret.return_value
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/core/utils.py", line 233, in return_value
raise self._return_value
File "/home/mh/miniconda3/envs/nuplan/lib/python3.9/site-packages/hydra/core/utils.py", line 160, in run_job
ret.return_value = task_function(task_cfg)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/run_training.py", line 59, in main
engine = build_training_engine(cfg, worker)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/training/experiments/training.py", line 60, in build_training_engine
trainer = build_trainer(cfg)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_builder.py", line 109, in build_trainer
callbacks = build_callbacks(cfg)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py", line 25, in build_callbacks
validate_type(callback, pl.Callback)
File "/home/mh/code/nuplan/nuplan-devkit/nuplan/planning/script/builders/utils/utils_type.py", line 32, in validate_type
assert isinstance(
AssertionError: Class to be of type <class 'pytorch_lightning.callbacks.base.Callback'>, but is <class 'omegaconf.dictconfig.DictConfig'>!
I appreciate any suggestions!
Hello,
I run simulation by using cpu, but the speed is slowly.
I see this code in pdm_open_planer.py
self._device = "cpu"
self._model = LightningModuleWrapper.load_from_checkpoint(
checkpoint_path,
model=model,
map_location=self._device,
).model
I change device cpu
to cuda:0
but have bug about tensors have two different device cpu
and cuda:0
so I want to ask if you use cuda to accelerate simulation process or usecpu
run it.
and how to change cofing to use cuda:0
Thanks for your help!
Hi, running the simulation I noticed that the centerline and the trajectory generated by PDM-Closed planner are not consistent in more than one scenario, as you can see in the pictures below (the red line is the centerline, the blue line is the PDM-Closed trajectory):
My question is the following:
for the PDM-Open model you take as input just the centerline (together with ego history). Instead, for the hybrid model, you fuse together the PDM-Closed trajectory with the PDM-Open one (trained on the centerline).
Since, as shown in the plots above, PDM-Closed trajectory and centerline are not consistent, is it possible that this leads to a distorted hybrid trajectory?
Indeed, as shown above, the hybrid trajectory is distorted in the point where you fuse together PDM-Closed and PDM-Open.
Do you think this could be the cause of the distorted trajectory?
If that is the case, is there a reason why you trained the MLP-Open just taking the centerline as input?
Thanks a lot in advance!
Hello. I have set up the nuplan environment and installed tuplan_garage as a package, followed every step for the preparation in the readme.md. However, when I tried to train the model, I have encountered a fatal Ray error. Every time after 'ray objects' is finished, it soon failed to start the dashboard, causing the program to 'ray objects' again. Because of the failure to initialize the ray instance, there is no log recording the error. I have searched a similar issue here but of little help. Thank you for the assistance.
bash the code file below:
TRAIN_EPOCHS=100
TRAIN_LR=1e-4
TRAIN_LR_MILESTONES=[50,75]
TRAIN_LR_DECAY=0.1
BATCH_SIZE=64
SEED=0
JOB_NAME=training_pdm_open_model
CACHE_PATH=/mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/cache
USE_CACHE_WITHOUT_DATASET=False
source ~/.bashrc
conda activate nuplan
python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_training.py \
seed=$SEED \
py_func=train \
+training=training_pdm_open_model \
job_name=$JOB_NAME \
scenario_builder=nuplan \
cache.cache_path=$CACHE_PATH \
cache.use_cache_without_dataset=$USE_CACHE_WITHOUT_DATASET \
lightning.trainer.params.max_epochs=$TRAIN_EPOCHS \
data_loader.params.batch_size=$BATCH_SIZE \
optimizer.lr=$TRAIN_LR \
lr_scheduler=multistep_lr \
lr_scheduler.milestones=$TRAIN_LR_MILESTONES \
lr_scheduler.gamma=$TRAIN_LR_DECAY \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.training, pkg://tuplan_garage.planning.script.experiments, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"
Global seed set to 0
2023-09-16 11:08:48,865 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:08:48,868 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.08.46
2023-09-16 11:08:48,868 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:08:48,870 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-09-16 11:08:52,865 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:08:58,481 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101} Worker: RayDistributed
2023-09-16 11:08:58,482 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102} Number of nodes: 1
Number of CPUs per node: 96
Number of GPUs per node: 8
Number of threads across all nodes: 96
2023-09-16 11:08:58,482 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27} Building WorkerPool...DONE!
2023-09-16 11:08:58,482 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/training/experiments/training.py:41} Building training engine...
2023-09-16 11:08:58,483 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18} Building TorchModuleWrapper...
2023-09-16 11:08:59,487 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21} Building TorchModuleWrapper...DONE!
2023-09-16 11:08:59,488 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18} Building Splitter...
2023-09-16 11:09:00,464 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21} Building Splitter...DONE!
2023-09-16 11:09:00,465 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18} Building AbstractScenarioBuilder...
2023-09-16 11:09:00,988 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21} Building AbstractScenarioBuilder...DONE!
2023-09-16 11:09:00,988 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35} Building ScenarioFilter...
2023-09-16 11:09:00,989 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44} Building ScenarioFilter...DONE!
Ray objects: 100%|██████████████████████████████████████████████████████████████████████████████| 96/96 [13:16<00:00, 8.29s/it]
2023-09-16 11:22:25,347 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_builder.py:171} Extracted 177435 scenarios for training
2023-09-16 11:22:25,347 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258} WORLD_SIZE was not set.
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266} PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277} Number of gpus found to be in use: 8
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:114} World size: 8
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:115} Learning rate before: 0.0001
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:119} Scaling method: Equal Variance
2023-09-16 11:22:25,349 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:141} Betas after scaling: [0.7422979694372631, 0.9971741579476155]
2023-09-16 11:22:25,349 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:143} Learning rate after scaling: 0.000282842712474619
2023-09-16 11:22:25,478 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:172} Updating Learning Rate Scheduler Config...
2023-09-16 11:22:25,478 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258} WORLD_SIZE was not set.
2023-09-16 11:22:25,478 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266} PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-09-16 11:22:25,479 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277} Number of gpus found to be in use: 8
2023-09-16 11:22:25,479 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:199} Updating torch.optim.lr_scheduler.MultiStepLR in ddp setting is not yet supported. Learning rate scheduler config will not be updated.
2023-09-16 11:22:25,479 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:245} Optimizer and LR Scheduler configs updated according to ddp strategy.
2023-09-16 11:22:25,503 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py:19} Building callbacks...
2023-09-16 11:22:25,538 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py:37} Building callbacks...DONE!
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
Using native 16bit precision.
2023-09-16 11:22:25,539 INFO {/home/linqing.zhao/nuplan-devkit//nuplan/planning/script/run_training.py:62} Starting training...
Global seed set to 0
2023-09-16 11:22:39,118 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:22:39,121 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.38
2023-09-16 11:22:39,121 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:22:39,123 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
Global seed set to 0
2023-09-16 11:22:41,279 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:22:41,281 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.40
2023-09-16 11:22:41,281 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:22:41,283 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-09-16 11:22:42,819 INFO worker.py:1621 -- Started a local Ray instance.
Global seed set to 0
2023-09-16 11:22:45,132 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:22:45,138 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.44
2023-09-16 11:22:45,138 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:22:45,140 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-09-16 11:22:45,659 INFO worker.py:1621 -- Started a local Ray instance.
Global seed set to 0
2023-09-16 11:22:49,560 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101} Worker: RayDistributed
2023-09-16 11:22:49,560 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102} Number of nodes: 1
Number of CPUs per node: 96
Number of GPUs per node: 8
Number of threads across all nodes: 96
2023-09-16 11:22:49,561 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27} Building WorkerPool...DONE!
2023-09-16 11:22:49,561 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/training/experiments/training.py:41} Building training engine...
2023-09-16 11:22:49,561 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18} Building TorchModuleWrapper...
2023-09-16 11:22:49,782 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:22:49,784 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.49
2023-09-16 11:22:49,785 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:22:49,787 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-09-16 11:22:50,106 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21} Building TorchModuleWrapper...DONE!
2023-09-16 11:22:50,106 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18} Building Splitter...
Global seed set to 0
initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/8
2023-09-16 11:22:51,378 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21} Building Splitter...DONE!
2023-09-16 11:22:51,379 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18} Building AbstractScenarioBuilder...
2023-09-16 11:22:51,571 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21} Building AbstractScenarioBuilder...DONE!
2023-09-16 11:22:51,571 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35} Building ScenarioFilter...
2023-09-16 11:22:51,573 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44} Building ScenarioFilter...DONE!
Ray objects: 0%| | 0/96 [00:00<?, ?it/s]2023-09-16 11:22:54,601 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101} Worker: RayDistributed
2023-09-16 11:22:54,601 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102} Number of nodes: 1
Number of CPUs per node: 96
Number of GPUs per node: 8
Number of threads across all nodes: 96
2023-09-16 11:22:54,602 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27} Building WorkerPool...DONE!
2023-09-16 11:22:54,602 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/training/experiments/training.py:41} Building training engine...
2023-09-16 11:22:54,602 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18} Building TorchModuleWrapper...
Global seed set to 0
2023-09-16 11:22:55,184 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21} Building TorchModuleWrapper...DONE!
2023-09-16 11:22:55,184 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18} Building Splitter...
2023-09-16 11:22:55,567 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:22:55,599 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:22:55,607 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.55
2023-09-16 11:22:55,608 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:22:55,610 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-09-16 11:22:55,752 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:22:56,570 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21} Building Splitter...DONE!
2023-09-16 11:22:56,571 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18} Building AbstractScenarioBuilder...
2023-09-16 11:22:56,780 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21} Building AbstractScenarioBuilder...DONE!
2023-09-16 11:22:56,780 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35} Building ScenarioFilter...
2023-09-16 11:22:56,782 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44} Building ScenarioFilter...DONE!
Ray objects: 0%| | 0/96 [00:00<?, ?it/s]Global seed set to 0
2023-09-16 11:23:03,159 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:23:03,167 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.23.02
2023-09-16 11:23:03,168 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:23:03,170 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
Global seed set to 0
2023-09-16 11:23:12,023 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:23:12,030 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.23.11
2023-09-16 11:23:12,031 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:23:12,034 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-09-16 11:23:17,381 ERROR services.py:1207 -- Failed to start the dashboard
2023-09-16 11:23:17,382 ERROR services.py:1232 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2023-09-16 11:23:17,382 ERROR services.py:1242 -- Couldn't read dashboard.log file. Error: [Errno 2] No such file or directory: '/tmp/ray/session_2023-09-16_11-22-55_720350_89715/logs/dashboard.log'. It means the dashboard is broken even before it initializes the logger (mostly dependency issues). Reading the dashboard.err file which contains stdout/stderr.
2023-09-16 11:23:17,382 ERROR services.py:1276 -- Failed to read dashboard.err file: cannot mmap an empty file. It is unexpected. Please report an issue to Ray github. https://github.com/ray-project/ray/issues
2023-09-16 11:23:17,582 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:23:25,116 ERROR services.py:1207 -- Failed to start the dashboard
2023-09-16 11:23:25,116 ERROR services.py:1232 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2023-09-16 11:23:25,116 ERROR services.py:1242 -- Couldn't read dashboard.log file. Error: [Errno 2] No such file or directory: '/tmp/ray/session_2023-09-16_11-23-03_304985_90490/logs/dashboard.log'. It means the dashboard is broken even before it initializes the logger (mostly dependency issues). Reading the dashboard.err file which contains stdout/stderr.
2023-09-16 11:23:25,116 ERROR services.py:1276 -- Failed to read dashboard.err file: cannot mmap an empty file. It is unexpected. Please report an issue to Ray github. https://github.com/ray-project/ray/issues
2023-09-16 11:23:25,233 INFO worker.py:1621 -- Started a local Ray instance.
[2023-09-16 11:23:26,416 E 89301 89301] core_worker.cc:201: Failed to register worker 01000000ffffffffffffffffffffffffffffffffffffffffffffffff to Raylet. IOError: [RayletClient] Unable to register worker with raylet. No such file or directory
I encountered issues when running the tuplan-garage project on my computer. The training process on the nuplan dataset was slow and the GPU memory usage was only at 20%. I tried on the A100 cluster, the GPU usage was pretty low too.
I have been using the default training settings. I'm curious if any important settings were overlooked during the process?
Hi,
Thank you for the code! However, when I run the evaluation command in readme, I encounter this error:
INFO:nuplan.planning.script.builders.main_callback_builder:Building MultiMainCallback...
INFO:nuplan.planning.script.builders.main_callback_builder:Building MultiMainCallback: 4...DONE!
2023-07-19 21:44:16,393 INFO {/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-07-19 21:44:16,436 INFO {/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-07-19 21:44:20,657 ERROR services.py:1207 -- Failed to start the dashboard , return code -11
2023-07-19 21:44:20,657 ERROR services.py:1232 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2023-07-19 21:44:20,658 ERROR services.py:1276 --
The last 20 lines of /tmp/ray/session_2023-07-19_21-44-16_486474_454342/logs/dashboard.log (it contains the error message from the dashboard):
2023-07-19 21:44:18,280 INFO head.py:242 -- Starting dashboard metrics server on port 44227
2023-07-19 21:44:20,971 INFO worker.py:1636 -- Started a local Ray instance.
[2023-07-19 21:44:22,949 E 454342 454342] core_worker.cc:193: Failed to register worker 01000000ffffffffffffffffffffffffffffffffffffffffffffffff to Raylet. IOError: [RayletClient] Unable to register worker with raylet. No such file or directory
Do you have any hints on what went wrong?
I noticed the item related to visualization in the 'To Do' list. I'd like to inquire about when the corresponding visualization scripts will be made available. Thank you !
In 'default_simulation': Could not find 'planner/pdm_closed_planner'
Available options in 'planner':
idm_planner
log_future_planner
ml_planner
remote_planner
simple_planner
dealed with file:///home/***/tuplan_garage/tuplan_garage/planning/script/config/common
Hello,
I was wondering when the updated versions of the checkpoints at this link (https://drive.google.com/drive/folders/1LLdunqyvQQuBuknzmf7KMIJiA2grLYB2) would be uploaded to be compatible with the renamed version of tuplan garage.
Hi,
I am a little confused about the val14 data split. Did you use all the validation set and the training set for sampling the 178k samples for the train150 set, in which the samples in the val set is used to do validation, then the val14 set with 1118 samples from the val set is used to test the performance? That means the val set and the test set of val14 setting might overlap?
Hello,
Great work
How much time needed for training based on a single 3090 GPU
Thanks a lot!
Hi,
For you released gc_pgp code,I notice you pretrained the aggregator. Did you load the pretrained aggregator when training the gc_pgp? I didn't find any model loading code. Also,could you share the training configuration of urban driver? I trained the urbandriver on the train150k set with 4000 scenarios per type, but the performance is much lower than it in the table in your readme, with only 55 open-loop score.
Bests,
I want to run simulations using your provided models from https://drive.google.com/drive/folders/1LLdunqyvQQuBuknzmf7KMIJiA2grLYB2. However, there is an error "No module named 'nuplan_garage'" when I load "pdm_offset_checkpoint.ckpt" and "gc_pgp_checkpoint.ckpt". This error does not happen when I load "pdm_open_checkpoint.ckpt".
Maybe the reason is that the checkpoints are not regenerated after you changed the repository name due to the trademark conflicts.
Thanks in advance!
Hello, I met some issues, these made me confusing, I would appreaite it if u could help me.
Firstly, what's the different between the open-loop and closed-loop(NR)? I have read the paper of nuplan, and i know these are both non-reactive patterns, so they simulate agents by directly replaying logs. I think the difference is that planner know the new state in closed-loop(NR) in each step, so planner can correct its trajectory in future, but open-loop dont considers that. Do I understand correctly?Can you tell me more about the differences between these two modes?
Secondly, i am going to try to use my own controller based RL to control the background agents, but actually i dont know how. Have you ever tried before? And could you tell me how they implement background IDM in the nuplan-devkit in the code?
Hi, thanks for your great work.
I‘m trying to compute L2 error in open-loop settings. (Like NuScenes we compute L2 in 0.5s, 1s, 1.5s, 2s, 2.5, 3s timestamps)
So I wonder these infos:
for example, when I try to analysis the simulation log file:
"exp/simulation/open_loop_boxes/2024.05.09.13.50.26/simulation_log/MLPlanner/following_lane_with_lead/2021.06.07.11.59.52_veh-35_00765_01072/5d91ff45ef9d568f/5d91ff45ef9d568f.msgpack.xz"
How to use these two paras (or other paras) to compute L2?
data.simulation_history.data[i].ego_state.waypoint.center
data.simulation_history.data[i].trajectory._trajectory[j].center
(where i = 1,2,..., frame_num-1; j = 0,1,2,...,16, and data = SimulationLog.load_data(file_path=log_path))
can
the series of [data.simulation_history.data[i].ego_state.waypoint.center, data.simulation_history.data[i+1].ego_state.waypoint.center, data.simulation_history.data[i+2].ego_state.waypoint.center, ..., data.simulation_history.data[i+N].ego_state.waypoint.center] be seen as GT?
and
[data.simulation_history.data[i].trajectory._trajectory[j].center, data.simulation_history.data[i].trajectory._trajectory[j+1].center, data.simulation_history.data[i].trajectory._trajectory[j+2].center,..., data.simulation_history.data[i].trajectory._trajectory[j+N].center] be seen as Pred ?
Many thanks!
Greetings,
Ryhn
Thank you for open-sourcing your work. How can I export high-resolution simulation images from NuBoard in SVG or other formats? If I use the save button provided by NuBoard, the quality of the images obtained is very poor.
Hello. I have set up the nuplan environment and installed tuplan_garage as a package, followed every step for the preparation in the readme.md. However, when I tried to train the model, I have encountered a fatal Ray error. Every time after 'ray objects' is finished, it soon failed to start the dashboard, causing the program to 'ray objects' again. Because of the failure to initialize the ray instance, there is no log recording the error. I have searched a similar issue here but of little help. Thank you for the assistance.
bash the code file below:
TRAIN_EPOCHS=100
TRAIN_LR=1e-4
TRAIN_LR_MILESTONES=[50,75]
TRAIN_LR_DECAY=0.1
BATCH_SIZE=64
SEED=0
JOB_NAME=training_pdm_open_model
CACHE_PATH=/mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/cache
USE_CACHE_WITHOUT_DATASET=False
source ~/.bashrc
conda activate nuplan
python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_training.py \
seed=$SEED \
py_func=train \
+training=training_pdm_open_model \
job_name=$JOB_NAME \
scenario_builder=nuplan \
cache.cache_path=$CACHE_PATH \
cache.use_cache_without_dataset=$USE_CACHE_WITHOUT_DATASET \
lightning.trainer.params.max_epochs=$TRAIN_EPOCHS \
data_loader.params.batch_size=$BATCH_SIZE \
optimizer.lr=$TRAIN_LR \
lr_scheduler=multistep_lr \
lr_scheduler.milestones=$TRAIN_LR_MILESTONES \
lr_scheduler.gamma=$TRAIN_LR_DECAY \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.training, pkg://tuplan_garage.planning.script.experiments, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"
Global seed set to 0
2023-09-16 11:08:48,865 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:08:48,868 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.08.46
2023-09-16 11:08:48,868 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:08:48,870 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-09-16 11:08:52,865 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:08:58,481 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101} Worker: RayDistributed
2023-09-16 11:08:58,482 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102} Number of nodes: 1
Number of CPUs per node: 96
Number of GPUs per node: 8
Number of threads across all nodes: 96
2023-09-16 11:08:58,482 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27} Building WorkerPool...DONE!
2023-09-16 11:08:58,482 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/training/experiments/training.py:41} Building training engine...
2023-09-16 11:08:58,483 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18} Building TorchModuleWrapper...
2023-09-16 11:08:59,487 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21} Building TorchModuleWrapper...DONE!
2023-09-16 11:08:59,488 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18} Building Splitter...
2023-09-16 11:09:00,464 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21} Building Splitter...DONE!
2023-09-16 11:09:00,465 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18} Building AbstractScenarioBuilder...
2023-09-16 11:09:00,988 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21} Building AbstractScenarioBuilder...DONE!
2023-09-16 11:09:00,988 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35} Building ScenarioFilter...
2023-09-16 11:09:00,989 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44} Building ScenarioFilter...DONE!
Ray objects: 100%|██████████████████████████████████████████████████████████████████████████████| 96/96 [13:16<00:00, 8.29s/it]
2023-09-16 11:22:25,347 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_builder.py:171} Extracted 177435 scenarios for training
2023-09-16 11:22:25,347 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258} WORLD_SIZE was not set.
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266} PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277} Number of gpus found to be in use: 8
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:114} World size: 8
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:115} Learning rate before: 0.0001
2023-09-16 11:22:25,348 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:119} Scaling method: Equal Variance
2023-09-16 11:22:25,349 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:141} Betas after scaling: [0.7422979694372631, 0.9971741579476155]
2023-09-16 11:22:25,349 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:143} Learning rate after scaling: 0.000282842712474619
2023-09-16 11:22:25,478 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:172} Updating Learning Rate Scheduler Config...
2023-09-16 11:22:25,478 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:258} WORLD_SIZE was not set.
2023-09-16 11:22:25,478 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:266} PytorchLightning Trainer gpus was set to -1, finding number of GPUs used from torch.cuda.device_count().
2023-09-16 11:22:25,479 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:277} Number of gpus found to be in use: 8
2023-09-16 11:22:25,479 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:199} Updating torch.optim.lr_scheduler.MultiStepLR in ddp setting is not yet supported. Learning rate scheduler config will not be updated.
2023-09-16 11:22:25,479 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/utils/utils_config.py:245} Optimizer and LR Scheduler configs updated according to ddp strategy.
2023-09-16 11:22:25,503 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py:19} Building callbacks...
2023-09-16 11:22:25,538 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/training_callback_builder.py:37} Building callbacks...DONE!
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
Using native 16bit precision.
2023-09-16 11:22:25,539 INFO {/home/linqing.zhao/nuplan-devkit//nuplan/planning/script/run_training.py:62} Starting training...
Global seed set to 0
2023-09-16 11:22:39,118 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:22:39,121 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.38
2023-09-16 11:22:39,121 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:22:39,123 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
Global seed set to 0
2023-09-16 11:22:41,279 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:22:41,281 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.40
2023-09-16 11:22:41,281 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:22:41,283 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-09-16 11:22:42,819 INFO worker.py:1621 -- Started a local Ray instance.
Global seed set to 0
2023-09-16 11:22:45,132 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:22:45,138 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.44
2023-09-16 11:22:45,138 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:22:45,140 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-09-16 11:22:45,659 INFO worker.py:1621 -- Started a local Ray instance.
Global seed set to 0
2023-09-16 11:22:49,560 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101} Worker: RayDistributed
2023-09-16 11:22:49,560 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102} Number of nodes: 1
Number of CPUs per node: 96
Number of GPUs per node: 8
Number of threads across all nodes: 96
2023-09-16 11:22:49,561 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27} Building WorkerPool...DONE!
2023-09-16 11:22:49,561 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/training/experiments/training.py:41} Building training engine...
2023-09-16 11:22:49,561 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18} Building TorchModuleWrapper...
2023-09-16 11:22:49,782 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:22:49,784 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.49
2023-09-16 11:22:49,785 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:22:49,787 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-09-16 11:22:50,106 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21} Building TorchModuleWrapper...DONE!
2023-09-16 11:22:50,106 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18} Building Splitter...
Global seed set to 0
initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/8
2023-09-16 11:22:51,378 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21} Building Splitter...DONE!
2023-09-16 11:22:51,379 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18} Building AbstractScenarioBuilder...
2023-09-16 11:22:51,571 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21} Building AbstractScenarioBuilder...DONE!
2023-09-16 11:22:51,571 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35} Building ScenarioFilter...
2023-09-16 11:22:51,573 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44} Building ScenarioFilter...DONE!
Ray objects: 0%| | 0/96 [00:00<?, ?it/s]2023-09-16 11:22:54,601 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:101} Worker: RayDistributed
2023-09-16 11:22:54,601 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_pool.py:102} Number of nodes: 1
Number of CPUs per node: 96
Number of GPUs per node: 8
Number of threads across all nodes: 96
2023-09-16 11:22:54,602 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:27} Building WorkerPool...DONE!
2023-09-16 11:22:54,602 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/training/experiments/training.py:41} Building training engine...
2023-09-16 11:22:54,602 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:18} Building TorchModuleWrapper...
Global seed set to 0
2023-09-16 11:22:55,184 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/model_builder.py:21} Building TorchModuleWrapper...DONE!
2023-09-16 11:22:55,184 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:18} Building Splitter...
2023-09-16 11:22:55,567 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:22:55,599 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:22:55,607 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.22.55
2023-09-16 11:22:55,608 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:22:55,610 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-09-16 11:22:55,752 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:22:56,570 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/splitter_builder.py:21} Building Splitter...DONE!
2023-09-16 11:22:56,571 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:18} Building AbstractScenarioBuilder...
2023-09-16 11:22:56,780 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_building_builder.py:21} Building AbstractScenarioBuilder...DONE!
2023-09-16 11:22:56,780 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:35} Building ScenarioFilter...
2023-09-16 11:22:56,782 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/scenario_filter_builder.py:44} Building ScenarioFilter...DONE!
Ray objects: 0%| | 0/96 [00:00<?, ?it/s]Global seed set to 0
2023-09-16 11:23:03,159 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:23:03,167 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.23.02
2023-09-16 11:23:03,168 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:23:03,170 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
Global seed set to 0
2023-09-16 11:23:12,023 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:20} Building experiment folders...
2023-09-16 11:23:12,030 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/folder_builder.py:22} Experimental folder: /mnt/cfs/algorithm/linqing.zhao/haozhe/Tuplan_garage/exp/exp/training_pdm_open_model/training_pdm_open_model/2023.09.16.11.23.11
2023-09-16 11:23:12,031 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/script/builders/worker_pool_builder.py:19} Building WorkerPool...
2023-09-16 11:23:12,034 INFO {/home/linqing.zhao/nuplan-devkit/nuplan/planning/utils/multithreading/worker_ray.py:78} Starting ray local!
2023-09-16 11:23:17,381 ERROR services.py:1207 -- Failed to start the dashboard
2023-09-16 11:23:17,382 ERROR services.py:1232 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2023-09-16 11:23:17,382 ERROR services.py:1242 -- Couldn't read dashboard.log file. Error: [Errno 2] No such file or directory: '/tmp/ray/session_2023-09-16_11-22-55_720350_89715/logs/dashboard.log'. It means the dashboard is broken even before it initializes the logger (mostly dependency issues). Reading the dashboard.err file which contains stdout/stderr.
2023-09-16 11:23:17,382 ERROR services.py:1276 -- Failed to read dashboard.err file: cannot mmap an empty file. It is unexpected. Please report an issue to Ray github. https://github.com/ray-project/ray/issues
2023-09-16 11:23:17,582 INFO worker.py:1621 -- Started a local Ray instance.
2023-09-16 11:23:25,116 ERROR services.py:1207 -- Failed to start the dashboard
2023-09-16 11:23:25,116 ERROR services.py:1232 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2023-09-16 11:23:25,116 ERROR services.py:1242 -- Couldn't read dashboard.log file. Error: [Errno 2] No such file or directory: '/tmp/ray/session_2023-09-16_11-23-03_304985_90490/logs/dashboard.log'. It means the dashboard is broken even before it initializes the logger (mostly dependency issues). Reading the dashboard.err file which contains stdout/stderr.
2023-09-16 11:23:25,116 ERROR services.py:1276 -- Failed to read dashboard.err file: cannot mmap an empty file. It is unexpected. Please report an issue to Ray github. https://github.com/ray-project/ray/issues
2023-09-16 11:23:25,233 INFO worker.py:1621 -- Started a local Ray instance.
[2023-09-16 11:23:26,416 E 89301 89301] core_worker.cc:201: Failed to register worker 01000000ffffffffffffffffffffffffffffffffffffffffffffffff to Raylet. IOError: [RayletClient] Unable to register worker with raylet. No such file or directory
Thank you for open-sourcing your work. I am a little confused that I didn't download any checkpoint you mentioned in the readme. But I ran the example pdm_closed_planner, no error occurred. Is this because this method is totally rule-based? But it seems there are some networks in this algorithm.
Hi, I am trying to reproduce the result in CoRL23 Table1 with only the centerline input. In the article, pdm-open achieved a score of 85 on the OLS, while my replication results only reached 21 in reduced_val14. I would like to know whether it is due to differences in my hyperparameter settings or if there are other issues.
Changed pdm_open_model.py
self.planner_head = nn.Sequential(
# nn.Linear(self.hidden_dim * 2, self.hidden_dim),
nn.Linear(self.hidden_dim * 1, self.hidden_dim),
nn.Dropout(0.1),
nn.ReLU(),
nn.Linear(self.hidden_dim, self.hidden_dim),
nn.ReLU(),
nn.Linear(self.hidden_dim, trajectory_sampling.num_poses * len(SE2Index)),
)
# planner_features = torch.cat([state_encodings, centerline_encodings], dim=-1)
planner_features = centerline_encodings
Hi,
Are there any plans to release the Val14 dataset to the public or the logic used to split the dataset? Thanks in advance.
Hi, great work!
if the train and val data are combined according to the official data setup method of nuplan, will there be a situation where the data from val14 is trained during the training process, causing the data used for testing to have been seen by the model?
Thanks for great coding and work!
I'm wondering if you still have plans to release visualization script to generate videos like the teaser.mp4 file in README. Is this file made from simulation (closed-loop)?
Thanks beforehand for taking the time to consider my question!
Hi
Thank you for your work.
However I have some question sim urban driver with Val14_split evaluation result
when run bash file located ../tupaln_garage/scripts/simulation/sim_urban_driver.sh
,I met IOError with [RayletClient]. So modify worker
parameter to worker=single_machine_thread_pool
in bashfile
The full bash file here!
SPLIT=val14_split
CHALLENGE=closed_loop_reactive_agents # open_loop_boxes, closed_loop_nonreactive_agents, closed_loop_reactive_agents
CHECKPOINT_PATH="/mnt/sdd/jyYun/planning/tuplan_garage/trained_weights/urban_driver.ckpt"
python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_simulation.py \
+simulation=$CHALLENGE \
planner=ml_planner \
worker=single_machine_thread_pool \
scenario_filter=$SPLIT \
scenario_builder=nuplan \
planner.ml_planner.model_config='\${model}' \
planner.ml_planner.checkpoint_path=$CHECKPOINT_PATH \
model=urban_driver_open_loop_model \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.simulation, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"
It works, but I have noticed that some scenarios fail during evaluation
when evaluation ended it report 38 out of 1118 scenarios have failed
Despite the severall failed scenarios, we have confirmed that the results are still being successfully saved in the 'exp' folder as configured in Nuplan
Under these circumstances, I have some questions
Traceback (most recent call last):
File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/simulation/runner/executor.py", line 28, in run_simulation
run_results = sim_runner.run()
File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/simulation/runner/simulations_runner.py", line 128, in run
self.simulation.callback.on_simulation_end(self.simulation.setup, self.planner, self.simulation.history)
File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/simulation/callback/multi_callback.py", line 68, in on_simulation_end
callback.on_simulation_end(setup, planner, history)
File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/simulation/callback/metric_callback.py", line 102, in on_simulation_end
run_metric_engine(
File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/simulation/callback/metric_callback.py", line 24, in run_metric_engine
metric_files = metric_engine.compute(history, scenario=scenario, planner_name=planner_name)
File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 133, in compute
all_metrics_results = self.compute_metric_results(history=history, scenario=scenario)
File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 119, in compute_metric_results
raise RuntimeError(f"Metric Engine failed with: {e}")
RuntimeError: Metric Engine failed with:
'
2023-12-09 09:45:38,768 WARNING {/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/simulation/runner/executor.py:125} Failed Simulation.
'Traceback (most recent call last):
File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 112, in compute_metric_results
metric_results[metric.name] = metric.compute(history, scenario=scenario)
File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/metrics/evaluation_metrics/common/speed_limit_compliance.py", line 218, in compute
time_series = TimeSeries(
File "<string>", line 7, in __init__
File "/mnt/sdd/jyYun/planning/nuplan-devkit/nuplan-devkit/nuplan/planning/metrics/metric_result.py", line 127, in __post_init__
assert len(self.time_stamps) == len(self.values)
AssertionError
Hello, im trying to get your urban_driver and GC-PGP pretrained model working and am experiencing an error. There are two separate problems. First, it seems that that the urban_driver model was not updated with the rest when you retrained GC-PGP, so it still has the 'cannot find module nuplan_garage' error. But further, when running the GC-PGP model with the type of config used in the nuplan tutorials, I get a different error regarding multimodal_trajectories.
Config I am using and the resulting error is below. This was run in a jupyter notebook sitting in the nuplan-devkit repo.
# Location of path with all simulation configs
CONFIG_PATH = '../nuplan/planning/script/config/simulation'
CONFIG_NAME = 'default_simulation'
CHECKPOINT_PATH='run_sim_closed_loop/pretrained_checkpoints/gc_pgp_checkpoint.ckpt'
# Select the planner and simulation challenge
PLANNER = 'ml_planner' # [simple_planner, ml_planner]
CHALLENGE = 'closed_loop_reactive_agents' # [open_loop_boxes, closed_loop_nonreactive_agents, closed_loop_reactive_agents]
DATASET_PARAMS = [
'scenario_builder=nuplan_mini', # use nuplan mini database
'scenario_filter=all_scenarios', # initially select all scenarios in the database
'scenario_filter.scenario_types=[near_multiple_vehicles, on_pickup_dropoff, starting_unprotected_cross_turn, high_magnitude_jerk]', # select scenario types
'scenario_filter.num_scenarios_per_type=5', # use 5 scenarios per scenario type
]
# Name of the experiment
EXPERIMENT = 'simulation_simple_experiment'
# Initialize configuration management system
hydra.core.global_hydra.GlobalHydra.instance().clear() # reinitialize hydra if already initialized
hydra.initialize(config_path=CONFIG_PATH)
# Compose the configuration
cfg = hydra.compose(config_name=CONFIG_NAME, overrides=[
f'experiment_name={EXPERIMENT}',
f'planner={PLANNER}',
f'model=raster_model',
'planner.ml_planner.model_config=${model}', # hydra notation to select model config
f'planner.ml_planner.checkpoint_path={CHECKPOINT_PATH}', # this path can be replaced by the checkpoint of the model trained in the previous section
f'group={SAVE_DIR}',
f'+simulation={CHALLENGE}',
*DATASET_PARAMS,
'hydra.searchpath=[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.simulation, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]'
])
from nuplan.planning.script.run_simulation import main as main_simulation
# Run the simulation loop (real-time visualization not yet supported, see next section for visualization)
main_simulation(cfg)
# Simple simulation folder for visualization in nuBoard
simple_simulation_folder = cfg.output_dir
AssertionError Traceback (most recent call last)
Cell In[10], line 4
1 from nuplan.planning.script.run_simulation import main as main_simulation
3 # Run the simulation loop (real-time visualization not yet supported, see next section for visualization)
----> 4 main_simulation(cfg)
6 # Simple simulation folder for visualization in nuBoard
7 simple_simulation_folder = cfg.output_dir
File /opt/conda/lib/python3.9/site-packages/hydra/main.py:44, in main.<locals>.main_decorator.<locals>.decorated_main(cfg_passthrough)
41 @functools.wraps(task_function)
42 def decorated_main(cfg_passthrough: Optional[DictConfig] = None) -> Any:
43 if cfg_passthrough is not None:
---> 44 return task_function(cfg_passthrough)
45 else:
46 args = get_args_parser()
File ~/nuplan-devkit/nuplan/planning/script/run_simulation.py:110, in main(cfg)
107 assert cfg.simulation_log_main_path is None, 'Simulation_log_main_path must not be set when running simulation.'
109 # Execute simulation with preconfigured planner(s).
--> 110 run_simulation(cfg=cfg)
112 if is_s3_path(Path(cfg.output_dir)):
113 clean_up_s3_artifacts()
File ~/nuplan-devkit/nuplan/planning/script/run_simulation.py:66, in run_simulation(cfg, planners)
63 if isinstance(planners, AbstractPlanner):
64 planners = [planners]
---> 66 runners = build_simulations(
67 cfg=cfg,
68 callbacks=callbacks,
69 worker=common_builder.worker,
70 pre_built_planners=planners,
71 callbacks_worker=callbacks_worker_pool,
72 )
74 if common_builder.profiler:
75 # Stop simulation construction profiling
76 common_builder.profiler.save_profiler(profiler_name)
File ~/nuplan-devkit/nuplan/planning/script/builders/simulation_builder.py:90, in build_simulations(cfg, worker, callbacks, callbacks_worker, pre_built_planners)
87 if 'planner' not in cfg.keys():
88 raise KeyError('Planner not specified in config. Please specify a planner using "planner" field.')
---> 90 planners = build_planners(cfg.planner, scenario)
91 else:
92 planners = pre_built_planners
File ~/nuplan-devkit/nuplan/planning/script/builders/planner_builder.py:58, in build_planners(planner_cfg, scenario)
51 def build_planners(planner_cfg: DictConfig, scenario: Optional[AbstractScenario]) -> List[AbstractPlanner]:
52 """
53 Instantiate multiple planners by calling build_planner
54 :param planners_cfg: planners config
55 :param scenario: scenario
56 :return planners: List of AbstractPlanners
57 """
---> 58 return [_build_planner(planner, scenario) for planner in planner_cfg.values()]
File ~/nuplan-devkit/nuplan/planning/script/builders/planner_builder.py:58, in <listcomp>(.0)
51 def build_planners(planner_cfg: DictConfig, scenario: Optional[AbstractScenario]) -> List[AbstractPlanner]:
52 """
53 Instantiate multiple planners by calling build_planner
54 :param planners_cfg: planners config
55 :param scenario: scenario
56 :return planners: List of AbstractPlanners
57 """
---> 58 return [_build_planner(planner, scenario) for planner in planner_cfg.values()]
File ~/nuplan-devkit/nuplan/planning/script/builders/planner_builder.py:26, in _build_planner(planner_cfg, scenario)
23 if is_target_type(planner_cfg, MLPlanner):
24 # Build model and feature builders needed to run an ML model in simulation
25 torch_module_wrapper = build_torch_module_wrapper(planner_cfg.model_config)
---> 26 model = LightningModuleWrapper.load_from_checkpoint(
27 planner_cfg.checkpoint_path, model=torch_module_wrapper
28 ).model
30 # Remove config elements that are redundant to MLPlanner
31 OmegaConf.set_struct(config, False)
File /opt/conda/lib/python3.9/site-packages/pytorch_lightning/core/saving.py:157, in ModelIO.load_from_checkpoint(cls, checkpoint_path, map_location, hparams_file, strict, **kwargs)
154 # override the hparams with values that were passed in
155 checkpoint[cls.CHECKPOINT_HYPER_PARAMS_KEY].update(kwargs)
--> 157 model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
158 return model
File /opt/conda/lib/python3.9/site-packages/pytorch_lightning/core/saving.py:199, in ModelIO._load_model_state(cls, checkpoint, strict, **cls_kwargs_new)
195 if not cls_spec.varkw:
196 # filter kwargs according to class init unless it allows any argument via kwargs
197 _cls_kwargs = {k: v for k, v in _cls_kwargs.items() if k in cls_init_args_name}
--> 199 model = cls(**_cls_kwargs)
201 # give model a chance to load something
202 model.on_load_checkpoint(checkpoint)
File ~/nuplan-devkit/nuplan/planning/training/modeling/lightning_module_wrapper.py:67, in LightningModuleWrapper.__init__(self, model, objectives, metrics, batch_size, optimizer, lr_scheduler, warm_up_lr_scheduler, objective_aggregate_mode)
65 for metric in self.metrics:
66 for feature in metric.get_list_of_required_target_types():
---> 67 assert feature in model_targets, f"Metric target: \"{feature}\" is not in model computed targets!"
AssertionError: Metric target: "multimodal_trajectories" is not in model computed targets!
Thanks for open sourcing the great work!
I want to ask about the training details of PDM-open and PDM-Offset.
Thank you in advance.
Hi, I was wondering if you have tried to implement this learning-based planner on a real vehicle, or if you have tried to build a bridge to connect the simulator with a ROS environment.
If that's the case, could you please provide more information about it?
I think it would be very interesting to test this model on a real vehicle! Thanks
Hi,
When I evaluate pdm_closed_planner and pdm_open_planner in closed loop, non reactive agent setting, both planners runs fine without error if I use a reduced scenario filter (with 3 or 5 scenarios randomly picked from the validation dataset). However, in val14_split, when running
python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_simulation.py +simulation=closed_loop_nonreactive_agents planner=pdm_closed_planner scenario_filter=val14_split scenario_builder=nuplan worker=single_machine_thread_pool scenario_builder.data_root=/fs/scratch/projects/proj-ai-planning/archive/nuScenes/nuplan/dataset/nuplan-v1.1/splits/val hydra.searchpath="[pkg://nuplan_garage.planning.script.config.common, pkg://nuplan_garage.planning.script.config.simulation, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"
I get the following error when the simulations are being executed:
Traceback (most recent call last):
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 112, in compute_metric_results
metric_results[metric.name] = metric.compute(history, scenario=scenario)
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/evaluation_metrics/common/speed_limit_compliance.py", line 218, in compute
time_series = TimeSeries(
File "<string>", line 7, in __init__
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/metric_result.py", line 127, in __post_init__
assert len(self.time_stamps) == len(self.values)
AssertionError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/simulation/runner/executor.py", line 27, in run_simulation
return sim_runner.run()
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/simulation/runner/simulations_runner.py", line 128, in run
self.simulation.callback.on_simulation_end(self.simulation.setup, self.planner, self.simulation.history)
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/simulation/callback/multi_callback.py", line 68, in on_simulation_end
callback.on_simulation_end(setup, planner, history)
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/simulation/callback/metric_callback.py", line 102, in on_simulation_end
run_metric_engine(
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/simulation/callback/metric_callback.py", line 24, in run_metric_engine
metric_files = metric_engine.compute(history, scenario=scenario, planner_name=planner_name)
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 133, in compute
all_metrics_results = self.compute_metric_results(history=history, scenario=scenario)
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 119, in compute_metric_results
raise RuntimeError(f"Metric Engine failed with: {e}")
RuntimeError: Metric Engine failed with:
Traceback (most recent call last):
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/metric_engine.py", line 112, in compute_metric_results
metric_results[metric.name] = metric.compute(history, scenario=scenario)
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/evaluation_metrics/common/speed_limit_compliance.py", line 218, in compute
time_series = TimeSeries(
File "<string>", line 7, in __init__
File "/home/lis2syv/nuPlan/nuplan-devkit/nuplan/planning/metrics/metric_result.py", line 127, in __post_init__
assert len(self.time_stamps) == len(self.values)
AssertionError
Dear Author
Thank you for your work.
I have built my own prediction models based on PGP and try to combine it with PDM-Closed, but It didn't work so well.
GC-PGP's feature is different from PDM
So I want to ask how to conbine GC-PGP with PDM-Closed.
Uploading the code would be great!
Thank you for all the help you've given me before!
wishes,
Yifan
Hello,
I have a doubt regarding the encoding process of the centerline and the states of the ego:
1) Centerline: from my understanding, you first extract the centerline using Dijkstra, from the starting position to the goal position.
Afterwards, you just keep a length of 120m sampled at 1m and you pass it to the neural network.
My question is: how do you select the "local" centerline of 120m? Is its starting point the same as the ego vehicle for each frame? Or do you implement another logic?
Moreover, each state of the centerline only includes x, y and theta, but in which reference frame?
2) Ego states: ego_position, ego_velocity and ego_acceleration have 11 elements (the last one is the present one and the first 10 are the past ones), each with 3 states (x, y and theta). Also in this case, which is the reference frame?
I guess it's the local one, but I ask you just to make sure.
Thank you very much in advance!
Dear Author
I want to make some suggestions.When run bash file located
/home/wjl/jyf/tuplan_garage/scripts/simulation/sim_gc_pgp.sh
I met error mismatched input '=' expecting <EOF>
After many attempts I realized that the problem is checkpoint path
"/home/wjl/jyf/nuplandata/exp/exp/training_gc_pgp_model/training_gc_pgp_model/2023.12.12.09.53.38/checkpoints/epoch=1.ckpt"
When I delete the '='
and rename the checkpoint file , this bash file can run correctly. This code name epoch=<int>
initially. So I suggest name checkpoint file likeepoch-<int>
APPENDIX
error shell
SPLIT=val14_split
CHALLENGE=closed_loop_reactive_agents # open_loop_boxes, closed_loop_nonreactive_agents, closed_loop_reactive_agents
CHECKPOINT_PATH="/home/wjl/jyf/nuplandata/exp/exp/training_gc_pgp_model/training_gc_pgp_model/2023.12.12.09.53.38/checkpoints/epoch=1.ckpt"
python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_simulation.py \
+simulation=$CHALLENGE \
planner=ml_planner \
scenario_filter=$SPLIT \
scenario_builder=nuplan \
planner.ml_planner.model_config='\${model}' \
planner.ml_planner.checkpoint_path=$CHECKPOINT_PATH \
model=gc_pgp_model \
model.aggregator.pre_train=false \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.simulation, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"
correct shell
SPLIT=val14_split
CHALLENGE=closed_loop_reactive_agents # open_loop_boxes, closed_loop_nonreactive_agents, closed_loop_reactive_agents
CHECKPOINT_PATH="/home/wjl/jyf/nuplandata/exp/exp/training_gc_pgp_model/training_gc_pgp_model/2023.12.12.09.53.38/checkpoints/epoch1.ckpt" #delete '='
python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_simulation.py \
+simulation=$CHALLENGE \
planner=ml_planner \
scenario_filter=$SPLIT \
scenario_builder=nuplan \
planner.ml_planner.model_config='\${model}' \
planner.ml_planner.checkpoint_path=$CHECKPOINT_PATH \
model=gc_pgp_model \
model.aggregator.pre_train=false \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.simulation, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"
Best whishes
Yifan
Great work!It helped me a lot!
Could you tell me what is the plan for open source code?
Thank you very much!
Hi Dear Authors,
Thank you for the excellent work! I am new to the field of planning and would really like to follow the path of your paper. However, the resource in my lab is quite limited and the >1T data of nuPlan looks really so large.
Therefore, I am curious if such data only contains the trajectory data, or do you have other suggestions for running your algorithms with some condensed trajectory data?
Best,
Ziqi
Dear Authors,
Thank you for the excellent work! I have set up the nuplan environment and installed tuplan_garage as a package.But I have a problem before the start of training GC-PGP. I can't find callbacks.multimodal_visualization_callback.pixel_size
key so I can't train GC-PGP model. Then I try to look the path /home/wjl/jyf/nuplan-devkit/nuplan/planning/training/callbacks
but it not exists the file multimodal_visualization_callback
.So do you have suggestions for trainning the GC-PGP model
bash the code file below:
BATCH_SIZE=32
SEED=0
NUPLAN_DEVKIT_ROOT=/home/wjl/jyf/nuplan-devkit #加nuplan-devkit地址
PRETRAIN_EPOCHS=20
PRETRAIN_LR=1e-4
TRAIN_EPOCHS=90
TRAIN_LR=1e-4
TRAIN_LR_MILESTONES=[40,50,55]
TRAIN_LR_DECAY=0.5
JOB_NAME=training_gc_pgp_model
CACHE_PATH=/home/wjl/jyf/tuplan_garage/nuplandata/exp/jyf/cache #cache地址
USE_CACHE_WITHOUT_DATASET=False
ROUTE_FEATURE=FALSE
ROUTE_MASK=FALSE
HARD_MASK=FALSE
TRAFFIC_LIGHT=TRUE
echo "Starting Pre-Training with gt traversals as input for decoder"
python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_training.py \
seed=$SEED \
py_func=train \
+training=training_gc_pgp_model \
job_name=$JOB_NAME \
scenario_builder=nuplan \
scenario_filter.num_scenarios_per_type=4000 \
cache.cache_path=$CACHE_PATH \
cache.use_cache_without_dataset=$USE_CACHE_WITHOUT_DATASET \
callbacks.visualization_callback.pixel_size=0.25 \
callbacks.multimodal_visualization_callback.pixel_size=0.25 \
lightning.trainer.params.max_epochs=$PRETRAIN_EPOCHS \
lightning.trainer.params.max_time=null \
data_loader.params.batch_size=$BATCH_SIZE \
optimizer.lr=$PRETRAIN_LR \
lr_scheduler=multistep_lr \
lr_scheduler.milestones=$TRAIN_LR_MILESTONES \
lr_scheduler.gamma=$TRAIN_LR_DECAY \
model.encoder.use_red_light_feature=$TRAFFIC_LIGHT \
model.aggregator.use_route_mask=$ROUTE_MASK \
model.aggregator.hard_masking=$HARD_MASK \
model.aggregator.pre_train=true \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.training, pkg://tuplan_garage.planning.script.experiments, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"
echo "Starting Training with aggregator traversals as input for decoder"
python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_training.py \
seed=$SEED \
py_func=train \
+training=training_gc_pgp_model \
job_name=$JOB_NAME \
scenario_builder=nuplan \
scenario_filter.num_scenarios_per_type=4000 \
cache.cache_path=$CACHE_PATH \
cache.use_cache_without_dataset=$USE_CACHE_WITHOUT_DATASET \
callbacks.visualization_callback.pixel_size=0.25 \
callbacks.multimodal_visualization_callback.pixel_size=0.25 \
lightning.trainer.params.max_epochs=$TRAIN_EPOCHS \
lightning.trainer.params.max_time=null \
lightning.trainer.checkpoint.resume_training=true \
data_loader.params.batch_size=$BATCH_SIZE \
optimizer.lr=$TRAIN_LR \
lr_scheduler=multistep_lr \
model.encoder.use_red_light_feature=$TRAFFIC_LIGHT \
model.aggregator.use_route_mask=$ROUTE_MASK \
model.aggregator.hard_masking=$HARD_MASK \
model.aggregator.pre_train=false \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.training, pkg://tuplan_garage.planning.script.experiments, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]"
output
Starting Pre-Training with gt traversals as input for decoder
Could not override 'callbacks.multimodal_visualization_callback.pixel_size'.
To append to your config use +callbacks.multimodal_visualization_callback.pixel_size=0.25
Key 'multimodal_visualization_callback' is not in struct
full_key: callbacks.multimodal_visualization_callback
object_type=dict
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Starting Training with aggregator traversals as input for decoder
Could not override 'callbacks.multimodal_visualization_callback.pixel_size'.
To append to your config use +callbacks.multimodal_visualization_callback.pixel_size=0.25
Key 'multimodal_visualization_callback' is not in struct
full_key: callbacks.multimodal_visualization_callback
object_type=dict
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Best wishes,
Yifan
While running
SPLIT=mini_test_split
CHALLENGE=closed_loop_reactive_agents # open_loop_boxes, closed_loop_nonreactive_agents, closed_loop_reactive_agents
CHECKPOINT=~/tuplan_garage/pdm_open_checkpoint.ckpt
taskset -c 0-30 python $NUPLAN_DEVKIT_ROOT/nuplan/planning/script/run_simulation.py \
+simulation=$CHALLENGE \
planner=pdm_open_planner \
planner.pdm_open_planner.checkpoint_path=$CHECKPOINT \
scenario_filter=$SPLIT \
scenario_builder=nuplan \
hydra.searchpath="[pkg://tuplan_garage.planning.script.config.common, pkg://tuplan_garage.planning.script.config.simulation, pkg://nuplan.planning.script.config.common, pkg://nuplan.planning.script.experiments]" \
on a cluster i've gotten the issue
(wrapped_fn pid=3822105) Before assert: _drivable_area_map = <tuplan_garage.planning.simulation.planner.pdm_planner.observation.pdm_occupancy_map.PDMOccupancyMap object at 0x7f996e5255b0> [repeated 27x across cluster]
(wrapped_fn pid=3822105) Drivable area map has 'intersects' method. [repeated 21x across cluster]
(wrapped_fn pid=3822105) Test intersecting lanes at (0,0): [] [repeated 21x across cluster]
(wrapped_fn pid=3822105) Intersecting lanes at ego position [ 664863.43947442 3999616.06394007]: ['63253'] [repeated 18x across cluster]
(wrapped_fn pid=3822105) [] [repeated 5x across cluster]
(wrapped_fn pid=3822445) Intersecting lanes at ego position [ 664430.74693994 3998337.4457144 ]: ['63319']
Converting detections to smart agents: 0%| | 0/17 [00:00<?, ?it/s] [repeated 5x across cluster]
[repeated 7x across cluster]
(wrapped_fn pid=3822105) WARNING:nuplan.planning.simulation.runner.executor:----------- Simulation failed: with the following trace: [repeated 5x across cluster]
(wrapped_fn pid=3822105) Traceback (most recent call last): [repeated 5x across cluster]
(wrapped_fn pid=3822105) File "/home/tombo/nuplan-devkit/nuplan/planning/simulation/runner/executor.py", line 27, in run_simulation [repeated 5x across cluster]
(wrapped_fn pid=3822105) return sim_runner.run() [repeated 5x across cluster]
(wrapped_fn pid=3822105) File "/home/tombo/nuplan-devkit/nuplan/planning/simulation/runner/simulations_runner.py", line 113, in run [repeated 5x across cluster]
(wrapped_fn pid=3822105) trajectory = self.planner.compute_trajectory(planner_input) [repeated 5x across cluster]
(wrapped_fn pid=3822105) File "/home/tombo/nuplan-devkit/nuplan/planning/simulation/planner/abstract_planner.py", line 105, in compute_trajectory [repeated 10x across cluster]
(wrapped_fn pid=3822105) raise e [repeated 5x across cluster]
(wrapped_fn pid=3822105) trajectory = self.compute_planner_trajectory(current_input) [repeated 5x across cluster]
(wrapped_fn pid=3822105) File "/home/tombo/tuplan_garage/tuplan_garage/planning/simulation/planner/pdm_planner/pdm_open_planner.py", line 109, in compute_planner_trajectory [repeated 5x across cluster]
(wrapped_fn pid=3822105) current_lane = self._get_starting_lane(ego_state) [repeated 5x across cluster]
(wrapped_fn pid=3822105) File "/home/tombo/tuplan_garage/tuplan_garage/planning/simulation/planner/pdm_planner/abstract_pdm_planner.py", line 123, in _get_starting_lane [repeated 5x across cluster]
(wrapped_fn pid=3822105) on_route_lanes, heading_error = self._get_intersecting_lanes(ego_state) [repeated 5x across cluster]
(wrapped_fn pid=3822105) File "/home/tombo/tuplan_garage/tuplan_garage/planning/simulation/planner/pdm_planner/abstract_pdm_planner.py", line 158, in _get_intersecting_lanes [repeated 5x across cluster]
(wrapped_fn pid=3822105) assert ( [repeated 5x across cluster]
(wrapped_fn pid=3822105) AssertionError: AbstractPDMPlanner: Drivable area map must be initialized first! [repeated 5x across cluster]
(wrapped_fn pid=3822105) WARNING:nuplan.planning.simulation.runner.executor:Simulation failed with error: [repeated 5x across cluster]
(wrapped_fn pid=3822105) AbstractPDMPlanner: Drivable area map must be initialized first! [repeated 5x across cluster]
(wrapped_fn pid=3822105) WARNING:nuplan.planning.simulation.runner.executor: [repeated 5x across cluster]
(wrapped_fn pid=3822105) Failed simulation [log,token]: [repeated 5x across cluster]
(wrapped_fn pid=3822105) [repeated 13x across cluster]
(wrapped_fn pid=3822105) WARNING:nuplan.planning.simulation.runner.executor:----------- Simulation failed! [repeated 5x across cluster]
Hi,
I noticed that inside the planner (get_closed_loop_trajectory) we have two types one trajectory:
proposals_array which don't start at ego current state and are later extended from 4s to 8s (41 poses -> 81 poses as 10Hz is used as frequency)
simulated_proposals_array which are the same trajectories as proposals_array but starting at current ego pose and simulated using Bicycle model and LQRtracker, which are used for scoring. (and also have 81 poses)
What I am wondering is why return a trajectory that is not starting at ego current pose as starting point (extension of the best trajectory inside proposals_array) instead of using one trajectory that starts at ego position as in simulated_proposals? Is that done on purpose as part of a trick for the simulation?
Thanks in advance :)
Dear Author:
I tried to run the simulation process and found it to be very slow and low CPU usage.I estimate it will take 30 days to run val14,so I'm asking if there's a way to speed up.
Device Hardware
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.