tonyzhaozh / act Goto Github PK

View Code? Open in Web Editor NEW

514.0 514.0 133.0 10.93 MB

License: MIT License

Python 100.00%

act's People

Contributors

Stargazers

Watchers

Forkers

mertcookimg backupart damdag kirandoshi jammy112 songyangzhao zlt1213 sanjeev2qing cheng-chi wghrayeb dailyrobotics daniel-gudmundsson yifeige86 ke-wang1017 moojink mindfactoryai chadwick-yao paarthshah-tri lzhhh93 luislechugaruiz colorcatliu trigrass2 alison-bartsch keti-ai abraham190137 dhruv2012 caleb272 zhaolebor hal-zhaodong-yang eaa3 manutdmoon shaouxyz phoenix-ra hydra1983 idoru isuyu wangqun-shepherd moefear85 jonathanrandall sanyaade-projects hhy5277 andyzlys lijingshanxi kassasin kevin-gittest barisyazici dexman-ai agilexrobotics unibots1043 bamaao chifongip openghz baishibona hoolatech tttttliao vpnasdfghjkl sms95 anuragmaurya-pred dieface danieladejumo17 yancie-yjr dennishu0902 ml2s minihuaer ustb-zichenyan 3dalgolab marcyu0303 ito-kazu-lab huangzhizhong0305 buckleytoby jaiber haichao-zhang charlotteavra crumbyrobotics emancro liuyi61111 andy-wen chanmin33 kalzahid tany-g qiuxingmao andrewcwlee wenjw666 xyzacademic schcher dotandung ryanzhutao harryzhesutacta jcui65 luozhiping kinlee784 ccpowers corneliamelon shunlu-1 apollohuang1 jacob-zietek zhuyiche messy-snail silenzio777 leonkennedy

act's Issues

Suggestion: Increase sample density per epoch to lower validation loss variance

After evaluating a few different checkpoints during training, I found that the checkpoint with the best validation loss, didn't necessarily produce the best result when rolled out on the robot. Sometimes the last_checkpoint is better, and sometimes a previously checkpoint with a higher validation loss is better.

I highly suspect this is due to the sampling strategy used.

https://github.com/tonyzhaozh/act/blob/dfe6c7f5ff13ecb4a9dec887f000c0e5d8afba72/utils.py#L35C4-L35C4

https://github.com/tonyzhaozh/act/blob/dfe6c7f5ff13ecb4a9dec887f000c0e5d8afba72/utils.py#L20C1-L21C37

From these two lines.. it looks like an "item" in the dataset is a single random uniform sample from a trajectory.

I think this means that the validation set on each epoch, consists of say.. 10 randomly sampled timesteps, 1 from each episode in the validation set.

This I think is why the validation loss variance is kinda big during training.

When running 5000 epochs, I think it's quite possible to draw a "validation set" that achieves a low loss by "luck of the draw" (literally).

To deal with this, in my clone I made a simple change... adding a samples_per_epoch parameter... that draws n samples per epoch (with replacement).

class EpisodicDataset(torch.utils.data.Dataset):
    def __init__(self, episode_ids, dataset_dir, camera_names, norm_stats, samples_per_epoch=1):
        super(EpisodicDataset).__init__()
        self.episode_ids = episode_ids
        self.dataset_dir = dataset_dir
        self.camera_names = camera_names
        self.norm_stats = norm_stats
        self.is_sim = None
        self.samples_per_epoch = samples_per_epoch
        self.__getitem__(0) # initialize self.is_sim

    def __len__(self):
        return len(self.episode_ids) * self.samples_per_epoch

    def __getitem__(self, index):
        index = index % len(self.episode_ids)

Then divide the number of epochs to train by the samples_per_epoch.

in my most recent run I used 8 samples per epoch for 625 epochs of training (100 episodes in that dataset).

I saw a big drop in the validation variance.. also, if I train longer, I do appear to get better quality rollouts for lower validation losses.

If training on 50 episodes, I might try 10/500 or 16/300

The best part, is you can restore the original behavior, just by setting samples_per_epoch = 1.

How to Get Human Demonstration Data in Mujoco

Hello Tony,
I have finished reproducing your open source dataset in a virtual environment, and I want to collect some teaching data in Mujoco simulation environment to train the robot to do other tasks, but there is no code in your open source project, and there is no mention of how to control the movement of robotic arm remotely (keypad or remote control) in the official doc of Mujoco, so I'm not sure how to get human data in Mujoco. Can you provide some help?
Thanks
Jessy

discrepancy in results

The results for "insertion" task in the readme file is mentioned to be 50%, but after multiple runs the results only come upto 22-26% is there any particular reason for that, or is there a need to change any parameters in code for the insertion task?

ModuleNotFoundError: No module named 'util'

python3 imitate_episodes.py
--task_name sim_transfer_cube_scripted
--ckpt_dir
--policy_class ACT --kl_weight 10 --chunk_size 100 --hidden_dim 512 --batch_size 8 --dim_feedforward 3200
--num_epochs 2000 --lr 1e-5
--seed 0

Run the above command and report this error.

CVAE and DETR

Why not use detr directly for network training, but instead use CVAE for joint state to joint state restoration? Will the effect of using detr directly differ greatly

[closed]

AttributeError: 'RealEnv' object has no attribute '_physics'

Hi,

Thanks for the excellent work!

I collected some demonstrations with ALOHA (real platform) and run the act/imitate_episodes.py program. When I evaluated the trained checkpoint with args --eval --onscreen_render, I got the error AttributeError: 'RealEnv' object has no attribute '_physics'.

Can you give me some suggestions about how to fix it?

Subject: Potential Overfitting Issue and Loss Representation in Inference

Hello everyone,

I am currently working on a project where I encountered some issues during training and inference. Specifically, I have noticed that while the training loss decreases, the success rate during inference first increases and then decreases. This has led me to question whether overfitting might be occurring in my model.

Here are my main concerns:

Could this pattern, where the inference success rate first improves and then declines despite a continuously decreasing training loss, indicate overfitting?
Can loss effectively represent the performance of the model during inference, or are there other metrics that might be more indicative?
I would appreciate any insights or suggestions on how to address these issues.

Thank you!

DETR VEA uses output of first layer of transformer decoder?

act/detr/models/detr_vae.py

Line 131 in 694c606

 hs = self.transformer(src, None, self.query_embed.weight, pos, latent_input, proprio_input, self.additional_pos_embed.weight)[0] 

and

act/detr/models/detr_vae.py

Line 136 in 694c606

 hs = self.transformer(transformer_input, None, self.query_embed.weight, self.pos.weight)[0] 

you index the output of the transformer with [0]. Does this not take the output of the first layer of the transformer decoder, instead of the last layer? And is this behaviour expected?

From joint postion to end-effector position?

Hi Tony,

Thank you for your great work!

I notice that all states/actions about the arm is about qpos and qvel. If I wonder the spatial postion and velocity of end-effector, what should I do? Is it possible to compute them according to the recorded qpos & qvel? I see the $M$ and $slist$ matrix on trossen robotics website. But I have no idea about how to use it.

Best,
Dongjie

pretrained model

Are you going to release the pretrained model in the future?

How to train the real dataset？

Hi ,
I found the publicly shared datasets. But I could not train it.

(aloha) ubuntu@aloha:~/act$  python3 imitate_episodes.py --task_name aloha_fork_pick_up_compressed --ckpt_dir trainings_aloha_fork_pick_up_compressed --policy_class ACT --kl_weight 10 --chunk_size 10 --hidden_dim 128 --batch_size 8 --dim_feedforward 3200 --num_epochs 2000 --lr 1e-5 --seed 0

Data from: /home/ubuntu/dataset_dir/aloha_fork_pick_up_compressed

Traceback (most recent call last):
  File "imitate_episodes.py", line 438, in <module>
    main(vars(parser.parse_args()))
  File "imitate_episodes.py", line 106, in main
    train_dataloader, val_dataloader, stats, _ = load_data(dataset_dir, num_episodes, camera_names, batch_size_train, batch_size_val)
  File "/home/ubuntu/ros1_aloha_ws/src/act/utils.py", line 123, in load_data
    train_dataset = EpisodicDataset(train_indices, dataset_dir, camera_names, norm_stats)
  File "/home/ubuntu/ros1_aloha_ws/src/act/utils.py", line 18, in __init__
    self.__getitem__(0) # initialize self.is_sim
  File "/home/ubuntu/ros1_aloha_ws/src/act/utils.py", line 60, in __getitem__
    all_cam_images = np.stack(all_cam_images, axis=0)
  File "<__array_function__ internals>", line 200, in stack
  File "/home/ubuntu/anaconda3/envs/aloha1/lib/python3.8/site-packages/numpy/core/shape_base.py", line 464, in stack
    raise ValueError('all input arrays must have the same shape')
ValueError: all input arrays must have the same shape

And I edited the constants.py file in aloha/aloha_scripts

TASK_CONFIGS = {    
    'aloha_fork_pick_up_compressed': {
        'dataset_dir': DATA_DIR + '/aloha_fork_pick_up_compressed',
        'num_episodes': 50,
        'episode_len': 1000,
        'camera_names': ['cam_high', 'cam_low', 'cam_left_wrist', 'cam_right_wrist']
    },    
}

Is my configuration wrong, or is it somewhere else?
Thanks for your rely.

Policy roll-out in real-world results in the same action prediction at every timestep

I am currently using the ACT model, similar to ALOHA, for learning a single-arm policy on the UR5e robot. I am seeing the same action prediction at every timestep, and it is an action that was never seen in any trajectories in the demonstrations. Besides obvious differences in the hardware, what model/training specific problems could I be facing based on this roll-out? Is it potentially related to the "style variable" (z)? I have provided my training results in screenshots below. Here are the hyperparameters used:

python3 imitate_episodes.py \ --task_name rotate_bottle \ --ckpt_dir ./policy/ \ --policy_class ACT --kl_weight 10 --chunk_size 100 --hidden_dim 512 --batch_size 8 --dim_feedforward 3200 \ --num_epochs 5000 \ --lr 1e-5 \ --seed 0

Why use leader robots joint positions as Supervision rather than the follow's

In your paper, you mention:

We record the joint positions of the leader robots (i.e. input from the human operator) and use them as actions. It is important to use the leader joint positions instead of the follower's, because the amount of force applied is implicitly defined by the difference between them, through the It is important to use the leader joint positions instead of the follower's, because the amount of force applied is implicitly defined by the difference between them, through the The low-level PID controller.

How to understand it? What if I use the qpos(the follow's joint positions?

ALOHA2 simulation support

Hello Tony and the team,

First of all, thank you for the fascinating work. The ALOHA and the ACT framework has given me so much inspiration.
I was wondering if there is a codebase for using ALOHA2 as the simulation environment to use with ACT.

Thank you

Potential bug in the transformer.py and detr_vae.py?

Hi, thanks for the amazing work! Recently, I tried to reimplement your code. However, I found something strange.

Especially, the forward of Transformer class in transformer.py only returns hs，which is a tensor with shape [Decoder block num, B, num_query, C]:

act/detr/models/transformer.py

Line 77 in 742c753

return hs

However, in detr_vae.py, the action_head uses only hs[0] as input. This means the action is predicted based only on the first decorder feature with shape [1, B, num_query, C], but ignores all the other following features:

act/detr/models/detr_vae.py

Line 131 in 742c753

 hs = self.transformer(src, None, self.query_embed.weight, pos, latent_input, proprio_input, self.additional_pos_embed.weight)[0] 

act/detr/models/detr_vae.py

Line 136 in 742c753

 hs = self.transformer(transformer_input, None, self.query_embed.weight, self.pos.weight)[0] 

So, this is strange for me and I wonder if this is a bug. Hoping for your reply! Thanks in advance~

Asking for best_ckpt

Hi, I followed your instructions to create 50 simulated examples for the insertion task and used the default settings to train the model. However, the evaluation performance was bad, I trained 700 epochs and the eval success rate was only 10%. As you mentioned it should be around 50% for this task.
Could you please upload your best_ckpt for this task or the training details to achieve such a success rate?
Thank you for your sharing.

Update: After training for 2000 epochs, it achieved 50% success rate. Although the losses don't show much difference, I guess that's crucial to this task. Sorry for any inconvenience.

From where to get checkpoints please?

Hi,

From where to get checkpoints please? --ckpt_dir

Missing packages to run?

Followed the installation procedure to install listed pip packages in the condo environment but still get missing packages errors when I run python3 imitate_episodes.py

(aloha) Rams-MBP:act-plus-plus ramkumarkoppu$ python3 imitate_episodes.py
ROBOMIMIC WARNING(
No private macro file found!
It is recommended to use a private macro file
To setup, run: python /Users/ramkumarkoppu/miniforge3/envs/aloha/lib/python3.8/site-packages/robomimic/scripts/setup_macros.py
)
Traceback (most recent call last):
File "imitate_episodes.py", line 20, in
from policy import ACTPolicy, CNNMLPPolicy, DiffusionPolicy
File "/Users/ramkumarkoppu/GIT/GitHub/Robotic_projs/act-plus-plus/policy.py", line 12, in
from robomimic.algo.diffusion_policy import replace_bn_with_gn, ConditionalUnet1D
ModuleNotFoundError: No module named 'robomimic.algo.diffusion_policy'
(aloha) Rams-MBP:act-plus-plus ramkumarkoppu$

which coordinate system are the actions involved in training?

Hi, Thanks for great paper.

I encountered a problem:
which coordinate system are the "actions" (rotation and position) involved in training? Is it in the camera coordinate system or the robot arm base coordinate system? If it is under arm base and there are two robotic arms, which one should prevail? Similarly, if it is a camera coordinate system, which one should prevail?

issue of scripted_policy

Thanks for your working.

Nearly, I'm try to modify the function generation_trajectory of class PckAndTransferPolicy in the file scripted_policy.py, increase the timesteps and the description "episode_len" of mission in the constante.py, for example, add waypoints in the trajectory and increase the total timesteps to 800.

But in the simulation, the robot and the env always be reseted to the initial at the timestep 400, and at the ts.reward will return to None, I don't know why, is any other file need to be modified?

By the way, if the new total timesteps is smaller, like 300, the simulation will be normal.

Here are where I modified:

In the scripted_policy.py:

class PickAndTransferPolicy(BasePolicy):
     def generate_trajectory(self, ts_first):
          ......
          ......
          self.left_trajectory = {
          {"t": 0, "xyz".........},
          {"t": 175, "xyz".........},
          {"t": 211, "xyz".........},
          {"t": 285, "xyz".........},
          {"t": 295, "xyz".........},
          {"t": 325, "xyz".........},
          {"t": 350, "xyz".........},
          {"t": 400, "xyz".........},
          {"t": 470, "xyz".........},
          {"t": 530, "xyz".........},
          {"t": 600, "xyz".........},
          {"t": 720, "xyz".........},
          {"t": 800, "xyz".........},
          }

          self.right_trajectory = {
          {"t": 0, "xyz".........},
          {"t": 175, "xyz".........},
          {"t": 211, "xyz".........},
          {"t": 285, "xyz".........},
          {"t": 295, "xyz".........},
          {"t": 325, "xyz".........},
          {"t": 350, "xyz".........},
          {"t": 400, "xyz".........},
          {"t": 470, "xyz".........},
          {"t": 530, "xyz".........},
          {"t": 600, "xyz".........},
          {"t": 720, "xyz".........},
          {"t": 800, "xyz".........},
          }

And in the In the constante.py:

SIM_TASK_CONFIGS = {
    'task_name':{
    'dataset_dir': ...,
    'num_episodes':50,
    'episode_len':800,
    'camera_names': 'cam',
    }
}

Extremely poor validation results on self-trained checkpoints

Dear Tony.
Thank you for your excellent work on ALOHA, I have tried to reproduce your work in the Mujoco simulation environment, and based on your open source data, The success rate should be around 90% for transfer cube, and around 50% for insertion.
I have trained and validated on your open source dataset, and I get results of around 54% for transfer cube, and around 14% for insertion.The results are very dismal, and I have not changed any of the parameter settings, and I would like to know what are some of the reasons for such a problem, or if I need to optimise in any way. The exact data can be viewed in the table below

Here are my training parameter settings

python3 imitate_episodes.py \
--task_name sim_transfer_cube_scripted \
--ckpt_dir ~/data/aloha/act/sim_transfer_cube_scripted/ckpt/ \
--policy_class ACT --kl_weight 10 --chunk_size 100 --hidden_dim 512 --batch_size 8 --dim_feedforward 3200 \
--num_epochs 2000  --lr 1e-5 \
--seed 0

Here are my eval parameter settings

python3 imitate_episodes.py \
--task_name sim_transfer_cube_scripted \
--ckpt_dir ~/data/aloha/act/sim_transfer_cube_scripted/ckpt/ \
--policy_class ACT --kl_weight 10 --chunk_size 100 --hidden_dim 512 --batch_size 8 --dim_feedforward 3200 \
--num_epochs 2000  --lr 1e-5 \
--seed 0 \
--eval

The dataset obtained from the training is shown in the following link (Lark document, you need to register yourself, please note if you apply for permission to view)
https://iklxo6z9yv.feishu.cn/sheets/LRwosV4jnh7xokt8F8UcMT7snzh?from=from_copylink

ModuleNotFoundError: No module named 'robomimic.algo.diffusion_policy'

Hi, thank you for great work.

I run python3 imitate_episodes.py shown as follow:

ModuleNotFoundError: No module named 'robomimic.algo.diffusion_policy'

how I fix it?

Why are encoder heads so important when training?

I found that the presence or absence of an encoder during training has a great impact on the final result. However, the reason is not detailed in the paper. Can you provide some information for me to learn from?

Simulation issues

The data generated by "record_sim_episodes. py" mostly fails. In the project, there is "sim_env. py" that can enable the simulation environment. However, even after configuring Interbotix, the following errors still exist. How can I build data through the simulation environment and test the program without hardware

Python sim_ Env.py
Timeout exceeded while waiting for service/master_ Left/set_ Operating_ Modes
The robot 'master_ Left 'is not discoverable Did you enter the correct robot_ Name parameter? Is the xs_ SDK node running? Quitting

tonyzhaozh / act Goto Github PK

act's People

Contributors

Stargazers

Watchers

Forkers

act's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs