cheerss / crossformer Goto Github PK

View Code? Open in Web Editor NEW

358.0 358.0 43.0 3.14 MB

The official code for the paper: https://openreview.net/forum?id=_PHymLIxuI

License: MIT License

Python 99.48% Shell 0.52%

classification deep-learning instance-segmentation object-detection pytorch semantic-segmentation vision-transformer

crossformer's People

Contributors

Stargazers

Watchers

crossformer's Issues

关于LSDA与CEL设计的疑问

大神好，非常感谢你们的作品。有几个小疑问：

貌似LDA和SDA是交替使用的，如果调整S和L的比例或顺序是否会对结果有影响。比如一个stage中SLS或SSLLL这样的。
我看G貌似一直都是7，如果在金字塔结构中，将G逐步变为7,5,3,1(vanilla attention)，不知对效果影响如何
CEL放在最开始可以理解为提取多尺度信息。但随着层数的加深，可能H*W这个维度的空间位置意义越来越稀薄，那么再去提取多尺度信息可能很难用多尺度空间信息去解释了。不知后面不再加[2,4]的CEL是否有显著影响？
抛开代码实现的整洁问题，kernel=32是不是太大了，换成kernel=3的堆叠应该不影响吧

非常感谢

Evaluation question

I would like to know if the accuracy values output from each epoch during training are the results obtained from testing on the validation set

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

	OpenMMLab 1.0 branch	OpenMMLab 2.0 branch
MMEngine		0.x
MMCV	1.x	2.x
MMDetection	0.x 、1.x、2.x	3.x
MMAction2	0.x	1.x
MMClassification	0.x	1.x
MMSegmentation	0.x	1.x
MMDetection3D	0.x	1.x
MMEditing	0.x	1.x
MMPose	0.x	1.x
MMDeploy	0.x	1.x
MMTracking	0.x	1.x
MMOCR	0.x	1.x
MMRazor	0.x	1.x
MMSelfSup	0.x	1.x
MMRotate	1.x	1.x
MMYOLO		0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

你好，请问CrossFormer++的代码什么时间会公布？

@cheerss

老师，请问这个只能用python3.6吗

Question about the pseudo code of the LSDA.

Thanks for your excellent work!

I have a question about the pseudo code of the LSDA which was implemented with only ten lines of code, and only reshape
and permute operations are used:

if type == "SDA":
x = x.reshaspe(H // G, G, W // G, G, D).permute(0, 2, 1, 3, 4)
elif type == "LDA":
x = x.reshaspe(G, H // G, G, W // G, D).permute(1, 3, 0, 2, 4)

Although they do have difference in the way of reshaping, I still have question about the reason for this special design. Can you explain from another perspective why these two different design results correspond to different attention (Long or Short ) implementations?

Thanks a lot !

Semantic FPN CrossFormer-S The weight file is not uploaded completely

main()
File "./test.py", line 127, in main
checkpoint = load_checkpoint(model, args.checkpoint, map_location='cpu')
File "/home/wangnan/anaconda3/envs/yolo-v5/lib/python3.6/site-packages/mmcv/runner/checkpoint.py", line 522, in load_checkpoint
checkpoint = _load_checkpoint(filename, map_location, logger)
File "/home/wangnan/anaconda3/envs/yolo-v5/lib/python3.6/site-packages/mmcv/runner/checkpoint.py", line 466, in _load_checkpoint
return CheckpointLoader.load_checkpoint(filename, map_location, logger)
File "/home/wangnan/anaconda3/envs/yolo-v5/lib/python3.6/site-packages/mmcv/runner/checkpoint.py", line 243, in load_checkpoint
return checkpoint_loader(filename, map_location)
File "/home/wangnan/anaconda3/envs/yolo-v5/lib/python3.6/site-packages/mmcv/runner/checkpoint.py", line 260, in load_from_local
checkpoint = torch.load(filename, map_location=map_location)
File "/home/wangnan/anaconda3/envs/yolo-v5/lib/python3.6/site-packages/torch/serialization.py", line 594, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/home/wangnan/anaconda3/envs/yolo-v5/lib/python3.6/site-packages/torch/serialization.py", line 853, in _load
result = unpickler.load()
File "/home/wangnan/anaconda3/envs/yolo-v5/lib/python3.6/site-packages/torch/serialization.py", line 845, in persistent_load
load_tensor(data_type, size, key, _maybe_decode_ascii(location))
File "/home/wangnan/anaconda3/envs/yolo-v5/lib/python3.6/site-packages/torch/serialization.py", line 833, in load_tensor
storage = zip_file.get_storage_from_record(name, size, dtype).storage()
RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading file data/2154620144: invalid header or archive is corrupted

What‘s the config of crossformer++ on small size data such cifar10.

I use crossformer++ on cifar10(3232) and cifar100(32) and tiny-imagenet(6464), however, the acc drop to 1/num_classes after about 8 epoches.

question about LDA and SDA for irregular feature map

Thank you very much for your careful reply about LDA and SDA ，but I have another question about LDA and SDA for irregular
feature map。

In your paper，the LDA and SDA used for regular input image size，like 224x224 or 384x384. So the group size is default 7，
and the I is set (8, 4, 2, 1) . And for Stage-1, the I = 8， because of the need to meet GxI = feature map width/height(56x56).

However, for irregular feature map size, for example, 80 x 134, now for the group size and interval ，It seems that we can no longer design as mentioned in the paper。If the group size is 7，it's need to padding feature map to apply the group size，then the feature map size is become 84x140, the feature map reshape to [W_nG, G, H_nG, G], SDA here can be executed normally。 but for next LDA，how to set the interval I ？Besides，the feature map is irregular，so the interval I is different for width and height。How can I reasonably set the parameter interval I ？

Can you give me some advice about this question？Thanks！

could you provide the train log of segmentation and detection?

Some questions about your last paper

Dear author, I have read your article recently:'Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework', and I am particularly interested in your article. Do you have an open source plan? Thank you for your answer.

Some wrongs with the pre-trained model crossformer-b.pth

Hi, thanks for your great work. I am using your crossformer_base as my backbone network for downstream tracking tasks. But now when I load your pre-trained model, a very correct Unexpected key(s) appears. My loading code is as follow:
ckpt = torch.load(ckpt_path, map_location='cpu')
missing_keys, unexpected_keys = backbone.body.load_state_dict(ckpt['model'], strict=False)

The result as follow:
unexpected keys: ['norm.weight', 'norm.bias', 'head.weight', 'head.bias', 'layers.0.blocks.0.attn.biases', 'layers.0.blocks.0.attn.relative_position_index', 'layers.0.blocks.1.attn.biases', .....

Can I use this network as my feature extraction network

Since the input resolution of my task weight is not 224, can I still load the imagenet pre-training weights

how to train segmentation in win10

Dear author, I have a question: how to train segmentation in win10?
I used the "python train.py configs/fpn_crossformer_s_ade20k_40k.py --cfg-options pretrained/backbone-corssformer-s.pth --work-dir output --launcher pytorch" but got an error msg as follows:

Traceback (most recent call last):
File "train.py", line 152, in
main()
File "train.py", line 65, in main
args = parse_args()
File "train.py", line 57, in parse_args
args = parser.parse_args()
File "C:\Python37\lib\argparse.py", line 1755, in parse_args
args, argv = self.parse_known_args(args, namespace)
File "C:\Python37\lib\argparse.py", line 1787, in parse_known_args
namespace, args = self._parse_known_args(args, namespace)
File "C:\Python37\lib\argparse.py", line 1993, in _parse_known_args
start_index = consume_optional(start_index)
File "C:\Python37\lib\argparse.py", line 1933, in consume_optional
take_action(action, args, option_string)
File "C:\Python37\lib\argparse.py", line 1861, in take_action
action(self, namespace, argument_values, option_string)
File "C:\Python37\lib\site-packages\mmcv\utils\config.py", line 739, in call
key, val = kv.split('=', maxsplit=1)
ValueError: not enough values to unpack (expected 2, got 1)

and I also tried to use ur shell (dist_train.sh) directly, but also got an error as

$ /bin/sh E:/project_c/crossformer-debug/segmentation/dist_train.sh
NOTE: Redirects are currently not supported in Windows or MacOs.
C:\Python37\lib\site-packages\torch\distributed\launch.py:186: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

FutureWarning,
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\torch\distributed\run.py", line 564, in determine_local_world_size
return int(nproc_per_node)
ValueError: invalid literal for int() with base 10: ''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Python37\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\Python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Python37\lib\site-packages\torch\distributed\launch.py", line 193, in
main()
File "C:\Python37\lib\site-packages\torch\distributed\launch.py", line 189, in main
launch(args)
File "C:\Python37\lib\site-packages\torch\distributed\launch.py", line 174, in launch
run(args)
File "C:\Python37\lib\site-packages\torch\distributed\run.py", line 709, in run
config, cmd, cmd_args = config_from_args(args)
File "C:\Python37\lib\site-packages\torch\distributed\run.py", line 617, in config_from_args
nproc_per_node = determine_local_world_size(args.nproc_per_node)
File "C:\Python37\lib\site-packages\torch\distributed\run.py", line 582, in determine_local_world_size
raise ValueError(f"Unsupported nproc_per_node value: {nproc_per_node}")
ValueError: Unsupported nproc_per_node value:

so can u give some suggestions for solutions?
Thanks.

KeyError: "EncoderDecoder: 'CrossFormer_L is not in the backbone registry'"

when i run the program, there is a mistake:

KeyError: "EncoderDecoder: 'CrossFormer_L is not in the backbone registry'"

how can i registry this?

failed reading file data

Thank you for your contribution to science. I encountered the following issues during the reproduction process:
Traceback (most recent call last):
File "/tmp/pycharm_project_864/tools/train.py", line 194, in
main()
File "/tmp/pycharm_project_864/tools/train.py", line 183, in main
train_detector(
File "/tmp/pycharm_project_864/mmdet/apis/train.py", line 185, in train_detector
runner.load_checkpoint(cfg.load_from)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 349, in load_checkpoint
return load_checkpoint(
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/mmcv/runner/checkpoint.py", line 627, in load_checkpoint
checkpoint = _load_checkpoint(filename, map_location, logger)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/mmcv/runner/checkpoint.py", line 561, in _load_checkpoint
return CheckpointLoader.load_checkpoint(filename, map_location, logger)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/mmcv/runner/checkpoint.py", line 303, in load_checkpoint
return checkpoint_loader(filename, map_location) # type: ignore
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/mmcv/runner/checkpoint.py", line 323, in load_from_local
checkpoint = torch.load(filename, map_location=map_location)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/torch/serialization.py", line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/torch/serialization.py", line 1172, in _load
result = unpickler.load()
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/torch/serialization.py", line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/torch/serialization.py", line 1112, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: PytorchStreamReader failed reading file data/2237523104: invalid header or archive is corrupted
May I ask if the weight file of your detection is damaged？

Thanks for the earlier answer. Do you have any previous model test results of training directly in ADE20K without using classification pre-training?

Validation accuracy keeps to be 0.09% during training

Dear authors,

I'm interested in your paper and perfom training from scratch on ImageNet. However, the validation accuracy keeps to be * Acc@1 0.090 during training.

Do you have any idea why this happens? I train Swin Transformer, it works.

I use Pytorch 1.7.1 and 1.6.0, no mixed precision, 100 epochs.

--amp-opt-level O0 --output ./output --opts TRAIN.EPOCHS 100

Thanks,
Eddie

Crossformer for small object detect

Hello, thank you for your work. I used crossformer for small object detection in my own dataset, and the effect was very poor. Is there any way to improve the accuracy of small target detection? Tansnks very much!

detection test question

I followed the steps of the detection direction configuration, why is the detection result poor after training?

Some question about your paper and code

Hi，I'm very interested in your work about Multi-scale Attention in Transformer. but I have some questions about your work：

In Appendix 2. DPB, Why do i and j parameters range from 0 to 2G-1 instead of 0 to G-1？Besides，the inputs of DPB
module is (1-G+i, 1-G+j)， What is the reason for this setting? Why not just use i and j as inputs?
When I debug your code , I add a parameters due to I have only one 3090 with 24G memory， like this：

parser = argparse.ArgumentParser('CrossFormer training and evaluation script', add_help=False)
parser.add_argument('--cfg', type=str, required=True, metavar="FILE",
default='/configs/small_patch4_group7_224.yaml', help='path to config file')
parser.add_argument(
"--opts",
help="Modify config options by adding 'KEY VALUE' pairs. ",
default=None,
nargs='+'
)
# easy config modification
parser.add_argument('--batch-size', type=int, default=32, help="batch size for single GPU")
parser.add_argument('--data-set', type=str, default='flower', help='dataset to use')
parser.add_argument('--data-path', type=str, help='path to dataset', default='/media/data2/huzhen/flower_data')
parser.add_argument('--zip', action='store_true', help='use zipped dataset instead of folder dataset')
parser.add_argument('--cache-mode', type=str, default='part', choices=['no', 'full', 'part'],
help='no: no cache, '
'full: cache all data, '
'part: sharding the dataset into nonoverlapping pieces and only cache one piece')
parser.add_argument('--resume', help='resume from checkpoint', default='')
parser.add_argument('--accumulation-steps', type=int, help="gradient accumulation steps")
parser.add_argument('--use-checkpoint', action='store_true',
help="whether to use gradient checkpointing to save memory")
parser.add_argument('--amp-opt-level', type=str, default='native', choices=['native', 'O0', 'O1', 'O2'],
help='mixed precision opt level, if O0, no amp is used')
parser.add_argument('--output', default='./Flower_weights', type=str, metavar='PATH',
help='root of output folder, the full path is /<model_name>/ (default: output)')
parser.add_argument('--tag', help='tag of experiment')
parser.add_argument('--eval', action='store_true', help='Perform evaluation only')
parser.add_argument('--throughput', action='store_true', help='Test throughput only')
parser.add_argument('--num_workers', type=int, default=8, help="")
parser.add_argument('--mlp_ratio', type=int, default=4, help="")
parser.add_argument('--warmup_epochs', type=int, default=20, help="#epoches for warm up")
parser.add_argument("--local_rank", type=int, required=True, default=0, help='local rank for DistributedDataParallel')
parser.add_argument('--device', default='cuda:2',
help='device to use for training / testing')

args, unparsed = parser.parse_known_args()

but its report an error: 发生异常: SystemExit 2
The above is my parameter setting. Is there a problem?
I sincerely hope I can receive for your help！

What's the difference between CrossFormer and swin_transformer?

Sorry,I have just watched the model structure,would you mind telling me the difference between CrossFormer and swin_transformer?

Does CrossFormer require a fixed input size?

Hi there and thanks for the nice work.
I'm currently trying to use CrossFormer_B as my backbone in detection/instance_segmentation.
I've noticed that we need to define the img_size in the backbone configs. However, defining that can be limiting in the sense that we usually use cropping augmentations during training, or multi-scale inference at test time. Is there any way to keep these methods working with the current implementation?

I'll copy the related part of my config file down here:

model = dict(
type='CascadeRCNN',
pretrained=None,
backbone=dict(
type='CrossFormer',
img_size=[3840, 1920],
patch_size=[4, 8, 16, 32],
in_chans=3,
num_classes=7,
embed_dim=96,
depths=[2, 2, 18, 2],
num_heads=[3, 6, 12, 24],
group_size=[7, 7, 7, 7],
crs_interval=[8, 4, 2, 1],
mlp_ratio=4,
qkv_bias=True,
qk_scale=None,
drop_rate=0.0,
drop_path_rate=0.3,
patch_norm=True,
use_checkpoint=False,
merge_size=[[2, 4], [2, 4], [2, 4]]),

...
...

train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
dict(
type='Resize',
img_scale=[(3840, 1080), (3840, 1560)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.0),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
]

...
...

Thanks,

some question about the different usage of CEL between papers and your codes

I have read your paper and noticed that CEL was used in all stages.
But I found in codes that you used it in the first stage only.
Am I mistaken?

cheerss / crossformer Goto Github PK

crossformer's People

Contributors

Stargazers

Watchers

Forkers

crossformer's Issues

Welcome update to OpenMMLab 2.0

Dear author, I have a question: how to train segmentation in win10? I used the "python train.py configs/fpn_crossformer_s_ade20k_40k.py --cfg-options pretrained/backbone-corssformer-s.pth --work-dir output --launcher pytorch" but got an error msg as follows:

and I also tried to use ur shell (dist_train.sh) directly, but also got an error as

Recommend Projects

Recommend Topics

Recommend Org

Jobs

Dear author, I have a question: how to train segmentation in win10?
I used the "python train.py configs/fpn_crossformer_s_ade20k_40k.py --cfg-options pretrained/backbone-corssformer-s.pth --work-dir output --launcher pytorch" but got an error msg as follows: