GithubHelp home page GithubHelp logo

yizt / grad-cam.pytorch Goto Github PK

View Code? Open in Web Editor NEW
695.0 10.0 165.0 2.3 MB

pytorch实现Grad-CAM和Grad-CAM++,可以可视化任意分类网络的Class Activation Map (CAM)图,包括自定义的网络;同时也实现了目标检测faster r-cnn和retinanet两个网络的CAM图;欢迎试用、关注并反馈问题...

License: Apache License 2.0

Python 100.00%
cam grad-cam guided-backpropagation model-interpretability faster-r-cnn-grad-cam retinanet-grad-cam

grad-cam.pytorch's Introduction

Grad-CAM.pytorch

​ pytorch 实现Grad-CAM:Visual Explanations from Deep Networks via Gradient-based Localization

Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks

  1. 依赖
  2. 使用方法
  3. 样例分析
    3.1 单个对象
    3.3 多个对象
  4. 总结
  5. 目标检测-faster-r-cnn
    5.1 detectron2安装
    5.2 测试
    5.3 Grad-CAM结果
    5.4 总结
  6. 目标检测-retinanet
    6.1 detectron2安装
    6.2 测试
    6.3 Grad-CAM结果
    6.4 总结
  7. 目标检测-fcos
    7.1 AdelaiDet安装
    7.2 测试
    7.3 Grad-CAM结果
    7.4 总结

Grad-CAM整体架构

Grad-CAM++与Grad-CAM的异同

依赖

python 3.6.x
pytoch 1.0.1+
torchvision 0.2.2
opencv-python
matplotlib
scikit-image
numpy

使用方法

python main.py --image-path examples/pic1.jpg \
               --network densenet121 \
               --weight-path /opt/pretrained_model/densenet121-a639ec97.pth

参数说明

  • image-path:需要可视化的图像路径(可选,默认./examples/pic1.jpg)

  • network: 网络名称(可选,默认resnet50)

  • weight-path: 网络对应的与训练参数权重路径(可选,默认从pytorch官网下载对应的预训练权重)

  • layer-name: Grad-CAM使用的层名(可选,默认最后一个卷积层)

  • class-id:Grad-CAM和Guided Back Propagation反向传播使用的类别id(可选,默认网络预测的类别)

  • output-dir:可视化结果图像保存目录(可选,默认results目录)

样例分析

单个对象

原始图像

效果

network HeatMap Grad-CAM HeatMap++ Grad-CAM++ Guided backpropagation Guided Grad-CAM
vgg16
vgg19
resnet50
resnet101
densenet121
inception_v3
mobilenet_v2
shufflenet_v2

多个对象

​ 对应多个图像Grad-CAM++比Grad-CAM覆盖要更全面一些,这也是Grad-CAM++最主要的优势

原始图像

效果

network HeatMap Grad-CAM HeatMap++ Grad-CAM++ Guided backpropagation Guided Grad-CAM
vgg16
vgg19
resnet50
resnet101
densenet121
inception_v3
mobilenet_v2
shufflenet_v2

总结

  • vgg模型的Grad-CAM并没有覆盖整个对象,相对来说resnet和denset覆盖更全,特别是densenet;从侧面说明就模型的泛化和鲁棒性而言densenet>resnet>vgg
  • Grad-CAM++相对于Grad-CAM也是覆盖对象更全面,特别是对于同一个类别有多个实例的情况下,Grad-CAM可能只覆盖部分对象,Grad-CAM++基本覆盖所有对象;但是这仅仅对于vgg而言,想densenet直接使用Grad-CAM也基本能够覆盖所有对象
  • MobileNet V2的Grad-CAM覆盖也很全面
  • Inception V3和MobileNet V2的Guided backpropagation图轮廓很模糊,但是ShuffleNet V2的轮廓则比较清晰

目标检测-faster-r-cnn

​ 有位网友SHAOSIHAN问道怎样在目标检测中使用Grad-CAM;在Grad-CAM和Grad-CAM++论文中都没有提及对目标检测生成CAM图。我想主要有两个原因:

a) 目标检测不同于分类,分类网络只有一个分类损失,而且所有网络都是一样的(几个类别最后一层就是几个神经元),最后的预测输出都是单一的类别得分分布。目标检测则不同,输出都不是单一的,而且不同的网络如Faster R-CNN, CornerNet,CenterNet,FCOS,它们的建模方式不一样,输出的含义都不相同。所以不会有统一的生成Grad-CAM图的方法。

b) 分类属于弱监督,通过CAM可以了解网络预测时主要关注的空间位置,也就是"看哪里",对分析问题有实际的价值;而目标检测,本身是强监督,预测边框就直接指示了“看哪里”。

​ 这里以detetron2中的faster-rcnn网络为例,生成Grad-CAM图。主要思路是直接获取预测分值最高的边框;将该边框的预测分值反向传播梯度到,该边框对应的proposal 边框的feature map上,生成此feature map的CAM图。

detectron2安装

a) 下载

git clone https://github.com/facebookresearch/detectron2.git

b) 修改detectron2/modeling/roi_heads/fast_rcnn.py文件中的fast_rcnn_inference_single_image函数,主要是增加索引号,记录分值高的预测边框是由第几个proposal边框生成的;修改后的fast_rcnn_inference_single_image函数如下:

def fast_rcnn_inference_single_image(
        boxes, scores, image_shape, score_thresh, nms_thresh, topk_per_image
):
    """
    Single-image inference. Return bounding-box detection results by thresholding
    on scores and applying non-maximum suppression (NMS).

    Args:
        Same as `fast_rcnn_inference`, but with boxes, scores, and image shapes
        per image.

    Returns:
        Same as `fast_rcnn_inference`, but for only one image.
    """
    valid_mask = torch.isfinite(boxes).all(dim=1) & torch.isfinite(scores).all(dim=1)
    indices = torch.arange(start=0, end=scores.shape[0], dtype=int)
    indices = indices.expand((scores.shape[1], scores.shape[0])).T
    if not valid_mask.all():
        boxes = boxes[valid_mask]
        scores = scores[valid_mask]
        indices = indices[valid_mask]
    scores = scores[:, :-1]
    indices = indices[:, :-1]

    num_bbox_reg_classes = boxes.shape[1] // 4
    # Convert to Boxes to use the `clip` function ...
    boxes = Boxes(boxes.reshape(-1, 4))
    boxes.clip(image_shape)
    boxes = boxes.tensor.view(-1, num_bbox_reg_classes, 4)  # R x C x 4

    # Filter results based on detection scores
    filter_mask = scores > score_thresh  # R x K
    # R' x 2. First column contains indices of the R predictions;
    # Second column contains indices of classes.
    filter_inds = filter_mask.nonzero()
    if num_bbox_reg_classes == 1:
        boxes = boxes[filter_inds[:, 0], 0]
    else:
        boxes = boxes[filter_mask]

    scores = scores[filter_mask]
    indices = indices[filter_mask]
    # Apply per-class NMS
    keep = batched_nms(boxes, scores, filter_inds[:, 1], nms_thresh)
    if topk_per_image >= 0:
        keep = keep[:topk_per_image]
    boxes, scores, filter_inds = boxes[keep], scores[keep], filter_inds[keep]
    indices = indices[keep]

    result = Instances(image_shape)
    result.pred_boxes = Boxes(boxes)
    result.scores = scores
    result.pred_classes = filter_inds[:, 1]
    result.indices = indices
    return result, filter_inds[:, 0]

c) 安装;如遇到问题,请参考detectron2;不同操作系统安装有差异

cd detectron2
pip install -e .

测试

a) 预训练模型下载

wget https://dl.fbaipublicfiles.com/detectron2/PascalVOC-Detection/faster_rcnn_R_50_C4/142202221/model_final_b1acc2.pkl

b) 测试Grad-CAM图像生成

​ 在本工程目录下执行如下命令

export KMP_DUPLICATE_LIB_OK=TRUE
python detection/demo.py --config-file detection/faster_rcnn_R_50_C4.yaml \
--input ./examples/pic1.jpg \
--opts MODEL.WEIGHTS /Users/yizuotian/pretrained_model/model_final_b1acc2.pkl MODEL.DEVICE cpu

Grad-CAM结果

原始图像 检测边框 Grad-CAM HeatMap Grad-CAM++ HeatMap 边框预测类别
Dog
Aeroplane
Person
Horse

总结

​ 对于目标检测Grad-CAM++的效果并没有比Grad-CAM效果好,推测目标检测中预测边框已经是单个对象了,Grad-CAM++在多个对象的情况下优于Grad-CAM

目标检测-retinanet

​ 在目标检测网络faster r-cnn的Grad-CAM完成后,有两位网友abhigoku10wangzyon问道怎样在retinanet中实现Grad-CAM。retinanet与faster r-cnn网络结构不同,CAM的生成也有一些差异;以下是详细的过程:

detectron2安装

a) 下载

git clone https://github.com/facebookresearch/detectron2.git

b) 修改detectron2/modeling/meta_arch/retinanet.py 文件中的inference_single_image函数,主要是增加feature level 索引,记录分值高的预测边框是由第几层feature map生成的;修改后的inference_single_image函数如下:

    def inference_single_image(self, box_cls, box_delta, anchors, image_size):
        """
        Single-image inference. Return bounding-box detection results by thresholding
        on scores and applying non-maximum suppression (NMS).

        Arguments:
            box_cls (list[Tensor]): list of #feature levels. Each entry contains
                tensor of size (H x W x A, K)
            box_delta (list[Tensor]): Same shape as 'box_cls' except that K becomes 4.
            anchors (list[Boxes]): list of #feature levels. Each entry contains
                a Boxes object, which contains all the anchors for that
                image in that feature level.
            image_size (tuple(H, W)): a tuple of the image height and width.

        Returns:
            Same as `inference`, but for only one image.
        """
        boxes_all = []
        scores_all = []
        class_idxs_all = []
        feature_level_all = []

        # Iterate over every feature level
        for i, (box_cls_i, box_reg_i, anchors_i) in enumerate(zip(box_cls, box_delta, anchors)):
            # (HxWxAxK,)
            box_cls_i = box_cls_i.flatten().sigmoid_()

            # Keep top k top scoring indices only.
            num_topk = min(self.topk_candidates, box_reg_i.size(0))
            # torch.sort is actually faster than .topk (at least on GPUs)
            predicted_prob, topk_idxs = box_cls_i.sort(descending=True)
            predicted_prob = predicted_prob[:num_topk]
            topk_idxs = topk_idxs[:num_topk]

            # filter out the proposals with low confidence score
            keep_idxs = predicted_prob > self.score_threshold
            predicted_prob = predicted_prob[keep_idxs]
            topk_idxs = topk_idxs[keep_idxs]

            anchor_idxs = topk_idxs // self.num_classes
            classes_idxs = topk_idxs % self.num_classes

            box_reg_i = box_reg_i[anchor_idxs]
            anchors_i = anchors_i[anchor_idxs]
            # predict boxes
            predicted_boxes = self.box2box_transform.apply_deltas(box_reg_i, anchors_i.tensor)

            boxes_all.append(predicted_boxes)
            scores_all.append(predicted_prob)
            class_idxs_all.append(classes_idxs)
            feature_level_all.append(torch.ones_like(classes_idxs) * i)

        boxes_all, scores_all, class_idxs_all, feature_level_all = [
            cat(x) for x in [boxes_all, scores_all, class_idxs_all, feature_level_all]
        ]
        keep = batched_nms(boxes_all, scores_all, class_idxs_all, self.nms_threshold)
        keep = keep[: self.max_detections_per_image]

        result = Instances(image_size)
        result.pred_boxes = Boxes(boxes_all[keep])
        result.scores = scores_all[keep]
        result.pred_classes = class_idxs_all[keep]
        result.feature_levels = feature_level_all[keep]
        return result

c) 修改detectron2/modeling/meta_arch/retinanet.py 文件增加predict函数,具体如下:

    def predict(self, batched_inputs):
        """
        Args:
            batched_inputs: a list, batched outputs of :class:`DatasetMapper` .
                Each item in the list contains the inputs for one image.
                For now, each item in the list is a dict that contains:

                * image: Tensor, image in (C, H, W) format.
                * instances: Instances

                Other information that's included in the original dicts, such as:

                * "height", "width" (int): the output resolution of the model, used in inference.
                  See :meth:`postprocess` for details.
        Returns:
            dict[str: Tensor]:
                mapping from a named loss to a tensor storing the loss. Used during training only.
        """
        images = self.preprocess_image(batched_inputs)

        features = self.backbone(images.tensor)
        features = [features[f] for f in self.in_features]
        box_cls, box_delta = self.head(features)
        anchors = self.anchor_generator(features)

        results = self.inference(box_cls, box_delta, anchors, images.image_sizes)
        processed_results = []
        for results_per_image, input_per_image, image_size in zip(
                results, batched_inputs, images.image_sizes
        ):
            height = input_per_image.get("height", image_size[0])
            width = input_per_image.get("width", image_size[1])
            r = detector_postprocess(results_per_image, height, width)
            processed_results.append({"instances": r})
        return processed_results

d) 安装;如遇到问题,请参考detectron2;不同操作系统安装有差异

cd detectron2
pip install -e .

测试

a) 预训练模型下载

wget https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_50_FPN_3x/137849486/model_final_4cafe0.pkl

b) 测试Grad-CAM图像生成

​ 在本工程目录下执行如下命令:

export KMP_DUPLICATE_LIB_OK=TRUE
python detection/demo_retinanet.py --config-file detection/retinanet_R_50_FPN_3x.yaml \
      --input ./examples/pic1.jpg \
      --layer-name head.cls_subnet.0 \
      --opts MODEL.WEIGHTS /Users/yizuotian/pretrained_model/model_final_4cafe0.pkl MODEL.DEVICE cpu

Grad-CAM结果

图像1 图像2 图像3 图像4
原图
预测边框
GradCAM-cls_subnet.0
GradCAM-cls_subnet.1
GradCAM-cls_subnet.2
GradCAM-cls_subnet.3
GradCAM-cls_subnet.4
GradCAM-cls_subnet.5
GradCAM-cls_subnet.6
GradCAM-cls_subnet.7
GradCAM++-cls_subnet.0
GradCAM++-cls_subnet.1
GradCAM++-cls_subnet.2
GradCAM++-cls_subnet.3
GradCAM++-cls_subnet.4
GradCAM++-cls_subnet.5
GradCAM++-cls_subnet.6
GradCAM++-cls_subnet.7

注:以上分别对head.cls_subnet.0~head.cls_subnet.7共8个层生成Grad-CAM图,这8层分别对应retinanet分类子网络的4层卷积feature map及ReLu激活后的feature map

总结

a) retinanet的Grad-CAM图效果都不算好,相对来说中间层head.cls_subnet.2~head.cls_subnet.4相对好一点

b) 个人认为retinanet效果不要的原因是,retinanet最后的分类是卷积层,卷积核实3*3,也就是说反向传播到最后一个卷积层的feature map上,只有3*3个单元有梯度。而分类网络或者faster r-cnn分类都是全连接层,感受全局信息,最后一个卷积层的feature map上所有单元都有梯度。

c) 反向传播到浅层的feature map上,有梯度的单元会逐渐增加,但是就像Grad-CAM论文中说的,越浅层的feature map语义信息越弱,所以可以看到head.cls_subnet.0的CAM图效果很差。

目标检测-fcos

​ 在目标检测网络faster r-cnn和retinanet的Grad-CAM完成后,有位网友linsy-ai 问道怎样在fcos中实现Grad-CAM。fcos与retinanet基本类似,因为它们整体网络结构类似;这里使用AdelaiDet 工程中的fcos网络,以下是详细的过程:

AdelaiDet安装

a) 下载

git clone https://github.com/aim-uofa/AdelaiDet.git

b) 安装

cd AdelaiDet
python setup.py build develop

注意:1. AdelaiDet安装依赖detectron2,需要首先安装$\color{red}{detectron2}$

​ 2. fcos的不支持CPU,只支持GPU,请确保在$\color{red}{GPU环境}$下安装和测试

测试

a) 预训练模型下载

wget https://cloudstor.aarnet.edu.au/plus/s/glqFc13cCoEyHYy/download -O fcos_R_50_1x.pth

b) 测试Grad-CAM图像生成

​ 在本工程目录下执行如下命令:

export CUDA_DEVICE_ORDER="PCI_BUS_ID"
export CUDA_VISIBLE_DEVICES="0"
python AdelaiDet/demo_fcos.py --config-file AdelaiDet/R_50_1x.yaml \
  --input ./examples/pic1.jpg \
  --layer-name proposal_generator.fcos_head.cls_tower.8 \
  --opts MODEL.WEIGHTS /path/to/fcos_R_50_1x.pth MODEL.DEVICE cuda

Grad-CAM结果

图像1 图像2 图像3 图像4
原图
预测边框
GradCAM-cls_tower.0
GradCAM-cls_tower.1
GradCAM-cls_tower.2
GradCAM-cls_tower.3
GradCAM-cls_tower.4
GradCAM-cls_tower.5
GradCAM-cls_tower.6
GradCAM-cls_tower.7
GradCAM-cls_tower.8
GradCAM-cls_tower.9
GradCAM-cls_tower.10
GradCAM-cls_tower.11
GradCAM++-cls_tower.0
GradCAM++-cls_tower.1
GradCAM++-cls_tower.2
GradCAM++-cls_tower.3
GradCAM++-cls_tower.4
GradCAM++-cls_tower.5
GradCAM++-cls_tower.6
GradCAM++-cls_tower.7
GradCAM++-cls_tower.8
GradCAM++-cls_tower.9
GradCAM++-cls_tower.10
GradCAM++-cls_tower.11

注:以上分别对proposal_generator.fcos_head.cls_tower..0~head.cls_subnet.11共12个层生成Grad-CAM图,这12层分别对应fcos分类子网络的4层卷积feature map、组标准化后的feature map及ReLu激活后的feature map

总结

​ 不总结了,看图效果吧!

grad-cam.pytorch's People

Contributors

yizt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grad-cam.pytorch's Issues

语义分割

对于unet网络,是否有必要在encoder的最后一层使用grad-cam,这样会对分割结果有帮助吗?

全图cam

你好,我修改fasterrcnn为全图像的,为extractor的最后一层,但输出为roi_scores,我该如何操作去的到它的cam?

How can I implement Grad-CAM on object detection??

I find that almost code about Grad-CAM to be aimed classification model, so can it be implement on object detection model??

Recently I read some paper about attention, these papers use Grad-CAM to compare the performance improvement of object detection model, but I can find the code about visualization.

Can you give me some suggestion for modifying the code ?

Thanks!!!

feature = self.feature[0].cpu().data.numpy()

您好,非常感谢您提供的目标检测CAM思路!代码中有两处疑问向您请教。

grad_cam.py:

  1. weight = np.mean(gradient, axis=(1, 2)) # [C]
    这里为什么要取平均?
    2.feature = self.feature[0].cpu().data.numpy() # [C,H,W]
    为什么选用feature[0]而不是feature[proposal_idx]

谢谢!!!

IndexError: index 0 is out of bounds for dimension 0 with size 0

我使用detectron2运行了heatmap.py,但是遇到了一下问题。

[{'instances': Instances(num_instances=0, image_height=1024, image_width=1024, fields=[pred_boxes: Boxes(tensor([], size=(0, 4), grad_fn=)), scores: tensor([], grad_fn=), pred_classes: tensor([], dtype=torch.int64), indices: tensor([], dtype=torch.int64)])}]
Traceback (most recent call last):
File "/home/dxy/ll/obj_competition/detectron2框架/heatmap.py", line 160, in
mask, box, class_id = grad_cam(inputs) # cam mask
File "/home/dxy/ll/obj_competition/detectron2框架/grad_cam.py", line 52, in call
score = output[0]['instances'].scores[index]
IndexError: index 0 is out of bounds for dimension 0 with size 0

通过debug,发现是grad_cam.py中output = self.net.inference([inputs]) 的输出为空 ,准确的说是roi 为空。请问有哪位遇到了同样的问题?

Grad CAM中的数值归一化

关于Grad CAM和Grad CAM++中您的代码
# 数值归一化 cam -= np.min(cam) cam /= np.max(cam)

这里是做最大最小归一化吗?但是这里与最大最小归一化的公式不同,做完这一步后cam有部分值小于0,而且在Grad-cam与Grad-cam++的论文中均没有看到关于数值归一化的内容。这一点困扰了我很久,能麻烦您为我答疑解惑吗?非常感谢

Grad-CAM for Faster-RCNN with config file faster_rcnn_X_101_32x8d_FPN_3x.yaml

hi, @yizt thanks for great work. I am trying to apply grad-cam for faster-rcnn with config file faster_rcnn_X_101_32x8d_FPN_3x.yaml. I am using the "roi_heads.box_pooler" as the last layer to calculate grad-cam and I have applied some changes to detectron2/modeling/roi_heads/fast_rcnn.py and the function fast_rcnn_inference_single_image the same as the changes that you menthioned in readme. when I run the result code, all the elements of the cam (before Relu) are negative.
would you please help me to figure out what the problem is?

加载自己的模型参数

请问如何在main.py文件中加载自己的模型及参数,我所使用的模型不是pytorch官方提供的,谢谢

能在cuda上实现在训练过程生成并裁减热力图区域吗?

我最近想根据热力图区域从原图上裁减crop,然后resize到原始size去训练同一个网络,然后基于你这个实现了cpu版的,有些连通域操作需要用opencv和scikit-image去实现,然后transform中的resize也没法对cuda变量操作,导致训练特别慢,请问这个有办法实现cuda版本吗?

AttributeError: 'NoneType' object has no attribute 'zero_'

Hi, thanks for sharing the code.
I use Grad-CAM in my own classification model, but I got the following error and can't fix it.
Could you please figure out the solutions? Thanks.

Traceback (most recent call last):
File "/remote-home/my/Code/baseline/CAM_Visual.py", line 289, in
main(config, args)
File "/remote-home/my/Code/baseline/CAM_Visual.py", line 252, in main
inputs.grad.zero_() # 梯度置零
AttributeError: 'NoneType' object has no attribute 'zero_'

移植您的方法到YOLOv3遇到点问题,请您指点一二

感谢您的开源分享,我尝试将您的工作移植到yolo v3网络中,目前基本框架已经实现了,但是在用hook求中间梯度的时候得到的值全是0,这样的话导致我的cam没法计算准确,最后的目标bbx的热力图就是深蓝图。我注意到您的代码中梯度Hook函数是直接传入input_grad 和 output_grad 然后直接赋值即可。我这里实在是不太明白,这两个变量是什么意思?我这里打印出来就是全是0,虽然格式是对的。获取特征的hook函数可以给我想要的结果。如果您有时间的话,请您指点一下,谢谢了。

error when running retinaNet demo.py

feature shape:torch.Size([1, 256, 100, 136])
feature shape:torch.Size([1, 256, 50, 68])
feature shape:torch.Size([1, 256, 25, 34])
feature shape:torch.Size([1, 256, 13, 17])
feature shape:torch.Size([1, 256, 7, 9])
Traceback (most recent call last):
File "detection/demo_retinanet.py", line 185, in
main(arguments)
File "detection/demo_retinanet.py", line 147, in main
mask, box, class_id = grad_cam(inputs) # cam mask
File "detectors/Grad-CAM.pytorch/detection/grad_cam_retinanet.py", line 61, in call
output = self.net.predict([inputs])
File "detectors/detectron2/detectron2/modeling/meta_arch/retinanet.py", line 473, in predict
results = self.inference(box_cls, box_delta, anchors, images.image_sizes)
File "detectors/detectron2/detectron2/modeling/meta_arch/retinanet.py", line 316, in inference
anchors, pred_logits_per_image, deltas_per_image, tuple(image_size)
File "detectors/detectron2/detectron2/modeling/meta_arch/retinanet.py", line 428, in inference_single_image
predicted_boxes = self.box2box_transform.apply_deltas(box_reg_i, anchors_i.tensor)
File "detectors/detectron2/detectron2/modeling/box_regression.py", line 100, in apply_deltas
pred_ctr_x = dx * widths[:, None] + ctr_x[:, None]
RuntimeError: The size of tensor a (25) must match the size of tensor b (0) at non-singleton dimension 1

运行RetinaNet示例 - Run the RetinaNet example

嗨,我尝试运行RetinaNet示例代码。 该程序开始运行,但不久后失败。 那是最后几行:
Hi, I tried to run the RetinaNet sample code. The program started to run, but failed shortly after. Those are the last few lines:

feature shape:torch.Size([1, 256, 100, 100])
feature shape:torch.Size([1, 256, 50, 50])
feature shape:torch.Size([1, 256, 25, 25])
feature shape:torch.Size([1, 256, 13, 13])
feature shape:torch.Size([1, 256, 7, 7])
Traceback (most recent call last):
  File "detection/demo_retinanet.py", line 185, in <module>
    main(arguments)
  File "detection/demo_retinanet.py", line 147, in main
    mask, box, class_id = grad_cam(inputs)  # cam mask
  File "/content/drive/My Drive/Pytorch/yizt_GradCam/detection/grad_cam_retinanet.py", line 61, in __call__
    output = self.net.predict([inputs])
  File "/content/drive/My Drive/Pytorch/yizt_GradCam/detectron2/detectron2/modeling/meta_arch/retinanet.py", line 438, in predict
    results = self.inference(box_cls, box_delta, anchors, images.image_sizes)
  File "/content/drive/My Drive/Pytorch/yizt_GradCam/detectron2/detectron2/modeling/meta_arch/retinanet.py", line 335, in inference
    anchors, pred_logits_per_image, deltas_per_image, tuple(image_size)
  File "/content/drive/My Drive/Pytorch/yizt_GradCam/detectron2/detectron2/modeling/meta_arch/retinanet.py", line 367, in inference_single_image
    num_topk = min(self.topk_candidates, box_reg_i.size(0))
AttributeError: 'Boxes' object has no attribute 'size' 

有人知道,如何解决这个问题?
Does anybody know, how to solve this problem?

a little bug in detection/demo.py/gen_cam()

def gen_cam(image, mask):
    heatmap = cv2.applyColorMap(np.uint8(255 * mask), cv2.COLORMAP_JET)
    # heatmap = np.float32(heatmap) / 255  # little bug here
    heatmap = np.float32(heatmap)
    heatmap = heatmap[..., ::-1]  # gbr to rgb
    cam = heatmap + np.float32(image)
    return norm_image(cam), heatmap

For variable "heatmap" there is no need to divide by 255.

图像复原能用热力图吗

大佬您好 我是做图像复原的 请问热力图能应用到图像复原领域吗 如果可以的话,应该对谁进行反向传播呢 图像复原的结果是二维矩阵 并不是分类任务的一个值 求大佬指教

Running retinanet demo example

Hello,

I have the following issue when running any of the demo example on retinanet.
May you please advise in this

self.anchor_idxs tensor([], dtype=torch.int64) tensor([], dtype=torch.int64)
Traceback (most recent call last):
File "detection/demo_retinanet.py", line 185, in
main(arguments)
File "detection/demo_retinanet.py", line 147, in main
mask, box, class_id = grad_cam(inputs) # cam mask
File "/content/Grad-CAM.pytorch/detection/grad_cam_retinanet.py", line 61, in call
output = self.net.predict([inputs])
File "/content/detectron2_repo/detectron2/modeling/meta_arch/retinanet.py", line 531, in predict
results = self.inference(box_cls, box_delta, anchors, images.image_sizes)
File "/content/detectron2_repo/detectron2/modeling/meta_arch/retinanet.py", line 422, in inference
anchors, pred_logits_per_image, deltas_per_image, image_size
File "/content/detectron2_repo/detectron2/modeling/meta_arch/retinanet.py", line 477, in inference_single_image
predicted_boxes = self.box2box_transform.apply_deltas(box_reg_i, anchors_i.tensor)
File "/content/detectron2_repo/detectron2/modeling/box_regression.py", line 101, in apply_deltas
pred_ctr_x = dx * widths[:, None] + ctr_x[:, None]
RuntimeError: The size of tensor a (25) must match the size of tensor b (0) at non-singleton dimension 1

Thanks.

'NoneType' object is not subscriptable

我在用GuidedBackPropagation的时候gradient是None出现了这个error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-f5189ca71dcc> in <module>
      1 gbp = GuidedBackPropagation(model)
----> 2 gbp.__call__(x)

<ipython-input-8-b837947b3c16> in __call__(self, inputs, index)
     34         target.backward()
     35 
---> 36         return inputs.grad[0]  # [3,H,W]

TypeError: 'NoneType' object is not subscriptable

模型是ResNet34只改变了第一层kernel的channel从3变成了4,load或者不load state_dict都有这个问题,想请教一下问题出在哪里,谢谢!

强化学习使用GradCAM

你好,我想尝试在强化学习里用GradCAM,但是遇到一个问题是强化学习里损失函数跟做分类不太一样,我不太确定求GradCAM的梯度时是用强化学习的损失函数,还是像分类一样用最大值做反向,不知道您有没有了解?

TypeError: 'NoneType' object is not subscriptable

我在用自己的resnet18模型进行训练时报了如下错误:


TypeError Traceback (most recent call last)
in
----> 1 test()

in test()
74 print('actual: ' + str(int(target[0])))
75
---> 76 getGradCAM(images[0].numpy().transpose(1,2,0).astype(np.float) * [0.229, 0.224, 0.225] + [0.485, 0.456, 0.406], model, class_id = target[0])
77
78 for j in range(len(output)):

in getGradCAM(img, net, layer, class_id)
236 layer_name = get_last_conv_name(net) if layer is None else layer
237 grad_cam = GradCAM(net, layer_name)
--> 238 mask = grad_cam(inputs, class_id) # cam mask
239 image_dict['cam'], image_dict['heatmap'] = gen_cam(img, mask)
240 grad_cam.remove_handlers()

in call(self, inputs, index)
68 print(output[0].grad_fn)
69
---> 70 gradient = self.gradient[0].cpu().data.numpy() # [C,H,W]
71 weight = np.mean(gradient, axis=(1, 2)) # [C]
72

TypeError: 'NoneType' object is not subscriptable

图片大小是1000 * 1500 * 3, feature的大小是:torch.Size([1, 512, 47, 32])

如何加载自定义模型权重

问题描述:
加载imagenet预训练权重跑没问题。但是当我尝试着加载自己训练好的模型权重时,出现以下错误:

size mismatch for fc.weight: copying a param with shape torch.Size([5, 2048]) from checkpoint, the shape in current model is torch.Size([1000, 2048]).       
        size mismatch for fc.bias: copying a param with shape torch.Size([5]) from checkpoint, the shape in current model is torch.Size([1000]).

imagenet的训练类别有1000类,而我自己训练好的模型只有5个类,并且不包含在imagenet的训练类别中
当维度不一致的时候,我要如何修改代码才能解决这个问题呢,希望dalao能够指点一下~

如何加载自己训练的权重文件

你好,请问怎么加载自己训练的模型的权重文件啊,我在输入自己的网络和权重文件路径之后,总是会出现以下错误,其中第二行很长我省略了后面的部分。
RuntimeError: Error(s) in loading state_dict for DenseNet:
Missing key(s) in state_dict: "features.conv0.weight",
Unexpected key(s) in state_dict: "state_dict".

Using Grad-CAM for SSD network

Hi,
Thanks for publishing this great repo! I am debugging an SSD network used for a 3-class object detection task.
Can I use this repo for visualizing such a network as well? I would appreciate it if you could provide me the initial steps...

Hi, Could you give some hints for transfer your great works into YOLOv3?

I want to know more about the learning process of YOLO network so that it requires me a framework for visualize the Grad-Cam maps or something like that. However, there are limited resources for my problem. Could you give me some guidance for using your methods on YOLO v3 models?

Thanks in advance!

save image error

Traceback (most recent call last):
File "D:/2019Pytorch_Programs/Papers_codes/efficient-finetuning/visualization/main.py", line 215, in
main(arguments)
File "D:/2019Pytorch_Programs/Papers_codes/efficient-finetuning/visualization/main.py", line 194, in main
save_image(image_dict, os.path.basename(args.image_path), args.network, args.output_dir)
File "D:/2019Pytorch_Programs/Papers_codes/efficient-finetuning/visualization/main.py", line 153, in save_image
io.imsave(os.path.join(output_dir, '{}-{}-{}.jpg'.format(prefix, network, key)), image)
File "D:\Anaconda3\lib\site-packages\skimage\io_io.py", line 139, in imsave
if is_low_contrast(arr):
File "D:\Anaconda3\lib\site-packages\skimage\exposure\exposure.py", line 501, in is_low_contrast
image = rgb2gray(image)
File "D:\Anaconda3\lib\site-packages\skimage\color\colorconv.py", line 804, in rgb2gray
rgb = _prepare_colorarray(rgb[..., :3])
File "D:\Anaconda3\lib\site-packages\skimage\color\colorconv.py", line 158, in _prepare_colorarray
return dtype.img_as_float(arr)
File "D:\Anaconda3\lib\site-packages\skimage\util\dtype.py", line 378, in img_as_float64
return convert(image, np.float64, force_copy)
File "D:\Anaconda3\lib\site-packages\skimage\util\dtype.py", line 244, in convert
raise ValueError("Images of type float must be between -1 and 1.")
ValueError: Images of type float must be between -1 and 1.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.