GithubHelp home page GithubHelp logo

facebookresearch / fvcore Goto Github PK

View Code? Open in Web Editor NEW
2.0K 41.0 226.0 479 KB

Collection of common code that's shared among different research projects in FAIR computer vision team.

License: Apache License 2.0

Python 99.62% Shell 0.38%

fvcore's Introduction

fvcore

Support Ukraine

fvcore is a light-weight core library that provides the most common and essential functionality shared in various computer vision frameworks developed in FAIR, such as Detectron2, PySlowFast, and ClassyVision. All components in this library are type-annotated, tested, and benchmarked.

The computer vision team in FAIR is responsible for maintaining this library.

Features:

Besides some basic utilities, fvcore includes the following features:

  • Common pytorch layers, functions and losses in fvcore.nn.
  • A hierarchical per-operator flop counting tool: see this note for details.
  • Recursive parameter counting: see API doc.
  • Recompute BatchNorm population statistics: see its API doc.
  • A stateless, scale-invariant hyperparameter scheduler: see its API doc.

Install:

fvcore requires pytorch and python >= 3.6.

Use one of the following ways to install:

1. Install from PyPI (updated nightly)

pip install -U fvcore

2. Install from Anaconda Cloud (updated nightly)

conda install -c fvcore -c iopath -c conda-forge fvcore

3. Install latest from GitHub

pip install -U 'git+https://github.com/facebookresearch/fvcore'

4. Install from a local clone

git clone https://github.com/facebookresearch/fvcore
pip install -e fvcore

License

This library is released under the Apache 2.0 license.

fvcore's People

Contributors

amtagrwl avatar amyreese avatar bigfootjon avatar bottler avatar bxiong1202 avatar connernilsen avatar dmitryvinn avatar ericmintun avatar hannamao avatar haooooooqi avatar hdcharles avatar hudeven avatar janeyx99 avatar jonmorton avatar lauragustafson avatar lyttonhao avatar marcszafraniec avatar maxfrei750 avatar ngimel avatar normster avatar patricklabatut avatar ppwwyyxx avatar sampepose avatar sujitoc avatar thatch avatar theschnitz avatar theweiho avatar wangg12 avatar wat3rbro avatar zdavid1995 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fvcore's Issues

Add release tags

Hi,

Currently, the repo has no release tags, meaning that one has to point to a commit hash in order to pin a "version". Is it an option to add some tags to the repo?

Thanks,
Diogo

TypeError: _get_local_path() got an unexpected keyword argument 'force'

Hi guys,

Thanks for the great project 🙏
I've faced with it while running my usual colab project and it was resolved after downgrading fvcore 0.1.4 -> 0.1.3.post20210317.

It seems get_local_path() doesn't call _get_local_path() of iopath 0.1.7.
It's an environmental issue?

Test Environment

fvcore: 0.1.4
python: 3.7
iopath: 0.1.7
...

Stack Trace

...
  File "/usr/local/lib/python3.7/dist-packages/fvcore/common/checkpoint.py", line 140, in load
    path = self.path_manager.get_local_path(path)
  File "/usr/local/lib/python3.7/dist-packages/fvcore/common/file_io.py", line 157, in get_local_path
    path, force=force, **kwargs
TypeError: _get_local_path() got an unexpected keyword argument 'force'

RuntimeError in ParamScheduler

Encountered an runtime error when running with latest version of detectron2 and fvcore. Happens exactly when training the object detection model with scheduler.step() in code.

File "/../lib/computer-vision/pfr-vision/env/pytorch_gpu/lib/python3.8/site-packages/fvcore/common/param_scheduler.py", line 236, in __call__
raise RuntimeError( RuntimeError: where in ParamScheduler must be in [0, 1]: got 1.1666666666666667

library with version

  • torch 1.9.0+cu111
  • torchaudio 0.9.0
  • torchvision 0.10.0+cu111
  • detectron2 0.5+cu111
  • fvcore 0.1.5

calculate param problem

Hi, I find that the results of fvcore.nn.prameter_count function and tensorflow are different when calculate same model.
The difference comes from the BN layer. tensorflow calculate beta, gamma, moving_mean and variance(four params), but fvcore.nn.prameter_count only calculate beta, gamma.

for example:

import torch
from fvcore.nn import parameter_count_table


def main():
    model = torch.nn.BatchNorm2d(1)
    print(parameter_count_table(model))


if __name__ == '__main__':
    main()

result:

| name    | #elements or shape   |
|:--------|:---------------------|
| model   | 2                    |
|  weight |  (1,)                |
|  bias   |  (1,)                |

only calculate beta, gamma.

In https://github.com/facebookresearch/fvcore/blob/master/fvcore/nn/parameter_count.py#L10-L30

r = defaultdict(int)
for name, prm in model.named_parameters():
    size = prm.numel()
    name = name.split(".")
    for k in range(0, len(name) + 1):
        prefix = ".".join(name[:k])

using model.named_parameters() only get beta, gamma no moving_mean and variance.

Are there any other considerations?

ShapelyDeprecationWarning in CropTransform

I'm getting the following warning when using CropTransform on polygons:

"Iteration over multi-part geometries is deprecated and will be removed in Shapely 2.0. Use the geoms property to access the constituent parts of a multi-part geometry."

for poly in cropped:
# It could produce lower dimensional objects like lines or
# points, which we want to ignore
if not isinstance(poly, geometry.Polygon) or not poly.is_valid:
continue
coords = np.asarray(poly.exterior.coords)
# NOTE This process will produce an extra identical vertex at
# the end. So we remove it. This is tested by
# `tests/test_data_transform.py`
cropped_polygons.append(coords[:-1])

Problem with 'flop_count'

Hi,

When I used the function 'flop_count', I got an problem.
my code is here:

from fvcore.nn import flop_count
from efficientnet_pytorch import EfficientNet
model = EfficientNet.from_name('efficientnet-b2')
netinput = torch.randn(1, 3, 224, 224)
final_count, skipped_ops = flop_count(model, (netinput, )) 
print(final_count)

And the result is

Skipped operation aten::batch_norm 69 time(s)
Skipped operation prim::PythonOp 69 time(s)
Skipped operation aten::adaptive_avg_pool2d 24 time(s)
Skipped operation aten::sigmoid 23 time(s)
Skipped operation aten::mul 39 time(s)
Skipped operation aten::rand 16 time(s)
Skipped operation aten::add 32 time(s)
Skipped operation aten::div 16 time(s)
Skipped operation aten::dropout 1 time(s)
defaultdict(<class 'float'>, {'conv': 0.65755544, 'addmm': 0.001408})

the result 0.657B is different from the result in the efficientnet paper. https://arxiv.org/abs/1905.11946

Can you help me with this?

Why not return the variable `save_file` in Line 184 directly?

def get_checkpoint_file(self) -> str:
"""
Returns:
str: The latest checkpoint file in target directory.
"""
save_file = os.path.join(self.save_dir, "last_checkpoint")
try:
with self.path_manager.open(save_file, "r") as f:
last_saved = f.read().strip()
except IOError:
# if file doesn't exist, maybe because it has just been
# deleted by a separate process
return ""
return os.path.join(self.save_dir, last_saved)

Module never called

Hi, When I calculate mobilenet_v3 FLOPs(in torchvison), got result as follows:

Skipped operation aten::batch_norm 46 time(s)
Skipped operation aten::hardswish_ 21 time(s)
Skipped operation aten::add_ 10 time(s)
Skipped operation aten::adaptive_avg_pool2d 9 time(s)
Skipped operation aten::hardsigmoid_ 8 time(s)
Skipped operation aten::mul 8 time(s)
Skipped operation aten::dropout_ 1 time(s)
The following submodules of the model were never called during the trace of the graph. 
They may be unused, or they were accessed by direct calls to .forward() or via other python methods. 
In the latter case they will have zeros for statistics, though their statistics will still contribute to their parent calling module.
Module never called: features.10.block.2.2
Module never called: features.12.block.3.2
Module never called: features.14.block.3.2
Module never called: features.8.block.2.2
Module never called: features.15.block.3.2
Module never called: features.1.block.1.2
Module never called: features.3.block.2.2
Module never called: features.4.block.3.2
Module never called: features.6.block.3.2
Module never called: features.7.block.2.2
Module never called: features.5.block.3.2
Module never called: features.9.block.2.2
Module never called: features.2.block.2.2
Module never called: features.11.block.3.2
Module never called: features.13.block.3.2
216589760

using script:

from fvcore.nn import FlopCountAnalysis
from torchvision.models import mobilenet_v3_large

import torch
tensor = (torch.rand(1, 3, 224, 224),)
model = mobilenet_v3_large()
flops = FlopCountAnalysis(model, tensor)
print(flops.total())

env:
python: 3.7
torch: 1.8.1+cpu
torchvison: 0.9.1+cpu
system: ubuntu18.04

I guess the problem is caused by nn.Identity Module.

Counting bias term

Hello
I stumbled upon this package as I'm looking for a way to count FLOPs/MACs in my model.
I compared the results that I got to another popular package (thop) and noticed that fvcore does not count the bias term as part of the FLOPs count.
I wrote a small example to demonstrate this using a convolutional layer (I'm running on Linux CPU, with the latest version of fvcore and pytorch v1.9.1); notice the difference when we set bias=True and how the results are identical when bias=False:

import torch
import torch.nn as nn
from thop import profile
from fvcore.nn import FlopCountAnalysis

class MyModel(nn.Module):

    def __init__(self, bias):
        super().__init__()
        self.conv = nn.Conv2d(3, 10, 3, bias=bias)

    def forward(self, x):
        return self.conv(x)


input_ = torch.randn((1, 3, 32, 32))
for bias in [True, False]:
    print(f"********************** bias = {bias} ***********************")
    model = MyModel(bias=bias)
    print("fvcore result:")
    flops = FlopCountAnalysis(model, input_)
    print(flops.total())
    print("thop results:")
    macs, _ = profile(model, inputs=(input_,))
    print(macs)
    print("***********************************************************")

It outputs:

********************** bias = True ***********************
fvcore result:
[W NNPACK.cpp:79] Could not initialize NNPACK! Reason: Unsupported hardware.
243000
thop results:
[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
[WARN] Cannot find rule for <class '__main__.MyModel'>. Treat it as zero Macs and zero Params.
252000.0
***********************************************************
********************** bias = False ***********************
fvcore result:
243000
thop results:
[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
[WARN] Cannot find rule for <class '__main__.MyModel'>. Treat it as zero Macs and zero Params.
243000.0
***********************************************************

Process finished with exit code 0

My questions:

  1. Is that a bug, or a feature?
  2. Is there a way to include the bias in the calculation without modifying the internal code?

Thanks

ImportError: cannot import name 'PathManagerBase'

from fvcore.common.file_io import PathManagerBase
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'PathManagerBase'

fvcore版本
fvcore 0.1.1.post20200608

Could I ask you a question about sigmoid_focal_loss() function?

Thanks for your amazing work. Can i use sigmoid_focal_loss() for multiclass classification problem where each class is mutually exclusive? If yes, is this the right way?
example:
suppose i want to classify objects over 3 possible classes: 0,1,2. D is my network that output a vector of dimension N*3 where N is the batch size (for simplicity in this example N=1). For each class i create a target vector as follow:
class 0 = [0,0,1] class 1 = [0,1,0] class 2 = [1,0,0]

therefore can i use this loss as follow?

def get_loss(output, class):
    if class == 0:
        return sigmoid_focal_loss(output,[0,0,1])
     elif class == 1:
        return sigmoid_focal_loss(output,[0,1,0])
     else:
        return sigmoid_focal_loss(output,[1,0,0])

output = D(input) #example output vector [-1,2.33,9.5]
loss = get_loss(output,0)
...
output2 = D(input2) #example output vector [-7,-8,4]
loss = get_loss(output2,1)
...
and so on

Thank you.

Some error when i trained yoloF

File "/home/zzf/miniconda3/envs/torch1.7.1/lib/python3.7/site-packages/fvcore-0.1.3.post20210317-py3.7.egg/fvcore/nn/giou_loss.py", line 32, in giou_loss
AssertionError: bad box: x1 larger than x2

Why convert numpy to tensor in VFlipTransform

I notice that in HFlipTransform, you use numpy to flip images directly. But in VFlipTransform, you convert numpy array to torch.Tensor firstly, and the flip the array. Finally you convert to numpy again.

Why don't you use the numpy directly? Is there any performance concern?

Thanks!

May I ask you a question about precise-bn ?

Hi,

Thanks for releasing this helpful codebase. After reading the part of precise-bn, a question comes to me:
Do I need to enlarge the batch size when I do precise bn?
I noticed that the performance of bn becomes bad when the batch size is small. People say that the bad performance is due to the noisy estimation of running mean/var of the bn layers. When batch size is small, there will be more noise in each batch's mean/var which brings bad estimation for the running mean/var. So in order to cope with the problem of small batch, do I need to enlarge the batch size when I use precise bn, or is there other explanations of the bn performance drop?

Error while running Detectron2 code through spyder

I am able to execute the Detectron2 script once but the next time I try to run it I get this error. It works once after I restart the kernel.

File "/home/user/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
    execfile(filename, namespace)

  File "/home/user/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "/home/user/Table_Structure_Extract/Detectron2/detectron2/detectron2/Scripts/Table_Instance1.py", line 23, in <module>
    from detectron2.engine import DefaultTrainer

  File "/home/user/Prateek_Data/Table_Structure_Extract/Detectron2/detectron2/detectron2/engine/__init__.py", line 12, in <module>
    from .defaults import *

  File "/home/user/Table_Structure_Extract/Detectron2/detectron2/detectron2/engine/defaults.py", line 23, in <module>
    from detectron2.checkpoint import DetectionCheckpointer

  File "/home/user/Table_Structure_Extract/Detectron2/detectron2/detectron2/checkpoint/__init__.py", line 6, in <module>
    from . import catalog as _UNUSED  # register the handler

  File "/home/user/Table_Structure_Extract/Detectron2/detectron2/detectron2/checkpoint/catalog.py", line 133, in <module>
    PathManager.register_handler(ModelCatalogHandler())

  File "/home/user/anaconda3/lib/python3.7/site-packages/fvcore/common/file_io.py", line 768, in register_handler
    assert prefix not in PathManager._PATH_HANDLERS

AssertionError

Roadmap

Is there some sort of a rough roadmap for this repo?
It would be really helpful for fellow contributors as well if you could release some info along those lines.

Overall Flops are less than the modules

Hi,
Thank you for this super-cool code.
I am using this code to calculate the flops of my model. However, I encountered a very strange problem as below. It seems that the overall flops of the whole model is less than, even some modules.
Also, it seems that some conv layers don't calculate the flops. Why does this situation happen?
I am wondering if there is any solution to help with this so strange problem.
Thank you very much.

D:\anaconda3\envs\PyTorch_py3\lib\site-packages\fvcore\nn\jit_handles.py:139: RuntimeWarning: overflow encountered in long_scalars
flop = batch_size * out_size * Cout_dim * Cin_dim * kernel_size
D:\anaconda3\envs\PyTorch_py3\lib\collections_init_.py:802: RuntimeWarning: overflow encountered in long_scalars
self[elem] += count

module #parameters or shape #flops
model 6.789M 1.09G
conv1.conv2d 0.379M 1.904G
conv1.conv2d.weight (256, 1478, 1, 1)
conv1.conv2d.bias (256,)
conv2.conv2d 0.59M 1.074G
conv2.conv2d.weight (256, 256, 3, 3)
conv2.conv2d.bias (256,)
conv3.conv2d 0.59M 0
conv3.conv2d.weight (256, 256, 3, 3)
conv3.conv2d.bias (256,)
res_module 3.59M 3.195M
res_module.0 1.197M 1.065M
res_module.0.conv1.conv2d 0.59M 0
res_module.0.conv2.conv2d 0.59M 0
res_module.0.se_layer 16.672K 1.065M
res_module.1 1.197M 1.065M
res_module.1.conv1.conv2d 0.59M 0
res_module.1.conv2.conv2d 0.59M 0
res_module.1.se_layer 16.672K 1.065M
res_module.2 1.197M 1.065M
res_module.2.conv1.conv2d 0.59M 0
res_module.2.conv2.conv2d 0.59M 0
res_module.2.se_layer 16.672K 1.065M
deconv1.conv2d 1.049M 0
deconv1.conv2d.weight (256, 256, 4, 4)
deconv1.conv2d.bias (256,)
deconv2.conv2d 0.59M 1.074G
deconv2.conv2d.weight (256, 256, 3, 3)
deconv2.conv2d.bias (256,)
deconv3.conv2d 0.771K 12.583M
deconv3.conv2d.weight (3, 256, 1, 1)
deconv3.conv2d.bias (3,)

flop count analysis of LSTM layers

Hello, it seems that LSTM layers are not yet supported for the fvcore.nn.FlopCountAnalysis method:

import torch
from fvcore.nn import FlopCountAnalysis
from torch import nn


class ToyLSTMModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.rnn = nn.LSTM(10, 20, 1)

    def forward(self, x):
        h0 = torch.randn(1, 3, 20)
        c0 = torch.randn(1, 3, 20)
        output, _ = self.rnn(x, (h0, c0))

        return output

model = ToyLSTMModel()
example_input = torch.randn(5, 3, 10)

print(FlopCountAnalysis(model, example_input).by_module())

gives:

Unsupported operator aten::randn encountered 2 time(s)
Unsupported operator aten::lstm encountered 1 time(s)
Counter({'': 0, 'rnn': 0})

While the same works an LSTM cell:

import torch
from fvcore.nn import FlopCountAnalysis
from torch import nn

class ToyLSTMModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.rnn = nn.LSTMCell(10, 20)

    def forward(self, x):
        hx = torch.randn(3, 20)
        cx = torch.randn(3, 20)
        output = []
        for i in range(x.size()[0]):
            hx, cx = self.rnn(x[i], (hx, cx))
            output.append(hx)
        output = torch.stack(output, dim=0)
        return output


model = ToyLSTMModel()
example_input = torch.randn(5, 3, 10)

print(FlopCountAnalysis(model, example_input).by_module())

output:

Unsupported operator aten::randn encountered 2 time(s)
Unsupported operator aten::add_ encountered 10 time(s)
Unsupported operator aten::unsafe_chunk encountered 5 time(s)
Unsupported operator aten::sigmoid_ encountered 15 time(s)
Unsupported operator aten::tanh_ encountered 5 time(s)
Unsupported operator aten::mul encountered 15 time(s)
Unsupported operator aten::tanh encountered 5 time(s)
Counter({'': 36000, 'rnn': 36000})

Is there any particular reason for that? The number of FLOPS of the LSTM layer should be the same than from one LSTM cell times the number of time steps.

BUG: ImportError: cannot import name 'FakeQuantizeBase' from 'torch.quantization'

After your recent release, I got this error from my code. I can only use my code by downgrading fvcore version to previous one.

Error Traceback:

2021-10-15T09:46:49.0347680Z ==================================== ERRORS ====================================
2021-10-15T09:46:49.0350615Z �[31m�[1m________________________ ERROR collecting test session _________________________�[0m
2021-10-15T09:46:49.0352801Z �[1m�[31m/usr/local/lib/python3.8/dist-packages/_pytest/config/__init__.py�[0m:495: in _importconftest
2021-10-15T09:46:49.0358791Z     return self._conftestpath2mod[key]
2021-10-15T09:46:49.0360317Z �[1m�[31mE   KeyError: PosixPath('/app/cv_ner_detectron/tests/conftest.py')�[0m
2021-10-15T09:46:49.0360877Z 
2021-10-15T09:46:49.0362075Z �[33mDuring handling of the above exception, another exception occurred:�[0m
2021-10-15T09:46:49.0363832Z �[1m�[31m/usr/local/lib/python3.8/dist-packages/_pytest/config/__init__.py�[0m:501: in _importconftest
2021-10-15T09:46:49.0364711Z     mod = conftestpath.pyimport()
2021-10-15T09:46:49.0365834Z �[1m�[31m/usr/local/lib/python3.8/dist-packages/py/_path/local.py�[0m:704: in pyimport
2021-10-15T09:46:49.0368988Z     __import__(modname)
2021-10-15T09:46:49.0370340Z �[1m�[31m/usr/local/lib/python3.8/dist-packages/_pytest/assertion/rewrite.py�[0m:152: in exec_module
2021-10-15T09:46:49.0372267Z     exec(co, module.__dict__)
2021-10-15T09:46:49.0373441Z �[1m�[31mcv_ner_detectron/tests/conftest.py�[0m:11: in <module>
2021-10-15T09:46:49.0377209Z     from cv_ner_detectron.detectron.datamodels import CvNerExample
2021-10-15T09:46:49.0378686Z �[1m�[31mcv_ner_detectron/detectron/datamodels.py�[0m:7: in <module>
2021-10-15T09:46:49.0380548Z     from detectron2.structures import BoxMode
2021-10-15T09:46:49.0382497Z �[1m�[31m/usr/local/lib/python3.8/dist-packages/detectron2/structures/__init__.py�[0m:6: in <module>
2021-10-15T09:46:49.0385934Z     from .keypoints import Keypoints, heatmaps_to_keypoints
2021-10-15T09:46:49.0387443Z �[1m�[31m/usr/local/lib/python3.8/dist-packages/detectron2/structures/keypoints.py�[0m:6: in <module>
2021-10-15T09:46:49.0389906Z     from detectron2.layers import interpolate
2021-10-15T09:46:49.0391270Z �[1m�[31m/usr/local/lib/python3.8/dist-packages/detectron2/layers/__init__.py�[0m:10: in <module>
2021-10-15T09:46:49.0397234Z     from .blocks import CNNBlockBase, DepthwiseSeparableConv2d
2021-10-15T09:46:49.0402621Z �[1m�[31m/usr/local/lib/python3.8/dist-packages/detectron2/layers/blocks.py�[0m:4: in <module>
2021-10-15T09:46:49.0406940Z     import fvcore.nn.weight_init as weight_init
2021-10-15T09:46:49.0408469Z �[1m�[31m/usr/local/lib/python3.8/dist-packages/fvcore/nn/__init__.py�[0m:2: in <module>
2021-10-15T09:46:49.0409904Z     from .activation_count import ActivationCountAnalysis, activation_count
2021-10-15T09:46:49.0411437Z �[1m�[31m/usr/local/lib/python3.8/dist-packages/fvcore/nn/activation_count.py�[0m:10: in <module>
2021-10-15T09:46:49.0414217Z     from .jit_analysis import JitModelAnalysis
2021-10-15T09:46:49.0415640Z �[1m�[31m/usr/local/lib/python3.8/dist-packages/fvcore/nn/jit_analysis.py�[0m:15: in <module>
2021-10-15T09:46:49.0417905Z     from fvcore.common.checkpoint import _named_modules_with_dup
2021-10-15T09:46:49.0419295Z �[1m�[31m/usr/local/lib/python3.8/dist-packages/fvcore/common/checkpoint.py�[0m:23: in <module>
2021-10-15T09:46:49.0422403Z     from torch.quantization import ObserverBase, FakeQuantizeBase
2021-10-15T09:46:49.0424633Z �[1m�[31mE   ImportError: cannot import name 'FakeQuantizeBase' from 'torch.quantization' (/usr/local/lib/python3.8/dist-packages/torch/quantization/__init__.py)�[0m
2021-10-15T09:46:49.0425328Z 

preciseBN example

@jreese How do we use preciseBN in pytorch training code in the right manner? Can you please publish an example for the same?

Wrong ConvTranspose flops

The flops of transposed conv seems to be wrong:

from fvcore.nn import FlopCountAnalysis
import torch
from torch import nn

model = nn.ConvTranspose2d(32, 1, (3, 3))
x = torch.empty((1, 32, 1, 1))

print(FlopCountAnalysis(model, x).total())  # 81
# expected_flops: 32 * 3 * 3

Maybe we could change conv_flop_count to

def conv_flop_count(
    x_shape: List[int], w_shape: List[int], out_shape: List[int], transposed: bool,
) -> Number:
    batch_size = x_shape[0]
    conv_size = prod((x_shape if transposed else out_shape)[2:])
    flop = prod(w_shape) * batch_size * prod(conv_size)
    return flop

and

# use a custom name instead of "_convolution"
return Counter({"conv": conv_flop_count(x_shape, w_shape, out_shape)})

to

    transposed = inputs[6].toIValue()
    # use a custom name instead of "_convolution"
    return Counter({"conv": conv_flop_count(x_shape, w_shape, out_shape, transposed)})

or

    transposed = inputs[6].toIValue()
    # use a custom name instead of "_convolution"
    name = "conv_transpose" if transposed else "conv"
    return Counter({name: conv_flop_count(x_shape, w_shape, out_shape, transposed)})

flop_count support more ops, such as bmm

Now flop_count already supports several ops with high computational costs:

_DEFAULT_SUPPORTED_OPS: typing.Dict[str, typing.Callable] = {
"aten::addmm": addmm_flop_jit,
"aten::_convolution": conv_flop_jit,
"aten::einsum": einsum_flop_jit,
"aten::matmul": matmul_flop_jit,
}

However with the popularity of MultiHeadAttention, some ops with high computational costs are not taken into consideration, such as bmm. Some element-wise ops can also be costly such as mul, add and layer_norm.
image
Will flop_count support these ops in the future?

How do I cite this repo?

Hi,

Thank you for the great tools you supply in this repo. I use the module for computing the FLOPs of a model for my project. How do you propose to cite this repo in my project report?

sample_points_from_meshes() breaks when using 'cuda:1'

Hello,

The sample_points_from_meshes() function fails when my mesh is loaded to my second GPU 'cuda:1' but works when the mesh is loaded to my first GPU, 'cuda:0'.

import os
import torch
from pytorch3d.io import load_obj, save_obj
from pytorch3d.structures import Meshes
from pytorch3d.utils import ico_sphere
from pytorch3d.ops import sample_points_from_meshes
from pytorch3d.loss import (
chamfer_distance,
mesh_edge_loss,
mesh_laplacian_smoothing,
mesh_normal_consistency,
)
import numpy as np
from tqdm import tqdm_notebook
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import matplotlib as mpl

def plot_pointcloud(mesh, title=""):
# Sample points uniformly from the surface of the mesh.
points = sample_points_from_meshes(mesh, 5000)
x, y, z = points.clone().detach().cpu().squeeze().unbind(1)
fig = plt.figure(figsize=(5, 5))
ax = Axes3D(fig)
ax.scatter3D(x, z, -y)
ax.set_xlabel('x')
ax.set_ylabel('z')
ax.set_zlabel('y')
ax.set_title(title)
ax.view_init(190, 30)
plt.show()

path = 'cube768.obj'

Change this to 'cuda:0' and the code works

device = "cuda:1"

verts, faces, aux = load_obj(path)

textures_idx

faces_idx = faces.verts_idx.to(device)
verts = verts.to(device)

center = verts.mean(0)
verts = verts - center
scale = max(verts.abs().max(0)[0])
verts = verts / scale

trg_mesh = Meshes(verts=[verts], faces=[faces_idx])

plot_pointcloud(trg_mesh, "Target mesh")

The following error is thrown:

/opt/conda/conda-bld/pytorch_1579040055865/work/aten/src/ATen/native/cuda/MultinomialKernel.cu:87: int at::native::::binarySearchForMultinomial(scalar_t *, scalar_t *, int, scalar_t) [with scalar_t = float]: block: [0,0,0], thread: [0,1,0] Assertion cumdist[size - 1] > static_cast<scalar_t>(0) failed.
/opt/conda/conda-bld/pytorch_1579040055865/work/aten/src/ATen/native/cuda/MultinomialKernel.cu:87: int at::native::::binarySearchForMultinomial(scalar_t *, scalar_t *, int, scalar_t) [with scalar_t = float]: block: [0,0,0], thread: [0,2,0] Assertion cumdist[size - 1] > static_cast<scalar_t>(0) failed.
/opt/conda/conda-bld/pytorch_1579040055865/work/aten/src/ATen/native/cuda/MultinomialKernel.cu:87: int at::native::::binarySearchForMultinomial(scalar_t *, scalar_t *, int, scalar_t) [with scalar_t = float]: block: [0,0,0], thread: [0,3,0] Assertion cumdist[size - 1] > static_cast<scalar_t>(0) failed.
/opt/conda/conda-bld/pytorch_1579040055865/work/aten/src/ATen/native/cuda/MultinomialKernel.cu:87: int at::native::::binarySearchForMultinomial(scalar_t *, scalar_t *, int, scalar_t) [with scalar_t = float]: block: [0,0,0], thread: [0,0,0] Assertion cumdist[size - 1] > static_cast<scalar_t>(0) failed.
Traceback (most recent call last):
File "/home/albert/anaconda3/envs/testConda/lib/python3.7/site-packages/pytorch3d/ops/sample_points_from_meshes.py", line 67, in sample_points_from_meshes
sample_face_idxs += mesh_to_face[meshes.valid].view(num_valid_meshes, 1)
RuntimeError: copy_if failed to synchronize: device-side assert triggered

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1579040055865/work/aten/src/THC/THCCachingHostAllocator.cpp line=278 error=59 : device-side assert triggered

Thank you.

how to parse an empty list for config

E.g:

case 1.

MODULE:
    LOCATION: []

case 2.

MODULE:
    LOCATION: [1, 2, 3]

How can I set case 1 from the command line? I.e., do not modify the config file.

python test.py  MODULE.LOCATION ?

CropTransform leaves boxes hanging over the edge of the image

The CropTransform currently leaves boxes hanging over the edge of the images.
I think in most object detection pipelines I would want the boxes cropped just like the image.

import numpy as np

from fvcore.transforms import CropTransform

transform = CropTransform(5, 5, 5, 5)
box = np.array([[0, 0, 10, 10]])
transform.apply_box(box)
# get: array([[-5, -5,  5,  5]])
# wanted: array([[0, 0, 5, 5]])

The problem in conda install

Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: \
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versionsThe following specifications were found to be incompatible with your CUDA driver:

  • feature:/linux-64::__cuda==10.1=0

Your installed CUDA driver is: 10.1

Why is that?

fvcore flop-counting Error, ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

When using fvcore to count flops of my model,like this,

with torch.cuda.device(0):
model = Net()
print(flop_count_table(FlopCountAnalysis(model, inputs=(torch.randn((1, 3, 224, 224)),torch.randn((1, 1, 224, 224))))))

There are errors as follows,

File "/home/lee/model/util.py", line 123, in forward
conv = F.upsample(self.conv(F.adaptive_avg_pool2d(x, 1)), size=x.size()[2:], mode='bilinear')
File "/home/lee/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 720, in _call_impl
result = self._slow_forward(*input, **kwargs)
File "/home/lee/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 704, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/lee/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/home/lee/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 720, in _call_impl
result = self._slow_forward(*input, **kwargs)
File "/home/lee/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 704, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/lee/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 131, in forward
return F.batch_norm(
File "/home/lee/software/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2012, in batch_norm
_verify_batch_size(input.size())
File "/home/lee/software/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 1995, in _verify_batch_size
raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

How to fix it? Many thanks!

FlopCountAnalysis: ignore modules or modules with specified type

Hi all,

Is there an option to ignore some specified modules or modules with specified type?
For example,

class MyModule(nn.Module):
    def __init__(self):
        self.m1 = nn.Conv2d(...)
        self.m2 = nn.Conv2d(...)
        ....
        self.mN = nn.Conv2d(...)

here I want to ignore for flops calculation all modules embedded in MyModule.
I tried to pass MyModule with None-handler with the help of set_op_handler, but MyModule is not an operation, this did not work.

Thanks.

Getting:- ImportError: cannot import name 'FakeQuantizeBase' from 'torch.quantization'

Hey, I am using torch version 1.7.1 CPU and detectron2 along with torch 1.7 but when I used it through my own package, I started getting this error. Please tell me how should I fix this?

ImportError                               Traceback (most recent call last)
<ipython-input-4-d3af5af0e43a> in <module>()
----> 1 from INPR import inpr
      2 import matplotlib.pyplot as plt
      3 get_ipython().magic('matplotlib inline')

4 frames
/usr/local/lib/python3.7/dist-packages/INPR/inpr.py in <module>()
      4 from PIL import Image
      5 import numpy as np
----> 6 from .utils import Load_model
      7 from .get_num_plate import get_number_plate
      8 from .get_details import fetch

/usr/local/lib/python3.7/dist-packages/INPR/utils.py in <module>()
----> 1 from detectron2.engine import DefaultPredictor
      2 from detectron2.data import MetadataCatalog
      3 from detectron2.config import get_cfg
      4 from detectron2.utils.visualizer import ColorMode, Visualizer
      5 from detectron2 import model_zoo

/usr/local/lib/python3.7/dist-packages/detectron2/engine/__init__.py in <module>()
      9 # prefer to let hooks and defaults live in separate namespaces (therefore not in __all__)
     10 # but still make them available here
---> 11 from .hooks import *
     12 from .defaults import *

/usr/local/lib/python3.7/dist-packages/detectron2/engine/hooks.py in <module>()
     11 from collections import Counter
     12 import torch
---> 13 from fvcore.common.checkpoint import PeriodicCheckpointer as _PeriodicCheckpointer
     14 from fvcore.common.param_scheduler import ParamScheduler
     15 from fvcore.common.timer import Timer

/usr/local/lib/python3.7/dist-packages/fvcore/common/checkpoint.py in <module>()
     21 else:
     22     from torch import quantization
---> 23     from torch.quantization import ObserverBase, FakeQuantizeBase
     24 
     25 __all__ = ["Checkpointer", "PeriodicCheckpointer"]

ImportError: cannot import name 'FakeQuantizeBase' from 'torch.quantization' (/usr/local/lib/python3.7/dist-packages/torch/quantization/__init__.py)

"flop_count" for Detectron2 DeformConv

I'm trying to calculate the flops for Detectron2 Deformable Conv, but I'm having trouble figuring out what should be the name of the handler:

def deform_conv_flop_jit(inputs: List[Any], outputs: List[Any]) -> typing.Counter[str]:
    """
    Count flops for deformable convolution.
    """

flops = FlopCountAnalysis(model, data)

handlers = {name: deform_conv_flop_jit}
flops.set_op_handle(**handlers)

According to the doc string, the name should be "in the form of list(torch._C.Value)". But I got TypeError: 'pybind11_type' object is not iterable when I try to check what's in this list. Could you let me know where I can figure out the name?

urllib HTTPError

Hi!
I've tried to run Detectron2 demo and when I executed it for the first time, it ran until a point where it loads an image (I gave it a wrong path in the argument so the execution stopped).
When I fixed it and tried running it for the second time, I repeatedly encountered HTTPError, which occurred in fvcore package. I post the traceback below.
Is there any restriction on automatic downloads by fvcore?

[12/05 18:23:44 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='configs/COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml', input=['/home/kurapan/Datasets/YFCC100M/111/aac/111aac90261da6d67961cec95367b21a.jpg'], opts=['MODEL.WEIGHTS', 'detectron2://COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x/137849600/model_final_f10217.pkl'], output=None, video_input=None, webcam=False)
WARNING [12/05 18:23:44 d2.config.compat]: Config 'configs/COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml' has no VERSION. Assuming it to be compatible with latest v2.
model_final_f10217.pkl: 0.00B [00:01, ?B/s]
Failed to download https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x/137849600/model_final_f10217.pkl
Traceback (most recent call last):
  File "demo/demo.py", line 73, in <module>
    demo = VisualizationDemo(cfg)
  File "/home/kurapan/Code/3rdparty/repos/detection/detectron2/demo/predictor.py", line 35, in __init__
    self.predictor = DefaultPredictor(cfg)
  File "/home/kurapan/Code/3rdparty/repos/detection/detectron2/detectron2/engine/defaults.py", line 161, in __init__
    checkpointer.load(cfg.MODEL.WEIGHTS)
  File "/home/kurapan/miniconda3/envs/detectron/lib/python3.7/site-packages/fvcore/common/checkpoint.py", line 99, in load
    path = PathManager.get_local_path(path)
  File "/home/kurapan/miniconda3/envs/detectron/lib/python3.7/site-packages/fvcore/common/file_io.py", line 529, in get_local_path
    )._get_local_path(path, **kwargs)
  File "/home/kurapan/Code/3rdparty/repos/detection/detectron2/detectron2/checkpoint/catalog.py", line 125, in _get_local_path
    return PathManager.get_local_path(self.S3_DETECTRON2_PREFIX + name)
  File "/home/kurapan/miniconda3/envs/detectron/lib/python3.7/site-packages/fvcore/common/file_io.py", line 529, in get_local_path
    )._get_local_path(path, **kwargs)
  File "/home/kurapan/miniconda3/envs/detectron/lib/python3.7/site-packages/fvcore/common/file_io.py", line 404, in _get_local_path
    cached = download(path, dirname, filename=filename)
  File "/home/kurapan/miniconda3/envs/detectron/lib/python3.7/site-packages/fvcore/common/download.py", line 62, in download
    url, filename=tmp, reporthook=hook(t)
  File "/home/kurapan/miniconda3/envs/detectron/lib/python3.7/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/home/kurapan/miniconda3/envs/detectron/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/home/kurapan/miniconda3/envs/detectron/lib/python3.7/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/home/kurapan/miniconda3/envs/detectron/lib/python3.7/urllib/request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "/home/kurapan/miniconda3/envs/detectron/lib/python3.7/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/home/kurapan/miniconda3/envs/detectron/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/home/kurapan/miniconda3/envs/detectron/lib/python3.7/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

Bug in ScaleTransform

If I'm not mistaken the scale transform scales the width by the new_h parameter and the height by the new_w parameter.
Expected input is NxHxWxC. (height first)
But size argument to interpolate is (self.new_w, self.new_h) (width first).
The call to to_float_tensor doesn't seem to swap spatial dimensions either.

The line I'm concerned with is the following:

size=(self.new_w, self.new_h),

Maybe the following makes it clearer:

>>> new_h, new_w = 480, 720
>>> array = numpy.empty((1, 1080, 1920, 3), dtype=numpy.float)
>>> float_tensor = torch.nn.functional.interpolate(
...     to_float_tensor(array),
...     size=(new_w, new_h),
...     mode="bilinear",
...     align_corners=False,
... )
>>> float_tensor.shape
torch.Size([1, 3, 720, 480])

Now height is new_w and width is new_h.

I'll submit a pull request, but I'm not sure if that's a breaking change and if that is a concern at this stage of the project.
Maybe you should warn the detectron2 team or other people, which rely on this code.

Best Regards,
Lucas Steinmann

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.