Comments (3)
@maximsch2 I like the idea. The only thing I'm a bit afraid of is that if we don't check on each update, we will silently calculate wrong values. therefore I like having the method explicitly, but I'd probably do it on an opt-out basis rather than opt-in. (i.e. having a flag for that that defaults to true and can be set to false). What do you think?
@SkafteNicki @Borda thoughts?
from torchmetrics.
1. We are seeing slowdown from format checking [PyTorchLightning/pytorch-lightning#6605](https://github.com/PyTorchLightning/pytorch-lightning/issues/6605)
I would argue that the first step would be trying to lower the computational time of our implementations (they may not be optimal)
2. We would like to be able to do more sanity checking that metrics specified by a LightningModule are a correct fit for the task.
I think it is important that we remember that torchmetrics are not intended to be used only with lightning but also with native pytorch. Also the concept of task is more related to flash than lightning right?
If this really boils down to our implementations being too slow because we make sure that the user input is correct, I would argue that we should have some kind of flag:
import torchmetrics
torchmetrics.performance_mode = True
that turns off all checking (meant for users knowing what they are doing).
from torchmetrics.
Performance optimizations is not the main goal (as that can be addressed by other implementations anyway).
Connection between training task and metrics is the key. Assume you are building a framework that allows people to train various tasks of different shape and you want to add metrics configurability in it. For simplicity, you can say that task==LightningModule, but of course this doesn't have to be the case. Now, you need a way to know how to pipe the output from the arbitrary model to a set of metrics. There are two ways:
- Let users write it manually - most flexible, but makes configuring things harder
- Explicitly support task_types and give the model ability to do things generically.
In Lightning terms:
def training_step(self, batch):
loss, outputs = self.model.get_loss_and_outputs()
# outputs is Dict[TaskType, TTaskTypeOutput]
for task_type, output in outputs.items():
for metric_name, metric in self.metrics_collection.items():
if metric.supports(task_type):
self.log(metric_name, metric(*output))
return loss
Why do we have a the same model outputting different types? This can happen in various ways:
- Multi-tasking where you have a model with multiple heads outputting different types (e.g. classification head, regression head, similarity head, etc)
- Representation learning models where we can output both a set of class probabilities and an embedding representation of the object and want to compute metrics on both
- etc
I think it is important that we remember that torchmetrics are not intended to be used only with lightning but also with native pytorch. Also the concept of task is more related to flash than lightning right?
Right, flash is a more appropriate analogy.
from torchmetrics.
Related Issues (20)
- A way to run *any* metric async on cpu HOT 1
- Unable to plot `MetricCollection` containing prefix using `MetricCollection.plot()` HOT 1
- A new metric NPV HOT 1
- Error in ERGAS metric HOT 5
- top-k multiclass macro accuracy less than top-1 multiclass macro accuracy HOT 2
- top-k multiclass macro accuracy is not calculated correctly
- Error during argument validation: predictions can not contain `ignore_index` HOT 1
- Add a CW-SSIM support for torchmetrics HOT 8
- Create gallery of realistic examples
- Delay imports of optional dependencies such as torchaudio, torchvision HOT 1
- Incorrect result in computing `MulticlassRecall` macro average when `ignore_index` is specified HOT 1
- RetrievalNormalizedDCG doesn't change with different top_k values HOT 2
- BootStrapper.update/forward don't process kwargs HOT 1
- List Metric synchronization fails in corner case HOT 1
- Contribution: Add new audio/speech metrics for generative audio HOT 4
- ClasswiseWrapper and JaccardIndex confmat attribute error HOT 2
- MulticlassAveragePrecision crashes on .compute() if empty HOT 2
- Metric not moved to device and invalids the cpu-gpu offloading when combining with DeepSpeed HOT 1
- [Bug] No backend type associated with device type cpu HOT 2
- Metrics not being logged properly on remote GPU HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from torchmetrics.