GithubHelp home page GithubHelp logo

Comments (33)

koaning avatar koaning commented on August 18, 2024 48

I should mention that I have little knowledge of the tensorboard/tensorflow internals and I was not aware of the database roadmap but to introduce the idea I had in mind I will first explain the hack we currently use.

Current Practice

We usually write command line apps with fire to run a bit of grid search from the command line. We can give the run a name from the command line and from the command line we also set any hyperparams (which optimiser, learning rate, dropout chance, etc). Something like:

$ tensoryo train --lr 0.001 --opt adam --dropout 20 --name 8-layers 

The command line app tensoryo will create a log folder over at <logdir>/8-layers-lr=0.001-opt=adam-dropout=20-id=<hash> and train a model.

Every run gets a randomised hash such that we can identify different id for each run.

Out current hack involves pushing the hyperparams into the log name and afterwards the grabbing any relevant data from the tflogs using the EventAccumulator. With a few loops and pandas we get what we want: a simple summary of all models where all the hyperparams are encoded in the logfolder name. It is a hack, but works very well for us.

The main usecase for these hyperparams though, at least for us, is to generate a large table of all the runs we've ever done in order to determine if perhaps there is a pattern. I imagine that this is the main usecase for most people: to be able to query summary statistics.

What Might Be Better

We are really just interested in getting a table with consistent summaries for every run. It would be great if tensorboard had a table view for something like:

id dropout lr alg acc_train acc_val
8-layers-1 20 0.001 adam 0.97 0.96
8-layers-2 20 0.001 adam 0.94 0.95
8-layers-3 30 0.001 sgd 0.98 0.94
8-layers-4 30 0.001 sgd 0.99 0.93

You should be very easily be able to add support for this by having a summary method for it. This might look something like:

tf.summary.attribute('id', name + str(uuid()))
tf.summary.attribute('dropout`, 20)
tf.summary.attribute('lr', 0.001)
tf.summary.attribute('alg', stddev)
tf.summary.attribute('acc_train', training_loss.eval())
tf.summary.attribute('acc_val', validation_loss.eval())

Getting hyperparameters in general is certainly not trivial but I wonder if that is the feature developers really need. I can reason that developers would not mind to explicitly set these things manually as long as they are logged. Currently though, I am not aware of such a feature in tensorflow/tensorboard.

We're looking into perhaps making a custom ui for this either way but it feels like a rather general feature for tensorboard that would adres the hyperparam issue. Am very curious to hear from you folks if this sounds sensible. It might very well be that my usecase is different from the general one.

from tensorboard.

carlthome avatar carlthome commented on August 18, 2024 22

This is the best temporary workaround for a tf.estimator's params dict that I could come up with, though it doesn't let you compare runs. I would love something sortable like what @koaning describes.

hyperparameters = [tf.convert_to_tensor([k, str(v)]) for k, v in params.items()]
tf.summary.text('hyperparameters', tf.stack(hyperparameters))

screen shot 2018-08-03 at 20 55 11

from tensorboard.

hardianlawi avatar hardianlawi commented on August 18, 2024 21

Is there any update on this?

from tensorboard.

GalOshri avatar GalOshri commented on August 18, 2024 19

Hi all, apologies about the delay, but check out the HParams dashboard in TensorBoard here. This documentation can easily be run in Colab for you to try it out.

This dashboard provides a table similar to the one suggested by @koaning above, but also enables two other visualizations such as the parallel coordinates view:
image.

This is still evolving, so it would be great to get your feedback and discuss how to make this better.

/cc @erzel

from tensorboard.

Spenhouet avatar Spenhouet commented on August 18, 2024 5

@GalOshri Nice to see some progress on this! I really like the new view.

My concern is the ton of boilerplate code that currently seems to be necessary.
I wish that you reduce it to something like:

tf.summary.hparam(value, name)
tf.summary.metric(value_tensor, name)

That would make it much more usable and I don't see why that wouldn't be possible.
What do you think? Are you planning to reduce it to that?

EDIT 1: If a set of runs is selected in the new hparams view and I switch to the scalar, histogram or image view, do I get the same selection in these views?

EDIT 2: Is it planned to enable viewing hyperparameters in the tooltip on the scalar view?

EDIT 3: Why do we need to define metrics for this view anyway? Couldn't all scalars be used for this? It could be selectable which scalars we would like to view in the parallel coordinates view.
This would also allow a step slider for this view.

from tensorboard.

mamaj avatar mamaj commented on August 18, 2024 5

Based on @carlthome's answer, you can simply turn a dictionary of your hyperparameter name/values into a markdown table:

def dict2mdtable(d, key='Name', val='Value'):
    rows = [f'| {key} | {val} |']
    rows += ['|--|--|']
    rows += [f'| {k} | {v} |' for k, v in d.items()]
    return "  \n".join(rows)

Then add a text summary. E.g. for PyTroch:

writer.add_text('Hyperparams', dict2mdtable(hparam_dict), 1)

image

from tensorboard.

ricoms avatar ricoms commented on August 18, 2024 4

These two projects allow hyper-parameters logging among some other things:

  1. https://mlflow.org/
  2. https://www.comet.ml/

might help

from tensorboard.

koaning avatar koaning commented on August 18, 2024 3

After some rethinking. It may be better to have the id column always map to the name of the logdir.

from tensorboard.

berendo avatar berendo commented on August 18, 2024 3

Please pardon my ignorance, but I'm having a bit of difficulty (wholly) reconciling the later discussions here with the work done in the hparams plugin. Is the hparams plugin still the intended target for integration the features that are discussed in this issue? If so, can someone (@jart?) summarize the current state of things? This is an area that I could (and would like to) contribute to...

from tensorboard.

dvisztempacct avatar dvisztempacct commented on August 18, 2024 3

Any movement or discussion on this issue? There are people willing to contribute if we know what direction the maintainers want things to go

from tensorboard.

jfaleiro avatar jfaleiro commented on August 18, 2024 2

An alternative would be to allow the association of a dictionary to a FileWriter, and allow the regex-search of keys or values of that dictionary, similar to the way it is currently done for names of runs. Is there anything similar available?

from tensorboard.

erzel avatar erzel commented on August 18, 2024 1

@Spenhouet Thanks so much for using the tool and the thoughful feedback.

It wasn't my intention to see all scalars per default. The intuition here is more that the new metrics feel
redundant. I think it is save to say that the metrics someone defines will be within the scalars. Also the > scalars already contain all necessary information about this metric.
Implementing it in such a general way that scalars can be used as metric would make the system much more flexible (and save boilerplate code).

Currently, if one does not emit an 'experiment' summary, the plugin backend will try to create one by scanning all the sub-directories of the log dir and gathering session-info summaries. This would have the effect of collecting all the scalar summaries as metrics (and by default enabling only the first few of thems for display).

As already mentioned, this would also allow for the addition of a step slider in this new view. If think
that would be important since I might not be interested in the end state of my training but rather in a peak state some epochs ago.

Good idea. Indeed, we've already gotten this request in the past. These are actually two different requests here: 1) To enable seeing metrics for intermediate steps, and 2) Finding the step for each session (~run) where some chosen metric is optimized (maximized or minimized), and then using that step to get the metrics for that run and then compare all runs using one of the visualizations.

In terms of priority I'd think 2) is probably what you'd really want to see first, but I'm wondering what you think?

Note also that not every step contains metrics and different runs may have different steps for which metrics are available, so for 1) I'm not sure what's the best way to implement a "step" slider here. Should we get the metric from the closest step for which a metric is available?

from tensorboard.

mmuneebs avatar mmuneebs commented on August 18, 2024

Hi, is this feature still in the plans?

from tensorboard.

jart avatar jart commented on August 18, 2024

Yes storing hyper parameters is something that's long in the works, and should happen, but we haven't had the cycles to implement it yet.

from tensorboard.

koaning avatar koaning commented on August 18, 2024

We have a serious use-case for this and wouldn't mind spending a bit of time implementing this. Is there a branch where some of the work is being done on this? I would not mind contributing!

from tensorboard.

jart avatar jart commented on August 18, 2024

How would you implement it? Because it's not a simple problem to solve. Right now I'm leaning in the direction that it might be best to wait until more of the database stuff is done. But if you have ideas, I'd love to listen.

from tensorboard.

neighthan avatar neighthan commented on August 18, 2024

I would also be very appreciative of just a nice way to show a table of parameters for each model. Currently I use the Text plugin for this, which works okay for making a little table for each model. However, it would be nicer if there were an integrated table (across all of the selected models) that supported simple things like sorting by a given column or filtering. I agree with @koaning that I don't need TensorBoard to automatically pull in any flags; I'd be happy to explicitly pass in a list of table headers and the values for a particular model (right now, with the Text plugin, I do this with a 2D list of strings like [['header1', 'header2'], ['val1', 'val2']]).

from tensorboard.

ahundt avatar ahundt commented on August 18, 2024

This would be fantastic to have!

from tensorboard.

Spenhouet avatar Spenhouet commented on August 18, 2024

@koaning This would be wonderful.

I wish that this form of table just would replace the current list of runs.
Every column with hyperparameters should be filterable.

The hyperparameters should also be shown in the tool tip for the plots.

I like the simple approach here:

tf.summary.attribute('lr', 0.001)

What is the current state of this issue? @koaning your comment is a year old. Are you still interested?

from tensorboard.

koaning avatar koaning commented on August 18, 2024

@Spenhouet I am certainly still interested. I haven't done much with tensorboard in a while I should say so I don't know if anybody committed something meaningful here.

I'm not aware of the current status of the roadmap though. A year ago I recall hearing a warning about something internal changing.

from tensorboard.

dvisztempacct avatar dvisztempacct commented on August 18, 2024

@Spenhouet I've been watching this thread because I'm also very interested in such a feature. I've come across this which seems to exist expressly for this purpose:

https://www.tensorflow.org/api_docs/python/tf/contrib/training/HParams

I'm somewhat actively trying to figure out how to use this to make my experiments searchable (edit) and visualizable (e.g. plotting min validation loss vs. some hyperparam, as opposed to plotting validation loss vs. training step.) If you're doing similar, let's share notes :)

from tensorboard.

ajbouh avatar ajbouh commented on August 18, 2024

I am also interested in a feature along these lines.

I've experimented with changing the name of the log dir based on the names of hyperparameters. I use a format string like: "mymodel_beta={{beta}}_gamma={{gamma}}"

And I patch the tensorboard log dir loading logic to use the resulting string instead of the directory's basename.

This makes it possible to do basic regex filtering of runs

from tensorboard.

dvisztempacct avatar dvisztempacct commented on August 18, 2024

@ajbouh I think that's typically what people end up doing (overloading log dir) but we must do better! 🥇

Can you link to your patch/branch?

from tensorboard.

ajbouh avatar ajbouh commented on August 18, 2024

from tensorboard.

dvisztempacct avatar dvisztempacct commented on August 18, 2024

@jart it says I unassigned you:

screenshot from 2018-12-04 13-05-29

No idea how I might have done this...

from tensorboard.

cwbeitel avatar cwbeitel commented on August 18, 2024

@GalOshri This is awesome, great work.

My first thought is that we (by way of using tensor2tensor) are logging a large number of hparams to a json file in the root of each ckpt dir. We could write a script that can produce a ckpt summary from such a json but since you're asking it would be convenient if tboard had this built in (i.e. parsed and loaded data from an hparams.json file when present). But idk if that would perhaps be limiting or otherwise not aligned with others' use patterns.

from tensorboard.

GalOshri avatar GalOshri commented on August 18, 2024

I think it would be great to have some common format so users don't need to do the conversion themselves or duplicate the hparam configuration. Another option is also for tensor2tensor to directly log the summary to TensorBoard (as it can already log to TensorBoard). Is your hparams.json always a flat list or are there ever nested hyperparameters?

/cc @erzel @wchargin @manivaradarajan for their thoughts as well.

from tensorboard.

cwbeitel avatar cwbeitel commented on August 18, 2024

@GalOshri Having t2t write the necessary summaries sounds appropriate for those using t2t. I could also see it from the perspective of writing to a common format that is used more generally.

Currently always flat, for example But this does create some challenges and I would prefer to organize these hierarchically. I think some others might be working on that using gin which might look something like this.

from tensorboard.

wchargin avatar wchargin commented on August 18, 2024

Hi @Spenhouet! Thanks for the feedback. We definitely agree—as noted in
the tutorials, this is a very rough cut of the Python API. We’re using
issue #1998 to track improvements to this API.

If a set of runs is selected in the new hparams view and I switch to
the scalar, histogram or image view, do I get the same selection in
these views?

Currently: no. This is also something that we’re interested in; it ties
into broader questions about how TensorBoard should consider “runs” in
general.

Is it planned to enable viewing hyperparameters in the tooltip on the
scalar view?

We haven’t discussed this much; if we were to do this, we’d need to be
careful to not make the tooltips so wide as to be unwieldy.

Why do we need to define metrics for this view anyway? Couldn't all
scalars be used for this? It could be selectable which scalars we
would like to view in the parallel coordinates view.

Many TensorBoard jobs include scalar summaries that may not be useful as
metrics: for instance, logging annealing functions over time (learning
rate decay, regularization parameters, teamwork coefficients for RL), or
diagnostics about health of per-layer activations (sparsity, extent), or
operational statistics like training steps per second or convergence
rate. We frequently see TensorBoard jobs with many dozens of scalar
summaries, which would make a parallel coordinates plot or multi-scatter
view harder to interpret.

from tensorboard.

Spenhouet avatar Spenhouet commented on August 18, 2024

@wchargin Thank you for your feedback and good to hear this is in the works.

Is it planned to enable viewing hyperparameters in the tooltip on the
scalar view?

We haven’t discussed this much; if we were to do this, we’d need to be
careful to not make the tooltips so wide as to be unwieldy.

It definitely would need to be selectable which hparams are shown in the tooltip. I agree that showing all hparams per default in the tooltip wouldn't work. I think it would be better to show no hparam in the tooltip per default but with the option to select hparams that should be shown in the tooltip. Then, if needed, someone can just add the hparams that are of interest and add them to the tooltip.

Why do we need to define metrics for this view anyway? Couldn't all
scalars be used for this? It could be selectable which scalars we
would like to view in the parallel coordinates view.

We frequently see TensorBoard jobs with many dozens of scalar
summaries, which would make a parallel coordinates plot or multi-scatter
view harder to interpret.

It wasn't my intention to see all scalars per default. The intuition here is more that the new metrics feel redundant. I think it is save to say that the metrics someone defines will be within the scalars. Also the scalars already contain all necessary information about this metric.
Implementing it in such a general way that scalars can be used as metric would make the system much more flexible (and save boilerplate code).

The parallel coordinates would be handled similar to the tooltip: If someone is interested in a metric, it is selectable from the scalars. Per default no metric is selected and only hparams are shown.

As already mentioned, this would also allow for the addition of a step slider in this new view. If think that would be important since I might not be interested in the end state of my training but rather in a peak state some epochs ago.

from tensorboard.

Spenhouet avatar Spenhouet commented on August 18, 2024

@erzel Correct me if I'm wrong but the steps (frequency and size) are defined by the act of saving a summary. Different steps only appear if multiple FileWriter are used (given a proper implementation).

Yes, you are right that there isn't a step slider that fits all since it is not clear how different steps relate to each other, but I'm not sure this is such a hard problem to solve.

EDIT (changed my opinion on option 2)
Option 2) definitely could be a option on its own, a form of min/max mode. This mode would detach results from steps and compare models on their best performance.

  1. To enable seeing metrics for intermediate steps

I think that this should be the goal. Adding that this also includes seeing hyperparameters for intermediate steps, what is equally important. Any step given any reason / motivation could be interesting to me.

I see different ways to go about this issue:

Given the multiple FileWriter assumption, it could be possible to split between those. Only scalars from the same FileWriter can be mixed and matched in the parallel coordinates view. Most of the time these denote training and testing.

Another option could be to add a form of global step variable. The user would need to define the smallest measure of step size which is universally applicable. This would allow to view any metric in combination with each other.

Yet another option could be to create sets of equal steps (size and frequency). Only hyperparameters and metrics that have the same steps could be mixed within the same view.

We could also assume that the beginning and ending of all sequences independent to their steps is at the same time / step. From that we could interpolate all matching steps. This would results in the smoothest experience BUT could be wrong, given a case where the beginnings or endings don't match.

from tensorboard.

williamFalcon avatar williamFalcon commented on August 18, 2024

These two projects allow hyper-parameters logging among some other things:

  1. https://mlflow.org/
  2. https://www.comet.ml/

might help

Test Tube

from tensorboard.

paloha avatar paloha commented on August 18, 2024

@GalOshri Nice to see some progress on this! I really like the new view.

My concern is the ton of boilerplate code that currently seems to be necessary.
I wish that you reduce it to something like:

tf.summary.hparam(value, name)
tf.summary.metric(value_tensor, name)

That would make it much more usable and I don't see why that wouldn't be possible.
What do you think? Are you planning to reduce it to that?

EDIT 1: If a set of runs is selected in the new hparams view and I switch to the scalar, histogram or image view, do I get the same selection in these views?

EDIT 2: Is it planned to enable viewing hyperparameters in the tooltip on the scalar view?

EDIT 3: Why do we need to define metrics for this view anyway? Couldn't all scalars be used for this? It could be selectable which scalars we would like to view in the parallel coordinates view.
This would also allow a step slider for this view.

I have just wanted to make a feature request, but I have seen this comment from @Spenhouet which basically describes exactly what I need in the EDIT 1. We have more than a thousand experiments logged in Tensorboard. The hyperparameters plugin is awesome, but it would be really useful, if we could use the same selection from hyperparameters in other plugins like scalars, images etc. Now I need to use regex with e.g. 10 trial_id values which is annoying to type in.

Regarding the Edit 2 - this would also be a very nice feature, though, currently, I use the most important hyperparameters to name the experiment folder. So the tooltip which shows the run folder basically shows me the hyperparameters. Maybe this helps until it is implemented. Also I do this in order to navigate through my experiments quickly in the file manager even without Tensorboard. My experiment folder looks e.g. like this: v_0.8625-t_0.8875-b_00122-s_005-w_970-o_adam-r_11min-seed_1-id_1580606051.332965, so I can sort it in the file manager according to the highest validation accuracy and see everything else which I am interested in.

Regarding the EDIT 3 - I am using the tensorboard.plugins.hparams in Keras and I wanted to have the information about the highest val_accuracy metric (best_epoch) in the hyperparameters. Now, there is just the last one from the epoch on which the Early stopping callback terminated the training. I have solved this by just adding the best_epoch_val_accuracy as a hyperparameter to tensorboard.plugins.hparams.api.hparams(config). So the only difference is, it does not show up in the metrics sidebar, but I do not care about it.

from tensorboard.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.