GithubHelp home page GithubHelp logo

tensorflow / profiler-ui Goto Github PK

View Code? Open in Web Editor NEW
134.0 14.0 32.0 1.5 MB

[Deprecated] The TensorFlow Profiler (TFProf) UI provides a visual interface for profiling TensorFlow models.

License: Apache License 2.0

Python 0.39% CSS 0.29% HTML 98.47% JavaScript 0.86%

profiler-ui's Introduction

Note: This project has been deprecated in favor of TensorBoard

https://www.tensorflow.org/tensorboard/r2/tensorboard_profiling_keras

TensorFlow Profiler UI

The TensorFlow Profiler (TFProf) UI provides a visual interface for profiling TensorFlow models.

Installation

  1. Install Python dependencies.
    pip install --user -r requirements.txt
  2. Install pprof.
  3. Create a profile context file using the tf.contrib.tfprof.ProfileContext class.
  4. Start the UI.
    python ui.py --profile_context_path=/path/to/your/profile.context

Learn more

You can learn more about the TensorFlow Profiler's Python API and CLI here.

Screenshot

Browser support

Currently only Chrome is supported.

Contributing

Please see our contributor's guide

Feature requests

Want ideas for ways to contribute to the TensorFlow Profiler UI? Here are some requested features:

  • Support multiple profile contexts at once (#11)

profiler-ui's People

Contributors

chrisantaki avatar wonglkd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

profiler-ui's Issues

can not produce context file

I use tf.contrib.tfprof.ProfileContext as follows, but it can not produce any context file

with tf.contrib.tfprof.ProfileContext('./test') as pctx:
       train(model, train_data_files, val_data_file)

how to soloved it ?
thanking you very much~

Check failed: PyBfloat16_Type.tp_base != nullptr

After following the installation instructions, I get the following error when attempting to start up the UI:

python ui.py --profile_context_path=/opt/devel/src/autocoder/rad-autocoder/model_configurable/output/profile_100
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
ImportError: numpy.core.multiarray failed to import

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 968, in _find_and_load
SystemError: <class '_frozen_importlib._ModuleLockManager'> returned a result with an error set
ImportError: numpy.core._multiarray_umath failed to import
ImportError: numpy.core.umath failed to import
2019-03-12 12:50:48.229931: F tensorflow/python/lib/core/bfloat16.cc:675] Check failed: PyBfloat16_Type.tp_base != nullptr
Abort trap: 6

Profiled time for one step higher than in actual training

The time taken for one step of training as profiled by the profiler is much higher than how much time one step takes during actual training. For example, for my model the profiler shows that one training step takes around 4 second to complete, however in the actual training, a training step on an average takes less than 1 second.
Please note that I am not referring to the slowdown of the tensorflow during profiling.
I am wondering what is the reason for this. One hypothesis that I have is that during actual training, many ops get performed in parallel where as profiler probably just adds up the time of each individual operation to report the total time?

Insufficient documentation

This TensorFlow Profiler UI looks promising but documentation is too brief. I installed pprof and bazel. I managed to make it work in the CLI following these instructions. But I could not make it on the web browser. Please, give us some more details.

Error on request...

Hi Chris !
Just to let you know that I'm facing a similar issue to that of #3 indeed I'm trying to use the UI of tfprof (using google chrome) with the following setup :

Experiment Side
conda environment (TLDR: tensorflow-gpu 1.6.0 - python 3.6.4)
conda_env.txt
corresponds to the environment in which I run my model and where the file "profile.context" is generated thanks to the pyhtonic line (as stated here):
with tf.contrib.tfprof.ProfileContext('/tmp/train_dir') as pctx:

Profiling Side
virtual environment (TLDR: tensorflow 1.6.0 - python 3.6.4)
venv.txt
corresponds to the place where I run the following command:
python ui.py --profile_context_path=~/path_to_something/profile.context
obviously I followed your installation guide

Issue
The "profile.context" file (real filename is by default profile_100) is effectively generated (weights ~5Mb) only once, at step 100, as I only need one profiling step for now on.
The tfprofiler ui effectivelly opens in google chrome showing all the buttons/scrolling bars/... it should show.
So my issue is that this interface is empty in my case..

  • Timeline tab : The server returned an error.
  • PPROF : The server seems to be offline.
  • Python : Profile was not generated.

Note : FYI there is no VM and all takes place locally on the same machine and I do use Experiment/Estimator APIs.

Sorry for the long post and thanks for reading this !
Cheers

Profiler UI does not work with Python3 with error: No module named 'route_handlers'

Looks like this UI can only be run under Python 2 (i.e. python2 ui.py --profile_context_path=/tmp/train_dir/profile). With Python 3, it will throw out this error message:

Traceback (most recent call last):
  File "ui.py", line 19, in <module>
    from server.server import start_server
  File "/home/xiaoyzhu/notebooks/profiler-ui/server/server.py", line 20, in <module>
    from route_handlers import handle_home_page
ImportError: No module named 'route_handlers'

Just want to make sure it's tracked and hopefully it's helpful for people who meet the same issue.

merge multiple profile context together

Use cases:

  • i have multiple training (2) running on the same GPUs and i want to profile each process, but want to see them at the same time on the profiler-ui.

current profiler ui takes --profile_context_path=path/to/directory/profile

is it possible to do multiple visualization like tensorboard? --profile_context_path=path/to/directory

Problem getting UI to display data from tf profiler

I added code to the NMT train.py script. I have run the NMT trainer many times, it works fine. This created a profile
with tf.contrib.tfprof.ProfileContext('/home/levinth/train_dir') as pctx:
while global_step < num_train_steps:
### Run a step ###
start_time = time.time()
try:
step_result = loaded_train_model.train(train_sess)
(_, step_loss, step_predict_count, step_summary, global_step,
step_word_count, batch_size) = step_result
hparams.epoch_step += 1
except tf.errors.OutOfRangeError:
etc..etc
ls -l ~/train_dir/
total 52688
-rw-rw-r-- 1 levinth levinth 8912664 Mar 8 11:43 profile_100
-rw-rw-r-- 1 levinth levinth 5977 Mar 8 12:11 timeline_100_100
-rw-rw-r-- 1 levinth levinth 45030165 Mar 8 12:11 timeline_100_11

installed go, used that to install pprof
set PYTHONPATH to my install of the R1.5 wheel built for cuda9.1, libcudnn7.0.4 and python2
I have to maintain many versions of TF due to limitations of assorted applications
export PYTHONPATH=~/tf_r1.5_c91_py2/
and remove the requirement on tf r1.4.1 from the requirements.txt
then did the pip install
pip install --user -r requirements.txt

invoking ui.py did the following
python ui.py --profile_context_path=/home/levinth/train_dir/profile_100
/usr/local/lib/python2.7/dist-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters

  • Running on http://0.0.0.0:7007/ (Press CTRL+C to quit)
    127.0.0.1 - - [08/Mar/2018 13:33:26] "GET / HTTP/1.1" 200 -
    127.0.0.1 - - [08/Mar/2018 13:33:26] "GET /static/js/app.js?ts=1520544806.33 HTTP/1.1" 200 -
    127.0.0.1 - - [08/Mar/2018 13:33:26] "GET /static/css/style.css?ts=1520544806.33 HTTP/1.1" 200 -
    127.0.0.1 - - [08/Mar/2018 13:33:27] "GET / HTTP/1.1" 200 -
    127.0.0.1 - - [08/Mar/2018 13:33:27] "GET /static/css/style.css?ts=1520544807.2 HTTP/1.1" 200 -
    127.0.0.1 - - [08/Mar/2018 13:33:27] "GET /static/js/app.js?ts=1520544807.2 HTTP/1.1" 200 -
    127.0.0.1 - - [08/Mar/2018 13:33:27] "GET /static/images/tf-400.png HTTP/1.1" 200 -
    the gui came up but I could not get it to display anything
    any assistance would be greatly appreciated

Problem getting UI working

I trigger a training with with tf.contrib.tfprof.ProfileContext('/tmp/train_dir') as pctx:
And I tried to launch the UI python ui.py --profile_context_path=/tmp/train_dir/profile_100

I did see a warning in terminal otherwise it looked fine:
* Serving Flask app "server.server" (lazy loading)
* Environment: production
WARNING: Do not use the development server in a production environment.
Use a production WSGI server instead.
* Debug mode: off
* Running on http://0.0.0.0:7007/ (Press CTRL+C to quit)

And this is the what I saw from the UI http://0.0.0.0:7007/:
screen shot 2018-05-03 at 3 55 27 pm

Could somebody help me with that? Thanks!

Tensorflow: 1.4.1
Browser: Chrome

Creating a release.

Would it be possible to cut a release of this project. I am unable to get through the bureaucratic process of bringing this software into the firm I work at because there is technically no release of this project.

Thanks

Archive repository

Since the repo is deprecated I think it would make sense to archive it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.