Comments (9)
See also #84
from training.
OK, didn't see it before. Feel free to close this issue.
from training.
Hi Lucien (@lucienwang1009 ), I wanted to link the two issues. Your finding makes it more urgent to update the TF version specified in the Dockerfile.
from training.
This is unfortunate. Thank for for letting us know. If this is blocking you, feel free to link a PR here and I'll take a look at it. Otherwise we will try to address this in the coming days.
from training.
I agree with @nvcforster that it ought to use a stable version rather than an alterable nightly one. Besides, I prefer providing variables in Dockerfile to assign versions of TF, CUDA, and CUDNN etc.
By the definition, a reference result is the quality or the performance of a benchmark running in a system with a framework. So the Dockerfile, as a definition of the system environment, should provide a way to specify these software versions.
Anyway, I really want you guys to consider and address this issue ASAP, at least stick to an available stable version.
BTW: in translation benchmark, tf-nightly-gpu is installed again in docker container, which is obviously redundant.
from training.
could somebody at least post a functioning edit for the dockerfile line?
RUN pip3 install --upgrade numpy scipy sklearn tf-nightly-gpu==1.9.0.dev20180419
from training.
I'll take a look at this in the next day or two... sorry for the issue here.
In the mean time, one option is taking a look at the source of this fork:
https://github.com/tensorflow/models/tree/master/official/transformer
The "official" version will be very similar (since we forked it with only minor modification) and better maintained, it may give insight into these kinds of problems.
from training.
PTAL #119
from training.
Closing because GNMT is deprecated from the benchmark suite.
from training.
Related Issues (20)
- [MaskRCNN bug] make_data_loader() method should only return data_loaders[0] when training
- [MaskRCNN bug] when MaskRCNN saves checkpoint after training, an error is reported
- does not have storage.objects.list access to the Google Cloud Storage bucket
- Unable to run unit tests of distributed checkpointing in Megatron-LM
- How to run dlrm module with criteo_kaggle dataset?
- Stable diffusion training test failed at module 'cv2.dnn' has no attribute 'DictValue' HOT 2
- Command line options in bert training
- docker run error for image_segmentation/pytorch test following the guide HOT 2
- failed to build object_detection container with below error on FedoraOS37 HOT 3
- error run the rnn speech workload, failed to process data after enter docker HOT 4
- Unable to download tar file in the mlcommons-training-wg-s3 S3 Bucket
- run stable diffusion see no space left on device error HOT 2
- Potential private information leak in retired benchmark
- unable to find image 'mlperf/object_detection'
- OCI runtime create failed
- where is the definition of mlperf_logging HOT 1
- [Stable Diffusion] VAE Moments to image outputs whited out image. HOT 1
- Alternative method for downloading Llama2 70b
- Gradient clipping not working for llama2_70b_lora benchmark HOT 1
- MLPerf library version for 4.0 Submission
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from training.