Comments (10)
I can successfully run the bert in FP16. But I do not find the file of "run_classifier_fastertf.py", I run the fp16 by the following commands:
" python profile_transformer_inference.py --init_checkpoint=tmp.ckpt --tf_profile=false --output_dir=mrpc_output --profiling_output_file=time_elapsed --xla=false --floatx=float16"
where tmp_ckpt is the converted fp16 ckpt.
Can you provide more details about how to reproduce your problem?
from fastertransformer.
I also face this problem, can anyone give me some suggestions.
Compiling command:
cmake -DSM=70 -DCMAKE_BUILD_TYPE=Release -DBUILD_TF=ON -DTF_PATH=/search/anaconda2/envs/pytf14/lib/python2.7/site-packages/tensorflow ..
GPU: TitanV
cuda:10.1
driver/nvidia/version:418.40.04
python2.7
Tensorflow 1.14.0
cmake:3.9.2
I can run ./bin/encoder_gemm 1 300 8 48 0
but step: /search/anaconda2/envs/pytf14/bin/python encoder_sample.py --batch_size 1 --seq_len 300 --head_number 8 --size_per_head 48 --num_layer 8 --data_type fp32 --test_time 1, i got:
Traceback (most recent call last):
File "encoder_sample.py", line 84, in
attention_mask=attention_mask)
File "/search/DeepLearningExamples/FasterTransformer/v2/build/utils/encoder.py", line 303, in op_encoder
os.path.join('./lib/libtf_fastertransformer.so'))
File "/search/anaconda2/envs/pytf14/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: ./lib/libtf_fastertransformer.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrESs
from fastertransformer.
I also face this problem, can anyone give me some suggestions.
Compiling command:
cmake -DSM=70 -DCMAKE_BUILD_TYPE=Release -DBUILD_TF=ON -DTF_PATH=/search/anaconda2/envs/pytf14/lib/python2.7/site-packages/tensorflow ..
GPU: TitanV
cuda:10.1
driver/nvidia/version:418.40.04
python2.7
Tensorflow 1.14.0
cmake:3.9.2I can run ./bin/encoder_gemm 1 300 8 48 0
but step: /search/anaconda2/envs/pytf14/bin/python encoder_sample.py --batch_size 1 --seq_len 300 --head_number 8 --size_per_head 48 --num_layer 8 --data_type fp32 --test_time 1, i got:Traceback (most recent call last):
File "encoder_sample.py", line 84, in
attention_mask=attention_mask)
File "/search/DeepLearningExamples/FasterTransformer/v2/build/utils/encoder.py", line 303, in op_encoder
os.path.join('./lib/libtf_fastertransformer.so'))
File "/search/anaconda2/envs/pytf14/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: ./lib/libtf_fastertransformer.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrESs
Have you tried the image "nvcr.io/nvidia/tensorflow:19.07-py2"?
from fastertransformer.
I also face this problem, can anyone give me some suggestions.
Compiling command:
cmake -DSM=70 -DCMAKE_BUILD_TYPE=Release -DBUILD_TF=ON -DTF_PATH=/search/anaconda2/envs/pytf14/lib/python2.7/site-packages/tensorflow ..
GPU: TitanV
cuda:10.1
driver/nvidia/version:418.40.04
python2.7
Tensorflow 1.14.0
cmake:3.9.2
I can run ./bin/encoder_gemm 1 300 8 48 0
but step: /search/anaconda2/envs/pytf14/bin/python encoder_sample.py --batch_size 1 --seq_len 300 --head_number 8 --size_per_head 48 --num_layer 8 --data_type fp32 --test_time 1, i got:
Traceback (most recent call last):
File "encoder_sample.py", line 84, in
attention_mask=attention_mask)
File "/search/DeepLearningExamples/FasterTransformer/v2/build/utils/encoder.py", line 303, in op_encoder
os.path.join('./lib/libtf_fastertransformer.so'))
File "/search/anaconda2/envs/pytf14/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: ./lib/libtf_fastertransformer.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrESsHave you tried the image "nvcr.io/nvidia/tensorflow:19.07-py2"?
I don't use the image, is it necessary? I think my environment should meet the requirements.
from fastertransformer.
I also face this problem, can anyone give me some suggestions.
Compiling command:
cmake -DSM=70 -DCMAKE_BUILD_TYPE=Release -DBUILD_TF=ON -DTF_PATH=/search/anaconda2/envs/pytf14/lib/python2.7/site-packages/tensorflow ..
GPU: TitanV
cuda:10.1
driver/nvidia/version:418.40.04
python2.7
Tensorflow 1.14.0
cmake:3.9.2
I can run ./bin/encoder_gemm 1 300 8 48 0
but step: /search/anaconda2/envs/pytf14/bin/python encoder_sample.py --batch_size 1 --seq_len 300 --head_number 8 --size_per_head 48 --num_layer 8 --data_type fp32 --test_time 1, i got:
Traceback (most recent call last):
File "encoder_sample.py", line 84, in
attention_mask=attention_mask)
File "/search/DeepLearningExamples/FasterTransformer/v2/build/utils/encoder.py", line 303, in op_encoder
os.path.join('./lib/libtf_fastertransformer.so'))
File "/search/anaconda2/envs/pytf14/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: ./lib/libtf_fastertransformer.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrESsHave you tried the image "nvcr.io/nvidia/tensorflow:19.07-py2"?
I don't use the image, is it necessary? I think my environment should meet the requirements.
It is not necessary. It is hard for us to make sure that the FT can run in any environment. For example, installing the TensorFlow by different methods may cause different problem.
Generally, Customer should be able to run the FasterTransformer (FT) in the image successfully, and make sure that they understand how to use the FT. And then test the FT in their environments. If they cannot run the FT in your environment, this means that they may need to install or modify somethings for the environment.
You can provide more information and details and we will try to help.
from fastertransformer.
@byshiue thanks for your help.
Firstly, the main reason I don't try the image "nvcr.io/nvidia/tensorflow:19.07-py2" is that the environment I'm using is a docker container with tf-1.12. So I think I can only install FT without the image.
Secondly, because the tf version(tf1.12) does not meet the requirements, I created an tf1.14 environment through anaconda(the path is: /search/anaconda2/envs/pytf14/bin/python).
Then I make sure that my environment meets the following requirements:
CMake >= 3.8(3.9.2)
CUDA 10.1
Python 2.7
Tensorflow 1.14(conda envs)
GPU TitanV.
Then I followed the instructions to install FT:
①Clone the repository.
git clone https://github.com/NVIDIA/DeepLearningExamples
cd DeepLearningExamples/FasterTransformer/v2
git submodule init
git submodule update
②Build the project.
ln -s /search/anaconda2/envs/pytf14/lib/python2.7/site-packages/tensorflow/libtensorflow_framework.so.1 /search/anaconda2/envs/pytf14/lib/python2.7/site-packages/tensorflow/libtensorflow_framework.so
(Is there a problem with this command?)
mkdir -p build
cd build
cmake -DSM=70 -DCMAKE_BUILD_TYPE=Release -DBUILD_TF=ON -DTF_PATH=/search/anaconda2/envs/pytf14/lib/python2.7/site-packages/tensorflow ..
③Generate the gemm_config.in file:
./bin/encoder_gemm 1 300 8 48 0 (there is no errors)
④Finally, i run the encoder in TensorFlow:
/search/anaconda2/envs/pytf14/bin/python encoder_sample.py
--batch_size 1
--seq_len 300
--head_number 8
--size_per_head 48
--num_layer 8
--data_type fp32
--test_time 1
and i got:
Traceback (most recent call last):
File "encoder_sample.py", line 84, in
attention_mask=attention_mask)
File "/search/DeepLearningExamples/FasterTransformer/v2/build/utils/encoder.py", line 303, in op_encoder
os.path.join('./lib/libtf_fastertransformer.so'))
File "/search/anaconda2/envs/pytf14/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: ./lib/libtf_fastertransformer.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrESs.
The above are the details of all my attempts to install FT, thanks for any help!
from fastertransformer.
Sorry, I try to install the TensorFlow by the anaconda2, but still not able to reproduce your problem.
A possible solution is, try to use the gcc/g++4.8 to build the project. In my experience, TensorFlow 1.14 has some problem when I use other version of gcc/g++.
from fastertransformer.
@byshiue
the gcc version in my container is 4.8.5. So I think the problem might be in TensorFlow of anaconda.Besides, I tried another method, i.e., I used a new container with tf1.14 version. Then the above problem does not exist, but there is another problem(Please refer to this link for more information, https://github.com/NVIDIA/DeepLearningExamples/issues/436).
Thanks again for your help!
from fastertransformer.
It seems that this bug is solved.
Please re-open this bug if you still have problem.
from fastertransformer.
@byshiue
the gcc version in my container is 4.8.5. So I think the problem might be in TensorFlow of anaconda.Besides, I tried another method, i.e., I used a new container with tf1.14 version. Then the above problem does not exist, but there is another problem(Please refer to this link for more information, https://github.com/NVIDIA/DeepLearningExamples/issues/436).
Thanks again for your help!
HI, I met the same problem as you do, could you share your solution to me. I create a virtual conda env with configuration of python2.7 ,tensorflow-gpu1.14, and other settings like gcc are the same as yours. thanks.
from fastertransformer.
Related Issues (20)
- CUDA code compile error with clang: function template partial specialization is not allowed
- Incorrect inline ptx device assembly code usage
- cuSPARSELt is slower? HOT 1
- Whether fastertransformer supports gpt-2 classification model, such as GPT2ForSequenceClassification?
- Supporting for expert parallelism in MoE inference
- Is llama2 70b supported? Do you know minimal configuration? HOT 1
- How to serving multi-gpu inference? HOT 1
- How to get started?
- Sparsity support
- repetition_penalty logic in FT has bug HOT 1
- can support decoder only bart? such as MBartForCausalLM
- error You need C++17 to compile PyTorch
- Does FasterTransformer support multi-stream pipeline parallelism ?
- multi_block_mode performance issue HOT 1
- Confidence is not returned in the decoding example?
- on H800 can not exec nvidia/pytorch:23.09-py3 container success
- Are `fuseQKV masked attention` and Flash Attention the same?
- what is the mean of EFF-FT?
- How to know the correspondence between versions vcr.io/nvidia/pytorch:xx.xx-py3 and pytorch?
- error: ‘CUDNN_DATA_BFLOAT16’ was not declared in this scope; did you mean ‘CUDNN_DATA_FLOAT’
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fastertransformer.