netease-fuxi / eet Goto Github PK

View Code? Open in Web Editor NEW

258.0 6.0 46.0 44.51 MB

Easy and Efficient Transformer : Scalable Inference Solution For Large NLP model

License: Apache License 2.0

Cuda 17.13% C++ 19.50% Dockerfile 0.12% Python 63.13% CMake 0.09% C 0.04%

gpt2 bert-inference-performance gpt2-inference-performance eet bert

eet's Introduction

Easy and Efficient Transformer

中文README

EET(Easy and Efficient Transformer) is a friendly Pytorch inference plugin focus on Transformer-based models to make mega-size model affordable.

Features

New🔥: Support Baichuan, LLaMA and other LLMs.
New🔥: Support int8 quantization.
Support Mega-size model with single GPU.
Expertise in inference for multi-modal and NLP tasks (CLIP/GPT-3/Bert/Seq2seq etc.).
High performance. Make the transformer-based model faster and faster with the effect of CUDA kernel optimization and quantization/sparsity algorithm.
Out-of-the-box for Transformers and Fairseq. Save your pain of trivial configuration and make your model work within a few lines.

Easy and Efficient Transformer
Features
Model Matrix
Quick Start
Performance
Cite Us
Video
Contact us

Model Matrix

model type	Transformers	Fairseq	Quantization	SpeedUp	Since version
GPT-3	✅	✅	✅	2~8x	0.0.1 beta
Bert	✅	✅	X	1~5x	0.0.1 beta
ALBert	✅	✅	X	1~5x	0.0.1 beta
Roberta	✅	X	X	1~5x	0.0.1 beta
T5	✅	X	X	4~8x	1.0
ViT	✅	X	X	1~5x	1.0
CLIP(GPT+ViT)	✅	X	X	2~4x	1.0
Distillbert	✅	X	X	1~2x	1.0
Baichuan	✅	X	✅	1~2x	2.0
LLaMA	✅	X	✅	1~2x	2.0

Quick Start

Environment

cuda:>=11.4
python:>=3.7
gcc:>= 7.4.0
torch:>=1.12.0
numpy:>=1.19.1
fairseq:==0.10.0
transformers:>=4.31.0

The above environment is the minimum configuration, and it is best to use a newer version.

Installation

Recommend using docker images.

From Source

If you are installing from source, you will need install the necessary environment.Then proceed as follows:

$ git clone https://github.com/NetEase-FuXi/EET.git
$ pip install .

Recommend using nvcr.io/nvidia/pytorch:23.04-py3 and other series of images, you can also use the provided Dockerfile file.

From Docker

$ git clone https://github.com/NetEase-FuXi/EET.git
$ docker build -t eet_docker:0.1 .
$ nvidia-docker run -it --net=host -v /your/project/directory/:/root/workspace  eet_docker:0.1 bash

The EET and its required environment have been installed in docker.

Run

We provide three types of APIs:

Operators APIs, such as embedding, masked-multi-head-attention, ffn etc. Enable you to define your custom models.
Model APIs, such as TransformerDecoder, BertEncoder etc. Enable you to integrate EET into your pytorch project.
Application APIs, such as Transformers Pipeline. Enable you to run your model in a few lines.

Operators APIs

Operators APIs are the intermediate representation of C++/CUDA and Python. We provide almost all the operators required for Transformer models. You can combine different OPs to build other model structures.

Operators API table

operators	python API	Remarks
multi_head_attention	EETSelfAttention	self attention
masked_multi_head_attention	EETSelfMaskedAttention	causal attention
cross_multi_head_attention	EETCrossAttention	cross attention
ffn	EETFeedforward	feed forward network
embedding	EETBertEmbedding	correspondence to Fairseq and Transfomers
layernorm	EETLayerNorm	same as nn.LayerNorm

How to use

The definition of these OPs is in the file EET/csrc/py11/eet2py.cpp and some using examples were show in the files under python/eet, which tell us how to use those OPs to make up classic models.

Model APIs

As an plugin, EET provides friendly model APIs(python/eet) to integrated into Fairseq and Transformers.

All you need to do is find the corresponding class according to the tables below (usually with a prefix of 'EET') and initialize an object with the from_torch and from_pretrained function.

Note: We now only support pre-padding for GPT-3.

EET and fairseq class comparison table :

EET	fairseq	Remarks
EETTransformerDecoder	TransformerDecoder
EETTransformerDecoderLayer	TransformerDecoderLayer
EETTransformerAttention	MultiheadAttention
EETTransformerFeedforward	TransformerDecoderLayer	fusion of multiple small operators
EETTransformerEmbedding	Embedding + PositionalEmbedding
EETTransformerLayerNorm	nn.LayerNorm

EET and Transformers class comparison table :

EET	transformers	Remarks
EETBertModel	BertModel
EETBertEmbedding	BertEmbeddings
EETGPT2Model	GPT2Model
EETGPT2Decoder	GPT2Model	Transformers has no GPT2Decoder
EETGPT2DecoderLayer	Block
EETGPT2Attention	Attention
EETGPT2Feedforward	MLP
EETGPT2Embedding	nn.Embedding
EETLayerNorm	nn.LayerNorm

In addition to the basic model types above, we have extended some task-specific APIs to support different tasks. The table below is part of our task-specific model APIs :

EET	transformers	Remarks
EETBertForPreTraining	BertForPreTraining
EETBertLMHeadModel	BertLMHeadModel
EETBertForMaskedLM	BertForMaskedLM
EETBertForNextSentencePrediction	BertForNextSentencePrediction
EETBertForSequenceClassification	BertForSequenceClassification
EETBertForMultipleChoice	BertForMultipleChoice
EETBertForTokenClassification	BertForTokenClassification
EETBertForQuestionAnswering	BertForQuestionAnswering

How to use

This is a code snip to show how to use model APIs :

You can build your application with the model APIs directly with the task-specific APIs. There is an example of a fill-mask:

from eet import EETRobertaForMaskedLM
from transformers import RobertaTokenizer
input = ["My <mask> is Sarah and I live in London"]
tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
eet_roberta_model = EETRobertaForMaskedLM.from_pretrained('roberta-base',max_batch = max_batch_size,data_type = data_type)
# first step: tokenize
model_inputs = tokenizer(input,return_tensors = 'pt')
masked_index = torch.nonzero(model_inputs['input_ids'][0] == tokenizer.mask_token_id, as_tuple=False).squeeze(-1)
# second step: predict
prediction_scores = eet_roberta_model(model_inputs['input_ids'].cuda(),attention_mask = model_inputs['attention_mask'])
# third step: argmax
predicted_index = torch.argmax(prediction_scores.logits[0, masked_index]).item()
predicted_token = tokenizer.convert_ids_to_tokens(predicted_index)

For more examples, please refer to example/python/models.

Application APIs

EET provides a ready-made pipelines approach to simplify your application building for different tasks without using the model APIs above.

Here is an example :

import torch
from eet import pipeline
max_batch_size = 1
model_path = 'roberta-base'
data_type = torch.float16
input = ["My <mask> is Sarah and I live in London"]
nlp = pipeline("fill-mask",model = model_path,data_type = data_type,max_batch_size = max_batch_size)
out = nlp(input)

Now we support these tasks：

Task	Since version
text-classification	1.0
token-classification	1.0
question-answering	1.0
fill-mask	1.0
text-generation	1.0
image-classification	1.0
zero_shot_image_classification	1.0

For more examples, please refer to example/python/pipelines.

Performance

Detailed performance data of GPT-3 and Bert model inference can be viewed at link.

GPT-3 on A100

Bert on 2080ti

Llama13B on 3090

Cite Us

If you use EET in your research, please cite the following paper.

@misc{https://doi.org/10.48550/arxiv.2104.12470,
  doi = {10.48550/ARXIV.2104.12470},
  url = {https://arxiv.org/abs/2104.12470},
  author = {Li, Gongzheng and Xi, Yadong and Ding, Jingzhen and Wang, Duan and Liu, Bai and Fan, Changjie and Mao, Xiaoxi and Zhao, Zeng},
  keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Easy and Efficient Transformer : Scalable Inference Solution For large NLP model},

Video

We have a share on ZhiYuan LIVE, link: https://event.baai.ac.cn/activities/325.

Contact us

You can post your problem with github issues.

You can also contact us by email :

[email protected], [email protected] ,[email protected]

eet's People

Contributors

Stargazers

Watchers

eet's Issues

gpt2 generation example bug

Hello,

when running

python example/python/models/gpt2_transformers_example.py

I get the

  File "/mnt/2287294e-32c7-437b-84bd-452a29548b1a/conda_env/EET/lib/python3.8/site-packages/eet/pipelines/generation.py", line 12, in <module>
    from transformers.generation_beam_constraints import Constraint, DisjunctiveConstraint, PhrasalConstraint
ModuleNotFoundError: No module named 'transformers.generation_beam_constraints'

There is probably wrong import statement, I will try to look into it

clip加速10倍的测试代码是 ./example/python/models/clip_transformers_example.py 吗？我使用这个代码测试，只加速2点几倍

fairseq翻译模型的推理

你好，我在说明文档里看到fairseq框架下gpt-2的使用方法，但没有找到transformer结构的翻译模型的推理方法。我参照说明

Replace the original transformer.py in Fairseq with our transformer.py and reinstall the Fairseq, that is all !

在使用fairseq-generate时报错，说是找不到transformer arch，我发现项目提供的transformer.py文件里没有模型结构注册的地方，只有decoder和embedding等相关的部分，找不到encoder相关的部分。所以我保留了原始的transformer文件，把项目提供的transformer作为新文件，然后把项目extra/fairseq/sequence_generator.py覆盖了原fairseq目录下的sequence_generator（我没理解错的话这里的sequence_generator.py是重写的generator），然后在运行推理时报错：
File "/examples/NMT/fairseq/fairseq/sequence_generator.py", line 811, in forward_decoder
decoder_out = model.decoder.forward(tokens, encoder_out=encoder_out, first_pass=first_pass)
TypeError: forward() got an unexpected keyword argument 'first_pass'

看起来是由于项目重写的EETTransformerDecoder与原来的TransformerDecoder参数不一致导致。
所以想要请教一下有没有什么好的方法来对fairseq训练的翻译模型推理加速吗？

如果用EET推理，CLIP是不是就没有梯度信息了？(Can't compute gradient for clip during inference)

请问能在megatron基础上使用吗

gpt2 text generation pipeline batch generation output

Hello,
I am trying to use text-generation pipeline using docker image with these parameters:

import torch
from eet import pipeline
max_batch_size = 16
data_type = torch.float16
input = "My name is Sarah and I live in London"
nlp = pipeline("text-generation", model = 'gpt2-medium', data_type = data_type, max_batch_size = max_batch_size, model_kwargs = {'nsamples':'1024', 'top_k':'40', 'temperature':'0.5', 'length':'30'})
out = nlp(input)
print(len(out))
print(out)

after the execution I get the following output:

There are 0 buffer in cache vector
Request a cache of size : 8388608
There are 1 buffer in cache vector
Request a cache of size : 8388608
There are 2 buffer in cache vector
Request a cache of size : 8388608
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
There are 0 buffer in vector
Request a buffer of size : 25165824
There are 1 buffer in vector
Request a buffer of size : 8388608
There are 2 buffer in vector
Request a buffer of size : 8388608
There are 3 buffer in vector
Request a buffer of size : 8388608
There are 4 buffer in vector
Request a buffer of size : 67108864
1
[{'generated_text': "My name is Sarah and I live in London. I'm 29 and I would like to join the Army. I'd like it to be to my benefit. What was your experience of joining the Army?\n\nSarah Waugh: My first impressions"}]

That means that instead of 1024 samples (passing 'nsamples':'1024' parameter) I am still getting just one output.
Is there something I am missing here?

pip install build. error

env:
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
python 3.8.10
torch 2.0.1+cu117

pip install .

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple, https://pypi.ngc.nvidia.com
Processing /home/classifier/data_test/EET
Preparing metadata (setup.py) ... done
Building wheels for collected packages: EET
Building wheel for EET (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [900 lines of output]
/home/py38_textcnn/lib/python3.8/site-packages/setuptools/dist.py:509: InformationOnly: Normalizing 'v1.0' to '1.0'
self.metadata.version = self.normalize_version(
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-38
creating build/lib.linux-x86_64-cpython-38/eet
copying python/eet/init.py -> build/lib.linux-x86_64-cpython-38/eet
creating build/lib.linux-x86_64-cpython-38/eet/pipelines
copying python/eet/pipelines/generation.py -> build/lib.linux-x86_64-cpython-38/eet/pipelines
copying python/eet/pipelines/zero_shot_image_classification.py -> build/lib.linux-x86_64-cpython-38/eet/pipelines
copying python/eet/pipelines/text_generation.py -> build/lib.linux-x86_64-cpython-38/eet/pipelines
copying python/eet/pipelines/image_classification.py -> build/lib.linux-x86_64-cpython-38/eet/pipelines
copying python/eet/pipelines/init.py -> build/lib.linux-x86_64-cpython-38/eet/pipelines
copying python/eet/pipelines/text_classification.py -> build/lib.linux-x86_64-cpython-38/eet/pipelines
copying python/eet/pipelines/fill_mask.py -> build/lib.linux-x86_64-cpython-38/eet/pipelines
copying python/eet/pipelines/question_answering.py -> build/lib.linux-x86_64-cpython-38/eet/pipelines
copying python/eet/pipelines/base.py -> build/lib.linux-x86_64-cpython-38/eet/pipelines
copying python/eet/pipelines/model_auto.py -> build/lib.linux-x86_64-cpython-38/eet/pipelines
copying python/eet/pipelines/token_classification.py -> build/lib.linux-x86_64-cpython-38/eet/pipelines
creating build/lib.linux-x86_64-cpython-38/eet/fairseq
copying python/eet/fairseq/init.py -> build/lib.linux-x86_64-cpython-38/eet/fairseq
copying python/eet/fairseq/transformer.py -> build/lib.linux-x86_64-cpython-38/eet/fairseq
copying python/eet/fairseq/config.py -> build/lib.linux-x86_64-cpython-38/eet/fairseq
creating build/lib.linux-x86_64-cpython-38/eet/transformers
copying python/eet/transformers/modeling_vit.py -> build/lib.linux-x86_64-cpython-38/eet/transformers
copying python/eet/transformers/modeling_ernie.py -> build/lib.linux-x86_64-cpython-38/eet/transformers
copying python/eet/transformers/modeling_clip.py -> build/lib.linux-x86_64-cpython-38/eet/transformers
copying python/eet/transformers/init.py -> build/lib.linux-x86_64-cpython-38/eet/transformers
copying python/eet/transformers/modeling_gpt2.py -> build/lib.linux-x86_64-cpython-38/eet/transformers
copying python/eet/transformers/encoder_decoder.py -> build/lib.linux-x86_64-cpython-38/eet/transformers
copying python/eet/transformers/modeling_bart.py -> build/lib.linux-x86_64-cpython-38/eet/transformers
copying python/eet/transformers/modeling_albert.py -> build/lib.linux-x86_64-cpython-38/eet/transformers
copying python/eet/transformers/modeling_roberta.py -> build/lib.linux-x86_64-cpython-38/eet/transformers
copying python/eet/transformers/modeling_bert.py -> build/lib.linux-x86_64-cpython-38/eet/transformers
copying python/eet/transformers/modeling_t5.py -> build/lib.linux-x86_64-cpython-38/eet/transformers
copying python/eet/transformers/modeling_distilbert.py -> build/lib.linux-x86_64-cpython-38/eet/transformers
creating build/lib.linux-x86_64-cpython-38/eet/utils
copying python/eet/utils/mapping.py -> build/lib.linux-x86_64-cpython-38/eet/utils
copying python/eet/utils/init.py -> build/lib.linux-x86_64-cpython-38/eet/utils
running build_ext
/home/py38_textcnn/lib/python3.8/site-packages/torch/utils/cpp_extension.py:398: UserWarning: There are no x86_64-linux-gnu-g++ version bounds defined for CUDA version 12.0
warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
building 'EET' extension
creating /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38
creating /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home
creating /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home
creating /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home/classifier
creating /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home/classifier/data_test
creating /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home/classifier/data_test/EET
creating /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home/classifier/data_test/EET/csrc
creating /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home/classifier/data_test/EET/csrc/core
creating /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home/classifier/data_test/EET/csrc/op
creating /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home/classifier/data_test/EET/csrc/py11
Emitting ninja build file /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/22] /usr/local/cuda-12/bin/nvcc -DVERSION_INFO=v1.0 -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/TH -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-12/include -I/home/classifier/data_test/EET/csrc -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/TH -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-12/include -I/home/py38_textcnn/include -I/usr/include/python3.8 -c -c /home/classifier/data_test/EET/csrc/core/pre_process.cu -o /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home/classifier/data_test/EET/csrc/core/pre_process.o -D__CUDA_NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="gcc"' '-DPYBIND11_STDLIB="libstdcpp"' '-DPYBIND11_BUILD_ABI="cxxabi1011"' -DTORCH_EXTENSION_NAME=EET -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_62,code=sm_62 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -std=c++17
FAILED: /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home/classifier/data_test/EET/csrc/core/pre_process.o
/usr/local/cuda-12/bin/nvcc -DVERSION_INFO=v1.0 -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/TH -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-12/include -I/home/classifier/data_test/EET/csrc -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/TH -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-12/include -I/home/py38_textcnn/include -I/usr/include/python3.8 -c -c /home/classifier/data_test/EET/csrc/core/pre_process.cu -o /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home/classifier/data_test/EET/csrc/core/pre_process.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EET -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_62,code=sm_62 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -std=c++17
/home/classifier/data_test/EET/csrc/core/pre_process.cu(28): error: namespace "thrust" has no member "device"

  /home/classifier/data_test/EET/csrc/core/pre_process.cu(28): error: no instance of overloaded function "thrust::fill" matches the argument list
              argument types are: (<error-type>, thrust::device_ptr<int64_t>, thrust::device_ptr<int64_t>, int64_t)

  /home/classifier/data_test/EET/csrc/core/pre_process.cu(34): error: namespace "thrust" has no member "device"

  /home/classifier/data_test/EET/csrc/core/pre_process.cu(34): error: no instance of overloaded function "thrust::reduce" matches the argument list
              argument types are: (<error-type>, thrust::device_ptr<int64_t>, thrust::device_ptr<int64_t>, int)

  /home/classifier/data_test/EET/csrc/core/pre_process.cu(50): error: namespace "thrust" has no member "device"

  /home/classifier/data_test/EET/csrc/core/pre_process.cu(50): error: no instance of overloaded function "thrust::reduce" matches the argument list
              argument types are: (<error-type>, int64_t *, int64_t *, int)

  6 errors detected in the compilation of "/home/classifier/data_test/EET/csrc/core/pre_process.cu".
  [2/22] /usr/local/cuda-12/bin/nvcc  -DVERSION_INFO=v1.0 -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/TH -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-12/include -I/home/classifier/data_test/EET/csrc -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/TH -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-12/include -I/home/py38_textcnn/include -I/usr/include/python3.8 -c -c /home/classifier/data_test/EET/csrc/core/add_bias.cu -o /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home/classifier/data_test/EET/csrc/core/add_bias.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EET -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_62,code=sm_62 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -std=c++17
  [3/22] /usr/local/cuda-12/bin/nvcc  -DVERSION_INFO=v1.0 -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/TH -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-12/include -I/home/classifier/data_test/EET/csrc -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/TH -I/home/py38_textcnn/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-12/include -I/home/py38_textcnn/include -I/usr/include/python3.8 -c -c /home/classifier/data_test/EET/csrc/core/transpose.cu -o /home/classifier/data_test/EET/build/temp.linux-x86_64-cpython-38/home/classifier/data_test/EET/csrc/core/transpose.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=EET -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_62,code=sm_62 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -std=c++17
  /home/classifier/data_test/EET/csrc/core/transpose.cu(61): warning #177-D: variable "bid2" was declared but never referenced
            detected during instantiation of "void copyKV_transpose_cross_kernel<T>(void *, void *, void *, void *, const int &, const int &, const int &, const int &) [with T=float]"
  (147): here

  Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
            detected during instantiation of "void copyKV_transpose_cross_kernel<T>(void *, void *, void *, void *, const int &, const int &, const int &, const int &) [with T=float]"
  /home/classifier/data_test/EET/csrc/core/transpose.cu(147): here

  /home/classifier/data_test/EET/csrc/core/transpose.cu(63): warning #177-D: variable "seq_id2" was declared but never referenced
            detected during instantiation of "void copyKV_transpose_cross_kernel<T>(void *, void *, void *, void *, const int &, const int &, const int &, const int &) [with T=float]"
  (147): here

....
....

[-Wattributes]
/home/py38_textcnn/lib/python3.8/site-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_eet::op::LayerNorm’:
/home/classifier/data_test/EET/csrc/py11/eet2py.cpp:98:50: required from here
/home/py38_textcnn/lib/python3.8/site-packages/torch/include/pybind11/pybind11.h:1479:7: warning: ‘pybind11::class_eet::op::LayerNorm’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/home/py38_textcnn/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
subprocess.run(
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/home/classifier/data_test/EET/setup.py", line 36, in <module>
      setup(
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/__init__.py", line 107, in setup
      return distutils.core.setup(**attrs)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
      return run_commands(dist)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
      dist.run_commands()
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
      super().run_command(command)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/home/py38_textcnn/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 343, in run
      self.run_command("build")
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
      super().run_command(command)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 131, in run
      self.run_command(cmd_name)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
      super().run_command(command)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 84, in run
      _build_ext.run(self)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
      self.build_extensions()
    File "/home/py38_textcnn/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions
      build_ext.build_extensions(self)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
      self._build_extensions_serial()
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
      self.build_extension(ext)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
      _build_ext.build_extension(self, ext)
    File "/home/py38_textcnn/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
      objects = self.compiler.compile(
    File "/home/py38_textcnn/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 658, in unix_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "/home/py38_textcnn/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1574, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "/home/py38_textcnn/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1909, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for EET
Running setup.py clean for EET
Failed to build EET
ERROR: Could not build wheels for EET, which is required to install pyproject.toml-based projects

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.