Comments (13)
Hi Damiox,
Thanks for your question. In torch.neuron.trace we use torch.jit.trace to generate a graph of operators that get compiled for neuron hardware. The following modification to your code shows the pytorch operators which are not supported. These are prefixed by aten:: in the nodes of the generated graph
from transformers.tokenization_gpt2 import GPT2Tokenizer
from transformers.modeling_gpt2 import GPT2LMHeadModel
import torch
import torch_neuron
# loading gpt2 medium model
tokenizer = GPT2Tokenizer.from_pretrained('gpt2', pad_token='<|endoftext|>')
model = GPT2LMHeadModel.from_pretrained('gpt2',torchscript=True)
model.eval()
# generating example input
tokens = [tokenizer.encode(t) for t in ['I like to drink coke']]
tensors = torch.LongTensor(tokens)
# using neuron sdk to compile the model
model_jit = torch.jit.trace(model, example_inputs=[tensors])
#print( model_jit.graph )
## Get the operators in the model
operators_in_model = set()
for node in model_jit.graph.nodes():
if node.kind().startswith("aten"):
operators_in_model.add( node.kind() )
## Get the supported operations in the current version of torch-neuron
supported_operators = set(torch.neuron.get_supported_operations())
print("The following operations are currently supported in torch-neuron")
for op in supported_operators:
print(op)
not_supported_operators = operators_in_model - supported_operators
print()
print("The following operations are currently not supported in torch-neuron for this model")
for op in not_supported_operators:
print(op)
#model_neuron = torch.neuron.test_trace(model, example_inputs=[tensors])
There is more information in the pytorch github repo here: https://github.com/pytorch/pytorch/wiki/PyTorch-IR if you are interested.
I'll also find out more about what we have tested for GPT2 on tensorflow-neuron and get back to you here.
from aws-neuron-sdk.
Alternatively, I also tried to re-save the pytorch model to run a clean test, so I loaded my model and re-save it as follows:
from transformers.tokenization_gpt2 import GPT2Tokenizer
from transformers.modeling_gpt2 import GPT2LMHeadModel
import torch
# loading gpt2 medium model
tokenizer = GPT2Tokenizer.from_pretrained('gpt2', pad_token='<|endoftext|>')
model = GPT2LMHeadModel.from_pretrained('data/GPT2-345M/', torchscript=True)
model.eval()
# generating input
tokens = [tokenizer.encode(t) for t in ['I like to drink coke']]
tensors = torch.LongTensor(tokens)
print(tensors)
# re-saving model
torch.save(model, 'gpt2-serial.pt2')
Then I tried to use that gpt2-serial.gpt2 model with neuron-sdk similarly to the steps detailed in https://github.com/aws/aws-neuron-sdk/blob/master/docs/pytorch-neuron/tutorial-compile-infer.md :
import torch
import torch_neuron
# loading gpt2 medium model
model = torch.load('gpt2-serial.pt2')
model.eval()
# generating example input
tensors = torch.LongTensor([[40, 588, 284, 4144, 763, 365]])
print(tensors)
# using neuron sdk to compile the model
model_neuron = torch.neuron.trace(model, example_inputs=[tensors])
model_neuron.save('gpt2-serial-neuron.pt2')
And ran into the same issue:
Traceback (most recent call last):
File "neuron-gpt2-next.py", line 17, in <module>
model_neuron = torch.neuron.trace(model, example_inputs=[tensors])
File "/home/prod/neuron/lib/python3.5/site-packages/torch_neuron/decorators.py", line 150, in trace
transform_torch_graph_to_tensorflow( func, example_inputs, args, kwargs )
File "/home/prod/neuron/lib/python3.5/site-packages/torch_neuron/decorators.py", line 294, in transform_torch_graph_to_tensorflow
tensor_outputs = _resolve_func(node)(op, *tensor_inputs)
TypeError: arange() takes from 2 to 6 positional arguments but 8 were given
from aws-neuron-sdk.
Hi Damiox,
Thanks for reporting this issue. Right now, Neuron-torch does not support all of the required operators for GPT2. This error message should be improved and we have opened an internal issue to track it.
I did a quick test which bypasses that failure, and there are other operators that need to be added to the Neuron-torch supported set. Please keep your eyes open for future announcements on operator support and any explicit release notes on GPT2
from aws-neuron-sdk.
from aws-neuron-sdk.
from aws-neuron-sdk.
Would the GPT2 architecture work in Neuron with Tensorflow? Also do you have any benchmark to take a look about the performance improvements by using inf1 machines?
from aws-neuron-sdk.
We are still working on TensorFlow 2.0 which is requirement for GPT2. For a sample of Inferentia capabilities, please take a look at our ResNet50 example (https://github.com/aws/aws-neuron-sdk/blob/master/docs/technotes/performance-tuning.md) and Bert example (https://github.com/aws/aws-neuron-sdk/tree/master/src/examples/tensorflow/bert_demo).
from aws-neuron-sdk.
Hello Damian,
Is there anything else with which we can help?
-Taylor
from aws-neuron-sdk.
I'm waiting for the new required operators to be available in neuron-sdk. Will this be announced? In the meantime I'm not using this framework
from aws-neuron-sdk.
Hello Damian,
I would encourage you to follow our PyTorch release notes for release announcements. You can also watch our roadmap.
-Taylor
from aws-neuron-sdk.
Hello Damian,
It appears your immediate questions have been addressed. Please feel free to re-open if you have any further questions.
Regards,
Taylor
from aws-neuron-sdk.
Hey @aws-taylor I see there has been a new release from aws-neuron-sdk ...
I was trying to check whether the unsupported pytorch operators were already supported by neuron-sdk. I can't find details in https://github.com/aws/aws-neuron-sdk/projects/2 to track the progress of this ticket.
from aws-neuron-sdk.
Resolved, we tested GPT-2 so the fix is coming in the next neuron-cc release. Please reopen if any issues . Thanks for your patience.
from aws-neuron-sdk.
Related Issues (20)
- Segmentation fault on neuronx when compiling model jinaai/jina-embeddings-v2-base-en HOT 3
- RuntimeError when running llama2_inference.ipynb HOT 1
- [Optimum-neuron]T5 tensor parallel official example not working as expected HOT 5
- Latest version of neuron-device-plugin (2.19.16.0) contains known security vulnerabilities HOT 1
- Mixtral-8x7B-Instruct-v0.1 | neuronx-cc compilation failure HOT 2
- Issue on page /general/faq/training/neuron-training.html HOT 1
- Error when using torch.block_diag method HOT 1
- Quantized `mistral` model on Inf2 with Neuron? HOT 4
- Need to use swap memory for loading (sdxl turbo) model, But I can't set it in sagemaker HOT 3
- [HF][Optimum] Compiling unet in stable diffusion XL pipeline failed since Neuron SDK 2.18 HOT 8
- tensor copy out too slow (XLATensor::ToTensor)
- Embedding layer of ViT not supported with dynamic batch size HOT 1
- Dynamic batching in inference doesn't work when embedding layers are included and input is two tensors HOT 2
- Internal Compiler error when compiling a model HOT 4
- Error: "Backward sending grads, but get None" HOT 1
- compiler_args not passed in for torch_neuronx.trace HOT 3
- torch.argsort crashes when tensor is on Neuron device HOT 1
- Bug in `configure_pjrt_environment` HOT 2
- Failure on neuron-cc compilation when a nn model is moved to Neuron device HOT 2
- LLM engine not using Neuron device with continuous batching using vLLM HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aws-neuron-sdk.