Comments (4)
Hi @DevinTDHa,
Thanks for your reply. I just tried to run the first task (GPT2 Pipeline
) as shown here.
I had updated both, pypi spark-nlp and the maven library. This is my current environment:
- Azure Databricks cluster with 14.0 ML runtime
- Spark 3.5.0
- Spark Maven library com.johnsnowlabs.nlp:spark-nlp_2.12:5.2.3 following Install Spark NLP on Databricks
- pypi spark-nlp==5.2.3
However, I still get the same error:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 23.0 failed 4 times, most recent failure: Lost task 2.3 in stage 23.0 (TID 173) (10.139.64.12 executor 1): java.lang.IllegalArgumentException: No Operation named [init_all_tables] in the Graph
I attach my notebook GPT2Transformer_ OpenAI Text-To-Text Transformer.zip to reproduce the error.
from spark-nlp.
Hi @mdrobena,
Thanks for the thorough description! I was able to reproduce it with your instructions. Important is also that it needs to be run on a muli-node setup. I am looking into this problem.
from spark-nlp.
Hi,
I have the exact same issue.
My environment:
Databricks runtime version 14.0 ML (includes Apache Spark 3.5.0, Scala 2.12)
Spark NLP: 5.2.2
from spark-nlp.
I tried to reproduce it, but there isn't an issue with my side. Can you perhaps detail your steps to recreate this issue? Additionally, can you perhaps try just to update Spark NLP to the latest version and see if there is any difference?
I tried these following combinations of settings:
- Run the notebook on colab with your specified version, applying the config
- Run the notebook locally with specified versions, applying the config
- Run the notebook on databricks with the same runtime and specified versions, following Install Spark NLP on Databricks
But in all of them, I get the results without any problems. Thanks for reporting!
from spark-nlp.
Related Issues (20)
- Flexible normalization HOT 1
- XlmRoBertaSentenceEmbeddings returns huge amount of embeddings instead of set dimensions
- Sparknlp returning different embedding for manual spark dataframe vs reading from file spark dataframe HOT 5
- SparkNLP Embeddings inference 3X slower than with pandas_udf HOT 3
- EntityRuler fails two basic tests HOT 3
- Show an error of 'GLIBC_2.27 not found' when pretrained model download in AWS EMR HOT 2
- Onnx models fail when saving transformer
- Hardcoded column name in DocumentSimilarityRanker annotator
- ERROR TorrentBroadcast: Store broadcast broadcast_5 fail, remove all pieces of the broadcast HOT 7
- Scala 2.13 support HOT 1
- org.apache.spark.SparkException: [FAILED_EXECUTE_UDF] HOT 3
- DependencyParserApproach throws "IllegalArgumentException: For input string: "_"" when training with CONLLU dataset HOT 5
- When Attempting to loadSavedModel, I Encountered 'java.lang.Exception: Could Not Retrieve the SavedModelBundle + () HOT 16
- Importing models into Spark NLP in TensorFlow and ONNX formats
- MultiClassifierDLApproach not transforming every row of my dataset HOT 1
- An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadModel. : java.lang.UnsatisfiedLinkError: no jnitensorflow in java.library.path: /Users/alexc./Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:. HOT 1
- KMeans throws “Column features must be of type equal to one of the following types” HOT 1
- Cache mechanism is not working related to metadata.json in s3 HOT 3
- XLMRoberta embeddings not differentiating between different sentences
- It seems the model is downloaded every time the program starts - any way to cache? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spark-nlp.