Comments (4)
Is it just a version incompatibility because we've pinned transformers to <4.37.0
, or are you able to actually update your transformers
install locally and does everything still work as expected?
Which version of spaCy are you on, if I may ask? Because from 3.7 onwards we've started switching towards https://github.com/explosion/spacy-curated-transformers instead - have you tried it?
from spacy.
@svlandeg I'm using spacy==3.7.3
Can I uninstall spacy-transformers
for spacy-curated-transformers?
Using spacy-transformers==1.3.4
everything seems to work, I just get the version ERROR.
Using spacy-curated-transformers
i have this error:
ValueError: [E002] Can't find factory for 'transformer' for language English (en). This usually happens when spaCy calls `nlp.create_pipe` with a custom component name that's not registered on the current language class. If you're using a custom component, make sure you've added the decorator `@Language.component` (for function components) or `@Language.factory` (for class components).
Available factories: attribute_ruler, tok2vec, merge_noun_chunks, merge_entities, merge_subtokens, token_splitter, doc_cleaner, parser, beam_parser, lemmatizer, trainable_lemmatizer, entity_linker, entity_ruler, tagger, morphologizer, ner, beam_ner, senter, sentencizer, spancat, spancat_singlelabel, span_finder, future_entity_ruler, span_ruler, textcat, textcat_multilabel, en.lemmatizer
Running:
spacy.load(path)
from spacy.
We had to yank 3.7.3 (for unrelated reasons - a bug in the multiprocessing code), so please update to 3.7.4 if you can.
Can I uninstall spacy-transformers for spacy-curated-transformers?
Yes, but you'll then need to use curated_transformer
as the factory instead of just transformer
. You can see an example config here:
[components.transformer]
factory = "curated_transformer"
spacy.load(path)
Which model are you loading? If this is a pretrained model using the old spacy_transformer
's transformer
factory, then you'll still need spacy_transformer
. If it's a pretrained model from us, you can likely update though.
from spacy.
@svlandeg Thanks. I want to train a spancat (with transformers) pipeline. I downloaded spacy-curated-transformers
and spacy==3.7.4
I got this error:
catalogue.RegistryError: [E892] Unknown function registry: 'span_getters'.
Available names: architectures, augmenters, batchers, callbacks, cli, datasets, displacy_colors, factories, initializers, languages, layers, lemmatizers, loggers, lookups, losses, misc, model_loaders, models, ops, optimizers, readers, schedules, scorers, tokenizers, vectors
I used the "This is an auto-generated partial config." on spacy website, but it's for spacy-transformers
only.
I tried to adapt it to spacy-curated-transformers
That's my actual cfg file, used in !python -m spacy init labels mycfg.cfg ...
:
[paths]
train = null
dev = null
vectors = null
init_tok2vec = null
[system]
gpu_allocator = "pytorch"
seed = 0
[nlp]
lang = "en"
pipeline = ["transformer","spancat"]
batch_size = 512
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
vectors = {"@vectors":"spacy.Vectors.v1"}
[components]
[components.spancat]
factory = "spancat"
max_positive = null
scorer = {"@scorers":"spacy.spancat_scorer.v1"}
spans_key = "sc"
threshold = 0.5
[components.spancat.model]
@architectures = "spacy.SpanCategorizer.v1"
[components.spancat.model.reducer]
@layers = "spacy.mean_max_reducer.v1"
hidden_size = 128
[components.spancat.model.scorer]
@layers = "spacy.LinearLogistic.v1"
nO = null
nI = null
[components.spancat.model.tok2vec]
@architectures = "spacy-curated-transformers.TransformerListener.v1"
grad_factor = 1.0
pooling = {"@layers":"reduce_mean.v1"}
upstream = "*"
[components.spancat.suggester]
@misc = "spacy.ngram_suggester.v1"
sizes = [1,2,3]
[components.transformer]
factory = "curated_transformer"
max_batch_items = 4096
set_extra_annotations = {"@annotation_setters":"spacy-curated-transformers.null_annotation_setter.v1"}
[components.transformer.model]
@architectures = "spacy-curated-transformers.RobertaTransformer.v1"
name = "roberta-base"
mixed_precision = false
[components.transformer.model.get_spans]
@span_getters = "spacy-curated-transformers.strided_spans.v1"
window = 128
stride = 96
[components.transformer.model.grad_scaler_config]
[components.transformer.model.tokenizer_config]
use_fast = true
[components.transformer.model.transformer_config]
[corpora]
...other..
from spacy.
Related Issues (20)
- ⚠ Aborting and saving the final best model. Encountered exception: RuntimeError('Invalid argument') HOT 1
- ⚠ Aborting and saving the final best model. Encountered exception: RuntimeError('Invalid argument') RuntimeError: Invalid argument HOT 1
- Warning: [W017] Alias 'Barack Obama' already exists in the Knowledge Base. HOT 1
- Why does displacy only support showing one span key?
- User Warning Transformer with Torch HOT 2
- Bad entity recognition HOT 1
- Mistral Instruct LLM doesn#t use chat template HOT 1
- Macbook Pro M3 Max Install Issue: Cythonizing spacy/kb.pyx HOT 2
- Can I retokenize at the start of a training pipeline?
- Compatibility issue with spaCy 3.7.4 and typing-extensions 4.11.0 HOT 2
- phrasematcher attr='LOWER' fails initialization when used in a pipeline HOT 3
- Multiple INFIX inside span cannot be recognized HOT 2
- how to disabled cupy.cuda.runtime.getDeviceCount()
- Batch processing does not speed up `en_core_web_trf`
- Incompatibility of spacy models as ray reference
- Spacy Requirement issues - Linux & Gradio 4.xx.x
- Lowercase lemmatization in pipe, when tok2vec disabled
- PyTorch RuntimeError when enabling mixed precision in transformer (roberta-base)
- 3.7.5 needs to be released soon HOT 3
- Feature Request: Pass custom values from Matcher pattern definitions to matched tokens
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spacy.