GithubHelp home page GithubHelp logo

Comments (4)

svlandeg avatar svlandeg commented on June 16, 2024

Is it just a version incompatibility because we've pinned transformers to <4.37.0, or are you able to actually update your transformers install locally and does everything still work as expected?

Which version of spaCy are you on, if I may ask? Because from 3.7 onwards we've started switching towards https://github.com/explosion/spacy-curated-transformers instead - have you tried it?

from spacy.

GennVa avatar GennVa commented on June 16, 2024

@svlandeg I'm using spacy==3.7.3
Can I uninstall spacy-transformers for spacy-curated-transformers?
Using spacy-transformers==1.3.4 everything seems to work, I just get the version ERROR.

Using spacy-curated-transformers i have this error:

ValueError: [E002] Can't find factory for 'transformer' for language English (en). This usually happens when spaCy calls `nlp.create_pipe` with a custom component name that's not registered on the current language class. If you're using a custom component, make sure you've added the decorator `@Language.component` (for function components) or `@Language.factory` (for class components).

Available factories: attribute_ruler, tok2vec, merge_noun_chunks, merge_entities, merge_subtokens, token_splitter, doc_cleaner, parser, beam_parser, lemmatizer, trainable_lemmatizer, entity_linker, entity_ruler, tagger, morphologizer, ner, beam_ner, senter, sentencizer, spancat, spancat_singlelabel, span_finder, future_entity_ruler, span_ruler, textcat, textcat_multilabel, en.lemmatizer

Running:
spacy.load(path)

from spacy.

svlandeg avatar svlandeg commented on June 16, 2024

We had to yank 3.7.3 (for unrelated reasons - a bug in the multiprocessing code), so please update to 3.7.4 if you can.

Can I uninstall spacy-transformers for spacy-curated-transformers?

Yes, but you'll then need to use curated_transformer as the factory instead of just transformer. You can see an example config here:

[components.transformer]
factory = "curated_transformer"

spacy.load(path)

Which model are you loading? If this is a pretrained model using the old spacy_transformer's transformer factory, then you'll still need spacy_transformer. If it's a pretrained model from us, you can likely update though.

from spacy.

GennVa avatar GennVa commented on June 16, 2024

@svlandeg Thanks. I want to train a spancat (with transformers) pipeline. I downloaded spacy-curated-transformers and spacy==3.7.4

I got this error:

catalogue.RegistryError: [E892] Unknown function registry: 'span_getters'.

Available names: architectures, augmenters, batchers, callbacks, cli, datasets, displacy_colors, factories, initializers, languages, layers, lemmatizers, loggers, lookups, losses, misc, model_loaders, models, ops, optimizers, readers, schedules, scorers, tokenizers, vectors

I used the "This is an auto-generated partial config." on spacy website, but it's for spacy-transformers only.
I tried to adapt it to spacy-curated-transformers
That's my actual cfg file, used in !python -m spacy init labels mycfg.cfg ... :

[paths]
train = null
dev = null
vectors = null
init_tok2vec = null

[system]
gpu_allocator = "pytorch"
seed = 0

[nlp]
lang = "en"
pipeline = ["transformer","spancat"]
batch_size = 512
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
vectors = {"@vectors":"spacy.Vectors.v1"}

[components]

[components.spancat]
factory = "spancat"
max_positive = null
scorer = {"@scorers":"spacy.spancat_scorer.v1"}
spans_key = "sc"
threshold = 0.5

[components.spancat.model]
@architectures = "spacy.SpanCategorizer.v1"

[components.spancat.model.reducer]
@layers = "spacy.mean_max_reducer.v1"
hidden_size = 128

[components.spancat.model.scorer]
@layers = "spacy.LinearLogistic.v1"
nO = null
nI = null

[components.spancat.model.tok2vec]
@architectures = "spacy-curated-transformers.TransformerListener.v1"
grad_factor = 1.0
pooling = {"@layers":"reduce_mean.v1"}
upstream = "*"

[components.spancat.suggester]
@misc = "spacy.ngram_suggester.v1"
sizes = [1,2,3]

[components.transformer]
factory = "curated_transformer"
max_batch_items = 4096
set_extra_annotations = {"@annotation_setters":"spacy-curated-transformers.null_annotation_setter.v1"}

[components.transformer.model]
@architectures = "spacy-curated-transformers.RobertaTransformer.v1"
name = "roberta-base"
mixed_precision = false

[components.transformer.model.get_spans]
@span_getters = "spacy-curated-transformers.strided_spans.v1"
window = 128
stride = 96

[components.transformer.model.grad_scaler_config]

[components.transformer.model.tokenizer_config]
use_fast = true

[components.transformer.model.transformer_config]

[corpora]
...other..

from spacy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.