Comments (17)
Thanks, for my use case (serving a model as an api), a contextmanager
doesn't fit, since I need to call predict after an external event (e.g. an http request), so I'm just calling _cached_inference
directly.
Anyhow, I think we can finally close this issue. Thanks a lot for your great work!
from finetune.
This is imperfect but this is some WIP code might be helpful for you to use as a starting point.
def _data_generator(self):
while not self._closed:
yield self._data.pop(0)
def _inference(self, Xs, mode=None):
self._data = Xs
n = len(Xs)
if self.__class__ == SequenceLabeler:
self._data = [[x] for x in self._data]
if not getattr(self, 'estimator', None):
self.estimator = self.get_estimator()
self._closed = False
dataset = lambda: self.input_pipeline._dataset_without_targets(self._data_generator, train=None).batch(1)
self.predictions = self.estimator.predict(input_fn=dataset, yield_single_examples=True)
_predictions = []
for _ in range(n):
try:
y = next(self.predictions)
except:
raise e
y = y[mode] if mode else y
_predictions.append(y)
return _predictions
from finetune.
I think we've found a way to have our cake and eat it too without complicating the user interface. Just padding out the final batches should allow us to get a batch speedup but not have to recompile the predict function. PR in progress at #193
from finetune.
Hola Guillermo,
I'm getting a sub-second end-to-end times (as measured from the web interface) using flask.
See here for details: #153
from finetune.
Hi @dimidd,
Thanks for checking back in! Although I was hoping to end up with a solution where we could have our metaphorical cake and eat it too, we ran into some limitations with how tensorflow handles cleaning up memory that meant we had to opt for a more explicit interface for prediction if you want to avoid rebuilding the graph: https://finetune.indico.io/#prediction
model = Classifier()
model.fit(train_data, train_labels)
with model.cached_predict():
model.predict(test_data) # triggers prediction graph construction
model.predict(test_data) # graph is already cached, so subsequence calls are faster
Let me know if this solution works for you!
from finetune.
This could be a possible solution
https://raw.githubusercontent.com/marcsto/rl/master/src/fast_predict2.py
Details:
tensorflow/tensorflow#4648
from finetune.
Yes! It's the fact that the tf Estimator API rebuilds the graph on every call to predict that's the problem. There's some tricky logic around making sure you can still batch properly if you keep a generator open, but this is absolutely the right way to go.
from finetune.
Thanks! I've tried to import SequenceLabeler
in base.py
, but this caused a strange error:
~/anaconda3/envs/tensorflow_p36_new_ft/lib/python3.6/site-packages/finetune/base.py in <module>()
30 from finetune.download import download_data_if_required
31 from finetune.estimator_utils import PatchedParameterServerStrategy
---> 32 from finetune.sequence_labeling import SequenceLabeler
33
34 JL_BASE = os.path.join(os.path.dirname(__file__), "model", "Base_model.jl")
~/anaconda3/envs/tensorflow_p36_new_ft/lib/python3.6/site-packages/finetune/sequence_labeling.py in <module>()
7 import numpy as np
8
----> 9 from finetune.base import BaseModel, PredictMode
10 from finetune.target_encoders import SequenceLabelingEncoder, SequenceMultiLabelingEncoder
11 from finetune.network_modules import sequence_labeler
ImportError: cannot import name 'BaseModel'
from finetune.
@dimidd you have a circular import reference. For now you can probably just delete the SequenceLabeler specific code and have this work for the other model types. Alternately you could move the check to be
if self.__class__.__name__ == "SequenceLabeler"
or similar
from finetune.
I think as a long term solution we need to refactor things to prevent having to override the _inference
method in SequenceLabeler
. But thanks for taking a look at this issue!
from finetune.
Great! Thank you! It's very fast now. I'll close the issue when this is merged.
from finetune.
@dimidd The tricky part is that the old behavior still makes sense in certain scenarios because it's batched. So if you need to predict over a large amount of data in a single call to "predict" that will still be faster, because the lazy generator solution uses a batch size of 1. I think there's a way to have our cake and eat it too (use a generator but have a dynamic batch size or yield batches instead of single examples) but not sure on the details. LMK if you find a good solution -- if we can get around that problem I fully support doing this by default.
from finetune.
Hi Madison,
IMHO we shouldn't square the circle, but rather let the user decide. The user usually knows in advance whether the batch version or the online one is needed. The downside is having two versions for each method. e.g. predict_batch
and predict_online
.
from finetune.
from finetune.
Using predict
on 0.5.12, I can get much better results than 0.5.11
, around 2 seconds per prediction, but it's still not instantaneous like calling _inference2
with Madison's initial suggestion. I can provide more exact stats next week, if you'd like to perfect it.
from finetune.
Ah, sorry, I should have guessed the docs will get updated. I'll try next week.
from finetune.
No worries, should have updated this thread!
from finetune.
Related Issues (20)
- Add chunk long sequences support to featurize_sequence HOT 1
- Make tqdm behavior more consistent between train / predict
- ModuleNotFoundError: No module named 'tensorflow.contrib' HOT 1
- Make try/except in visible devices search less broad and print traceback to enable debugging deployed finetune
- Soft targets for sequence labeling models (with use_crf=False) HOT 2
- Add chunk long sequences support to featurize
- Add "download=False" argument to optionally prevent automatic download on instantiation
- Check md5sums of hashes base model file hashes
- McDonalds dataset link is broken HOT 1
- Document tensorflow 2.0 requirement and removal of 1.x support
- How to save checkpoint when finetune?
- Pin Spacy Version to < 3.0 HOT 1
- [Need help] How can I load a created base model HOT 4
- LayoutLM documentation
- Add absl-py to requirements.txt
- Question: Does the Bert base model support multiple languages?
- Use on Colab Fails HOT 2
- Finetune a model Q/A Response
- MultiLabelClassifier predictions bug HOT 2
- Unavailable pre-trained model configs
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from finetune.