hamelsmu / seq2seq_tutorial Goto Github PK
View Code? Open in Web Editor NEWCode For Medium Article "How To Create Data Products That Are Magical Using Sequence-to-Sequence Models"
License: Apache License 2.0
Code For Medium Article "How To Create Data Products That Are Magical Using Sequence-to-Sequence Models"
License: Apache License 2.0
hi, when i run docker it exit automatic ! i cant use it .what is a problem ?
I have pulled the image and am running the container. I am accessing the jupyter notebook locally but I have many import errors. Is it possible that docker image is not built with all the dependencies pre-installed?
I've imported the data and install keytext without problem but I've got the error below when trying to execute from ktext.preprocess import processor
It seems that the spaCy 'en' model can't be loaded. I'm using Python 3.6 on Mac OS 10.12.6
By the way, I would like to share your blog post with the G+ Deep Learning community but I have the habit to validate every tutorial myself before.
`---------------------------------------------------------------------------
OSError Traceback (most recent call last)
in ()
1 get_ipython().magic('reload_ext autoreload')
2 get_ipython().magic('autoreload 2')
----> 3 from ktext.preprocess import processor
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ktext/preprocess.py in ()
19 import timeit
20
---> 21 spacyen_default = spacy.load('en')
22
23 def get_time():
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/spacy/init.py in load(name, **overrides)
17 "to load. For example:\nnlp = spacy.load('{}')".format(depr_path),
18 'error')
---> 19 return util.load_model(name, **overrides)
20
21
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/spacy/util.py in load_model(name, **overrides)
117 elif hasattr(name, 'exists'): # Path or Path-like to model data
118 return load_model_from_path(name, **overrides)
--> 119 raise IOError("Can't find model '%s'" % name)
120
121
OSError: Can't find model 'en'
---------------------------------------------------------------------------`
I am following along with this tutorial but am unable to get past this step:
seq2seq_encoder_out = encoder_model(encoder_inputs)
This is located at In[20] in the notebook.
This throws the following error:
ValueError Traceback (most recent call last)
in ()
----> 1 seq2seq_encoder_out = encoder_model(encoder_inputs)~\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\keras\engine\topology.py in call(self, inputs, **kwargs)
615
616 # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 617 output = self.call(inputs, **kwargs)
618 output_mask = self.compute_mask(inputs, previous_mask)
619~\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\keras\engine\topology.py in call(self, inputs, mask)
2076 return self._output_tensor_cache[cache_key]
2077 else:
-> 2078 output_tensors, _, _ = self.run_internal_graph(inputs, masks)
2079 return output_tensors
2080~\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\keras\engine\topology.py in run_internal_graph(self, inputs, masks)
2227 if 'mask' not in kwargs:
2228 kwargs['mask'] = computed_mask
-> 2229 output_tensors = _to_list(layer.call(computed_tensor, **kwargs))
2230 output_masks = _to_list(layer.compute_mask(computed_tensor,
2231 computed_mask))~\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\keras\layers\normalization.py in call(self, inputs, training)
183 self.add_update([K.moving_average_update(self.moving_mean,
184 mean,
--> 185 self.momentum),
186 K.moving_average_update(self.moving_variance,
187 variance,~\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\keras\backend\tensorflow_backend.py in moving_average_update(x, value, momentum)
999 """
1000 return moving_averages.assign_moving_average(
-> 1001 x, value, momentum, zero_debias=True)
1002
1003~\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\tensorflow\python\training\moving_averages.py in assign_moving_average(variable, value, decay, zero_debias, name)
68 decay = math_ops.cast(decay, variable.dtype.base_dtype)
69 if zero_debias:
---> 70 update_delta = _zero_debias(variable, value, decay)
71 else:
72 update_delta = (variable - value) * decay~\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\tensorflow\python\training\moving_averages.py in _zero_debias(unbiased_var, value, decay)
178 local_step_initializer = init_ops.zeros_initializer()
179 biased_var = variable_scope.get_variable(
--> 180 "biased", initializer=biased_initializer, trainable=False)
181 local_step = variable_scope.get_variable(
182 "local_step",~\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\tensorflow\python\ops\variable_scope.py in get_variable(name, shape, dtype, initializer, regularizer, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter)
1063 collections=collections, caching_device=caching_device,
1064 partitioner=partitioner, validate_shape=validate_shape,
-> 1065 use_resource=use_resource, custom_getter=custom_getter)
1066 get_variable_or_local_docstring = (
1067 """%s~\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\tensorflow\python\ops\variable_scope.py in get_variable(self, var_store, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter)
960 collections=collections, caching_device=caching_device,
961 partitioner=partitioner, validate_shape=validate_shape,
--> 962 use_resource=use_resource, custom_getter=custom_getter)
963
964 def _get_partitioned_variable(self,~\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\tensorflow\python\ops\variable_scope.py in get_variable(self, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter)
365 reuse=reuse, trainable=trainable, collections=collections,
366 caching_device=caching_device, partitioner=partitioner,
--> 367 validate_shape=validate_shape, use_resource=use_resource)
368
369 def _get_partitioned_variable(~\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\tensorflow\python\ops\variable_scope.py in _true_getter(name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource)
350 trainable=trainable, collections=collections,
351 caching_device=caching_device, validate_shape=validate_shape,
--> 352 use_resource=use_resource)
353
354 if custom_getter is not None:~\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\tensorflow\python\ops\variable_scope.py in _get_single_variable(self, name, shape, dtype, initializer, regularizer, partition_info, reuse, trainable, collections, caching_device, validate_shape, use_resource)
662 " Did you mean to set reuse=True in VarScope? "
663 "Originally defined at:\n\n%s" % (
--> 664 name, "".join(traceback.format_list(tb))))
665 found_var = self._vars[name]
666 if not shape.is_compatible_with(found_var.get_shape()):ValueError: Variable Encoder-Batchnorm-1_1/moving_mean/biased already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:
File "C:\Users\lrichards\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\tensorflow\python\framework\ops.py", line 1269, in init
self._traceback = _extract_stack()
File "C:\Users\lrichards\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\tensorflow\python\framework\ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "C:\Users\lrichards\AppData\Local\Continuum\anaconda3\envs\Seq2Seq\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op
op_def=op_def)
Using Anaconda3 on Windows 7 x64
Tensorflow v1.2.1 (installed via Anaconda)
Keras 2.1.3 (installed via Anaconda)
Hello, I recently paid attention to your part of this tutorial, but I encountered source code problems when implementing your tutorial, when I executed
from ktext.preprocess import processor
body_pp = processor(keep_n=8000, padding_maxlen=70)
train_body_vecs = body_pp.fit_transform(train_body_raw)
I got the following error:
OSError Traceback (most recent call last)
~/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/ktext/preprocess.py in apply_parallel(func, data, cpu_cores)
72 chunk_size = ceil(len(data) / cpu_cores)
---> 73 pool = Pool(cpu_cores)
74 transformed_data = pool.map(func, chunked(data, chunk_size), chunksize=1)
~/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/multiprocess/pool.py in __init__(self, processes, initializer, initargs, maxtasksperchild, context)
173 self._pool = []
--> 174 self._repopulate_pool()
175
~/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/multiprocess/pool.py in _repopulate_pool(self)
238 w.daemon = True
--> 239 w.start()
240 util.debug('added worker')
~/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/multiprocess/process.py in start(self)
104 _cleanup()
--> 105 self._popen = self._Popen(self)
106 self._sentinel = self._popen.sentinel
~/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/multiprocess/context.py in _Popen(process_obj)
276 from .popen_fork import Popen
--> 277 return Popen(process_obj)
278
~/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/multiprocess/popen_fork.py in __init__(self, process_obj)
18 self.returncode = None
---> 19 self._launch(process_obj)
20
~/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/multiprocess/popen_fork.py in _launch(self, process_obj)
65 parent_r, child_w = os.pipe()
---> 66 self.pid = os.fork()
67 if self.pid == 0:
OSError: [Errno 12] Cannot allocate memory
During handling of the above exception, another exception occurred:
UnboundLocalError Traceback (most recent call last)
<timed exec> in <module>()
~/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/ktext/preprocess.py in fit_transform(self, data)
336
337 """
--> 338 tokenized_data = self.fit(data, return_tokenized_data=True)
339
340 logging.warning(f'...fit is finished, beginning transform')
~/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/ktext/preprocess.py in fit(self, data, return_tokenized_data)
278 now = get_time()
279 logging.warning(f'....tokenizing data')
--> 280 tokenized_data = self.parallel_process_text(data)
281
282 if not self.padding_maxlen:
~/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/ktext/preprocess.py in parallel_process_text(self, data)
233 end_tok=self.end_tok)
234 n_cores = self.num_cores
--> 235 return flattenlist(apply_parallel(process_text, data, n_cores))
236
237 def generate_doc_length_stats(self):
~/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/ktext/preprocess.py in apply_parallel(func, data, cpu_cores)
74 transformed_data = pool.map(func, chunked(data, chunk_size), chunksize=1)
75 finally:
---> 76 pool.close()
77 pool.join()
78 return transformed_data
UnboundLocalError: local variable 'pool' referenced before assignment
This error seems to be due to an error caused by your source code. Can you see what happened?
Happy to make a PR for this if required, but I found that many of the modules in requirements.txt
are not needed for this tutorial.
If using the AWS Deep Learning Ubuntu AMI, you can simply clone the repo and then just
source activate tensorflow_p36
pip install ktext annoy nltk pydot
Thanks for this great tutorial, it's one of the few deep learning tutorials that just works out of the box.
Hi,
Has any of you have similar issue like below when running this notebook?
ValueError: Variable Encoder-Batchnorm-1_1/moving_mean/biased already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:
File "/Users/xxxxx/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 1005, in moving_average_update
x, value, momentum, zero_debias=True)
File "/Users/xxxxxxx/anaconda3/lib/python3.6/site-packages/keras/layers/normalization.py", line 193, in call
self.momentum),
File "/Users/xxxxxx/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py", line 619, in call
output = self.call(inputs, **kwargs)
Thanks a lot
After I trained a model, how can I generate a title for the text I entered? Because I saw that the data entered in the prediction in your code is a DF format containing the body and the title, but in practice, I only want to generate a title for the given text, so I don't need the title when predicting, then I What should I do?
Thank You!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.