jina-ai / jina Goto Github PK
View Code? Open in Web Editor NEW☁️ Build multimodal AI applications with cloud-native stack
Home Page: https://docs.jina.ai
License: Apache License 2.0
☁️ Build multimodal AI applications with cloud-native stack
Home Page: https://docs.jina.ai
License: Apache License 2.0
PEP-8 suggests to use self
instead of cls
at jina/executors/init.py#L25.
Always use self for the first argument to instance methods.
Always use cls for the first argument to class methods.
reference:
PEP-8 Function and Method Arguments
transformer tutorials: https://github.com/huggingface/transformers#quick-tour
docker buildx build --platform linux/arm64 -t jinaai/jina:master-multiarch -o type=registry --file ./Dockerfiles/debianx.Dockerfile --progress plain .
AttributeError: 'NoneType' object has no attribute 'anchor'
#12 7.392 node = self.compose_node(None, None)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/composer.py", line 120, in compose_node
#12 7.392 anchor = event.anchor
#12 7.392 AttributeError: 'NoneType' object has no attribute 'anchor'
Complete trace:
#12 7.392 {'is_trained': False, 'is_updated': False, 'batch_size': None, 'workspace': '$PWD', 'name': None, 'on_gpu': False, 'warn_unnamed': False, 'max_snapshot': 0, 'py_modules': None, 'replica_id': '{root.metas.replica_id}', 'separated_workspace': '{root.metas.separated_workspace}', 'replica_workspace': '{root.metas.workspace}/{root.metas.name}-{root.metas.replica_id}'}
#12 7.392 router@22[E][pea:run:247]:unknown exception: 'NoneType' object has no attribute 'anchor'
#12 7.392 Traceback (most recent call last):
#12 7.392 File "/usr/local/lib/python3.7/site-packages/jina/peapods/pea.py", line 217, in run
#12 7.392 self.post_init()
#12 7.392 File "/usr/local/lib/python3.7/site-packages/jina/peapods/pea.py", line 269, in post_init
#12 7.392 self.load_executor()
#12 7.392 File "/usr/local/lib/python3.7/site-packages/jina/peapods/pea.py", line 158, in load_executor
#12 7.392 self.args.separated_workspace, self.replica_id)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/jina/executors/__init__.py", line 396, in load_config
#12 7.392 return yaml.load(tmp_s)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/main.py", line 343, in load
#12 7.392 return constructor.get_single_data()
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 113, in get_single_data
#12 7.392 return self.construct_document(node)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 118, in construct_document
#12 7.392 data = self.construct_object(node)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 146, in construct_object
#12 7.392 data = self.construct_non_recursive_object(node)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 181, in construct_non_recursive_object
#12 7.392 data = constructor(self, node)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/jina/executors/__init__.py", line 438, in from_yaml
#12 7.392 return cls._get_instance_from_yaml(constructor, node, stop_on_import_error)[0]
#12 7.392 File "/usr/local/lib/python3.7/site-packages/jina/executors/__init__.py", line 443, in _get_instance_from_yaml
#12 7.392 constructor, node, deep=True)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 440, in construct_mapping
#12 7.392 return BaseConstructor.construct_mapping(self, node, deep=deep)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 255, in construct_mapping
#12 7.392 value = self.construct_object(value_node, deep=deep)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 146, in construct_object
#12 7.392 data = self.construct_non_recursive_object(node)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 188, in construct_non_recursive_object
#12 7.392 for _dummy in generator:
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 723, in construct_yaml_map
#12 7.392 value = self.construct_mapping(node)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 440, in construct_mapping
#12 7.392 return BaseConstructor.construct_mapping(self, node, deep=deep)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 255, in construct_mapping
#12 7.392 value = self.construct_object(value_node, deep=deep)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 146, in construct_object
#12 7.392 data = self.construct_non_recursive_object(node)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 188, in construct_non_recursive_object
#12 7.392 for _dummy in generator:
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 723, in construct_yaml_map
#12 7.392 value = self.construct_mapping(node)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 440, in construct_mapping
#12 7.392 return BaseConstructor.construct_mapping(self, node, deep=deep)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 255, in construct_mapping
#12 7.392 value = self.construct_object(value_node, deep=deep)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 146, in construct_object
#12 7.392 data = self.construct_non_recursive_object(node)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 188, in construct_non_recursive_object
#12 7.392 for _dummy in generator:
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 717, in construct_yaml_seq
#12 7.392 data.extend(self.construct_sequence(node))
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 211, in construct_sequence
#12 7.392 return [self.construct_object(child, deep=deep) for child in node.value]
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 211, in <listcomp>
#12 7.392 return [self.construct_object(child, deep=deep) for child in node.value]
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 146, in construct_object
#12 7.392 data = self.construct_non_recursive_object(node)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 181, in construct_non_recursive_object
#12 7.392 data = constructor(self, node)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/jina/drivers/__init__.py", line 111, in from_yaml
#12 7.392 return cls._get_instance_from_yaml(constructor, node)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/jina/drivers/__init__.py", line 118, in _get_instance_from_yaml
#12 7.392 obj = cls(**data.get('with', {}))
#12 7.392 File "/usr/local/lib/python3.7/site-packages/jina/executors/decorators.py", line 85, in arg_wrapper
#12 7.392 _defaults = get_default_metas()
#12 7.392 File "/usr/local/lib/python3.7/site-packages/jina/executors/metas.py", line 134, in get_default_metas
#12 7.392 _defaults = yaml.load(fp) # do not expand variables at here, i.e. DO NOT USE expand_dict(yaml.load(fp))
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/main.py", line 343, in load
#12 7.392 return constructor.get_single_data()
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 111, in get_single_data
#12 7.392 node = self.composer.get_single_node()
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/composer.py", line 78, in get_single_node
#12 7.392 document = self.compose_document()
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/composer.py", line 101, in compose_document
#12 7.392 node = self.compose_node(None, None)
#12 7.392 File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/composer.py", line 120, in compose_node
#12 7.392 anchor = event.anchor
#12 7.392 AttributeError: 'NoneType' object has no attribute 'anchor'
#12 7.397 router@22[I][zmq:__e:122]:bytes_sent: 0 KB bytes_recv:0 KB
setting all host_in and host_out to 0.0.0.0 works fine locally, but it makes docker container inaccessible.
I also feel that if this is solved, then we are one step closer to remote flow API
There are 12 models in the torchvision
.
semantic segmentaion
models can be used for transformer
object detection
models can be used for transformer
With the help of a multilingual pretrained model, we can do cross language search. The quality of the results is in doubt.
flow serialization error? need to check
paddlePaddle models: https://github.com/PaddlePaddle/models
flair reference: https://github.com/flairNLP/flair
Keras application includes 7 types of models (10 models in total)
i fixed the python version requirements in extra-requirements.txt. This should fix the unit test in python 3.7
but the dependency of bert-for-tf2
requires tensorflow to be installed. This is overkill. In order to use a tokenizer for bert, one need to install almost pytorch, paddlepaddle and tensorflow?? Apparently bert-for-tf2
is badly written
https://github.com/kpe/bert-for-tf2/
It is also not official.
Remove it from jina, we don't want this overkill yet trivial function as a dependency. Write a simple function if possible.
PaddleHub
has 14 types of models (17 models in total)
使用GAN的隐层变量表示图片
目标检测
类目下的模型作为 transformer
图像分割
类目下的模型作为 transformer
关键点检测
类目下的模型输出需要配合embedding使用
SPAWN
request contains the args
of a Pod
.SPAWN
request comes from request_generator in the form of a gRPC request and goes to the frontend.not affecting the running though but annoy, dig in needed
Jina build will build the docker image and do a unit test to make sure this container is valid and usable
perhaps a new type of message in ControlRequest
?
PaddleHub
has 3 type of models (4 models in total)
attribute is_updated
defined outside __init__
. see details: executors/init.py#L204
not via QEMU
DummyIndexer
inherits everything from NumpyIndexer
except add()
. During indexing, the add()
function is DummyIndexer
is not wrapped by as_update_method()
. The add()
function is expected to be wrapped in executors/init.py::register_class().
class DummyIndexer(NumpyIndexer):
# the add() function is simply copied from NumpyIndexer
def add(self, keys: 'np.ndarray', vectors: 'np.ndarray', *args, **kwargs):
if len(vectors.shape) != 2:
raise ValueError('vectors shape %s is not valid, expecting "vectors" to have rank of 2' % vectors.shape)
if not self.num_dim:
self.num_dim = vectors.shape[1]
self.dtype = vectors.dtype.name
elif self.num_dim != vectors.shape[1]:
raise ValueError(
"vectors' shape [%d, %d] does not match with indexers's dim: %d" %
(vectors.shape[0], vectors.shape[1], self.num_dim))
elif self.dtype != vectors.dtype.name:
raise TypeError(
"vectors' dtype %s does not match with indexers's dtype: %s" %
(vectors.dtype.name, self.dtype))
elif keys.shape[0] != vectors.shape[0]:
raise ValueError('number of key %d not equal to number of vectors %d' % (keys.shape[0], vectors.shape[0]))
elif self.key_dtype != keys.dtype.name:
raise TypeError(
"keys' dtype %s does not match with indexers keys's dtype: %s" %
(keys.dtype.name, self.key_dtype))
self.write_handler.write(vectors.tobytes())
self.key_bytes += keys.tobytes()
self.key_dtype = keys.dtype.name
self._size += keys.shape[0]
Related to #33
See PR in #35 (JEP-2) Section "Can we support remote Pod in the Flow API?"
JEP-2 rules out the remote control of Jina Flow, but on the second thought, I believe adding remote control may give Jina a competitive advantage over K8s and Docker Swarm: it enables users to use Jina in distributed manner without learning container orchestration. The learning curve of Jina Flow API is much smoother than the counterparts.
So maybe it is a good idea to include remote control in Flow API?
for example, all encoders inherited from BaseEncoder
should have the following as default
requests:
on:
[SearchRequest, TrainRequest, IndexRequest]:
- !EncodeDriver
with:
method: encode
then we can have multiple connect
As a part of co-op with deepset
and put into self-hosted runner ci
this leads to the following exception
compound_c@3378[E][pea:run:270]:unknown exception: list indices must be integers or slices, not str
Traceback (most recent call last):
File "/Users/nanwang/Codes/jina-ai/jina/jina/peapods/pea.py", line 252, in run
self.post_init()
File "/Users/nanwang/Codes/jina-ai/jina/jina/peapods/pea.py", line 288, in post_init
self.load_executor()
File "/Users/nanwang/Codes/jina-ai/jina/jina/peapods/pea.py", line 167, in load_executor
self.executor.attach(pea=self)
File "/Users/nanwang/Codes/jina-ai/jina/jina/executors/__init__.py", line 528, in attach
d.attach(executor=self, *args, **kwargs)
File "/Users/nanwang/Codes/jina-ai/jina/jina/drivers/__init__.py", line 169, in attach
self._exec = executor.components[self._executor_name]
TypeError: list indices must be integers or slices, not str
JEP2 turns out to be very challenging, let's first work on supporting container alone
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.