GithubHelp home page GithubHelp logo

jina-ai / jina Goto Github PK

View Code? Open in Web Editor NEW
20.1K 209.0 2.2K 1.34 GB

☁️ Build multimodal AI applications with cloud-native stack

Home Page: https://docs.jina.ai

License: Apache License 2.0

Python 96.94% Shell 0.84% Dockerfile 0.48% Go 1.65% C 0.09%
neural-search cloud-native deep-learning machine-learning framework grpc kubernetes multimodal mlops pipeline

jina's Issues

figure out why ruamel.yaml return strange error under non-x64 arch

docker buildx build --platform linux/arm64 -t jinaai/jina:master-multiarch -o type=registry --file ./Dockerfiles/debianx.Dockerfile --progress plain .

AttributeError: 'NoneType' object has no attribute 'anchor'

#12 7.392     node = self.compose_node(None, None)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/composer.py", line 120, in compose_node
#12 7.392     anchor = event.anchor
#12 7.392 AttributeError: 'NoneType' object has no attribute 'anchor'

Complete trace:

#12 7.392 {'is_trained': False, 'is_updated': False, 'batch_size': None, 'workspace': '$PWD', 'name': None, 'on_gpu': False, 'warn_unnamed': False, 'max_snapshot': 0, 'py_modules': None, 'replica_id': '{root.metas.replica_id}', 'separated_workspace': '{root.metas.separated_workspace}', 'replica_workspace': '{root.metas.workspace}/{root.metas.name}-{root.metas.replica_id}'}
#12 7.392     router@22[E][pea:run:247]:unknown exception: 'NoneType' object has no attribute 'anchor'
#12 7.392 Traceback (most recent call last):
#12 7.392   File "/usr/local/lib/python3.7/site-packages/jina/peapods/pea.py", line 217, in run
#12 7.392     self.post_init()
#12 7.392   File "/usr/local/lib/python3.7/site-packages/jina/peapods/pea.py", line 269, in post_init
#12 7.392     self.load_executor()
#12 7.392   File "/usr/local/lib/python3.7/site-packages/jina/peapods/pea.py", line 158, in load_executor
#12 7.392     self.args.separated_workspace, self.replica_id)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/jina/executors/__init__.py", line 396, in load_config
#12 7.392     return yaml.load(tmp_s)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/main.py", line 343, in load
#12 7.392     return constructor.get_single_data()
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 113, in get_single_data
#12 7.392     return self.construct_document(node)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 118, in construct_document
#12 7.392     data = self.construct_object(node)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 146, in construct_object
#12 7.392     data = self.construct_non_recursive_object(node)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 181, in construct_non_recursive_object
#12 7.392     data = constructor(self, node)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/jina/executors/__init__.py", line 438, in from_yaml
#12 7.392     return cls._get_instance_from_yaml(constructor, node, stop_on_import_error)[0]
#12 7.392   File "/usr/local/lib/python3.7/site-packages/jina/executors/__init__.py", line 443, in _get_instance_from_yaml
#12 7.392     constructor, node, deep=True)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 440, in construct_mapping
#12 7.392     return BaseConstructor.construct_mapping(self, node, deep=deep)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 255, in construct_mapping
#12 7.392     value = self.construct_object(value_node, deep=deep)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 146, in construct_object
#12 7.392     data = self.construct_non_recursive_object(node)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 188, in construct_non_recursive_object
#12 7.392     for _dummy in generator:
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 723, in construct_yaml_map
#12 7.392     value = self.construct_mapping(node)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 440, in construct_mapping
#12 7.392     return BaseConstructor.construct_mapping(self, node, deep=deep)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 255, in construct_mapping
#12 7.392     value = self.construct_object(value_node, deep=deep)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 146, in construct_object
#12 7.392     data = self.construct_non_recursive_object(node)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 188, in construct_non_recursive_object
#12 7.392     for _dummy in generator:
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 723, in construct_yaml_map
#12 7.392     value = self.construct_mapping(node)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 440, in construct_mapping
#12 7.392     return BaseConstructor.construct_mapping(self, node, deep=deep)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 255, in construct_mapping
#12 7.392     value = self.construct_object(value_node, deep=deep)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 146, in construct_object
#12 7.392     data = self.construct_non_recursive_object(node)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 188, in construct_non_recursive_object
#12 7.392     for _dummy in generator:
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 717, in construct_yaml_seq
#12 7.392     data.extend(self.construct_sequence(node))
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 211, in construct_sequence
#12 7.392     return [self.construct_object(child, deep=deep) for child in node.value]
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 211, in <listcomp>
#12 7.392     return [self.construct_object(child, deep=deep) for child in node.value]
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 146, in construct_object
#12 7.392     data = self.construct_non_recursive_object(node)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 181, in construct_non_recursive_object
#12 7.392     data = constructor(self, node)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/jina/drivers/__init__.py", line 111, in from_yaml
#12 7.392     return cls._get_instance_from_yaml(constructor, node)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/jina/drivers/__init__.py", line 118, in _get_instance_from_yaml
#12 7.392     obj = cls(**data.get('with', {}))
#12 7.392   File "/usr/local/lib/python3.7/site-packages/jina/executors/decorators.py", line 85, in arg_wrapper
#12 7.392     _defaults = get_default_metas()
#12 7.392   File "/usr/local/lib/python3.7/site-packages/jina/executors/metas.py", line 134, in get_default_metas
#12 7.392     _defaults = yaml.load(fp)  # do not expand variables at here, i.e. DO NOT USE expand_dict(yaml.load(fp))
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/main.py", line 343, in load
#12 7.392     return constructor.get_single_data()
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 111, in get_single_data
#12 7.392     node = self.composer.get_single_node()
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/composer.py", line 78, in get_single_node
#12 7.392     document = self.compose_document()
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/composer.py", line 101, in compose_document
#12 7.392     node = self.compose_node(None, None)
#12 7.392   File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/composer.py", line 120, in compose_node
#12 7.392     anchor = event.anchor
#12 7.392 AttributeError: 'NoneType' object has no attribute 'anchor'
#12 7.397     router@22[I][zmq:__e:122]:bytes_sent: 0 KB bytes_recv:0 KB

implement torchvision encoder

image classification

There are 12 models in the torchvision.

  1. AlexNet
  2. VGG
  3. ResNet
  4. SqueezeNet
  5. DenseNet
  6. Inception v3
  7. GoogLeNet
  8. ShuffleNet v2
  9. MobileNet v2
  10. ResNeXt
  11. Wide ResNet
  12. MNASNet

discussion

  • semantic segmentaion models can be used for transformer

    1. FCN ResNet101
    2. DeepLabV3 ResNet101
  • object detection models can be used for transformer

    1. Faster R-CNN ResNet-50 FPN
    2. Mask R-CNN ResNet-50 FPN

reference

https://pytorch.org/docs/stable/torchvision/models.html

demo on a multilingual search

With the help of a multilingual pretrained model, we can do cross language search. The quality of the results is in doubt.

removing bert-for-tf2 from extra-requirements.txt

i fixed the python version requirements in extra-requirements.txt. This should fix the unit test in python 3.7

but the dependency of bert-for-tf2 requires tensorflow to be installed. This is overkill. In order to use a tokenizer for bert, one need to install almost pytorch, paddlepaddle and tensorflow?? Apparently bert-for-tf2 is badly written

https://github.com/kpe/bert-for-tf2/

It is also not official.

Action point

Remove it from jina, we don't want this overkill yet trivial function as a dependency. Write a simple function if possible.

implement PaddleHub encoder

image classification

PaddleHub has 14 types of models (17 models in total)

  1. Xception
  2. VGG
  3. ShuffleNet V2
  4. ResNeXt
    • ResNeXt_vd
    • SE_ResNeXt
    • ResNeXt_wsl
    • ResNeXt
  5. ResNet
    • ResNet V2
  6. PNASNet
  7. Mobilenet_v2
  8. Inception_V4
  9. GoogleNet
  10. EfficientNet
  11. DPN
  12. DenseNet
  13. DarkNet
  14. AlexNet

图像生成

使用GAN的隐层变量表示图片

  1. STGAN
  2. StarGAN
  3. CycleGAN
  4. AttGAN

discussion

  • 目标检测类目下的模型作为 transformer

    1. YOLOv3
    2. Ultra-Light-Fast-Generic-Face-Detector
    3. SSD
    4. PyramidBox
    5. faster_rcnn
  • 图像分割类目下的模型作为 transformer

    1. deeplabv3p
    2. ACE2P
  • 关键点检测类目下的模型输出需要配合embedding使用

    1. pos_resnet50

reference:

https://www.paddlepaddle.org.cn/hublist

Support remote control using Pod and frontend

  • The SPAWN request contains the args of a Pod.
  • The SPAWN request comes from request_generator in the form of a gRPC request and goes to the frontend.
  • The frontend receives the request and start a Pod
  • All log output of this Pod is redirected back to where it connects with stream

Ideas on NLP demos

principle

  1. The demo must be AWESOME. We must show that we are way better than conventional search engines. Specifically, we need to show something that the old-school search engines can NOT do.
  2. The demo must be simple and easy to reproduce.
  3. The demo should present good results.
  4. The data and the model should be small, i.e. less than 300MB.
  5. The data should be neutral, non-racial, and non-offensive. For example, Hate speech identification is a bad idea.
  6. The data should be public available. No need to register or login.
  7. The data don't need a lot of munging.
  8. The quality of the search results should be easy to judge.
  9. The search task should be meaningful and useful.

Potential dataset

  1. Amazon Food Review, 240MB
  2. SouthParkData
  3. Urban dictionary words

add `jina build` to CLI API

Jina build will build the docker image and do a unit test to make sure this container is valid and usable

The `as_update_method` decorator is not working for customized executors

DummyIndexer inherits everything from NumpyIndexer except add(). During indexing, the add() function is DummyIndexer is not wrapped by as_update_method(). The add() function is expected to be wrapped in executors/init.py::register_class().

class DummyIndexer(NumpyIndexer):
    # the add() function is simply copied from NumpyIndexer
    def add(self, keys: 'np.ndarray', vectors: 'np.ndarray', *args, **kwargs):
        if len(vectors.shape) != 2:
            raise ValueError('vectors shape %s is not valid, expecting "vectors" to have rank of 2' % vectors.shape)

        if not self.num_dim:
            self.num_dim = vectors.shape[1]
            self.dtype = vectors.dtype.name
        elif self.num_dim != vectors.shape[1]:
            raise ValueError(
                "vectors' shape [%d, %d] does not match with indexers's dim: %d" %
                (vectors.shape[0], vectors.shape[1], self.num_dim))
        elif self.dtype != vectors.dtype.name:
            raise TypeError(
                "vectors' dtype %s does not match with indexers's dtype: %s" %
                (vectors.dtype.name, self.dtype))
        elif keys.shape[0] != vectors.shape[0]:
            raise ValueError('number of key %d not equal to number of vectors %d' % (keys.shape[0], vectors.shape[0]))
        elif self.key_dtype != keys.dtype.name:
            raise TypeError(
                "keys' dtype %s does not match with indexers keys's dtype: %s" %
                (keys.dtype.name, self.key_dtype))

        self.write_handler.write(vectors.tobytes())
        self.key_bytes += keys.tobytes()
        self.key_dtype = keys.dtype.name
        self._size += keys.shape[0]

Adding remote control support to Jina Flow?

Related to #33

See PR in #35 (JEP-2) Section "Can we support remote Pod in the Flow API?"

JEP-2 rules out the remote control of Jina Flow, but on the second thought, I believe adding remote control may give Jina a competitive advantage over K8s and Docker Swarm: it enables users to use Jina in distributed manner without learning container orchestration. The learning curve of Jina Flow API is much smoother than the counterparts.

So maybe it is a good idea to include remote control in Flow API?

getter method is missing in CompoundExecutor.components

this leads to the following exception

compound_c@3378[E][pea:run:270]:unknown exception: list indices must be integers or slices, not str
Traceback (most recent call last):
  File "/Users/nanwang/Codes/jina-ai/jina/jina/peapods/pea.py", line 252, in run
    self.post_init()
  File "/Users/nanwang/Codes/jina-ai/jina/jina/peapods/pea.py", line 288, in post_init
    self.load_executor()
  File "/Users/nanwang/Codes/jina-ai/jina/jina/peapods/pea.py", line 167, in load_executor
    self.executor.attach(pea=self)
  File "/Users/nanwang/Codes/jina-ai/jina/jina/executors/__init__.py", line 528, in attach
    d.attach(executor=self, *args, **kwargs)
  File "/Users/nanwang/Codes/jina-ai/jina/jina/drivers/__init__.py", line 169, in attach
    self._exec = executor.components[self._executor_name]
TypeError: list indices must be integers or slices, not str

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.