jina-ai / example-video-search Goto Github PK

This is an example of search videos using jina

License: Apache License 2.0

Python 90.93% Shell 9.07%

example-video-search's Introduction

Build A Video Search System using Jina

NOTE: The simplified version of this example is at feat-simple-tutorial branch, which contains the full codes for the tutorial at docs.jina.ai

Imagine that you remember one specific scene from a movie, for example the scene from The Lord of the Rings where Gandalf is figthing the dragon Balrog. Unfortunately, you forgot both the name Gandalf and Balrog and also in which of the three movies the scene occurred. How could you find the correct scene in the movie now? This is where this example can help you. This Video Search System allows you to search in movies based on text. This means you could search 'Old wizard fighting dragon' and the search system would return the correct movie and timestamp of the scene.

Table of Contents

Build A Video Search System using Jina

Overview

About this example:
Learnings	How to search through both image frames and audio of a video.
Used for indexing	Video Files.
Used for querying	Text Query (e.g. "girl studying engineering")
Dataset used	Choose your own videos
Model used	AudioCLIP

In this example, we create a video search system that retrieves the videos based on short text descriptions of the scenes. The main challenge is to enable the user to search videos without using any labels or text information about the videos.

We choose to use Audio CLIP models to encode the video frames and audios

Jina searches both the image frames and the audio of the video and returns the matched video and a timestamp.

🐍 Build the app with Python

These instructions explain how to build the example yourself and deploy it with Python.

🗝️ Requirements

You have a working Python 3.7 or 3.8 environment and a installation of Docker. Ensure that you set enough memory resources(more than 6GB) to docker. You can set it in settings/resources/advanced in Docker.
You have at least 5 GB of free space on your hard drive.
You have installed ffmpeg and it is available from the command line (it's in your PATH environment variable). On Ubuntu, this should cover it: sudo apt-get install -y ffmpeg
We recommend creating a new Python virtual environment to have a clean installation of Jina and prevent dependency conflicts.

python -m venv venv
source venv/bin/activate

👾 Step 1. Clone the repo and install Jina

Begin by cloning the repo, so you can get the required files and datasets.

git clone https://github.com/jina-ai/example-video-search
cd example-video-search

In your terminal, you should now be located in the example-video-search folder. Let's install Jina and the other required Python libraries. For further information on installing Jina check out our documentation.

pip install -r requirements.txt

Step 2. Download the AudioCLIP model.

We recommend you to download the AudioCLIP model in advance. To do that, run:

bash scripts/download_model.sh

🏃 Step 3. Index your data

To quickly get started, you can index a small dataset to make sure everything is working correctly.

To index the toy dataset, run

python app.py -m grpc

After indexing, the search flow is started automatically and three simple test queries are performed. The results are displayed in your terminal.

We recommend you come back to this step later and index more data.

🔎 Step 4: Query your data

After indexing once, you can query without indexing by running

python app.py -m restful_query

Afterwards, you can query with

curl -X 'POST' 'localhost:45678/search' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{"data": [{"text": "this is a highway"}]}'

The retrieved results contains the video filename (id) and the best matched frame in that video together with its timestamp.

You can also add more parameters to the query:

curl -X POST -d '{"parameters":{"top_k": 5}, "data": ["a black dog and a spotted dog are fighting"]}' -H 'accept: application/json' -H 'Content-Type: application/json' 'http://localhost:45678/search'

Once you run this command, you should see a JSON output returned to you. This contains the video uri and the timestamp, which together determine one part of the video that matches the query text description. By default, the toy_data contains two videos clipped from YouTube.

🌀 Flow diagram

This diagram provides a visual representation of the Flows in this example; Showing which executors are used in which order. Remember, our goal is to compare vectors representing the semantics of images and audio with vectors encoding the semantics of short text descriptions.

Indexing

As you can see, the Flow that Indexes the data contains two parallel branches:

Image: Encodes image frames from the video and indexes them.
Audio: Encodes audio of the images and indexes it.

Querying

The query flow is different to the index flow. We are encoding the text input using the AudioCLIP model and then compare the embeddings with the audio and image embeddings we have stored in the indexers. Then, the indexers add the closest matches to the documents.

🔮 Overview of the files


📃 `index-flow.yml`	YAML file to configure indexing Flow
📃 `search-flow.yml`	YAML file to configure querying Flow
📃 `executors.py`	File that contains Ranker and ModalityFilter executors
📂 `workspace/`	Folder to store indexed files (embeddings and documents). Automatically created after the first indexing
📂 `toy-data/`	Folder to store the toy dataset for the example
📃 `app.py`	Main file that runs the example

⏭️ Next steps

Did you like this example and are you interested in building your own? For a detailed tutorial on how to build your Jina app check out How to Build Your First Jina App guide in our documentation.

To learn more about Jina concepts, check out the cookbooks.

If you have any issues following this guide, you can always get support from our Slack community .

👩‍👩‍👧‍👦 Community

Slack channel - a communication platform for developers to discuss Jina.
LinkedIn - get to know Jina AI as a company and find job opportunities.
- follow us and interact with us using hashtag #JinaSearch.
Company - know more about our company, we are fully committed to open-source!

🦄 License

Jina is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

example-video-search's People

Contributors

Stargazers

Watchers

Forkers

sephib

example-video-search's Issues

Flow is aborted due to ['image_encoder', 'audio_encoder'] can not be started

Hi,
I'm running on a macOS 12.0.1
While trying to run the example (AS-IS) I'm getting the following errors:

image_encoder@90235[E]:Exception('the container fails to start, check the arguments or entrypoint') during <class 'jina.peapods.runtimes.container.ContainerRuntime'> initialization
...
audio_encoder@90246[E]:Exception('the container fails to start, check the arguments or entrypoint') during <class 'jina.peapods.runtimes.container.ContainerRuntime'> initialization

Any assistance would be appreciated

Here is the full trace

(base) ➜  example-video-search git:(main) ✗ python app.py

UserWarning: It looks like you are trying to import multiple python modules using `py_modules`. When using multiple python files to define an executor, the recommended practice is to structure the files in a python package, and only import the `__init__.py` file of that package. For more details, please check out the cookbook: https://docs.jina.ai/fundamentals/executor/repository-structure/ (raised from /Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/jina/jaml/helper.py:242)
UserWarning: 
            executor shadows one of built-in Python module name.
            It is imported as `user_module.executor`

            Affects:
            - Either, change your code from using `from executor import ...`
              to `from user_module.executor import ...`
            - Or, rename executor to another name
             (raised from /Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/jina/importer.py:111)
UserWarning: It looks like you are trying to import multiple python modules using `py_modules`. When using multiple python files to define an executor, the recommended practice is to structure the files in a python package, and only import the `__init__.py` file of that package. For more details, please check out the cookbook: https://docs.jina.ai/fundamentals/executor/repository-structure/ (raised from /Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/jina/jaml/helper.py:242)
UserWarning: 
            executor shadows one of built-in Python module name.
            It is imported as `user_module.executor`

            Affects:
            - Either, change your code from using `from executor import ...`
              to `from user_module.executor import ...`
            - Or, rename executor to another name
             (raised from /Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/jina/importer.py:111)
⠙ 8/10 waiting image_encoder audio_encoder to be ready...  image_encoder@64132[I]:                  
  image_encoder@64132[I]:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWWWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  image_encoder@64132[I]:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNNNNNNNWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  image_encoder@64132[I]:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMNNNNNNNNNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  image_encoder@64132[I]:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNNNNNNNNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  image_encoder@64132[I]:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNNNWWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  image_encoder@64132[I]:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  image_encoder@64132[I]:MMMMMMMMMMMMWxxxxxxxxxOMMMMMNxxxxxxxxx0MMMMMKddddddxkKWMMMMMMMMMMMMXOxdddxONMMMM
  image_encoder@64132[I]:MMMMMMMMMMMMXllllllllldMMMMM0lllllllllxMMMMMOllllllllllo0MMMMMMMM0olllllllllo0MM
  image_encoder@64132[I]:MMMMMMMMMMMMXllllllllldMMMMM0lllllllllxMMMMMOlllllllllllloWMMMMMdllllllllllllldM
  image_encoder@64132[I]:MMMMMMMMMMMMXllllllllldMMMMM0lllllllllxMMMMMOllllllllllllloMMMM0lllllllllllllllK
  image_encoder@64132[I]:MMMMMMMMMMMMKllllllllldMMMMM0lllllllllxMMMMMOllllllllllllllKMMM0lllllllllllllllO
  image_encoder@64132[I]:MMMMMMMMMMMMKllllllllldMMMMM0lllllllllxMMMMMOllllllllllllll0MMMMollllllllllllllO
  image_encoder@64132[I]:MWOkkkkk0MMMKlllllllllkMMMMM0lllllllllxMMMMMOllllllllllllll0MMMMMxlllllllllllllO
  image_encoder@64132[I]:NkkkkkkkkkMMKlllllllloMMMMMM0lllllllllxMMMMMOllllllllllllll0MMMMMMWOdolllllllllO
  image_encoder@64132[I]:KkkkkkkkkkNMKllllllldMMMMMMMMWWWWWWWWWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  image_encoder@64132[I]:MOkkkkkkk0MMKllllldXMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  image_encoder@64132[I]:MMWX00KXMMMMXxk0XMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  image_encoder@64132[I]:
  image_encoder@64132[I]:▶️  /usr/local/bin/jina executor --uses config.yml --name image_encoder --workspace /Users/sephi/git_forked/example-video-search --identity 5627a0e8-a635-4863-bcab-ab4e5608715f --workspace-id ceb0455c-0b52-447b-8b00-77dd84264f03 --zmq-identity 1332449d-1aac-4038-bd33-22dd03f1d681 --port-ctrl 52448 --uses-with {"traversal_paths": ["c"]} --port-in 52449 --port-out 52450 --hosts-in-connect --socket-in ROUTER_BIND --socket-out ROUTER_BIND --native --num-part 1 --dynamic-routing-out --dynamic-routing-in --runs-in-docker --upload-files --noblock-on-start
  image_encoder@64132[I]:🔧️                            cli = executor                      
  image_encoder@64132[I]:ctrl-with-ipc = False
  image_encoder@64132[I]:daemon = False
  image_encoder@64132[I]:disable-remote = False
  image_encoder@64132[I]:docker-kwargs = None
  image_encoder@64132[I]:dump-path =
  image_encoder@64132[I]:🔧️             dynamic-routing-in = True                          
  image_encoder@64132[I]:🔧️            dynamic-routing-out = True                          
  image_encoder@64132[I]:entrypoint = None
  image_encoder@64132[I]:env = None
  image_encoder@64132[I]:expose-public = False
  image_encoder@64132[I]:force = False
  image_encoder@64132[I]:gpus = None
  image_encoder@64132[I]:grpc-data-requests = False
  image_encoder@64132[I]:host = 0.0.0.0
  image_encoder@64132[I]:host-in = 0.0.0.0
  image_encoder@64132[I]:host-out = 0.0.0.0
  image_encoder@64132[I]:🔧️               hosts-in-connect = []                            
  image_encoder@64132[I]:🔧️                       identity = 5627a0e8-a635-4863-bcab-ab4e56
  image_encoder@64132[I]:install-requirements = False
  image_encoder@64132[I]:k8s-connection-pool = True
  image_encoder@64132[I]:k8s-namespace = None
  image_encoder@64132[I]:log-config = /usr/local/lib/python3.7/site-
  image_encoder@64132[I]:memory-hwm = -1
  image_encoder@64132[I]:🔧️                           name = image_encoder                 
  image_encoder@64132[I]:🔧️                         native = True                          
  image_encoder@64132[I]:🔧️               noblock-on-start = True                          
  image_encoder@64132[I]:🔧️                       num-part = 1                             
  image_encoder@64132[I]:on-error-strategy = IGNORE
  image_encoder@64132[I]:pea-id = 0
  image_encoder@64132[I]:pea-role = SINGLETON
  image_encoder@64132[I]:🔧️                      port-ctrl = 52448                         
  image_encoder@64132[I]:🔧️                        port-in = 52449                         
  image_encoder@64132[I]:port-jinad = 8000
  image_encoder@64132[I]:🔧️                       port-out = 52450                         
  image_encoder@64132[I]:pull-latest = False
  image_encoder@64132[I]:py-modules = None
  image_encoder@64132[I]:quiet = False
  image_encoder@64132[I]:quiet-error = False
  image_encoder@64132[I]:quiet-remote-logs = False
  image_encoder@64132[I]:routing-table = None
  image_encoder@64132[I]:🔧️                 runs-in-docker = True                          
  image_encoder@64132[I]:runtime-backend = PROCESS
  image_encoder@64132[I]:runtime-cls = ZEDRuntime
  image_encoder@64132[I]:🔧️                      socket-in = ROUTER_BIND                   
  image_encoder@64132[I]:🔧️                     socket-out = ROUTER_BIND                   
  image_encoder@64132[I]:ssh-keyfile = None
  image_encoder@64132[I]:ssh-password = None
  image_encoder@64132[I]:ssh-server = None
  image_encoder@64132[I]:static-routing-table = False
  image_encoder@64132[I]:timeout-ctrl = 5000
  image_encoder@64132[I]:timeout-ready = 600000
  image_encoder@64132[I]:🔧️                   upload-files = []                            
  image_encoder@64132[I]:🔧️                           uses = config.yml                    
  image_encoder@64132[I]:uses-metas = None
  image_encoder@64132[I]:uses-requests = None
  image_encoder@64132[I]:🔧️                      uses-with = {'traversal_paths': ['c']}    
  image_encoder@64132[I]:volumes = None
  image_encoder@64132[I]:🔧️                      workspace = /Users/sephi/git_forked/exampl
  image_encoder@64132[I]:🔧️                   workspace-id = ceb0455c-0b52-447b-8b00-77dd84
  image_encoder@64132[I]:🔧️                   zmq-identity = 1332449d-1aac-4038-bd33-22dd03
⠏ 8/10 waiting image_encoder audio_encoder to be ready...  image_encoder@64132[I]:                  
  image_encoder@64132[I]:           JINA@ 1[W]:You are using Jina version 2.1.7, however version 2.4.5 is available. You should consider upgrading via the "pip install --upgrade jina" command.
⠋ 8/10 waiting image_encoder audio_encoder to be ready...  image_encoder@64132[I]:Model already exists! Skipping.
  image_encoder@64132[I]:Vocab already exists! Skipping.
⠼ 8/10 waiting image_encoder audio_encoder to be ready...  audio_encoder@64144[I]:                  
  audio_encoder@64144[I]:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWWWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  audio_encoder@64144[I]:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNNNNNNNWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  audio_encoder@64144[I]:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMNNNNNNNNNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  audio_encoder@64144[I]:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNNNNNNNNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  audio_encoder@64144[I]:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNNNWWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  audio_encoder@64144[I]:MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  audio_encoder@64144[I]:MMMMMMMMMMMMWxxxxxxxxxOMMMMMNxxxxxxxxx0MMMMMKddddddxkKWMMMMMMMMMMMMXOxdddxONMMMM
  audio_encoder@64144[I]:MMMMMMMMMMMMXllllllllldMMMMM0lllllllllxMMMMMOllllllllllo0MMMMMMMM0olllllllllo0MM
  audio_encoder@64144[I]:MMMMMMMMMMMMXllllllllldMMMMM0lllllllllxMMMMMOlllllllllllloWMMMMMdllllllllllllldM
  audio_encoder@64144[I]:MMMMMMMMMMMMXllllllllldMMMMM0lllllllllxMMMMMOllllllllllllloMMMM0lllllllllllllllK
  audio_encoder@64144[I]:MMMMMMMMMMMMKllllllllldMMMMM0lllllllllxMMMMMOllllllllllllllKMMM0lllllllllllllllO
  audio_encoder@64144[I]:MMMMMMMMMMMMKllllllllldMMMMM0lllllllllxMMMMMOllllllllllllll0MMMMollllllllllllllO
  audio_encoder@64144[I]:MWOkkkkk0MMMKlllllllllkMMMMM0lllllllllxMMMMMOllllllllllllll0MMMMMxlllllllllllllO
  audio_encoder@64144[I]:NkkkkkkkkkMMKlllllllloMMMMMM0lllllllllxMMMMMOllllllllllllll0MMMMMMWOdolllllllllO
  audio_encoder@64144[I]:KkkkkkkkkkNMKllllllldMMMMMMMMWWWWWWWWWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  audio_encoder@64144[I]:MOkkkkkkk0MMKllllldXMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  audio_encoder@64144[I]:MMWX00KXMMMMXxk0XMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
  audio_encoder@64144[I]:
  audio_encoder@64144[I]:▶️  /usr/local/bin/jina executor --uses config.yml --name audio_encoder --workspace /Users/sephi/git_forked/example-video-search --identity 81bae0e1-775a-43b0-9970-93f4aaec97b3 --workspace-id ad82497e-c4b9-4d35-89ee-aa1918bf1610 --zmq-identity f5ebfade-f5a4-4115-a521-4cbded8be07d --port-ctrl 53749 --uses-with {"traversal_paths": ["c"]} --port-in 53750 --port-out 53751 --hosts-in-connect --socket-in ROUTER_BIND --socket-out ROUTER_BIND --native --num-part 1 --dynamic-routing-out --dynamic-routing-in --runs-in-docker --upload-files --noblock-on-start
  audio_encoder@64144[I]:🔧️                            cli = executor                      
  audio_encoder@64144[I]:ctrl-with-ipc = False
  audio_encoder@64144[I]:daemon = False
  audio_encoder@64144[I]:disable-remote = False
  audio_encoder@64144[I]:docker-kwargs = None
  audio_encoder@64144[I]:dump-path =
  audio_encoder@64144[I]:🔧️             dynamic-routing-in = True                          
  audio_encoder@64144[I]:🔧️            dynamic-routing-out = True                          
  audio_encoder@64144[I]:entrypoint = None
  audio_encoder@64144[I]:env = None
  audio_encoder@64144[I]:expose-public = False
  audio_encoder@64144[I]:force = False
  audio_encoder@64144[I]:gpus = None
  audio_encoder@64144[I]:grpc-data-requests = False
  audio_encoder@64144[I]:host = 0.0.0.0
  audio_encoder@64144[I]:host-in = 0.0.0.0
  audio_encoder@64144[I]:host-out = 0.0.0.0
  audio_encoder@64144[I]:🔧️               hosts-in-connect = []                            
  audio_encoder@64144[I]:🔧️                       identity = 81bae0e1-775a-43b0-9970-93f4aa
  audio_encoder@64144[I]:install-requirements = False
  audio_encoder@64144[I]:k8s-connection-pool = True
  audio_encoder@64144[I]:k8s-namespace = None
  audio_encoder@64144[I]:log-config = /usr/local/lib/python3.7/site-
  audio_encoder@64144[I]:memory-hwm = -1
  audio_encoder@64144[I]:🔧️                           name = audio_encoder                 
  audio_encoder@64144[I]:🔧️                         native = True                          
  audio_encoder@64144[I]:🔧️               noblock-on-start = True                          
  audio_encoder@64144[I]:🔧️                       num-part = 1                             
  audio_encoder@64144[I]:on-error-strategy = IGNORE
  audio_encoder@64144[I]:pea-id = 0
  audio_encoder@64144[I]:pea-role = SINGLETON
  audio_encoder@64144[I]:🔧️                      port-ctrl = 53749                         
  audio_encoder@64144[I]:🔧️                        port-in = 53750                         
  audio_encoder@64144[I]:port-jinad = 8000
  audio_encoder@64144[I]:🔧️                       port-out = 53751                         
  audio_encoder@64144[I]:pull-latest = False
  audio_encoder@64144[I]:py-modules = None
  audio_encoder@64144[I]:quiet = False
  audio_encoder@64144[I]:quiet-error = False
  audio_encoder@64144[I]:quiet-remote-logs = False
  audio_encoder@64144[I]:routing-table = None
  audio_encoder@64144[I]:🔧️                 runs-in-docker = True                          
  audio_encoder@64144[I]:runtime-backend = PROCESS
  audio_encoder@64144[I]:runtime-cls = ZEDRuntime
  audio_encoder@64144[I]:🔧️                      socket-in = ROUTER_BIND                   
  audio_encoder@64144[I]:🔧️                     socket-out = ROUTER_BIND                   
  audio_encoder@64144[I]:ssh-keyfile = None
  audio_encoder@64144[I]:ssh-password = None
  audio_encoder@64144[I]:ssh-server = None
  audio_encoder@64144[I]:static-routing-table = False
  audio_encoder@64144[I]:timeout-ctrl = 5000
  audio_encoder@64144[I]:timeout-ready = 600000
  audio_encoder@64144[I]:🔧️                   upload-files = []                            
  audio_encoder@64144[I]:🔧️                           uses = config.yml                    
  audio_encoder@64144[I]:uses-metas = None
  audio_encoder@64144[I]:uses-requests = None
  audio_encoder@64144[I]:🔧️                      uses-with = {'traversal_paths': ['c']}    
  audio_encoder@64144[I]:volumes = None
  audio_encoder@64144[I]:🔧️                      workspace = /Users/sephi/git_forked/exampl
  audio_encoder@64144[I]:🔧️                   workspace-id = ad82497e-c4b9-4d35-89ee-aa1918
  audio_encoder@64144[I]:🔧️                   zmq-identity = f5ebfade-f5a4-4115-a521-4cbded
⠧ 8/10 waiting image_encoder audio_encoder to be ready...  audio_encoder@64144[I]:                  
  audio_encoder@64144[I]:           JINA@ 1[W]:You are using Jina version 2.1.7, however version 2.4.5 is available. You should consider upgrading via the "pip install --upgrade jina" command.
⠇ 8/10 waiting image_encoder audio_encoder to be ready...  image_encoder@64132[E]:Exception('the container fails to start, check the arguments or entrypoint') during <class 'jina.peapods.runtimes.container.ContainerRuntime'> initialization
 add "--quiet-error" to suppress the exception details
Traceback (most recent call last):
  File "/Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/jina/peapods/peas/__init__.py", line 79, in run
    runtime = runtime_cls(
  File "/Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/jina/peapods/runtimes/container/__init__.py", line 49, in __init__
    raise Exception(
Exception: the container fails to start, check the arguments or entrypoint
⠋ 9/10 waiting audio_encoder to be ready...  audio_encoder@64144[I]:Model already exists! Skipping. 
  audio_encoder@64144[I]:Vocab already exists! Skipping.
⠼ 9/10 waiting audio_encoder to be ready...  audio_encoder@64144[E]:Exception('the container fails to start, check the arguments or entrypoint') during <class 'jina.peapods.runtimes.container.ContainerRuntime'> initialization
 add "--quiet-error" to suppress the exception details
Traceback (most recent call last):
  File "/Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/jina/peapods/peas/__init__.py", line 79, in run
    runtime = runtime_cls(
  File "/Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/jina/peapods/runtimes/container/__init__.py", line 49, in __init__
    raise Exception(
Exception: the container fails to start, check the arguments or entrypoint
           Flow@63885[E]:Flow is aborted due to ['image_encoder', 'audio_encoder'] can not be started.
Traceback (most recent call last):
  File "/Users/sephi/git_forked/example-video-search/app.py", line 76, in <module>
    main()
  File "/Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/Users/sephi/git_forked/example-video-search/app.py", line 58, in main
    with Flow.load_config('index-flow.yml', override_with=override_dict) as f:
  File "/Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/jina/flow/base.py", line 1115, in __enter__
    return self.start()
  File "/Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/jina/flow/base.py", line 1161, in start
    self._wait_until_all_ready()
  File "/Users/sephi/git_forked/example-video-search/.direnv/python-3.9.5/lib/python3.9/site-packages/jina/flow/base.py", line 1245, in _wait_until_all_ready
    raise RuntimeFailToStart
jina.excepts.RuntimeFailToStart

RuntimeError('MD5 checksum failed.')

Hi,
l'm starting to get MD5 checksum failed errors on various files.
I've run a clean installation of the example and I'm getting the following error

frame_extractor@63989[E]:Error while pulling jinahub://VideoLoader/v0.3: 
RuntimeError('MD5 checksum failed.')

Updated to jinahub://VideoLoader/v0.4

But then I got an error for SimpleIndexer

image_indexer@75458[E]:Error while pulling jinahub://SimpleIndexer/v0.4: 
RuntimeError('MD5 checksum failed.')

I then updated to jinahub://SimpleIndexer/v0.6

Finally the index is working .

The search failed with jinahub://SimpleIndexer/v0.4 , upgrading to 0.6 successfully completed the search.

Are you aware of the MD5 issue?

Advise to the ranker in search flow

Current return result in the search flow is like a ranklist of video, as I enlarge the top_k in search flow, the display result is a list of videos. I think that maybe it's better to rank according to the returned chunks. For example, if I want to search 'ring bell', in a certain video there might be several segments containing the voice bell ring. If the user want to take the top-10 results, the current return result is the top 10 different videos in sequence of their rank. But maybe the last two one don't contain ring bell voice at all while the best matches one contains several segments of ring bell voice. So isn't it better that we display 10 chunks? That may behave better when users want several numbers of return results.

Migrate to Jina 3

Migrate to Jina 3 as part of LP tutorials.

Image encoder is failed to start

I am trying to run this example.. I tried with different version of jina also..!!! I have given the path dynamically also still it threw the same error.. Please help

⠸ Fetching meta data of SimpleIndexer...  image_encoder@11633[E]:fail to load file dependency
  image_encoder@11633[E]:ExecutorFailToLoad() during <class 'jina.peapods.runtimes.zmq.zed.ZEDRuntime'> initialization
 add "--quiet-error" to suppress the exception details
Traceback (most recent call last):
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/peapods/runtimes/request_handlers/data_request_handler.py", line 85, in _load_executor
    runtime_args=vars(self.args),
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/jaml/__init__.py", line 556, in load_config
    return JAML.load(tag_yml, substitute=False)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/jaml/__init__.py", line 90, in load
    r = yaml.load(stream, Loader=JinaLoader)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/yaml/__init__.py", line 81, in load
    return loader.get_single_data()
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/yaml/constructor.py", line 51, in get_single_data
    return self.construct_document(node)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/yaml/constructor.py", line 55, in construct_document
    data = self.construct_object(node)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/yaml/constructor.py", line 100, in construct_object
    data = constructor(self, node)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/jaml/__init__.py", line 427, in _from_yaml
    return get_parser(cls, version=data.get('version', None)).parse(cls, data)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/jaml/parsers/executor/legacy.py", line 73, in parse
    runtime_args=data.get('runtime_args', {}),
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/executors/decorators.py", line 65, in arg_wrapper
    f = func(self, *args, **kwargs)
  File "/home/jyoti/.jina/hub-packages/3atsazub/executor/audioclip_image.py", line 51, in __init__
    self.model = AudioCLIP(pretrained=model_path).to(device).eval()
  File "/home/jyoti/.jina/hub-packages/3atsazub/executor/audio_clip/model/audioclip.py", line 98, in __init__
    self.load_state_dict(torch.load(self.pretrained, map_location='cpu'), strict=False)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/torch/serialization.py", line 594, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/torch/serialization.py", line 230, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/torch/serialization.py", line 211, in __init__
    super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '.cache/AudioCLIP-Full-Training.pt'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/peapods/peas/__init__.py", line 76, in run
    cancel_event=cancel_event,
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/peapods/runtimes/zmq/zed.py", line 52, in __init__
    self._data_request_handler = DataRequestHandler(self.args, self.logger)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/peapods/runtimes/request_handlers/data_request_handler.py", line 68, in __init__
    self._load_executor()
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/peapods/runtimes/request_handlers/data_request_handler.py", line 95, in _load_executor
    raise ExecutorFailToLoad from ex
jina.excepts.ExecutorFailToLoad
UserWarning: 
                executor shadows one of built-in Python module name.
                It is imported as `jinahub.executor`
                
                Affects:
                - Either, change your code from using `from executor import ...` to `from jinahub.executor import ...`
                - Or, rename executor to another name
                 (raised from /home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/importer.py:197)
UserWarning: 
                executors shadows one of built-in Python module name.
                It is imported as `jinahub.executors`
                
                Affects:
                - Either, change your code from using `from executors import ...` to `from jinahub.executors import ...`
                - Or, rename executors to another name
                 (raised from /home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/importer.py:197)
UserWarning: 
                executors shadows one of built-in Python module name.
                It is imported as `jinahub.executors`
                
                Affects:
                - Either, change your code from using `from executors import ...` to `from jinahub.executors import ...`
                - Or, rename executors to another name
                 (raised from /home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/importer.py:197)
⠼ Fetching meta data of SimpleIndexer...  audio_encoder@11693[E]:fail to load file dependency
  audio_encoder@11693[E]:ExecutorFailToLoad() during <class 'jina.peapods.runtimes.zmq.zed.ZEDRuntime'> initialization
 add "--quiet-error" to suppress the exception details
Traceback (most recent call last):
  File "/home/jyoti/.jina/hub-packages/f4d22e1r/executor/audio_clip_encoder.py", line 43, in __init__
    self.model = AudioCLIP(pretrained=model_path).to(device).eval()
  File "/home/jyoti/.jina/hub-packages/f4d22e1r/executor/audio_clip/model/audioclip.py", line 98, in __init__
    self.load_state_dict(torch.load(self.pretrained, map_location='cpu'), strict=False)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/torch/serialization.py", line 594, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/torch/serialization.py", line 230, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/torch/serialization.py", line 211, in __init__
    super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'assets/AudioCLIP-Full-Training.pt'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/peapods/runtimes/request_handlers/data_request_handler.py", line 85, in _load_executor
    runtime_args=vars(self.args),
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/jaml/__init__.py", line 556, in load_config
    return JAML.load(tag_yml, substitute=False)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/jaml/__init__.py", line 90, in load
    r = yaml.load(stream, Loader=JinaLoader)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/yaml/__init__.py", line 81, in load
    return loader.get_single_data()
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/yaml/constructor.py", line 51, in get_single_data
    return self.construct_document(node)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/yaml/constructor.py", line 55, in construct_document
    data = self.construct_object(node)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/yaml/constructor.py", line 100, in construct_object
    data = constructor(self, node)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/jaml/__init__.py", line 427, in _from_yaml
    return get_parser(cls, version=data.get('version', None)).parse(cls, data)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/jaml/parsers/executor/legacy.py", line 73, in parse
    runtime_args=data.get('runtime_args', {}),
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/executors/decorators.py", line 65, in arg_wrapper
    f = func(self, *args, **kwargs)
  File "/home/jyoti/.jina/hub-packages/f4d22e1r/executor/audio_clip_encoder.py", line 46, in __init__
    'Please download AudioCLIP model and set the `model_path` argument.'
FileNotFoundError: Please download AudioCLIP model and set the `model_path` argument.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/peapods/peas/__init__.py", line 76, in run
    cancel_event=cancel_event,
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/peapods/runtimes/zmq/zed.py", line 52, in __init__
    self._data_request_handler = DataRequestHandler(self.args, self.logger)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/peapods/runtimes/request_handlers/data_request_handler.py", line 68, in __init__
    self._load_executor()
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/peapods/runtimes/request_handlers/data_request_handler.py", line 95, in _load_executor
    raise ExecutorFailToLoad from ex
jina.excepts.ExecutorFailToLoad
UserWarning: 
                executor shadows one of built-in Python module name.
                It is imported as `jinahub.executor`
                
                Affects:
                - Either, change your code from using `from executor import ...` to `from jinahub.executor import ...`
                - Or, rename executor to another name
                 (raised from /home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/importer.py:197)
frame_extractor@11581[L]:ready and listening
   image_filter@11581[L]:ready and listening
           Flow@11581[E]:image_encoder:<jina.peapods.pods.Pod object at 0x7f2ebf55ab90> can not be started due to RuntimeFailToStart(), Flow is aborted
        gateway@11581[W]:Pea is being closed before being ready. Most likely some other Pea in the Flow or Pod failed to start
       join_all@11581[W]:Pea is being closed before being ready. Most likely some other Pea in the Flow or Pod failed to start
Traceback (most recent call last):
  File "app.py", line 76, in <module>
    main()
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "app.py", line 58, in main
    with Flow.load_config('index-flow.yml', override_with=override_dict) as f:
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/flow/base.py", line 930, in __enter__
    return self.start()
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/flow/base.py", line 975, in start
    v.wait_start_success()
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/peapods/pods/__init__.py", line 517, in wait_start_success
    p.wait_start_success()
  File "/home/jyoti/jina_video/jin_videoenv/lib/python3.7/site-packages/jina/peapods/peas/__init__.py", line 266, in wait_start_success
    raise RuntimeFailToStart
jina.excepts.RuntimeFailToStart

Please help.. Thank you