GithubHelp home page GithubHelp logo

kimtth / visual-genius Goto Github PK

View Code? Open in Web Editor NEW
10.0 1.0 2.0 30.04 MB

🪅Visual Learning aids for Autism spectrum disorder children. Built w/ Azure OpenAI, Azure Cognitive Search, Azure AI services, FastAPI, Next.js

CSS 1.13% TypeScript 47.86% JavaScript 1.00% Python 44.44% PLpgSQL 1.03% Bicep 4.14% PowerShell 0.41%
autism-spectrum-disorder education healthcare autism azure-openai bing visual-learning childrens card-generator neurodiversity

visual-genius's Introduction

Visual Genius: Communication Assistant

Most children with autism spectrum disorders (ASD) are visual learners. They tend to comprehend visual information better than auditory input, making visual supports more effective for their learning process.

The project was initiated due to the laborious creation of a visual card and a market demand for a better, cost-effective product. The aim is also to democratize AI, making it accessible to all and overcoming previous product limitations with the help of new technologies.

  • Visual aids product in the market: Bing Search Results / Can be quite costly Amazon Search Result
  • Applied behavior analysis (ABA) is a therapeutic approach for treating ASD.
  • Visual aids in applied behavior analysis (ABA) URL

Key Features

  1. Switching between personas and modes of generation (List, Steps, Manual / Parents and Caregivers, Childs)
  2. Visual Card generation and management (Set the order of images by Drag and Drop)
  3. Semantic Image search
  4. Video generation from images (To teach work procedures)
  5. Text-to-Image

Application Preview

git_preview.mp4

Technology stack

  1. Vector-based image search: Azure Cognitive Search & Computer Vision API for Vector embedding
  2. Text-to-image generation: Azure OpenAI GPT-3.5 & Image Generation by Azure OpenAI Dall-E

    Due to the Generation speed issue, only the last image will be generated by Dall-E.`

  3. Bing Image Search
  4. Fluent emoji dataset
  5. Azure Cognitive Services Speech to Text (Read the text on the card)
  6. [Optional] Microsoft Coco dataset (Everyday Life Images)

    The test dataset for Semantic Image Search. Semantic search seeks images based on their features, not by the associated metadata tags or the image file name.

Development environment

Note: Please ensure you have installed nodejs and python3.

To preview and run the project on your device:

  1. Open project folder in Visual Studio Code
  2. In the terminal, run npm install
  3. Run npm run dev to view the project in a browser
  4. Run python app.py to launch the backend.

!important: react-beautiful-dnd was not able to work well with reactStrictMode: true in NextJs. Turn off the option at next.config.js.`

Loading Data

  • The [Optional] steps are needed for demonstration purposes and are not mandatory for deploying the application.

    1. Uploading your image data into Azure Blob Storage.

      • [Optional] dataset > data > Upload to Blob image container
      • dataset > emoji > Upload to Blob emoji container
    2. Image and Category metadata are managed on SQL database.

      • DB Creation: backend\infra\db_postgres.sql
      • [Optional] DB Data Generate: backend\util\postgre_gen_db_data.py
    3. Image search requires to creation of Azure Cognitive Search Index.

      • Azure Cognitive Search Index Creation: backend\util\acs_index_gen.py
      • [Optional] trigger indexer: The web skill (azure functions: acs_skillset_for_indexer) should be deployed before it is triggered.
    4. [Optional] Update and synchronize the 'sid' attribute in Azure Cognitive Search based on metadata from the SQL database.

      • backend\util\data_for_dev\acs_index_mapping_with_postgre.py

Data creation for development and Dataset. Please find the sample images in dataset and backend\util directories.

API Documentation (Swagger)

http://localhost:5000/docs

Deploy to Azure

  1. Deployment can be done using Azure Template or Azure Bicep.
  • Azure Template

    • Click the template button.

      Deploy to Azure

  • Azure Bicep

    1. Deploy Azure Resources > backend\infra

    Set up your parameters for Azure Bicep.

    "prefix": {
        "value": "<your-value-for-prefix>"
    },
    "pgsqlId": {
        "value": "<your-postgre-sql-id>"
    },
    "pgsqlPwd": {
        "value": "<your-postgre-sql-password>"
    }
    1. Execute the script for Azure Bicep
    PS> .\main.ps1 -resourceGroup <your-resource-group-name> -location <your-resource-location>
  1. Build Next.js application
  • Execute the npm run build command. This will build UI code and create public directory in the backend.

    "scripts": {
        "dev": "next dev",
        "build": "next build && next export -o backend/public",
        "start": "next start",
        "lint": "next lint"
    }
  • The .env.production on root will be embedded into the javascript files.

    MS_CLARITY_ID= //[Optional]
    ENV_TYPE=prod
  1. Upload UI and Python code to Azure App Service (by Visual Code Extension: Azure Tools)

    1. Navigate to the target App service using Azure Tools.
    2. Select the backend directory as the target directory.
    3. Click on Deploy to Web app...
  2. To set up the start-up command at Azure App service.

    1. Open your Web App in the Azure Portal.

    2. Scroll to Configuration under Settings.

    3. Click on the General Settings tab.

    4. Enter the appropriate startup command.

      python app.py
  3. To set up environment variables in Azure App Service, you can follow these steps:

    • In the Azure Portal, locate your App Service.
    1. On the left pane, click on “Configuration”.
    2. Under “Application settings”, click on “New application setting”.
    3. Fill in the name and value for each environment variable:
    4. Click “OK”, then at the top, click "Save".
    • Most of the values will be mapped during the deployment.

      AZURE_SEARCH_SERVICE_ENDPOINT=https://?.search.windows.net
      AZURE_SEARCH_INDEX_NAME=
      AZURE_SEARCH_ADMIN_KEY=
      COGNITIVE_SERVICES_ENDPOINT=https://?.cognitiveservices.azure.com
      COGNITIVE_SERVICES_API_KEY=
      BLOB_CONNECTION_STRING=
      BLOB_CONTAINER_NAME=
      BLOB_EMOJI_CONTAINER_NAME=
      AZURE_OPENAI_ENDPOINT=https://?.openai.azure.com/
      AZURE_OPENAI_API_KEY=
      AZURE_OPENAI_API_VERSION_IMG=2023-06-01-preview
      AZURE_OPENAI_API_VERSION_CHAT=2023-07-01-preview
      BING_IMAGE_SEARCH_KEY=
      SPEECH_SUBSCRIPTION_KEY=
      SPEECH_REGION=
      POSTGRE_HOST=
      POSTGRE_USER=
      POSTGRE_PORT=5432
      POSTGRE_DATABASE=
      POSTGRE_PASSWORD=
      ENV_TYPE=PROD
      APP_SECRET_STRING= //JWT Token authentication key. e.g,. mysecret
    • [Optional]: backend/util/env_to_app_service_fmt.py: Convert the .env file to the appsettings.json for settings in Azure App Service.

      [
        {
          "name": "AZURE_SEARCH_SERVICE_ENDPOINT",
          "value": "https://?.search.windows.net"
        },
        {
          ...
        }
        ...
      ]
  4. PostgreSQL offers a vector search feature through the installation of the pgvector plugin. This feature can be utilized to implement image search, potentially serving as an alternative to Azure Cognitive Search. However, it’s important to note that PostgreSQL is not optimized for vector search.

visual-genius's People

Contributors

dependabot[bot] avatar kimtth avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

visual-genius's Issues

Add select box for steps optimized prompt [closed]

UI
image

Backend

imgStepPrompt = '''
Please generate steps list based on the user's query or a topic within it. 
Each step should be concise and clear, less than 20 characters suitable for assisting children educations.
This output will be used for image searches, so please consider that. 
Output should be comma-separated string without any additional explanation.

The user query
{query}

Example of output
Look at the picture, Read the text, Listen to the audio, Repeat the word, Repeat the sentence
'''

Upgrade Broken Link verification mechanism [partially completed]

  1. A URL is used in Azure Cognitive Search Vector generation and Saving the content to Blob storage. At the initial time during fetching from Bing search, it checks the broken URLs. Verified URLs still make different results in Azure Cognitive Search and Blob storage.

  2. Switch current logic to image binary to speed up the overall process of using an image.

Implement for Upload my own photos [in progress]

image

@todo

_1. Real-time embedding creation (Need to fix the indexer or develop separate functions) [not needed]
2. Migrate local db to cloud-native database (SQLite -> Postgre) [completed]
3. Sync image's id between Database and Azure Cognitive Search [in progress]
4. UI dialog for the multiple files select [completed]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.