GithubHelp home page GithubHelp logo

kwankiahn / semantic-search-with-amazon-opensearch Goto Github PK

View Code? Open in Web Editor NEW

This project forked from aws-samples/semantic-search-with-amazon-opensearch

0.0 0.0 0.0 171.95 MB

License: MIT No Attribution

Shell 0.35% JavaScript 2.58% Python 10.14% CSS 0.19% HTML 0.38% Jupyter Notebook 86.35%

semantic-search-with-amazon-opensearch's Introduction

Improve search relevance with machine learning in Amazon OpenSearch Service

This repository guides users through creating a semantic search using Amazon SageMaker and Amazon OpenSearch services

How does it work?

This code repository is for Semantic and Vector Search with Amazon OpenSearch Service Workshop. For more information about semantic search, please refer the workshop content.

Semantic Search Architecture

semantic_search_fullstack

Rereieval Augmented Generation Architecture

rag

Converational Search Architecture

converstational-search

CloudFormation Deployment

  1. The workshop can only be deployed in us-east-1 region
  2. Use the Cloudformation template cfn/semantic-search.yaml to create CF stack
  3. Cloudformation stack name must be semantic-search as we use this stack name in our lab
  4. You can click the following link to deploy CloudFormation Stack
Region Launch Template
US East (N. Virginia) Deploy to AWS

Lab Instruction

There are 8 modules in this workshop:

  • Module 1 - Search basics: You will learn fundamentals of text search and semantic search. This section also introduces differences between a best matching algorithm, popularly known as BM25 similarity and semantic similarity.

  • Module 2 -Text search: You will learn text search with Amazon OpenSearch Service. In information retrieval this type of searching is traditionally called 'Keyword' search.

  • Module 3 - Semantic search: You will learn semantic search with Amazon OpenSearch Service and Amazon SageMaker. You will use a machine learning technique called Bidirectional Encoder Representations from transformers, popularly known as BERT. BERT uses a pre-trained natural language processing (NLP) model that represents text in the form numbers or in other words, vectors. You will learn to use vectors with kNN feature in Amazon OpenSearch Service.

  • Module 4 - Fullstack semantic search: You will bring together all the concepts learnt earlier with an user interface that shows the advantages of using semantic search with text search. You will be using Amazon OpenSearch Service, Amazon SageMaker, AWS Lambda, Amazon API Gateway and Amazon S3 for this purpose.

  • Module 5 - Fine tuning semantic search: Large language models like BERT show better results when they are trained in-domain, which means fine tuning the general model to fit ones particular business requirements in the domain of its application. You will learn how to fine tune the model for semantic search with the chosen data set.

  • Module 6 - Neural Search: Implement semantic search with OpenSearch Neural Search Plugin.

  • Module 7 - Retrieval Augmented Generation: Use semantic search result as context, combine the user input and context as prompt for large language models to generate factual content for knowledge intensive applications.

  • Module 8 - Conversational Search: Search with history context while leveraging RAG.

Please refer Semantic Search Workshop for lab instruction.

Note

In this workshop, we use OpenSearch internal database to store username and password to simplify the lab. However in production env, you should design your security solution per your requirements. For more information , please refer Fine-grained access control and Identity and Access Management.

Feedback

If you have any questions or feedback, please reach us by sending email to [email protected].

License

This library is licensed under the MIT-0 License. See the LICENSE file.

semantic-search-with-amazon-opensearch's People

Contributors

amazon-auto avatar arunx2 avatar famestad avatar gaiamogh avatar jkitaok avatar joshtow avatar leejianwei avatar prasadnu avatar sec-oops avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.