GithubHelp home page GithubHelp logo

mlonaws / aws-neptune-sagemaker-knowledge-graph-bert Goto Github PK

View Code? Open in Web Editor NEW

This project forked from aws-samples/aws-neptune-sagemaker-knowledge-graph-bert

0.0 1.0 0.0 24 KB

License: MIT No Attribution

Python 54.83% Jupyter Notebook 45.17%

aws-neptune-sagemaker-knowledge-graph-bert's Introduction

Fine tune a Tensorflow 2 BERT model for named entity recognition and build a knowledge graph of extracted entities

Background and motivation

Knowledge graphs are a powerful way to represent and connect items with properties. For example connecting products according to their characteristics, connecting online news articles according to the subjects they talk about, etc. This enables to search, compare or recommend items by their properties effectively

As a consequence, our customers often ask us to build internal search and comparison engines running on graph databases. However, data is often a limiting factor: while text descriptions are often available for all items, properties are regularly missing or inconsistent. Those properties can be however extracted in a standardize manner through named entity recognition.

In this repository we present how to build a custom named entity recognition, by fine-tuning BERT on Tensorflow 2 with Keras using the SageMaker Python SDK and then build and populate a knowledge graph of items through these extracted properties using Amazon Neptune

Amazon SageMaker is a fully managed service that provides developers and data scientists with the ability to build, train, and deploy machine learning (ML) models quickly. Amazon SageMaker removes the heavy lifting from each step of the machine learning process to make it easier to develop high-quality models. The SageMaker Python SDK provides open source APIs and containers that make it easy to train and deploy models in Amazon SageMaker with several different machine learning and deep learning frameworks. We use an Amazon SageMaker Notebook Instance for running the code. For information on how to use Amazon SageMaker Notebook Instances, see the AWS documentation.

Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets. The core of Neptune is a purpose-built, high-performance graph database engine. This engine is optimized for storing billions of relationships and querying the graph with milliseconds latency. Neptune powers graph use cases such as recommendation engines, fraud detection, knowledge graphs, drug discovery, and network security.

Quick start

Named Entity Recognition with Tensorflow 2 Bert on SageMaker

To train and deploy the custom named entity recognition on sagemaker follow the in notebooks/ner-bert-keras-sagemaker.ipynb

Neptune cluster

Follow the steps and cloudformation steps avaialbe in Amazon Neptune documentation to start up cluster or use one of the following link to get the cfn stack:

Once these steps are over, follow the instructions in notebooks/knowledge-graph-neptune.ipynb

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Authors

aws-neptune-sagemaker-knowledge-graph-bert's People

Contributors

alkhanai avatar amazon-auto avatar othmane796 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.