GithubHelp home page GithubHelp logo

tarekziade / nucliadb Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nuclia/nucliadb

1.0 1.0 0.0 28.29 MB

NucliaDB, The vector database optimized for documents and video search

Home Page: https://nucliadb.com/

License: Other

Shell 0.01% Python 71.04% Rust 21.59% Makefile 0.34% PureBasic 6.86% HTML 0.08% Smarty 0.04% Dockerfile 0.05%

nucliadb's Introduction

nucliadb_standalone nucliadb_writer nucliadb_reader nucliadb_ingest nucliadb_node nucliadb_search Contributor Covenant License: AGPL V3 Twitter Follow Discord Rust Python codecov

Nuclia

The AI Search Database.

NucliaDB is a robust database that allows storing and searching on unstructured data.

It is an out of the box hybrid search database, utilizing vector, full text and graph indexes.

NucliaDB is written in Rust and Python. We designed it to index large datasets and provide multi-teanant suport.

When utilizing NucliaDB with Nuclia cloud, you are able to the power of an NLP database without the hassle of data extraction, enrichment and inference. We do all the hard work for you.

Features

  • Store text, files, vectors, labels and annotations
  • Perform text searches and given a word or set of words, return resources in our database that contain them.
  • Perform semantic searches with vectors. For example, given a set of vectors, return the closest matches in our database. With NLP, this allows us to look for similar sentences without being constrained by exact keywords.
  • Export your data in a format compatible with most NLP pipelines (HuggingFace datasets, pytorch, etc)
  • Store original data, extracting and data pulled from the Understanding API
  • Index fields, paragraphs, and semantic sentences on index storage
  • Cloud data and insight extraction with the Nuclia Understanding APIโ„ข
  • Cloud connection to train ML models with Nuclia Learning APIโ„ข
  • Role based security system with upstream proxy authentication validation
  • Resources with multiple fields and metadata
  • Text/HTML/Markdown plain fields support
  • Field types: text, file, link, conversation, layout
  • Storage layer support: TiKV, Redis and PostgreSQL
  • Blob support with S3-compatible API, GCS and PG drivers
  • Replication of index storage
  • Distributed search
  • Cloud-native

Architecture

Architecture

Quickstart

Trying NucliaDB is super easy! You can extend your knowledge with the following readings:

๐Ÿ’ฌ Community

๐Ÿ™‹ FAQ

How is NucliaDB different from traditional search engines like Elasticsearch or Solr?

The core difference and advantage of NucliaDB is its architecture built from the ground up for unstructured data. Its vector index, keyword, graph and fuzzy search provide an API to use all extracted and extracted information from Nuclia, Understanding API and provides powerful NLP abilities to any application with low code and peace of mind.

What license does NucliaDB use?

NucliaDB is open-source under the GNU Affero General Public License Version 3 - AGPLv3. Fundamentally, this means that you are free to use NucliaDB for your project, as long as you don't modify NucliaDB. If you do, you have to make the modifications public.

What is Nuclia's business model?

Our business model relies on our normalization API, this one is based on Nuclia Learning API and Nuclia Understanding API. This two APIs offers transformation of unstructured data to NucliaDB compatible data with AI. We also offer NucliaDB as a service at our multi-cloud provider infrastructure: https://nuclia.cloud.

๐Ÿค Contribute and spread the word

We are always happy to have contributions: code, documentation, issues, feedback, or even saying hello on discord! Here is how you can get started:

โœจ And to thank you for your contributions, claim your swag by emailing us at info at nuclia.com.

Reference

Meta

nucliadb's People

Contributors

bloodbare avatar lferran avatar vangheem avatar hermegarcia avatar jotare avatar sunbit avatar alekece avatar ferpizza avatar r3bu1ld3r avatar albertnadal avatar ebrehault avatar dependabot[bot] avatar ciniesta avatar fjmoronreyes avatar swallez avatar borinot avatar ccasimiro88 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.