GithubHelp home page GithubHelp logo

hartl3y94 / nucliadb Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nuclia/nucliadb

0.0 0.0 0.0 2.01 MB

NucliaDB is a cloud-native database for unstructured data, indexing vectors, text, paragraphs and relations.

Home Page: https://docs.nuclia.dev/

License: Other

Shell 0.02% Python 60.56% Rust 27.56% Makefile 0.29% PureBasic 11.26% Smarty 0.15% Dockerfile 0.17%

nucliadb's Introduction

nucliadb_one nucliadb_writer nucliadb_reader nucliadb_ingest nucliadb_node nucliadb_search codecov Contributor Covenant License: AGPL V3 Twitter Follow Discord Rust Python

Nuclia

Searchable database for unstructured data

Quickstart | Docs | Tutorials | Chat

Check out our blog post to grasp what we have been doing for the last months.

NucliaDB is a distributed search engine built from the ground up to offer high accuracy and semantic search on unstructured data. By mere mortals for mere mortals, NucliaDB's architecture is as simple as possible to be scalable and deliver what an NLP Database requires

NucliaDB is written in Rust and Python and built on top of the mighty tantivy library. We designed it to index big datasets and provide multi-teanant suport.

Features

  • Store original data, extracting and understanding data on object and blob storage
  • Index fields, paragraphs, and semantic sentences on index storage
  • Cloud extraction and understanding with Nuclia Understanding API™
  • Cloud connection to train ML models with Nuclia Learning API™
  • Container security based with Reader, Manager, Writer Roles
  • Resources with multiple fields and metadata
  • Text/HTML/Markdown plain fields support
  • File field support with direct upload and TUS upload
  • Link field support
  • Conversation field support
  • Blocks/Layout field support
  • Eventual consistency transactions based on Nats.io
  • Distributed source of truth with TiKV and Redis support
  • Blob support with S3-compatible API and GCS
  • Replication of index storage
  • Distributed search
  • Cloud-native: Kubernetes only

Upcomming Features

  • Blob support with Azure Blob storage
  • Index relations on index storage

Architecture

Architecture

Quickstart

Get a NucliaDB token to connect to Nuclia Understanding API™

Only needed if you want to use Nuclia Understanding API™ and Nuclia Learning API™

Start NucliaDB minimal

First we need object storage and blob storage

docker run redis
docker run minio

TODO

Create a Knowledge box container

curl http://localhost:8080/v1/kb \
  -X POST \
  -H "X-NUCLIADB-ROLES: MANAGER" \

Upload a file

After starting NucliaDB and creating a Knowledge Box you can upload a file:

curl http://localhost:8080/v1/kb/<your-knowledge-box-id>/upload \
  -X POST \
  -H "X-NUCLIADB-ROLES: WRITER" \
  -T /path/to/file

Search a file

After starting NucliaDB and creating a Knowledge Box you can upload a file:

curl http://localhost:8080/v1/kb/<your-knowledge-box-id>/search \
  -X GET \
  -H "X-NUCLIADB-ROLES: READER" \

API Tutorials

💬 Community

🙋 FAQ

How is NucliaDB different from traditional search engines like Elasticsearch or Solr?

The core difference and advantage of NucliaDB is its architecture built from the ground up for cloud and unstructured data. Its vector index plus standard keyword and fuzzy search provide an API to use all extracted and learned information from Nuclia, understanding API and provide super NLP powers to any application with low code and peace of mind.

What license does NucliaDB use?

NucliaDB is open-source under the GNU Affero General Public License Version 3 - AGPLv3. Fundamentally, this means that you are free to use Quickwit for your project, as long as you don't modify NucliaDB. If you do, you have to make the modifications public.

What is Nuclia's business model?

Our business model relies on our Nuclia Learning API and Nuclia Understanding API. We also offer NucliaDB as a service at our multi-cloud provider infrastructure: https://nuclia.cloud.

🤝 Contribute and spread the word

We are always super happy to have contributions: code, documentation, issues, feedback, or even saying hello on discord! Here is how you can get started:

✨ And to thank you for your contributions, claim your swag by emailing us at info at nuclia.com.

Reference

Meta

nucliadb's People

Contributors

albertnadal avatar bloodbare avatar dependabot[bot] avatar ebrehault avatar ferpizza avatar hermegarcia avatar jotare avatar r3bu1ld3r avatar sunbit avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.