GithubHelp home page GithubHelp logo

fredhu21 / havenask Goto Github PK

View Code? Open in Web Editor NEW

This project forked from alibaba/havenask

0.0 0.0 0.0 37.01 MB

License: Apache License 2.0

Shell 0.05% C++ 96.59% Python 2.94% C 0.22% LLVM 0.04% Yacc 0.15% Starlark 0.01%

havenask's Introduction

项目介绍

Havenask是阿里巴巴集团自研的搜索引擎,也是阿里巴巴内部广泛使用的大规模分布式检索系统,支持了包括淘宝、天猫、菜鸟、高德、饿了么、全球化在内整个阿里巴巴集团的搜索业务,为用户提供高性能、低成本、易用的搜索服务,同时具有灵活的定制和开发能力,支持算法快速迭代,帮助客户和开发者量身定做适合自身业务的智能搜索服务,助力业务增长。

此外,基于Havenask打造的行业AI搜索产品——阿里云OpenSearch,也将持续在阿里云上为企业级开发者提供全托管、免运维的一站式智能搜索服务,欢迎企业级开发者们试用。

核心能力

Havenask 的核心能力与优势,有以下几点:

  • 极致的工程性能:支持千亿级数据实时检索,百万QPS查询,百万TPS写入,毫秒级查询延迟与秒级数据更新。
  • C++的底层构建:对性能、内存、稳定性有更高保障。
  • SQL查询支持:支持SQL语法便捷查询,查询体验更友好。
  • 丰富的插件机制:支持各类业务插件,拓展性强。
  • 支持图化开发:实现算法分钟级快速迭代,定制能力丰富,在新一代智能检索场景下的支持效果优秀。
  • 支持向量检索:可通过与插件配合实现多模态搜索,满足更多场景的搜索服务搭建需求(待发布)。

开始使用

使用前确保已经安装和启动Docker服务

启动容器

克隆仓库并创建容器。其中DOCKER_NAME为指定的容器名

docker pull havenask/ha3_runtime:0.1
cd ~
git clone [email protected]:alibaba/havenask.git
cd ~/havenask/docker

## 如果是Linux环境执行以下指令
./create_container.sh <DOCKER_NAME> havenask/ha3_runtime:0.1
## 如果是Mac环境执行以下指令
./create_container_mac.sh <DOCKER_NAME> havenask/ha3_runtime:0.1

登陆容器

cd ~/havenask/docker/<DOCKER_NAME>
./sshme

测试索引构建

构建全量索引,其中USER为登陆容器前的用户名

cd /home/<USER>/havenask/example/scripts
python build_demo_data.py /ha3_install

测试引擎查询

启动havenask引擎

python start_demo_searcher.py /ha3_install

引擎的默认查询端口为45800,使用脚本进行查询测试。下面是一些测试query。

python curl_http.py 45800 "query=select count(*) from in0"

python curl_http.py 45800 "query=select id,hits from in0 where MATCHINDEX('title', '搜索词典')"

python curl_http.py 45800 "query=select title, subject from in0_summary_ where id=1 or id=2"

文档

havenask's People

Contributors

dyuyang avatar xuxijie avatar alibaba-oss avatar wenyiduan avatar ruijieguo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.