GithubHelp home page GithubHelp logo

kronuz / xapiand Goto Github PK

View Code? Open in Web Editor NEW
362.0 21.0 32.0 39.22 MB

Xapiand: A RESTful Search Engine

Home Page: https://kronuz.io/Xapiand

License: MIT License

Shell 0.04% C++ 86.12% C 10.95% Python 1.79% CMake 0.54% JavaScript 0.32% Dockerfile 0.01% Perl 0.16% Tcl 0.07%
search search-engine indexing c-plus-plus elasticsearch

xapiand's Introduction

Xapiand

Build Status

A RESTful Search Engine

Xapiand is A Modern Highly Available Distributed RESTful Search and Storage Engine built for the Cloud and with Data Locality in mind. It takes JSON (or MessagePack) documents and indexes them efficiently for later retrieval.

Official site is at https://kronuz.io/Xapiand

License

MIT

xapiand's People

Contributors

bryaneduardo24 avatar dependabot[bot] avatar josemariavr avatar kronuz avatar lodestone avatar vit1251 avatar yosefmac avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xapiand's Issues

Option to store (or not store) field

I just discovered this project and it looks fantastic! Thank you for building it.

After looking through the docs, I didn't see a way to specify whether the value for a field should be stored or not. It looks like all values are always stored. Being able to index fields without storing the value is useful for indexing external data, and I think it would be a good addition to this project.

Update README.md with usage guide

Most people like me want to see how to install and usage in README.md.

How about add Docker, Debian and compile instruction in README.md?

Listing of indices

Is there a way to list all indices that are currently available in the cluster without prior knowledge of their names?
I was unable to find this information in the documentation.

Thank you.

How to search in all fields?

Is it possible to search in all fields when using the SEARCH HTTP method?

To search in a given field, I send a query like this one: {"_query": {"fieldname": "search terms"}}.

To search in all fields, I tried this: {"_query": "search terms"}. But the query returned no documents.

Build fails with -DCLUSTERING="OFF"

The issue affects 0.23 release but should manifest in current master as well.

I'm building Xapiand on Alpine v3.9 like this:

cmake -DCLUSTERING="OFF" -DCMAKE_INSTALL_PREFIX="/usr" -GNinja "$builddir"
ninja

I get following error:

-- CMake v3.13.0
-- The C compiler identification is GNU 8.3.0
-- The CXX compiler identification is GNU 8.3.0
-- Check for working C compiler: /usr/bin/gcc
-- Check for working C compiler: /usr/bin/gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Option DATABASE_WAL - on
-- Option DATA_STORAGE - on
-- Option CHAISCRIPT - on
-- Option UUID_ENCODED - on
-- Option LTO - on
-- Checking whether system has ANSI C header files
-- Looking for 8 include files dlfcn.h, ..., float.h
-- Looking for 8 include files dlfcn.h, ..., float.h - found
-- Performing Test memchrExists
-- Performing Test memchrExists - Success
-- Performing Test freeExists
-- Performing Test freeExists - Success
-- ANSI C header files - found
-- Looking for include file unistd.h
-- Looking for include file unistd.h - found
-- Looking for DIR in sys/stat.h;sys/types.h;dirent.h
-- Looking for DIR in sys/stat.h;sys/types.h;dirent.h - found
-- Found Git: /usr/bin/git (found version "2.20.1")
-- Performing Test HAVE_FLAG_STDLIB_LIBCPP
-- Performing Test HAVE_FLAG_STDLIB_LIBCPP - Failed
-- Found ZLIB: /lib/libz.so (found version "1.2.11")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - found
-- Found Threads: TRUE
-- Found UUID: /usr/lib/libuuid.so
-- Found PkgConfig: /usr/bin/pkg-config (found version "1.6.0")
-- Found the following ICU libraries:
--   uc (required)
-- Found ICU: /usr/include (found suitable version "62.1", minimum required is "54.1")
-- Date Revision: 20190603101112
-- Performing Test HAVE_CXX_FLAG_STD_CXX17
-- Performing Test HAVE_CXX_FLAG_STD_CXX17 - Success
-- Performing Test HAVE_CXX_FLAG_FNO_COMMON
-- Performing Test HAVE_CXX_FLAG_FNO_COMMON - Success
-- Performing Test HAVE_CXX_FLAG_FLTO
-- Performing Test HAVE_CXX_FLAG_FLTO - Success
-- Performing Test HAVE_CXX_FLAG_FDIAGNOSTICS_COLOR_ALWAYS
-- Performing Test HAVE_CXX_FLAG_FDIAGNOSTICS_COLOR_ALWAYS - Success
-- Compiler: GCC (GNU)
-- Performing Test HAVE_CXX_FLAG_WNO_ATTRIBUTES
-- Performing Test HAVE_CXX_FLAG_WNO_ATTRIBUTES - Success
-- Performing Test HAVE_CXX_FLAG_WNO_SUBOBJECT_LINKAGE
-- Performing Test HAVE_CXX_FLAG_WNO_SUBOBJECT_LINKAGE - Success
-- Looking for include file pthread_np.h
-- Looking for include file pthread_np.h - not found
-- Looking for include file fcntl.h
-- Looking for include file fcntl.h - found
-- Looking for include file limits.h
-- Looking for include file limits.h - found
-- Looking for include file netinet/in.h
-- Looking for include file netinet/in.h - found
-- Looking for include file sys/socket.h
-- Looking for include file sys/socket.h - found
-- Looking for include file sys/time.h
-- Looking for include file sys/time.h - found
-- Looking for include file execinfo.h
-- Looking for include file execinfo.h - not found
-- Looking for include file libunwind.h
-- Looking for include file libunwind.h - not found
-- Looking for include file sys/sysctl.h
-- Looking for include file sys/sysctl.h - not found
-- Looking for include file sys/capability.h
-- Looking for include file sys/capability.h - not found
-- Looking for C++ include sstream
-- Looking for C++ include sstream - found
-- Looking for C++ include strstream
-- Looking for C++ include strstream - found
-- Looking for fallocate
-- Looking for fallocate - found
-- Looking for fsync
-- Looking for fsync - found
-- Looking for getcwd
-- Looking for getcwd - found
-- Looking for gettimeofday
-- Looking for gettimeofday - found
-- Looking for memcpy
-- Looking for memcpy - found
-- Looking for posix_fadvise
-- Looking for posix_fadvise - found
-- Looking for posix_fallocate
-- Looking for posix_fallocate - found
-- Looking for pread
-- Looking for pread - found
-- Looking for pwrite
-- Looking for pwrite - found
-- Looking for socket
-- Looking for socket - found
-- Looking for setresuid
-- Looking for setresuid - found
-- Looking for pthread_getname_np in pthread
-- Looking for pthread_getname_np in pthread - not found
-- Looking for pthread_get_name_np in pthread
-- Looking for pthread_get_name_np in pthread - not found
-- Looking for pthread_setname_np in pthread
-- Looking for pthread_setname_np in pthread - found
-- Looking for pthread_set_name_np in pthread
-- Looking for pthread_set_name_np in pthread - not found
-- Looking for pthread_attr_setaffinity_np in pthread
-- Looking for pthread_attr_setaffinity_np in pthread - not found
-- Performing Test HAVE_DECL___BUILTIN_EXPECT
-- Performing Test HAVE_DECL___BUILTIN_EXPECT - Success
-- Looking for fdatasync
-- Looking for fdatasync - found
-- Looking for include file sys/epoll.h
-- Looking for include file sys/epoll.h - found
-- Looking for include files sys/types.h, sys/event.h
-- Looking for include files sys/types.h, sys/event.h - not found
-- Looking for include file sys/eventfd.h
-- Looking for include file sys/eventfd.h - found
-- Looking for include file sys/inotify.h
-- Looking for include file sys/inotify.h - found
-- Looking for include file sys/select.h
-- Looking for include file sys/select.h - found
-- Looking for include file sys/signalfd.h
-- Looking for include file sys/signalfd.h - found
-- Looking for include file port.h
-- Looking for include file port.h - not found
-- Looking for include file poll.h
-- Looking for include file poll.h - found
-- Looking for inotify_init
-- Looking for inotify_init - found
-- Looking for epoll_ctl
-- Looking for epoll_ctl - found
-- Looking for kqueue
-- Looking for kqueue - not found
-- Looking for select
-- Looking for select - found
-- Looking for eventfd
-- Looking for eventfd - found
-- Looking for signalfd
-- Looking for signalfd - found
-- Looking for port_create
-- Looking for port_create - not found
-- Looking for poll
-- Looking for poll - found
-- Looking for clock_gettime
-- Looking for clock_gettime - found
-- Looking for nanosleep
-- Looking for nanosleep - found
-- Looking for exp10
-- Looking for exp10 - not found
-- Looking for log2
-- Looking for log2 - found
-- Looking for __exp10
-- Looking for __exp10 - not found
-- Looking for strerror_r
-- Looking for strerror_r - found
-- Looking for _byteswap_uint64
-- Looking for _byteswap_uint64 - not found
-- Looking for _byteswap_ulong
-- Looking for _byteswap_ulong - not found
-- Looking for _byteswap_ushort
-- Looking for _byteswap_ushort - not found
-- Looking for _putenv_s
-- Looking for _putenv_s - not found
-- Looking for __popcnt
-- Looking for __popcnt - not found
-- Looking for __popcnt64
-- Looking for __popcnt64 - not found
-- Looking for closefrom
-- Looking for closefrom - not found
-- Looking for fork
-- Looking for fork - found
-- Looking for ftime
-- Looking for ftime - found
-- Looking for ftruncate
-- Looking for ftruncate - found
-- Looking for getdirentries
-- Looking for getdirentries - not found
-- Looking for gethostname
-- Looking for gethostname - found
-- Looking for getrlimit
-- Looking for getrlimit - found
-- Looking for getrusage
-- Looking for getrusage - found
-- Looking for link
-- Looking for link - found
-- Looking for nftw
-- Looking for nftw - found
-- Looking for random
-- Looking for random - found
-- Looking for setenv
-- Looking for setenv - found
-- Looking for sigaction
-- Looking for sigaction - found
-- Looking for sigsetjmp
-- Looking for sigsetjmp - found
-- Looking for sleep
-- Looking for sleep - found
-- Looking for socketpair
-- Looking for socketpair - found
-- Looking for srandom
-- Looking for srandom - found
-- Looking for strerror_r
-- Looking for strerror_r - found
-- Looking for sysconf
-- Looking for sysconf - found
-- Looking for timer_create
-- Looking for timer_create - found
-- Looking for times
-- Looking for times - found
-- Looking for writev
-- Looking for writev - found
-- Performing Test HAVE_SYS_ERRLIST_AND_SYS_NERR
-- Performing Test HAVE_SYS_ERRLIST_AND_SYS_NERR - Failed
-- Performing Test HAVE__SYS_ERRLIST_AND__SYS_NERR
-- Performing Test HAVE__SYS_ERRLIST_AND__SYS_NERR - Failed
-- Performing Test HAVE___BUILTIN_EXP10
-- Performing Test HAVE___BUILTIN_EXP10 - Success
-- Performing Test FTIME_RETURNS_INT
-- Performing Test FTIME_RETURNS_INT - Success
-- Performing Test STRERROR_R_CHAR_P
-- Performing Test STRERROR_R_CHAR_P - Failed
-- Performing Test HAVE_DECL___BUILTIN_ADD_OVERFLOW
-- Performing Test HAVE_DECL___BUILTIN_ADD_OVERFLOW - Success
-- Performing Test HAVE_DECL___BUILTIN_BSWAP16
-- Performing Test HAVE_DECL___BUILTIN_BSWAP16 - Success
-- Performing Test HAVE_DECL___BUILTIN_BSWAP32
-- Performing Test HAVE_DECL___BUILTIN_BSWAP32 - Success
-- Performing Test HAVE_DECL___BUILTIN_BSWAP64
-- Performing Test HAVE_DECL___BUILTIN_BSWAP64 - Success
-- Performing Test HAVE_DECL___BUILTIN_CLZ
-- Performing Test HAVE_DECL___BUILTIN_CLZ - Success
-- Performing Test HAVE_DECL___BUILTIN_CLZL
-- Performing Test HAVE_DECL___BUILTIN_CLZL - Success
-- Performing Test HAVE_DECL___BUILTIN_CLZLL
-- Performing Test HAVE_DECL___BUILTIN_CLZLL - Success
-- Performing Test HAVE_DECL___BUILTIN_CTZ
-- Performing Test HAVE_DECL___BUILTIN_CTZ - Success
-- Performing Test HAVE_DECL___BUILTIN_CTZL
-- Performing Test HAVE_DECL___BUILTIN_CTZL - Success
-- Performing Test HAVE_DECL___BUILTIN_CTZLL
-- Performing Test HAVE_DECL___BUILTIN_CTZLL - Success
-- Performing Test HAVE_DECL___BUILTIN_MUL_OVERFLOW
-- Performing Test HAVE_DECL___BUILTIN_MUL_OVERFLOW - Success
-- Performing Test HAVE_DECL___BUILTIN_POPCOUNT
-- Performing Test HAVE_DECL___BUILTIN_POPCOUNT - Success
-- Performing Test HAVE_DECL___BUILTIN_POPCOUNTL
-- Performing Test HAVE_DECL___BUILTIN_POPCOUNTL - Success
-- Performing Test HAVE_DECL___BUILTIN_POPCOUNTLL
-- Performing Test HAVE_DECL___BUILTIN_POPCOUNTLL - Success
-- Performing Test HAVE_STD_IS_TRIVIALLY_COPYABLE
-- Performing Test HAVE_STD_IS_TRIVIALLY_COPYABLE - Failed
-- Looking for include file memory.h
-- Looking for include file memory.h - found
-- Looking for include file sys/resource.h
-- Looking for include file sys/resource.h - found
-- Looking for include file sys/uio.h
-- Looking for include file sys/uio.h - found
-- Looking for include file sys/utsname.h
-- Looking for include file sys/utsname.h - found
-- Looking for include file uuid.h
-- Looking for include file uuid.h - not found
-- Looking for include file uuid/uuid.h
-- Looking for include file uuid/uuid.h - found
-- Looking for include file zlib.h
-- Looking for include file zlib.h - found
-- Performing Test SOCKLEN_T
-- Performing Test SOCKLEN_T - Success
-- Performing Test SNPRINTF
-- Performing Test SNPRINTF - Success
-- Performing Test IEEE
-- Performing Test IEEE - Success
-- Performing Test PREAD_PROTOTYPE
-- Performing Test PREAD_PROTOTYPE - Success
-- Performing Test PWRITE_PROTOTYPE
-- Performing Test PWRITE_PROTOTYPE - Success
-- Performing Test ATOMIC_AVAILABLE
-- Performing Test ATOMIC_AVAILABLE - Failed
-- Check size of long long
-- Check size of long long - done
-- Check size of uint16_t
-- Check size of uint16_t - done
-- Check size of u_int16_t
-- Check size of u_int16_t - done
-- Check size of __uint16
-- Check size of __uint16 - failed
-- Check size of _Bool
-- Check size of _Bool - done
-- Looking for ccache - not found
-- RelWithDebInfo build
-- Compile flags: -Os -fomit-frame-pointer -std=c++17 -fno-common -fdiagnostics-color=always -Wno-attributes -Wno-subobject-linkage -D_Atomic=volatile -O2 -g -DNDEBUG
-- Configuring done
-- Generating done
-- Build files have been written to: /work/user/xapiand/src/Xapiand-0.23.0/build


[42/400] Building CXX object CMakeFiles/XAPIAND_OBJ.dir/src/database/schemas_lru.cc.o
FAILED: CMakeFiles/XAPIAND_OBJ.dir/src/database/schemas_lru.cc.o
/usr/bin/c++   -I../src -Isrc -Os -fomit-frame-pointer -std=c++17 -fno-common -fdiagnostics-color=always -Wno-attributes -Wno-subobject-linkage -D_Atomic=volatile -O2 -g -DNDEBUG -MD -MT CMakeFiles/XAPIAND_OBJ.dir/src/database/schemas_lru.cc.o -MF CMakeFiles/XAPIAND_OBJ.dir/src/database/schemas_lru.cc.o.d -o CMakeFiles/XAPIAND_OBJ.dir/src/database/schemas_lru.cc.o -c ../src/database/schemas_lru.cc
../src/database/schemas_lru.cc: In member function 'std::tuple<bool, std::shared_ptr<const MsgPack>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > SchemasLRU::_update(const char*, bool, const std::shared_ptr<const MsgPack>&, const MsgPack*, const Endpoints&, int, std::shared_ptr<std::unordered_set<std::__cxx11::basic_string<char> > >)':
../src/database/schemas_lru.cc:478:7: error: 'schema_updater' was not declared in this scope
       schema_updater()->debounce(foreign_uri, version, foreign_uri);
       ^~~~~~~~~~~~~~
../src/database/schemas_lru.cc:478:7: note: suggested alternative: 'schema_ptr'
       schema_updater()->debounce(foreign_uri, version, foreign_uri);
       ^~~~~~~~~~~~~~
       schema_ptr
ninja: build stopped: subcommand failed.

The compilation will succeed if following patch is applied (I'm not sure about any other inadverent effects though):

--- a/src/database/schemas_lru.cc
+++ b/src/database/schemas_lru.cc
@@ -475,7 +475,9 @@ SchemasLRU::_update([[maybe_unused]] const char* prefix, bool writable, const st
 					auto version = save_shared(foreign_id, *schema_ptr, Endpoint(foreign_path), context);
 					schema_ptr->set_flags(version);
 					if (version) {
+#ifdef XAPIAND_CLUSTERING
 						schema_updater()->debounce(foreign_uri, version, foreign_uri);
+#endif
 					}
 					L_SCHEMA("{}" + YELLOW_GREEN + "Foreign Schema [{}] was saved to {} id={}: " + DIM_GREY + "{}", prefix, repr(foreign_uri), repr(foreign_path), repr(foreign_id), repr(schema_ptr->to_string()));
 				} catch (const Xapian::DocVersionConflictError&) {

N-Gram index/search example

Do you have any example how to do N-gram based search indexing and retrieval?

E.g. I'd like to index the phrase "Search and Storage Server", and when I search for "storaeg" it should have a good chance to find it, because some of the N-grams matches the indexed N-grams ("sto", "tor", "ora").

I think xapian supports some kind of N-grams, but I couldn't find an end-to-end example with xapiand.

Compile on Debian complete with error

Can I suggest you start using CI system for example TravisCI or CircleCI service?
It may increase source code quality and provide Debian package for example.

In file included from /opt/Xapiand/src/msgpack_patcher.cc:23:
In file included from ../src/msgpack_patcher.h:31:
In file included from ../src/exception.h:42:
In file included from ../src/fmt/format.h:60:
../src/fmt/core.h:541:3: error: static_assert failed "don't know how to format the type, include fmt/ostream.h if it provides an operator<< that should be used"
  static_assert(
  ^
../src/fmt/core.h:677:15: note: in instantiation of template class 'fmt::v5::internal::fallback_formatter<std::experimental::fundamentals_v1::basic_string_view<char, std::char_traits<char> >, char, void>' requested here
    Formatter f;
              ^
../src/fmt/core.h:663:22: note: in instantiation of function template specialization 'fmt::v5::internal::value<fmt::v5::basic_format_context<std::back_insert_iterator<fmt::v5::internal::basic_buffer<char> >, char> >::format_custom_arg<std::experimental::fundamentals_v1::basic_string_view<char, std::char_traits<char> >, fmt::v5::internal::fallback_formatter<std::experimental::fundamentals_v1::basic_string_view<char, std::char_traits<char> >, char, void> >' requested here
    custom.format = &format_custom_arg<
                     ^
../src/fmt/core.h:690:58: note: in instantiation of function template specialization 'fmt::v5::internal::value<fmt::v5::basic_format_context<std::back_insert_iterator<fmt::v5::internal::basic_buffer<char> >, char> >::value<std::experimental::fundamentals_v1::basic_string_view<char, std::char_traits<char> > >' requested here
  FMT_CONSTEXPR operator value<Context>() const { return value<Context>(val); }
                                                         ^
../src/fmt/core.h:1249:25: note: in instantiation of function template specialization 'fmt::v5::internal::make_arg<true, fmt::v5::basic_format_context<std::back_insert_iterator<fmt::v5::internal::basic_buffer<char> >, char>, std::experimental::fundamentals_v1::basic_string_view<char, std::char_traits<char> > >' requested here
      : data_{internal::make_arg<IS_PACKED, Context>(args)...} {}
                        ^
../src/fmt/core.h:1269:10: note: in instantiation of member function 'fmt::v5::format_arg_store<fmt::v5::basic_format_context<std::back_insert_iterator<fmt::v5::internal::basic_buffer<char> >, char>, const char *, std::experimental::fundamentals_v1::basic_string_view<char, std::char_traits<char> > >::format_arg_store' requested here
  return {args...};
         ^
../src/exception.h:83:90: note: in instantiation of function template specialization 'fmt::v5::make_format_args<fmt::v5::basic_format_context<std::back_insert_iterator<fmt::v5::internal::basic_buffer<char> >, char>, const char *, std::experimental::fundamentals_v1::basic_string_view<char, std::char_traits<char> > >' requested here
                : BaseException(private_ctor{}, default_exc(), function, filename, line, type, format, make_format_args(std::forward<Args>(args)...)) { }
                                                                                                       ^
../src/exception.h:219:36: note: in instantiation of function template specialization 'BaseException::BaseException<const char *const &, const std::experimental::fundamentals_v1::basic_string_view<char, std::char_traits<char> > &>' requested here
        InvalidArgument(Args&&... args) : BaseException(std::forward<Args>(args)...), std::invalid_argument(message) { }
                                          ^
../src/strict_stox.hh:101:11: note: in instantiation of function template specialization 'InvalidArgument::InvalidArgument<char const (&)[11], char const (&)[22], int, char const (&)[16], char const (&)[29], const char *const &, const std::experimental::fundamentals_v1::basic_string_view<char, std::char_traits<char> > &>' requested here
                                THROW(InvalidArgument, "{}: Cannot convert value: {}", name, str);
                                      ^
3 errors generated.
ninja: build stopped: subcommand failed.

Windows

Any plans to support Windows?

Limiting Returned Fields on Query

I followed the instructions here: https://kronuz.io/Xapiand/docs/exploring/

When I executed the following curl command, I didn't get back only the account_number and balance fields though. It appears to return the whole document.

curl -H 'Content-Type: application/json' --data-binary '{
  "_query": "*",
  "_source": ["account_number", "balance"]
}' -X GET 'localhost:8880/bank/:search?pretty'

How can I limit the fields returned?

Data Encryption

Hi !

I need to index secure data, I would like to encrypt all data stores by Xapiand.

Do you have this enhancement in mind, or do you have some ideas how doing that?

By reviewing the source code, I found in that data is stored on the file system in a text file, same thing for the indexed data.

So, we can encrypt the file system and decrypt it when we need to read or write something or we encrypt only the data and indexed data in the text file.

For the first option I think Partition Encryption, but I've not put enough thought yet.
For the second option, I found saltpack (https://saltpack.org/). It uses a asymmetric encryption system, which it not the most adapted solution. We can use AES, but we have to think about the key.

So, here I am. I tested Xapiand and it's work really great. The stemming (which Xapiand not do) it's not perfect, especially for the French content, but, yeah, French it's hard.. !

Hope you can help me !

Segmentation fault on Alpine v3.9

Hi,

when I'm compile Xapiand 0.23 for Alpine Linux v3.9 (g++-8.3.0, libstdc++-8.3.0-r0, musl-1.1.20-r4) and then run it and try to insert some data into a clean index using an empty database dir as a starting point, the Xapiand process crashes with SIGSEGV error.

Data insertion is performed with the following request:
curl -v -X POST --data-binary @accounts.ndjson -H 'Content-Type: application/x-ndjson' 'http://localhost:8880/bank/:restore'

If I run xapiand with valgrind (w/ default settings) after compiling it with -DCMAKE_BUILD_TYPE=RelWithDebInfo (full log is in attachments):

$ valgrind /usr/bin/xapiand -vvvv --uid 1000  --database /tmp/xapiand --bind-address=127.0.0.1 --port=8880 --solo

... trimmed to a relevant part only ...

==5640== Thread 25 Xapiand:SU00:
==5640== Invalid read of size 8
==5640==    at 0x5C06FF: lock (std_mutex.h:103)
==5640==    by 0x5C06FF: lock_guard (std_mutex.h:162)
==5640==    by 0x5C06FF: enqueue (concurrent_queue.h:54)
==5640==    by 0x5C06FF: Discovery::schema_updated_send(unsigned long, std::basic_string_view<char, std::char_traits<char> >) (discovery.cc:1438)
==5640==    by 0x42EE4D: __invoke_impl<void, void (*&)(long unsigned int, std::__cxx11::basic_string<char>), long unsigned int&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&> (invoke.h:60)
==5640==    by 0x42EE4D: __invoke<void (*&)(long unsigned int, std::__cxx11::basic_string<char>), long unsigned int&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&> (invoke.h:95)
==5640==    by 0x42EE4D: __apply_impl<void (*&)(long unsigned int, std::__cxx11::basic_string<char>), std::tuple<long unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >&, 0, 1> (tuple:1678)
==5640==    by 0x42EE4D: apply<void (*&)(long unsigned int, std::__cxx11::basic_string<char>), std::tuple<long unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >&> (tuple:1687)
==5640==    by 0x42EE4D: operator() (debouncer.h:135)
==5640==    by 0x42EE4D: operator() (debouncer.h:126)
==5640==    by 0x42EE4D: operator() (scheduler.h:83)
==5640==    by 0x42EE4D: operator() (threadpool.hh:132)
==5640==    by 0x42EE4D: ThreadPoolThread<std::shared_ptr<ScheduledTask<ThreadedScheduler<DebouncerTask<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void (*)(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), std::tuple<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, (ThreadPolicyType)9>, (ThreadPolicyType)0>, DebouncerTask<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void (*)(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), std::tuple<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, (ThreadPolicyType)9>, (ThreadPolicyType)0> >, (ThreadPolicyType)0>::operator()() (threadpool.hh:339)
==5640==    by 0x42F983: Thread<ThreadPoolThread<std::shared_ptr<ScheduledTask<ThreadedScheduler<DebouncerTask<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void (*)(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), std::tuple<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, (ThreadPolicyType)9>, (ThreadPolicyType)0>, DebouncerTask<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void (*)(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), std::tuple<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, (ThreadPolicyType)9>, (ThreadPolicyType)0> >, (ThreadPolicyType)0>, (ThreadPolicyType)0>::_runner(void*) (thread.hh:80)
==5640==    by 0x4F1982D: ??? (in /usr/lib/libstdc++.so.6.0.25)
==5640==    by 0x40524D7: ??? (in /lib/ld-musl-x86_64.so.1)
==5640==  Address 0x5d8 is not stack'd, malloc'd or (recently) free'd
==5640== 
==5640== 
==5640== Process terminating with default action of signal 11 (SIGSEGV)
==5640==  Access not within mapped region at address 0x5D8
==5640==    at 0x5C06FF: lock (std_mutex.h:103)
==5640==    by 0x5C06FF: lock_guard (std_mutex.h:162)
==5640==    by 0x5C06FF: enqueue (concurrent_queue.h:54)
==5640==    by 0x5C06FF: Discovery::schema_updated_send(unsigned long, std::basic_string_view<char, std::char_traits<char> >) (discovery.cc:1438)
==5640==    by 0x42EE4D: __invoke_impl<void, void (*&)(long unsigned int, std::__cxx11::basic_string<char>), long unsigned int&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&> (invoke.h:60)
==5640==    by 0x42EE4D: __invoke<void (*&)(long unsigned int, std::__cxx11::basic_string<char>), long unsigned int&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&> (invoke.h:95)
==5640==    by 0x42EE4D: __apply_impl<void (*&)(long unsigned int, std::__cxx11::basic_string<char>), std::tuple<long unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >&, 0, 1> (tuple:1678)
==5640==    by 0x42EE4D: apply<void (*&)(long unsigned int, std::__cxx11::basic_string<char>), std::tuple<long unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >&> (tuple:1687)
==5640==    by 0x42EE4D: operator() (debouncer.h:135)
==5640==    by 0x42EE4D: operator() (debouncer.h:126)
==5640==    by 0x42EE4D: operator() (scheduler.h:83)
==5640==    by 0x42EE4D: operator() (threadpool.hh:132)
==5640==    by 0x42EE4D: ThreadPoolThread<std::shared_ptr<ScheduledTask<ThreadedScheduler<DebouncerTask<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void (*)(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), std::tuple<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, (ThreadPolicyType)9>, (ThreadPolicyType)0>, DebouncerTask<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void (*)(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), std::tuple<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, (ThreadPolicyType)9>, (ThreadPolicyType)0> >, (ThreadPolicyType)0>::operator()() (threadpool.hh:339)
==5640==    by 0x42F983: Thread<ThreadPoolThread<std::shared_ptr<ScheduledTask<ThreadedScheduler<DebouncerTask<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void (*)(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), std::tuple<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, (ThreadPolicyType)9>, (ThreadPolicyType)0>, DebouncerTask<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void (*)(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), std::tuple<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, (ThreadPolicyType)9>, (ThreadPolicyType)0> >, (ThreadPolicyType)0>, (ThreadPolicyType)0>::_runner(void*) (thread.hh:80)
==5640==    by 0x4F1982D: ??? (in /usr/lib/libstdc++.so.6.0.25)
==5640==    by 0x40524D7: ??? (in /lib/ld-musl-x86_64.so.1)
==5640==  If you believe this happened as a result of a stack
==5640==  overflow in your program's main thread (unlikely but
==5640==  possible), you can try to increase the size of the
==5640==  main thread stack using the --main-stacksize= flag.
==5640==  The main thread stack size used in this run was 8388608.
==5640== 
==5640== HEAP SUMMARY:
==5640==     in use at exit: 3,486,814 bytes in 4,921 blocks
==5640==   total heap usage: 48,507 allocs, 43,586 frees, 33,747,760 bytes allocated
==5640== 
==5640== LEAK SUMMARY:
==5640==    definitely lost: 264,232 bytes in 2 blocks
==5640==    indirectly lost: 0 bytes in 0 blocks
==5640==      possibly lost: 3,484 bytes in 4 blocks
==5640==    still reachable: 3,219,098 bytes in 4,915 blocks
==5640==                       of which reachable via heuristic:
==5640==                         newarray           : 48 bytes in 2 blocks
==5640==         suppressed: 0 bytes in 0 blocks
==5640== Rerun with --leak-check=full to see details of leaked memory
==5640== 
==5640== For counts of detected and suppressed errors, rerun with: -v
==5640== Use --track-origins=yes to see where uninitialised values come from
==5640== ERROR SUMMARY: 29081 errors from 174 contexts (suppressed: 0 from 0)
==5640== could not unlink /tmp/vgdb-pipe-from-vgdb-to-5640-by-root-on-???
==5640== could not unlink /tmp/vgdb-pipe-to-vgdb-from-5640-by-root-on-???
==5640== could not unlink /tmp/vgdb-pipe-shared-mem-vgdb-5640-by-root-on-???
Segmentation fault

If I compile Xapiand with -DCLUSTERING="OFF" the crash wont occur

.

For reference, I'm building xapiand using this APKBUILD:

# Contributor: Jiri Spacek <[email protected]>
# Maintainer: Jiri Spacek <[email protected]>
pkgname=xapiand
_pkgname="$(echo ${pkgname:0:1} | tr '[:lower:]' '[:upper:]')${pkgname:1}"
pkgver=0.23.0
pkgrel=0
pkgdesc="A Modern Highly Available Distributed RESTful Search and Storage Engine built for the Cloud and with Data Locality in mind."
url="https://kronuz.io/Xapiand/"
arch="x86_64"
license="Apache-2.0"
depends="zlib icu"
makedepends="cmake tcl perl ninja util-linux-dev zlib-dev icu-dev"
install="$pkgname.pre-install"
source="$pkgname-$pkgver.tar.gz::https://github.com/Kronuz/Xapiand/archive/v$pkgver.tar.gz
        $pkgname.initd
        $pkgname.confd
"
builddir="$srcdir/$_pkgname-$pkgver"
pkgusers="$pkgname"
pkggroups="$pkgname"
subpackages="$pkgname-doc $pkgname-openrc"
options="!check !strip" # tests are disabled for now
_bootstrapdir="$builddir"/build

build() {
        mkdir -p "$_bootstrapdir"
        cd "$_bootstrapdir"
        cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_INSTALL_PREFIX="/usr" -GNinja "$builddir"
        ninja -j1
}

package() {
    cd "$_bootstrapdir"

    DESTDIR="$pkgdir" ninja install

    install -Dm755 "$srcdir"/"$pkgname".initd \
            "$pkgdir"/etc/init.d/"$pkgname"
    install -Dm644 "$srcdir"/"$pkgname".confd \
            "$pkgdir"/etc/conf.d/"$pkgname"

}

The compile flags are:
-- Compile flags: -Os -fomit-frame-pointer -std=c++17 -fno-common -fdiagnostics-color=always -Wno-attributes -Wno-subobject-linkage -D_Atomic=volatile -O2 -g -DNDEBUG

Thank you.

How to make AND the default boolean operator (instead of OR)?

When sending a search query like {"_query": {"fieldname": "hello world"}} through the SEARCH HTTP method, the default boolean operator between "hello" and "world" is OR.

Is it possible to make AND the default? (like with Xapian::QueryParser::set_default_op(Xapian::Query::OP_AND)

Bulk Insert/Update API

Hi!

I'm trying to setup a syslog message forwarding with Xapiand as a target. Syslog server implementation (rsyslog) performs message batching by default, so I'm wondering if there is an option for bulk insert/update of data via Xapiand's API. And if there is not, are there any blockers that would prevent such a functionality in Xapiand (I'm not familiar with libxapian internals).

Thanks a lot.

Limit query execution time

Thank you for your amazing work with this, it is awesome to use Xapiand!
We really want to use Xapiand in our current project, we are, however, concerned about security. Since I couldn't find any information on this in the docs, I figured I would just ask:

  • What operators are allowed within text queries? Trying out it seemed to be mainly logic operators, but I didn't really figure out how the queries are parsed and where in the code this happens. We would like to be able to sanitize the queries before letting them touch Xapiand, so we need to know what to look for.
  • Is there a way to limit query execution time or is there already a limit? I didn't seem to reach one. We are especially concerned about this because we want to be able to prevent DOS attacks on our search server and limiting execution time would be a good measure to counter that.

docker image is not starting

Hello, idk if there is a problem with my docker install
i'm running it con Centos 7 (3.10.0-862.14.4.el7.x86_64)
docker-ce-cli-18.09.1-3.el7.x86_64
docker-ce-18.09.1-3.el7.x86_64

docker ps show no running containers.
(firewalld is stopped)

Documentation Status

I've been through the existing documentation and found several places that need to be finished. Is there a plan for the documentation? And is there any way I can help?

Result Highlighting

Can Xapiand highlight matching text within document fragments, using Xapian::MSet::snippetunder the hood?

Write performance problem, can not reach 1000qps

Run Xapiand as a test, and I found the write throughput can hardly reach 1000 qps, even as cluster. Write with golang and msgpack, and can hold about 800k qps when those services write to Elasticsearch. The number is about 900 qps at beginning, and slow down to 500 qps as document amount increase to 500k.

But the cpu usage of Xapiand service stay below 1000%(cpu core amount is 64) at the same time. Server built by Xeon Gold 5218 x2, 256G, Intel SSD DC P4510 2.0TB x2, the system is Debian 9. Which seems not the performance bottleneck.

I try to perf Xapiand service (svg send with this issue), and found doc_preparer thread wait for spin lock at enqueue and dequeue. I read the code and try to do some thing, but my C++ skill is very poor. Is there any place to do some optimization?

Thank you for your awesome work, and look forward for your help.

xapian-perf.svg.zip

compile with cmake : linking (error)

[100%] Linking CXX executable bin/xapiand
/tmp/ccCJI9fa.ltrans2.ltrans.o: In function main': <artificial>:(.text.startup+0x6bf): undefined reference to cap_init'
:(.text.startup+0x6f5): undefined reference to cap_set_flag' <artificial>:(.text.startup+0x715): undefined reference to cap_set_flag'
:(.text.startup+0x725): undefined reference to cap_set_proc' <artificial>:(.text.startup+0x771): undefined reference to cap_clear'
:(.text.startup+0x7a5): undefined reference to cap_set_flag' <artificial>:(.text.startup+0x7c5): undefined reference to cap_set_flag'
:(.text.startup+0x7d5): undefined reference to `cap_set_proc'
collect2: error: ld returned 1 exit status
CMakeFiles/xapiand.dir/build.make:803: recipe for target 'bin/xapiand' failed
make[2]: *** [bin/xapiand] Error 1
CMakeFiles/Makefile2:243: recipe for target 'CMakeFiles/xapiand.dir/all' failed
make[1]: *** [CMakeFiles/xapiand.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

How to config a cluster without multicast?

Pardon my poor English.
I Have about 50 servers, and I want to try Xapiand as a cluster.
But my servers are not in the same network.
Is there any way to config a cluster without multicast?
For example, like Elasticsearch, find a cluster with a list of ip addr, which belong to (part of) nodes in the cluster.
Thank you very much.

License question

Hi I see that xapiand is released under MIT license, but appears to either directly contain or link the xapian core. What is the relationship there, and how do you release xapiand under the MIT license without inheriting the GPL from xapian core? If I use the vanilla xapiand docker container as an appliance to index information in another application, am I obligated to open-source my application too?

DELETE all documents

I use

DELETE /some/resource/path/

it say

{ "status": 501, "type": "Not Implemented" }

Xapaind Glossary Terms

Xapiand Glossary Missing!

I've been poking around again. This time I was trying to grok what capabilities xapiand has for syncing data between systems. Despite reading what documentation is written around this subject, I still am extremely foggy in understanding.

Story Time

I fired up three instances like this:

# 1st
# pwd: xapiand-test
xapiand -vvvv --name=Cthulhu --cluster=fuck --database=db

# 2nd
# pwd: xapiand-test/GibberingMouther
xapiand -vvvv --name=GibberingMouther --cluster=fuck --database=db

# 3rd
# pwd: xapiand-test/EldritchEffulgence
xapiand -vvvv --name=EldritchEffulgence --cluster=fuck --database=db

Cthulhu

All that seems to work flawlessly as far as I can tell. The data seems to sync between the 3 nodes. I am confused, though, about scenarios when nodes go offline.

If I have two instances, one local, one in a remote data center, will they replicate/sync?

Finally, can I get a quick description of the following terms and their relations to each other?

  • Clusters
  • Nodes
  • Shards
  • Replication
  • Leaders

...๐Ÿง  on ๐Ÿ”ฅ...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.