GithubHelp home page GithubHelp logo

doc22940 / showcase-books-search Goto Github PK

View Code? Open in Web Editor NEW

This project forked from typesense/showcase-books-search

0.0 1.0 0.0 454 KB

A site to instantly search 28M books from OpenLibrary using Typesense Search (an open source alternative to Algolia / ElasticSearch) โšก ๐Ÿ“š ๐Ÿ”

Home Page: https://books-search.typesense.org/

License: Apache License 2.0

Ruby 33.89% HTML 27.30% JavaScript 34.12% SCSS 4.69%

showcase-books-search's Introduction

๐Ÿ“š Instant Books Search, powered by Typesense

This is a demo that showcases some of Typesense's features using a 28 Million database of books from OpenLibrary (Internet Archive).

View it live here: books-search.typesense.org

Tech Stack

This search experience is powered by Typesense which is a blazing-fast, open source typo-tolerant search-engine. It is an open source alternative to Algolia and an easier-to-use alternative to ElasticSearch.

The book dataset is from openlibrary.org. If you're able to contribute book metadata, please do ๐Ÿ™

The app was built using the Typesense Adapter for InstantSearch.js and is hosted on S3, with CloudFront for a CDN.

The search backend is powered by a geo-distributed 3-node Typesense cluster running on Typesense Cloud, with nodes in Oregon, Frankfurt and Mumbai.

The dataset has ~28M records, takes up 6.8GB on disk and 14.3GB in RAM when indexed in Typesense. Takes ~3 hours to index these 28M records.

Repo structure

  • src/ and index.html - contain the frontend UI components, built with Typesense Adapter for InstantSearch.js
  • scripts/indexer - contains the script to index the book data into Typesense.
  • scripts/data - contains a 1K sample subset of the books database. But you can download the full dataset from the link above.

Development

To run this project locally, install the dependencies and run the local server:

yarn
bundle # JSON parsing takes a while to run using JS when indexing, so we're using Ruby just for indexing

yarn run typesenseServer

ln -s .env.development .env

yarn run indexer:extractAuthors # This will output an authors.jsonl file
yarn run indexer:transformDataset # This will output a transformed_dataset.json file
BATCH_SIZE=100000 yarn run indexer:importToTypesense # This will import the JSONL file into Typesense

yarn start

Open http://localhost:3000 to see the app.

Deployment

The app is hosted on S3, with Cloudfront for a CDN.

yarn build
yarn deploy

showcase-books-search's People

Contributors

jasonbosco avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.