GithubHelp home page GithubHelp logo

rummager's Introduction

Rummager

Rummager is now primarily based on elasticsearch.

Get started

Install elasticsearch 0.20. Rummager doesn't work with 0.90.

Run the application with ./startup.sh this uses shotgun/thin.

To create indices, or to update them to the latest index settings, run:

RUMMAGER_INDEX=all bundle exec rake rummager:migrate_index

If you have indices from a Rummager instance before aliased indices, run:

RUMMAGER_INDEX=all bundle exec rake rummager:migrate_from_unaliased_index

If you don't know which of these you need to run, try running the first one; it will fail safely with an error if you have an unmigrated index.

Rummager has an asynchronous mode, disabled in development by default, that posts documents to a queue to be indexed later by a worker. To run this in development, you need to run both of these commands:

ENABLE_QUEUE=1 ./startup.sh
bundle exec rake jobs:work

Indexing GOV.UK content

Since search indexing happens through Panopticon's single registration API, you'll need to have both Panopticon and Rummager running. By default, Panopticon will not try to index search content in development mode, so you'll need to pass an extra environment variable to it.

If you have Bowler installed, you can set these both running with a single command from the development repository:

UPDATE_SEARCH=1 bowl panopticon rummager

The next stage is to register content from the applications you want. For example:

  • Business Support Finder
  • Calendars
  • Licence Finder
  • Publisher
  • Smart Answers
  • Trade Tariff

To re-register content for a single application, go to its directory and run:

bundle exec rake panopticon:register

To register content for all the applications, go to the replication directory in the development project and run:

./rebuild-search-local.sh

To rebuild from the Whitehall application, follow the instructions in the app.

Adding a new index

To add a new index to Rummager, you'll first need to add it to the list of index names Rummager knows about in elasticsearch.yml. For instance, you might change it to:

index_names: ["mainstream", "detailed", "government", "my_new_index"]

To create the index, you'll need to run:

RUMMAGER_INDEX=my_new_index bundle exec rake rummager:migrate_index

This task will fail if you've already created an index with this name, as Rummager can't add an alias that is the name of an existing index. In this case, you'll either need to delete your existing index or, if you want to keep its contents, run:

RUMMAGER_INDEX=my_new_index bundle exec rake rummager:migrate_from_unaliased_index

Health check

As we work on rummager we want some objective metrics of the performance of search. That's what the health check is for.

To run it first download the healthcheck data:

$ ./bin/health_check -d

Then run against your chosen indices:

$ ./bin/health_check government mainstream

By default it will run against the local search instance. You can run against a remote search service using the --json or --html options.

rummager's People

Contributors

alext avatar alextea avatar bradwright avatar chrisroos avatar craigw avatar daibach avatar davidb51 avatar dhwthompson avatar emilydacosta avatar floehopper avatar garethr avatar h-lame avatar heathd avatar jamiecobbett avatar john-griffin avatar jordanhatch avatar jystewart avatar kushalp avatar lanagibson avatar lazyatom avatar lazyatom-and-floehopper avatar matthewford avatar mnowster avatar partiallyblind avatar richardjpope avatar robyoung avatar tarastockford avatar threedaymonk avatar tomafro avatar tombye avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.