GithubHelp home page GithubHelp logo

ensembl / ensj-healthcheck Goto Github PK

View Code? Open in Web Editor NEW
10.0 38.0 26.0 15.71 MB

Ensembl's Automated QC Framework

License: Apache License 2.0

Shell 0.82% Perl 3.59% Java 95.23% HTML 0.17% Batchfile 0.15% Smalltalk 0.04%

ensj-healthcheck's Introduction

EnsEMBL HealthCheck
===================


REQUIREMENTS
============

1. Java 6 JDK (v1.6 or later - see http://java.com/en/download/); this is already
   installed on the Farm in /software/jdk1.6.0_14.  Make sure that your
   JAVA_HOME environment variable is pointing to the correct directory and that the
   *correct* Java executables are in your path; put something like the following in
   your .cshrc:

     setenv JAVA_HOME /software/jdk1.6.0_14
     setenv PATH ${JAVA_HOME}/bin:${PATH} 

   Note that if you get errors indicating that the java executable can't be found,
   check that $JAVA_HOME is set correctly by doing 

     which java

   and setting $JAVA_HOME to the directory in which bin/java resides.


INSTALLATION
============

1. Obtain the source files by checking out the ensj-healthcheck module from Git.
git clone https://github.com/Ensembl/ensj-healthcheck.git
Use the -r option to check out a specific tag if required.

2. cd ensj-healthcheck

3. Edit database.defaults.properties to contain values that correspond to the database
server which you want to connect to. You can also overwrite any values in the config file
on the command line.


RUNNING
=======

A number of shell scripts (with a .sh extension) are provided to aid in running
healthchecks. These are summarised below; the main one you will use is called
run-configurable-testrunner.sh; note that this script actually passes all of its
command-line options through to the ConfigurableTestRunner class.

 Usage: ./run-configurable-testrunner.sh -d my_db -h my_host -g healthcheck_group
 
 Options:
        [--conf -c value...]               : Name of one or many configuration files. Parameters in configuration files override each other. If a parameter is provided in more than one file, the first occurrence  is used.

        [--databaseURL value]              : Parameter used in org.ensembl.healthcheck.testcase.EnsTestCase

        [--dbtype value]                   : If set, this will be used as the type for all databases.

        [--driver value]                   : Parameter used in org.ensembl.healthcheck.testcase.EnsTestCase

        [--driver1 value]                  : Driver for server 1 (support for multiple staging servers in Ensembl)

        [--driver2 value]                  : Driver for server 2 (support for multiple staging servers in Ensembl)

        [--driver3 value]                  : Driver for server 3 (support for multiple staging servers in Ensembl)

        [--endSession value]               : Flag to run an empty testrunnerUsed to mark the end of a parallel run

        [--exclude_groups -G value...]     : Specify which groups of tests should not be run. Fully qualified class names can be used as well as their short names.

        [--exclude_tests -T value...]      : Specify which tests should not be run. Fully qualified class names can be used as well as their short names.

        [--file.separator value]           : Parameter used in org.ensembl.healthcheck.testcase.EnsTestCase

        [--funcgen_schema.file value]      : Parameter used only in and org.ensembl.healthcheck.testcase.funcgen.CompareFuncgenSchema

        [--help -h]                        : display help

        [--host -h value]                  : The host for the database server you wish to connect to.

        [--host1 value]                    : Host for server 1 (support for multiple staging servers in Ensembl)

        [--host2 value]                    : Host for server 2 (support for multiple staging servers in Ensembl)

        [--host3 value]                    : Host for server 3 (support for multiple staging servers in Ensembl)

        [--ignore.previous.checks value]   : Parameter used only in org.ensembl.healthcheck.testcase.generic.ComparePreviousVersionExonCoords, org.ensembl.healthcheck.testcase.generic.ComparePreviousVersionBase and org.ensembl.healthcheck.testcase.generic.GeneStatus

        [--include_groups -g value...]     : Specify which groups of tests should be run. Fully qualified class names can be used as well as their short names.

        [--include_tests -t value...]      : Specify which tests should be run. Fully qualified class names can be used as well as their short names.

        [--master.funcgen_schema value]    : Parameter used in org.ensembl.healthcheck.testcase.funcgen.CompareFuncgenSchema

        [--master.schema value]            : Parameter used only in org.ensembl.healthcheck.testcase.generic.CompareSchema, and org.ensembl.healthcheck.testcase.funcgen.CompareFuncgenSchema

        [--master.variation_schema value]  : Parameter used only in master.variation_schema

        [--output -o value]                : Specify the level of output that will be used. The allowed options are "All", "None", "Problem", "Current", "Warning" and "Info", .

        [--output.database value]          : The name of the database where the results of the healthchecks are written to, if the database reporter is used.

        [--output.driver value]            : The driver for the database where the results of the healthchecks are written to, if the database reporter is used.

        [--output.host value]              : The name of the database where the results of the healthchecks are written to, if the database reporter is used.

        [--output.password value]          : The password for the database where the results of the healthchecks are written to, if the database reporter is used.

        [--output.port value]              : The port of the database where the results of the healthchecks are written to, if the database reporter is used.

        [--output.release value]           : Gets written into the session table for describing the test session, if the database reporter is used.

        [--output.schemafile value]        : If output.database does not exist, it will be created automatically. This file should have the SQL commands to create the schema. Please remember that hashes (#) are not allowed to start comments in SQL. Use two dashes "--" at the beginning of a line instead. If the configuratble testrunner can't find this file from the current working directory, it will search for it in the classpath.

        [--output.user value]              : The user name for the database where the results of the healthchecks are written to, if the database reporter is used.

        [--password value]                 : Parameter used in org.ensembl.healthcheck.testcase.EnsTestCase

        [--password1 value]                : Password for server 1 (support for multiple staging servers in Ensembl)

        [--password2 value]                : Password for server 2 (support for multiple staging servers in Ensembl)

        [--password3 value]                : Password for server 3 (support for multiple staging servers in Ensembl)

        [--perl value]                     : Parameter used only in org.ensembl.healthcheck.testcase.AbstractPerlBasedTestCase

        [--port -P value]                  : The port for the database server you wish to connect to.

        [--port1 value]                    : Port for server 1 (support for multiple staging servers in Ensembl)

        [--port2 value]                    : Port for server 2 (support for multiple staging servers in Ensembl)

        [--port3 value]                    : Port for server 3 (support for multiple staging servers in Ensembl)

        [--production.database value]      : The name of the Ensembl production database to use to retrieve division information. Assumed to be on the same server as the output databases.

        [--compara_master.database value]  : The name of the Ensembl Compara master database to use to control the content of the tested Compara database. Assumed to be on one of the configured servers.

        [--repair value]                   : Allow the tests to try to repair the database (if they can)

        [--reporterType -R value]          : Specify the reporter type that will be used. The allowed options are "Database" and "Text".

        [--schema.file value]              : Parameter used only in org.ensembl.healthcheck.testcase.generic.CompareSchema,

        [--secondary.database value]       : Some tests require a second database containing the previous release. This configures the database name for the second database server.

        [--secondary.driver value]         : Some tests require a second database containing the previous release. This configures the driver for the second database server.

        [--secondary.host value]           : Some tests require a second database containing the previous release. This configures the hostname of the second database server.

        [--secondary.password value]       : Some tests require a second database containing the previous release. This configures the password for the second database server.

        [--secondary.port value]           : Some tests require a second database containing the previous release. This configures the port of the second database server.

        [--secondary.user value]           : Some tests require a second database containing the previous release. This configures the user name for the second database server.

        [--sessionID value]                : The session to add these results forUsed in parallel run

        [--species value]                  : If set, this will be used as the species for all databases, overriding anything thename or meta table of the database may indicate.

        [--testRegistryType -r value]      : Specify the type of test registry that will be used. The allowed options are "Discoverybased" and "ConfigurationBased"

        [--test_databases -d value...]     : Name of databases that should be tested (e.g.: ensembl_compara_bacteria_5_58). If there is more than one database, separate with spaces. Any configured tests will be run on these databases. Does not support same format as output.databases!

        [--test_divisions -D value...]     : Names of division to which databases to test should belong e.g. EPl or EnsemblPlants. This option requires the production database to be set up.

        [--user value]                     : Parameter used in org.ensembl.healthcheck.testcase.EnsTestCase

        [--user.dir value]                 : Parameter used in org.ensembl.healthcheck.testcase.EnsTestCase

        [--user1 value]                    : User for server 1 (support for multiple staging servers in Ensembl)

        [--user2 value]                    : User for server 2 (support for multiple staging servers in Ensembl)

        [--user3 value]                    : User for server 3 (support for multiple staging servers in Ensembl)

        [--variation_schema.file value]    : Parameter used only in org.ensembl.healthcheck.testcase.variation.CompareVariationSchema

Here is an example commandline : 

      ./run-configurable-testrunner.sh -d homo_sapiens_core_80_38 --dbtype core --species homo_sapiens --output problem -g PostGenebuild

Test Groups
-----------

It is possible to run a single test or a group of tests.
Groups of tests are defined in src/org/ensembl/healthcheck/testgroup
and represent sets of healthchecks that are usually run together

Other Utilities
---------------

Run each of these with the -h option to show usage.
   
  database-name-matcher.sh
  Shows which database names match a particular regular expression.
  
  compile-healthcheck.sh
  Only used if you've made changes to the source, e.g. when writing your own
  tests.

WRITING YOUR OWN TESTS
======================

If you want to write your own healthchecks, rather than running the pre-defined
ones, see the file README-writing-tests.txt.

ensj-healthcheck's People

Contributors

amonida avatar andrewyatz avatar arnaudxk avatar at7 avatar carlacummins avatar danstaines avatar emepyc avatar ens-admin avatar ens-ba1 avatar ens-bwalts avatar ens-carlos avatar ens-lg4 avatar ens-lgil avatar helensch avatar ilavidas avatar ima23 avatar jalvarezjarreta avatar james-monkeyshines avatar jcmarca avatar juettemann avatar leannehaggerty avatar marcoooo avatar mn1 avatar muffato avatar nerdstrike avatar sarahhunt avatar swingingsimian avatar thibauthourlier avatar thomasmaurel avatar yuanchen1962 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.