GithubHelp home page GithubHelp logo

twenga / geoip-updater Goto Github PK

View Code? Open in Web Editor NEW
2.0 5.0 0.0 284 KB

GeoIP Updater is a PHP tool that helps updating the GeoIPLite databases.

License: Apache License 2.0

PHP 94.43% Shell 5.57%

geoip-updater's Introduction

GeoIP Updater

Description

GeoIP Updater is a PHP program that helps updating the GeoIP databases :

  • Retrieve DB files from MaxMind (based on a list of URLs stored as CSV)
  • Archive older DB files
  • Validate DB files
  • Load the new DB files in the GeoIP DB directory

This program works with the 3 types of DB files :

  • Lite (Free version)
  • Legacy (Paid version, old .dat format)
  • GeoIP2 (Paid version, new .mmdb format)

Requirements

Installation

In the directory of your choice :

$ git clone https://github.com/Twenga/geoip-updater.git
$ cd geoip-updater
$ ./install.sh

The install script will simply create working directories such as /tmp/GeoIP, /usr/share/GeoIP_Archives. These paths can be edited in conf/config.php

Configuration

Folders

GeoIP needs to know 3 paths :

  • Directory to copy the final DB files
  • Directory to archive DB files
  • A temporary directory that will be used for intermediate tasks

These configuration can be edited for each type of DB with the following constants (see conf/config.php) :

  • XXX_DB_PATH
  • XXX_DB_ARCHIVE_PATH
  • XXX_DB_TMP_PATH

Where XXX can be GEOIP_LITE, GEOIP_LEGACY or GEOIP2.

Notes :

  • When using Lite and Legacy through the native PHP API will, by default, use the folder /usr/share/GeoIP.
  • For GeoIP2, there is no default folder, Paths to GeoIP DB files must be specified when using the GeoIP PHP API.

License key

Legacy and GeoIP2 are paid versions of GeoIP. Accessing the available DB files through HTTP requires sending a license key as part of the request.

You can configure the MAXMIND_LICENSE_KEY constant in conf/config.php to specify a valid MaxMind license key.

Validation

For DB files validation, GeoIP Updater will use PHP. There are 2 PHP API's depending on the DB type :

These API's should be available in order to enable DB files validation.

Validation items (Lite and Legacy)

For validating a set of DB files, GeoIP Updater will use a list of validation items. An item is made of :

  • A GeoIp function name (PHP API)
  • A host/IP
  • An expected result.

Validation items are stored in a CSV file inc/validation_list.csv which you can populate with your own validation items as follows :

"geoip_country_code_by_name","xx.xx.xx.xx","Country code"
"geoip_country_code_by_name","xx.xx.xx.xx","Country code"

NOTE : Use commas as separators and double quotes as enclosure.

For now, the expected result can only be specified as a string, that means we can't validate that geoip_record_by_name() works as it returns an array.

Validation items (GeoIP2)

Validation items for GeoIP2 are stored in a CSV file inc/validation_list_geoip2.csv and are formatted as follows :

"city->name","xx.xx.xx.xx","City name","GeoIP2-City.mmdb"
"country->name","xx.xx.xx.xx","Country name","GeoIP2-City.mmdb"

NOTES :

  • You only specify the NAME of the DB file, the path to the file is specified by GEOIP2_DB_PATH in conf/config.php

DB files URLs

The files inc/db_url_list_XXX.csv contain lists of URLs to MaxMind GeoIP DB files. GeoIP Updater comes with default lists for each type, which you can update as needed.

Default URLs come from :

Usage

Update

$ sudo php geoip-updater.php -v -m update

When a set of DB files is retrieved, GeoIP Updater computes a 'version' hash (sha1 of all DB files contents) and will archive these files in a directory for that version. When a set of DB files is not valid, the version is blacklisted and will never be loaded again.

Rollback

$ sudo php geoip-updater.php -v -m rollback

A rollback will attempt to load the previous set of DB files from the archives. If there are no older archived version, rollback will stop. If the loaded archive is not valid, the rollback will attempt to load the next older version and so on.

Options

Mode : -m [mode]

Specifies GeoIP Updater mode.

Values :

  • update
  • rollback

Type : -t [type] (optional)

Specifies the type of DB files to update.

Values :

  • Lite (default)
  • Legacy
  • GeoIP2

Verbose : -v (optional)

Specifies whether GeoIP should output logs to the console.

Notes

Files are packaged differently from a type to another :

Type Package City DB (=> path/to/destination) Country DB (=> path/to/destination
Lite Gzip GeoIPLiteCity.dat => /usr/share/GeoIP/GeoIPCity.dat GeoIP.dat => /usr/share/GeoIP/GeoIP.dat
Legacy Tar.gz /GeoIP-133_XXX/GeoIPCity.dat => /usr/share/GeoIP/GeoIPCity.dat /GeoIP-106_XXX/GeoIP-106_XXX.dat => /usr/share/GeoIP/GeoIP.dat
GeoIP2 Tar.gz /GeoIP2-City_XXX/GeoIP2-City.mmdb /GeoIP2-Country_XXX/GeoIP2-Country.mmdb

XXX = DB file timestamp

Notes :

  • For Lite and Legacy types, GeoIP Updater renames .dat files to GeoIP.dat and GeoIPCity.dat
  • For GeoIP2, it's up to the application to load the appropriate GeoIP2 MMDB.

Principles

DB versions

When using Lite or Legacy with the PHP API, geoip_database_info() returns the DB files versions but it has to be called for every DB file and it would be too much of a hassle to keep track of each file's version.

GeoIP Updater builds an 'overall' version for a given set of db files. It's basically a SHA-1 of the files contents.

The version is then written to a hash file, stored along with the db files. Next time we need this DB set version, we can just read the hash file.

Archives

Every set of DB files is archived in a directory specified in the conf/config.php file. When updating the DB files, the current files are archived, the newly retrieved files are also archived.

If a version is already archived, GeoIp Updater will just tell you about it.

The maximum number of archives can be configured in the conf/config.php file.

Validation

For Lite and Legacy only.

To make sure that the loaded DB files actually work, we simply call GeoIp functions and check that they return expected results for given parameters. The functions, IP/hosts and expected results are to be listed in the inc/validation_list.csv file.

The GeoIp functions are executed through exec() to prevent GeoIP Updater from crashing if the DB files are corrupted.

Limitations

It is not currently possible to validate geoip_record_by_name(). It returns an array and would not match a string as specified in the CSV validation items list.

GeoIP Updater MUST be executed by 'root' as it needs to write in /usr/share/

To do

Unit tests

geoip-updater's People

Contributors

nassimseddiki avatar

Stargazers

 avatar  avatar

Watchers

Gregory Oschwald avatar Meirza avatar James Cloos avatar Tony Caron avatar  avatar

geoip-updater's Issues

Return error code to shell on failure

Currently, errors are simply logged, they should make PHP return an error code to shell that could be caught by Puppet or some other program that executed GeoIp Updater.

Extensive validation

The validation process only runs the geoip_country_code_by_name() function. That's not enough to validate all databases!!!

We just had the issue, 2 db files were invalid, but geoip_country_code_by_name() still worked... And the db files were validated.

Validation process should run GeoIp functions that sollicitate all db files. The IP list should be extended to include more data to check (country name, isp, region, isp, city....), geoip_record_by_name() is a good candidate.

Validation fails when CSV IP list is malformed

A user modified the list and forgot to specify matching country codes for each IP, this made the validation fail and blacklist the db file!

Validation process should first check that it's got all the data it needs.

Can't retrieve files into /tmp/GeoIP with php 5.6

Hello Nassim,

Line 383:
$aDbFiles = $this->_oFileSystem->glob($sPath.DIRECTORY_SEPARATOR."*".DIRECTORY_SEPARATOR);

$aDbFiles = $this->_oFileSystem->glob($sPath.DIRECTORY_SEPARATOR."*");

Can you please fix this ?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.