GithubHelp home page GithubHelp logo

hibari / hibari Goto Github PK

View Code? Open in Web Editor NEW
271.0 271.0 29.0 2.08 MB

Hibari is a production-ready, distributed, ordered key-value, big data store. Hibari uses chain replication for strong consistency, high-availability, and durability. Hibari has excellent performance especially for read and large value operations.

License: Other

Shell 17.41% Erlang 62.59% Makefile 17.72% Batchfile 2.28%
distributed-database elixir erlang hibari hibaridb nosql thrift

hibari's Introduction

Welcome to Hibari

A Distributed, Consistent, Ordered Key-Value Store

Hibari is a distributed, ordered key-value store with strong consistency guarantee. Hibari is written in Erlang and designed for being:

  • Fast, Read Optimized: Hibari serves read and write requests in short and predictable latency. Hibari has excellent performance especially for read and large value operations

  • High Bandwidth: Batch and lock-less operations help to achieve high throughput while ensuring data consistency and durability

  • Big Data: Can store Peta Bytes of data by automatically distributing data across servers. The largest production Hibari cluster spans across 100 of servers

  • Reliable: High fault tolerance by replicating data between servers. Data is repaired automatically after a server failure

Hibari is able to deliver scalable high performance that is competitive with leading open source NOSQL (Not Only SQL) storage systems, while also providing the data durability and strong consistency that many systems lack. Hibari's performance relative to other NOSQL systems is particularly strong for reads and for large value (> 200KB) operations.

As one example of real-world performance, in a multi-million user webmail deployment equipped with traditional HDDs (non SSDs), Hibari is processing about 2,200 transactions per second, with read latencies averaging between 1 and 20 milliseconds and write latencies averaging between 20 and 80 milliseconds.

Distinct Features

Unlike many other distributed databases, Hibari uses "chain replication methodology" and delivers distinct features.

  • Ordered Key-Values: Data is distributed across "chains" by key prefixes, then keys within a chain are sorted by lexicographic order

  • Always Guarantees Strong Consistency: This simplifies creation of robust client applications

    • Compare and Swap (CAS): key timestamping mechanism that facilitates "test-and-set" type operations
    • Micro-Transaction: multi-key atomic transactions, within range limits
  • Custom Metadata: per-key custom metadata

  • TTL (Time To Live): per-key expiration times

Travis CI Status

http://travis-ci.org/hibari/hibari-ci-wrapper

Branch Erlang/OTP Versions Status Remarks
master 17.5, R16B03-1 master
dev 18.1, 17.5, R16B03-1 dev
hibari-gh54-thrift-api 18.1, 17.5, R16B03-1 hibari-gh54-thrift-api
gbrick-gh17-redesign-disk-storage 18.1, 17.5 gbrick-gh17-redesign-disk-storage no tests, compile only

News

  • Apr 5, 2015 - Hibari v0.1.11 Released. Release Notes

    • Update for Erlang/OTP 17 and R16. (Note: Erlang/OTP releases prior to R16 are no longer supported)
    • Update external libraries such as UBF to the latest versions
    • Enhanced client API: server side rename and server side timestamp
    • New logging format. Introduce Basho Lager for more traditional logging that plays nicely with Unix logging tools like logrotate and syslog
  • Feb 4, 2013 - Hibari v0.1.10 Released. Release Notes

    • A bug fix in Python EBF Client
    • Update for Erlang/OTP R15
    • Support for building on Ubuntu, including ARMv7 architecture
    • Remove S3 and JSON-RPC components from Hibari distribution. They will become separate projects
  • Older News

Quick Start

Please read Getting Started section of Hibari Application Developer Guide.

Hibari Documentation

They are a bit outdated -- sorry, but documentation rework is planned for Hibari v0.6.

Mailing Lists

Hibari Clients

As of Hibari v0.1 (since year 2010), only the native Erlang client is used in production. All other client APIs (Thrift, JSON-RPC, UBF, and S3) are still in proof of concept stage and only implement basic operations.

If you need a client library for other programming language, please feel free to post a request at the Hibari mailing list.

Supported Platforms

Hibari is written in pure Erlang/OTP and runs on many Unix/Linux platforms.

Please see the Supported Platforms page in Hibari Wiki for details.

Roadmap

Please see the Roadmap page in Hibari Wiki for the planned features for Hibari v0.3, v0.5, and v0.6.

Hibari's Origins

Hibari was originally written by Cloudian, Inc., formerly Gemini Mobile Technologies, to support mobile messaging and email services. Hibari was open-sourced under the Apache Public License version 2.0 in July 2010.

Hibari has been deployed by multiple telecom carriers in Asia and Europe. Hibari may lack some features such as monitoring, event and alarm management, and other "production environment" support services. Since telecom operator has its own data center support infrastructure, Hibari's development has not included many services that would be redundant in a carrier environment.

We hope that Hibari's release to the open source community will close those functional gaps as Hibari spreads outside of carrier data centers.

What does Hibari mean?

The word "Hibari" means skylark in Japanese; the Kanji characters stand for "cloud bird".

License

Copyright (c) 2005-2017 Hibari developers.  All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Note for License

Hibari has decided to display "Hibari developers" as the copyright holder name in the source code files and manuals. Actual copyright holder names (contributors) will be listed in the AUTHORS file.

EOF

hibari's People

Contributors

kinogmt avatar norton avatar tatsuya6502 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hibari's Issues

reltool command will fail to generate package due to broken symlinks created by repo command

  • repo version v1.12.2

The initial run of repo sync command seems to create broken links in .git directories, and this behavior causes reltool command to fail.

$ make package
...
generating: hibari-0.1.11-dev-x86_64-unknown-linux-gnu-64 ...
./rebar generate
==> rel (generate)
ERROR: Unable to generate spec: read file info /home/user/projects/hibari-0.1.11/tmp/rel/../lib/cluster_info/.git/shallow failed
ERROR: Unexpected error: rebar_abort
ERROR: generate failed while processing /home/user/projects/hibari-0.1.11/tmp/rel: rebar_abort
make: *** [generate] Error 1

Workaround this by adding find command to the Makefile to find and delete such broken links.

Support for building and running on Joyent SmartOS

Joyent SmartOS is a data center infrastructure for virtual machines. The OS itself is an illumos based Solaris variant armed with ZFS, DTrace, Zones, KVM, and Crossbow (network virtulaization). It can be a primary platform for deploying production Hibari clusters.

As of February 2013, some Erlang/OTP based distributed systems like RabbitMQ, Basho Riak, and Opscode Chef Server officially support SmartOS and provide binary packages.

  • Update Hibari's build and command scripts to work with Solaris Zones based SmartOS virtual machines
  • Write setup and administration guides
  • Measure performance on virtual machine clusters at Joyent Cloud.

Drop support for older Erlang/OTP releases. (Now server requires >= R16, client requires >= R14)

Drop support for older Erlang/OTP releases.

  • Hibari server components will drop support for R13, R14 and R15, and will continue support R16 and 17.x. The server now has h2leveldb based metadata store who takes advantages of new port driver APIs introduced in R16.
  • Hibari Erlang native client components will drop support for R13 and will continue support R14, R15, R16 and 17.x. I think Travis doesn't support R13.

Test Hibari deployment on Linux ARMv7 based scale out cluster (Calxeda EnergyCore)

Deploy Hibari on a test cluster of Calxeda EnergyCore servers running ARM Ubuntu 12.04 LTS (Linux ARMv7 architecture). Create documentation, run Basho Bench for performance measure and a 48-hour stability test. If possible, work as a project at Cloudian, Inc. to create a white paper for Calxeda's Software Partner page.

  • Create installation instructions for ARM Ubuntu
  • Create sample Hibari configurations
  • Develop and run custom Basho Bench drivers that simulate the workloads of typical applications for Hibari
  • Write a white paper with performance measure that can be published on the web sites of Calxeda, Altima and Cloudian

embedded hibari

Hi Tatsuya,

I m not sure this is the right place to ask but i couldn't find anything in the documentation on the subject of embedded hibari.

I m trying to build a real-time, self learning data management platform and i would like to use hibari as the default data storage system but i could not find anything that explains how to embed hibari into my application.

Is that even possible with hibari? the idea is to add hibari as rebar dependency and add it to the main application supervisor tree.

thanks,
Dalibor

Compilation fails with "dependency not available" for "lib/meck":

Compiling as described in the "building from source" docs fails:

$ repo sync; cd hibari; make
==> gmt_util (compile)
==> riak_err (compile)
==> cluster_info (compile)
==> partition_detector (compile)
==> congestion_watcher (compile)
==> mochiweb (compile)
==> s3 (compile)
==> ubf (compile)
==> ubf_jsonrpc (compile)
==> ubf_thrift (compile)
==> gdss_brick (compile)
==> gdss_client (compile)
==> gdss_admin (compile)
==> gdss_ubf_proto (compile)
==> gdss_json_rpc_proto (compile)
==> gdss_s3_proto (compile)
==> meck (compile)
==> rel (compile)
==> hibari (compile)
Dependency not available: meck-0.1.1 ("lib/meck")

Editing rebar.config and removing "meck" from the deps list fixes the issue.

This is Erlang R1404 on OS X.

Update for Erlang/OTP R16(A)

Prepare Hibari v0.3.0 to build and run on Erlang/OTP R16(A).

Since Erlang/OTP R16 has dropped the experimental parameterized modules (Issue 4 under http://www.erlang.org/news/35), the following external libraries have to be updated to the latest version.

  • ubf
  • ubf-thrift
  • qc
  • meck

Hibari v0.1.11 Release

Current stable release Hibari v0.1.10 doesn't work on Erlang/OTP 17. So prepare Hibari v0.1.11 that is compatible with OTP 17.

v0.1.11 will also include many features originally scheduled for v0.3.0. I may remove some of them, but the followings are the current changes on the dev branch.

NEW FEATURES

  • hibari >> GH18 - Add DTrace tracepoints for Erlang/OTP R15 or later
  • hibari >> GH19 - Introduce Basho Lager as the primary logging facility
  • gdss-brick >> GH2 - brick_server new client API - rename
  • gdss-brick >> GH13 - Add attrib and exp_time directive flags to rename/6
  • gdss-brick >> GH14 - Add attrib and exp_time directive flags to replace/6 and set/6

FIXES

  • hibari >> GH47 - reltool command will fail to generate package due to broken symlinks created by repo command
  • gdss-client >> GH8 - brick_simple_client:find_the_brick/3 crashes for unknown table

ENHANCEMENTS

  • gdss-client >> GH2 - brick_simple:{add,set,replace} APIs do not return the server-side timstamp for success

OTHERS

  • hibari >> GH20 - Remove riak-err
  • hibari >> GH21 - Add gmt_elog module again
  • hibari >> GH23 - Introduce qc wrapper
  • hibari >> GH24 - Update for Erlang/OTP R16
  • hibari >> GH27 - Support for building and running on Joyent SmartOS
  • hibari >> GH37 - Update for Erlang/OTP 17.x (Remove ubf and ubf-thrift)
  • hibari >> GH39 - Drop support for older Erlang/OTP releases

[Hibari release candidate] bad sequence file 'enoent' on a common log file deleted by the scavenger

This is a blocker issue.

This error occured while runnig a 8-hour stability test against 4-node Hibari 0.3.0 RC (Release Candidate). All Hibari nodes were running within one Solaris Zone-based SmartOS machine at Joyent Cloud. (Machine type: smartmachine medium 4GB RAM, 4 vCPUs)

Summary:

At 04:00, node [email protected] got the following bad sequence file error enoent on a common log file with sequence 1. That file was deleted by the scavenger between 03:01 and 03:11.

2013-03-30 04:00:57.401 [warning] <0.31010.58>@brick_ets:bigdata_dir_get_val:2922 stability_test_ch3_b3: error {error,enoent} at 1 36210172
2013-03-30 04:00:57.403 [warning] <0.610.0>@brick_ets:sequence_file_is_bad_common:3022 Brick commonLogServer bad sequence file: seq 1 offset 36210172: rename data/brick/hlog.commonLogServer/s/000000000001.HLOG or data/brick/hlog.commonLogServer/2/2/-000000000001.HLOG to data/brick/hlog.commonLogServer/BAD-CHECKSUM/1.at.offset.36210172 failed: enoent

[email protected] was hosting 3 logical bricks and reported total 2 keys were purged by absence of log 1. It put affected bricks in read-only mode, and other brick servers took over the roles of the affected bricks. So there was no data loss in this particular case.

2013-03-30 04:00:57.818 [info] <0.769.0>@brick_server:do_common_log_sequence_file_is_bad:5555 common_log_sequence_file_is_bad: stability_test_ch3_b3: 1 purged keys for log 1
2013-03-30 04:00:57.897 [info] <0.758.0>@brick_server:do_common_log_sequence_file_is_bad:5555 common_log_sequence_file_is_bad: stability_test_ch4_b2: 0 purged keys for log 1
2013-03-30 04:00:57.900 [info] <0.783.0>@brick_server:do_common_log_sequence_file_is_bad:5555 common_log_sequence_file_is_bad: stability_test_ch1_b1: 1 purged keys for log 1

...

2013-03-30 04:00:59.510 [info] <0.530.0> alarm_handler: {set,{{disk_error,{stability_test_ch1_b1,'[email protected]'}},"Administrator intervention is required."}}
2013-03-30 04:00:59.511 [info] <0.696.0>@brick_chainmon:set_all_chain_roles:1309 set_all_chain_roles: stability_test_ch1: brick read-only 1
2013-03-30 04:00:59.514 [info] <0.696.0>@brick_chainmon:set_all_chain_roles:1352 New head {stability_test_ch1_b2,'[email protected]'}: ChainDownSerial 368037 /= ChainDownAcked -4242, reflushing log downstream
2013-03-30 04:00:59.516 [info] <0.696.0>@brick_chainmon:set_all_chain_roles:1360 New head {stability_test_ch1_b2,'[email protected]'}: flush was ok
2013-03-30 04:00:59.516 [info] <0.645.0>@brick_admin:do_chain_status_change:1410 status_change: Chain stability_test_ch1 status degraded belongs to tab stability_test

The scavenger process was started at 3:00 and finished 3:11. It didn't report any problem including updating locations for log 1.

Conditions:

  • 4-node Hibari 0.3.0 RC culuster (1 admin, 4 physical blicks)
  • syncwrites = true
  • scavenger setting
    • {brick_scavenger_start_time, "03:00"}
    • {brick_scavenger_temp_dir,"/scav/hibari1_scavenger"}
    • {brick_scavenger_throttle_bytes, 2097152}
    • {brick_skip_live_percentage_greater_than, 90}

Enhancements on hlog compaction (aka scavenger) for Hibari v0.3.0

Brush up the scavenging process to avoid some extra works.

I'll explain details later, but for example:

  • Calculate live hunks percentage of an hlog before sorting the contents of the work file for the hlog.
  • Reorganize the process of copy live hunks and update their storage locations, so that we no longer need the resume scavenger feature. (only having the stop scavenger is sufficient)

Some works (including a major refactoring on the codes) have been done for GitHub issue: Hibari 0.3.0 RC - bad sequence file 'enoent' on a common log file deleted by the scavenger

Target Release: v0.3.0

hibari/bin/hibari attach not working

bootstrapping package: hibari-0.1.0-dev-x86_64-unknown-linux-gnu-64 ...
tar -C ./tmp -xzf ../hibari-0.1.0-dev-x86_64-unknown-linux-gnu-64.tgz
./tmp/hibari/bin/hibari start
./tmp/hibari/bin/hibari-admin bootstrap
ok
[norton@norton-pc hibari (master)]$ ./tmp/hibari/bin/hibari attach
No running Erlang on pipe /home/norton/workhub/hibari/hibari/tmp/hibari/tmp: No such file or directory

Hibari - config file and rebar-related changes for R15B

As of R15B on Hibari's dev branch, the following changes are in progress:

vm.args has moved to the releases/X.Y.Z directory
app.args has been renamed to sys.args and moved to the releases/X.Y.Z directory
congestion_watcher.config needs to be moved to the releases/X.Y.Z directory

Hibari's helper script for multi-node installation "clus-hibari.sh" and Hibari's documentation need updating as well.

Further testing is also required.

Lager - trunc_io_eqc is failing with badarg

Quviq QuickCheck version 1.26.2 (compiled at {{2012,6,20},{8,22,48}})
/home/tatsuya/workhub/dev/hibari/hibari/lib/lager% git branch -v
* master e749242 Merge pull request #105 from basho/adt-fix-tracing
/home/tatsuya/workhub/dev/hibari/hibari% make test

...

module 'trunc_io_eqc'
  trunc_io_eqc:42: eqc_test_...
Exception: badarg
FmtStr: []
Args:   []
Failed! After 1 tests.
{[],3}
*failed*
in function trunc_io_eqc:'-eqc_test_/0-fun-1-'/1 (test/trunc_io_eqc.erl, line 42)
**error:{assertEqual_failed,[{module,trunc_io_eqc},
                     {line,42},
                     {expression,"eqc : quickcheck ( eqc : testing_time ( 14 , ? QC_OUT ( prop_format ( ) ) ) )"},
                     {expected,true},
                     {value,false}]}
  output:<<"Starting Quviq QuickCheck version 1.26.2
   (compiled at {{2012,6,20},{8,22,48}})
Licence for ******** reserved until {{2013,3,9},{12,29,44}}
">>

  trunc_io_eqc:43: eqc_test_...Failed! Reason: 
{'EXIT',badarg}
After 1 tests.
[]
*failed*
in function trunc_io_eqc:'-eqc_test_/0-fun-4-'/1 (test/trunc_io_eqc.erl, line 43)
**error:{assertEqual_failed,[{module,trunc_io_eqc},
                     {line,43},
                     {expression,"eqc : quickcheck ( eqc : testing_time ( 14 , ? QC_OUT ( prop_equivalence ( ) ) ) )"},
                     {expected,true},
                     {value,false}]}


  [done in 0.669 s]

...

=======================================================
  Failed: 2.  Skipped: 0.  Passed: 136.
Cover analysis: /home/tatsuya/workhub/dev/hibari/hibari/lib/lager/.eunit/index.html

=INFO REPORT==== 9-Mar-2013::12:15:43 ===
    application: inets
    exited: stopped
    type: temporary
ERROR: One or more eunit tests failed.
ERROR: eunit failed while processing /home/tatsuya/workhub/dev/hibari/hibari/lib/lager: rebar_abort

Update license headers

  • Change all copyright holder name in the license banners and documents from Gemini Mobile Technologies, Inc. to Hibari developers.
  • Create AUTHORS file and put actual copyright holder name (contributors) there.
    (Cloudian, Inc. and Gemini should be listed there as well)

Check riak_err usage and remove it if it's not in use

riak_err was added to Hibari at v0.1.4 but I think it's not used in Hibari. I don't see any place to start riak_err application. I think riak_err is only used by Cloudian KK's proprietary products built on top of Hibari.

AFAIK, Hibari won't produce lenghty log messages, and there is a plan to introduce lager as the primaly logging facility (#19), maybe it's a good time to retire riak_err?

problem if running from sym link of hibari script

e.g.:
$ sudo ln -s /usr/local/var/lib/hibari/hibari/bin/hibari /etc/init.d/gogogo

then we can not run hibari by executing /etc/init.d/gogogo

Issue:

  • hibari script uses $0 and pwd for variable declaration.

As a workaround, edit the hibari script as follows:
....

RUNNER_SCRIPT_DIR=$(cd ${0%/*} && pwd)

RUNNER_BASE_DIR=${RUNNER_SCRIPT_DIR%/*}

RUNNER_BASE_DIR="/usr/local/var/lib/hibari/hibari/"
RUNNER_SCRIPT_DIR=$RUNNER_BASE_DIR/bin
...

SCRIPT=basename $0

SCRIPT="hibari"
....

No official communications channel

I wanted to ask a few questions about Hibari, but I didn't find a single mention of a mailing list, IRC channel or anything similar.

Does anyone still care at all?

dialyzer warnings (pre v0.3)

Related to: #30 (comment)

Before v0.3 release, dialyzer options -Werror_handling and -Wunderspecs were added, so dialyzer now produces more warnings. Examine those warnings and correct the code or spec when necessary.

checkpoint_start processing is failing during load test

checkpoint_start processing is failing during load test. Needs further investigation.

=ERROR REPORT==== 18-Jan-2011::00:46:53 ===
Error in process <0.22661.0> on node 'hibari@perf03' with exit value: {{badmatch,{error,enoent}},[{b
rick_ets,checkpoint_start,4}]}

Add gmt_elog module again

Add gmt_elog module which was deleted a couple of years ago (#18 (comment)). It contained utilities for Erlang dbg module to enable the tracepoints and create a report.

http://hibari.github.com/hibari-doc/hibari-contributor-guide.en.html#_hibari_internal_tracepoints

Hibari internal tracepoints
The Hibari source code has been annotated with over 400 tracepoints using
macros based on the gmt_elog.erl and gmt_elog_policy.erl modules. These
tracepoints give the developer (and even field support staff) more options
for tracing events through Hibari’s code.

The gmt_elog tracepoints are designed to be extremely lightweight. While
they can be disabled completely at compile-time, their overhead is so low
that they can remain in production code and be enabled only when needed
for debugging.

Hibari v0.1.10 Release

Steps

  1. Update dialyzer warnings for the latest Erlang/OTP R15B03-1.
  2. Update the release note.
  3. In each sub project with some commits, update the version number on
    dev branch, marge dev branch to master branch, and tag the merge
    commit.
    • gdss_admin v0.1.6
    • partition_detector v0.1.3
    • gmt_util v0.1.7
    • riak_err v1.0.1-hibari01
    • gdss_ubf_proto v0.1.8
    • ubf v1.15.7-hibari01
  4. In the main project, update the version number on dev branch, marge
    dev branch to master branch, and tag the merge commit.
    • hibari v0.1.10

Add Basho Bench driver (Phase 1)

The one in Basho Bench repository is outdated. Develop new ones.

Phase 1

  • A simple driver for direct comparison between Riak, Cassandra, etc.
    • Native Erlang (brick_simple)
  • Custom drivers that mimic typical applications on Hibari
    • Native Erlang (brick_simple)

Phase 2

  • Simple drivers for the following client APIs
    • UBF
    • Thrift
  • Custom drivers that mimic typical applications on Hibari
    • Thrift

Hibari v0.3.0 Release

Target Date

March 18th or 25th, 2013

Steps

  1. Update dialyzer warnings for Erlang/OTP R15B03-1. (Not R16B)
  2. Update the release note
  3. In each sub project with some commits, update the version number on dev branch, marge dev branch to master branch, and tag the merge commit.
    • gdss-admin v0.3.0
    • gdss-brick v0.3.0
    • gdss-client v0.3.0
    • gdss-ubf-proto v0.3.0
    • cluster-info v0.3.0
    • partition-detector v0.3.0
    • congestion-watcher v0.3.0
    • gmt-util v0.3.0
  4. Create and test release candidates (RCs)
  5. In the main project, update the version number on dev branch, marge dev branch to master branch, and tag the merge commit.
    • hibari v0.3.0

Add DTrace tracepoints for Erlang/OTP R15 or later

Compare the overhead for DTrace/SystemTap tracepoints (dyntrace:p/1 -- p/8) to the current one for gmt_elog. If the performance is acceptable, add DTrace/SystemTap tracepoints for Erlang/OTP R15 or later to get an unified view between Hibari, Erlang VM, and Unix/Linux kernel. (Keep gmt_elog for R13 and R14)

http://hibari.github.com/hibari-doc/hibari-contributor-guide.en.html#_hibari_internal_tracepoints

Hibari internal tracepoints
The Hibari source code has been annotated with over 400 tracepoints using
macros based on the gmt_elog.erl and gmt_elog_policy.erl modules. These
tracepoints give the developer (and even field support staff) more options
for tracing events through Hibari’s code.

The gmt_elog tracepoints are designed to be extremely lightweight. While
they can be disabled completely at compile-time, their overhead is so low
that they can remain in production code and be enabled only when needed
for debugging.

Enhancements on hlog compaction for the hybrid metadata storage

The current (up-to Hibari v0.3.0) compaction process takes a snapshot dump of all key-value metadata (aka store tuple) on the physical brick so that it can find out live blob hunks to move to new common hlogs. This requires all keys to be in-memory and won't be possible once Hibari starts to support a in-memory/disk hybrid metadata storage, which is currently in design phase.

Design and implement an experimental alternative compaction mechanism that would work with the hybrid metadata storage. One idea is that instead of taking the snapshot dump, maintain a on-disk database of live hunks all the time by using gmt_hlog_common's asynchronous metadata write-back process.

basho/eleveldb and krestenkrab/hanoidb will be the candidates for the on-disk database for this.

JSON RPC adapter not documented, nor apparently working

Added the following config:

{gdss_json_rpc_proto,
 [
  {gdss_json_rpc_tcp_port, 7598},
  {gdss_json_rpc_uri, "/gdss"}
 ]},

However, whatever request I throw at it, I get 500 Internal Server Error back. The log says nothing, the documentation does not say what operations are supported, and requests I think are declared in the Erlang code (I'm not an Erlang dev, so I'm just flailing) all give the same result.

Host Hibari docs at Read The Docs

Read The Docs (https://readthedocs.org) provides nice features, such as automated document building, as well as a prettier template. I began to convert current Hibari manuals from asciidoc to reStructuredText markup format. This is a manual process and might take some time, but it will be easily paid off.

https://read-the-docs.readthedocs.org/en/latest/index.html

Read the Docs hosts documentation for the open source community. We support Sphinx docs written with reStructuredText and Markdown docs written with Mkdocs. We pull your code from your Subversion, Bazaar, Git, and Mercurial repositories. Then we build documentation and host it for you. Think of it as Continuous Documentation.

partition_detector application is not running even though it is configured

The partition_detector application is not running even though it is (properly?) configured in etc/app.config. Needs further investigation.

=INFO REPORT==== 18-Jan-2011::00:44:22 ===
alarm_handler: {set,{{app_disabled,partition_detector},
"This application must run in production environments."}}

=ERROR REPORT==== 18-Jan-2011::00:44:22 ===
module: brick_admin
line: 834
msg: "ERROR: partition_detector application is not running! This should not happen i
n a production environment."
Eshell V5.8.2 (abort with ^G)

etc/app.config:

%%
%% Partition Detector
%%
{partition_detector,
[{heartbeat_status_udp_port, 63099},
{heartbeat_status_xmit_udp_port, 63100},
{network_a_address, "192.167.104.102"},
{network_a_broadcast_address, "192.167.104.255"},
{network_a_tiebreaker, "192.167.104.1"},
{network_b_address, "192.168.104.102"},
{network_b_broadcast_address, "192.168.104.255"},
{network_monitor_enable, true},
{network_monitor_monitored_nodes, [hibari@perf02,hibari@perf03,hibari@perf04,hibari@perf05,hibari
@perf06]}
]},

Add brick_metrics module - a Folsom based metrics system in production

In addition to the DTrace tracepoints (hibari GH18), introduce a brick_metrics module, which is a folsom based metrics system to provide statistics in production.

This will replace the DB operation counters in brick servers and add more statistical information such as 95 percentile and standard deviation of latencies in subsystems.

For example, this is a log message from current Hibair 0.3-dev:

2014-01-30 07:35:34.420 [info] <0.672.0>@brick_metrics:process_stats:132 statistics report
    (read)  sqflash prminig  median: 0.15 ms, 95 percentile: 0.244 ms
    (write) logging wait     median: 60.627 ms, 95 percentile: 100.651 ms
    (write) wal sync         median: 38.854 ms, 95 percentile: 66.769 ms, reqs 1, 4

exdec was used for the sampling method in above metrics, which exponentially decays less significant readings over time. They only keep recent 1028 readings to minimize performance impact. Note that these sampling methods and number of readings are configurable.

From the log, you can tell:

  • all (recent) reads were done from the filesystem cache, none from disk (as 95 percentile is less than 1 ms)
  • the disk drive (single, 2.5 inch, PC-grade) is overloaded by WAL sync (group commit)
  • logging wait takes twice as long as wal sync. I am rewriting the old WAL module (gmt_hlog) from scratch to improve this area.

An early work is done in this commit: hibari/gdss-brick@2e52fc5fc5a64

Notes about the metrics system and DTrace

brick_metrics will provide a continuous performance statistics in production. Good for monitoring brick server's resource usage and operation latencies.

DTrace tracepoints will be used to drill down performance issues in production. e.g. draw latency histogram in a subsystem.

Update for Erlang/OTP 17.x

Prepare Hibari v0.3.0 to build and run on Erlang/OTP 17.x.

I tried to build Hibari develop on 17.3 and the only (obvious) problem is that type-specs for dict() and queue() have to be specified with the namespace so they should be now dict:dict() and queue:queue().

documentation error with a2x even after patch applied

Symptom:
a2xp:a2xp: adoc_vers=''
Traceback (most recent call last):
File "/usr/bin/a2x", line 733, in
a2x.execute()
File "/usr/bin/a2x", line 297, in execute
self.getattribute('to_'+self.format)() # Execute to_* functions.
File "/usr/bin/a2x", line 496, in to_xhtml
self.copy_resources(xhtml_file, src_dir, self.destination_dir)
File "/usr/bin/a2x", line 446, in copy_resources
lambda attrs: attrs.get('type') == 'text/css')
File "/usr/bin/a2x", line 238, in find_resources
parser.feed(open(f).read())
File "/usr/lib/python2.6/HTMLParser.py", line 108, in feed
self.goahead(0)
File "/usr/lib/python2.6/HTMLParser.py", line 148, in goahead
k = self.parse_starttag(i)
File "/usr/lib/python2.6/HTMLParser.py", line 249, in parse_starttag
attrvalue = self.unescape(attrvalue)
File "/usr/lib/python2.6/HTMLParser.py", line 387, in unescape
return re.sub(r"&(#?[xX]?(?:[0-9a-fA-F]+|\w{1,8}));", replaceEntities, s)
File "/usr/lib/python2.6/re.py", line 151, in sub
return _compile(pattern, 0).sub(repl, string, count)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 4: ordinal not in range(128)
make: *** [public_html/hibari-contributor-guide.en.html] Error 1

Will return patch, when i could hunt this bug down.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.