GithubHelp home page GithubHelp logo

isabella232 / mako-regional-report Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tritondatacenter/mako-regional-report

0.0 0.0 0.0 18 KB

Mechanism for aggregating object sizes and counts across all makos in a region.

License: Mozilla Public License 2.0

Shell 100.00%

mako-regional-report's Introduction

mako-regional-report

This repository is part of the Joyent Manta project. For contribution guidelines, issues, and general documentation, visit the main Manta project page.

This is a Mechanism for aggregating object sizes and counts across all storage nodes (makos) in a region. Currently, it is comprised of a single script which consumes Manta public APIs in order to fetch the entire manifest generated by a mako describing its storage contents located in /poseidon/stor/mako/<storage id>.

The is an engineering tool intended for use by operators to obtain a regional view of storage consumption in a Manta deployment. Since it only consumes public APIs, it does not require access to any private networks or interfaces. Currently, it is not part of the Triton/Manta offering, but may be integrated in the future.

Initially, a summary of each manifest mako is generated, derived from the full manifest. It contains a per-account summarization of the number of objects stored on the mako, along with their average size and cumulative size. Size units are in kilobytes. The last line in the summary will always be a global calculation of the stats mentioned above across all accounts on the storage node.

Next, the totals of each summary are included in a region-wide report as part of a JSON array, where each object is a storage node. An example of such a report might look like this:

[
  {
    "datacenter": "robertdc",
    "storage_id": "1.stor.west.example.com",
    "kilobytes": 30271952,
    "objects": 163525,
    "avg": 185.121248,
    "tombstone": [
      {
        "date": "2018-10-11",
        "kilobytes": 2231,
        "objects": 24
      },
      {
        "date": "2018-10-12",
        "kilobytes": 249,
        "objects": 5
      },
      {
        "date": "2018-10-13",
        "kilobytes": 109,
        "objects": 3
      }
    ]
  },
  {
    "datacenter": "robertdc",
    "storage_id": "2.stor.west.example.com",
    "kilobytes": 31533450,
    "objects": 162932,
    "avg": 193.537488,
    "tombstone": []
  },
  {
    "datacenter": "robertdc",
    "storage_id": "3.stor.west.example.com",
    "kilobytes": 29700783,
    "objects": 163327,
    "avg": 181.84858,
    "tombstone": []
  }
]

Note that each object in the array represents a single storage node. Each member of the object is described below:

  • datacenter: The name of the datacenter that the storage node is a part of.
  • storage_id: The name of the storage node.
  • kilobytes: The total amount of physical storage (in kilobytes) currently consumed by the storage node.
  • objects: Total number of objects on the storage node.
  • avg: Average object size on the storage node.
  • tombstone: This is a JSON array containing information about objects which are scheduled for deletion. The array can be empty, or have a varying number of records depending on what ojects have been marked for deletion (and when). Each element in the array represents a subdirectory within /manta/tombstone and is named after the date at which it was created, this is part of a larger process referred to as Garbage Collection. For more information on that process, refer to the Garbage Collection project page.

Since generating the summary can be quite expensive, as a first recourse, we will always check /poseidon/stor/mako/summary first to see if one already exists. If one is not present, then we will download the full mako manifest and derive the summary from it. It's worth mentioning that in the event that we download the full summary, we do not actually save it to disk, rather, we perform the calculations on the stream, saving only the resulting summary to disk.

After all mako summaries in a region have been obtained (whether downloaded or derived from the full mako manifests) and the regional report region.json has been completed, it is then uploaded to /poseidon/stor/mako/summary.

Requirements

The automation in bin requires that the following are installed:

  • GNU awk 4.1.3 or later: (earlier versions do not allow you to specify the level of precision which would limit the user to 53 bits).
  • Manta Client Tools: The automation consumed Manta public APIs only and uses several utilies to access mako manifests as well as upload regional results. This has been tested with Manta Client Tools 5.1.1.

Note: Please note the environment variables that must be set in order to access Manta via the client tools. More information regarding configuring the Manta Client Tools can be found on the Manta Client Tools and SDK project page.

Installation Steps

  1. Install nodes.js:
# pkgin install nodejs-6.14.1

Note: This may work with other versions of node, but 6.14.1 was the version that was tested with.

  1. Install and configure Manta Command Line Tools:
# npm install -g manta

If your system does not have an ssh key, one can be generated with the following:

# ssh-keygen -t rsa -b 2048

In order to use the command line tools, it is necessary to set at least the following environment variables the first time:

$ export MANTA_KEY_ID=$(ssh-keygen -l -f ~/.ssh/id_rsa.pub | awk '{print $2}')
$ export MANTA_URL=https://us-east.manta.joyent.com
$ export MANTA_USER=rbogart

Note:

  • MANTA_URL is set based on the desired region.
  • In the example above, the account has operator access and the public key was added through the admin portal at https://my.samsungcloud.io. The alternative is to ask to have your public key included in the poseidon account in which case, do not forget to set MANTA_USER to poseidon.
  1. Install GNU awk 4.1.3 (or later):
# pkgin install gawk-4.1.3
  1. Install git:
# pkgin install git
  1. Clone the mako-regional-report GitHub repository and run the tool:
# git clone https://github.com/joyent/mako-regional-report.git
# cd mako-regional-report
# ./bin/report.sh

Note: It is possible that when generating the report, there are no pre-existing summaries for the mako manifests in /poseidon/stor/mako/summary. As discussed above, the utility will then resort to generating its own summary for the mako by deriving it from the full manifest in /posiedon/stor/mako. This is mentioned again because in the case where the summary is not present you will see a message like this appear in the terminal:

mget: ResourceNotFoundError: /poseidon/stor/mako/summary/1.stor.us-east.scloud.host was not found
Mon Dec 17 21:31:30 UTC 2018: report.sh: info: Unable to find summary for 1.stor.us-east.scloud.host.

This is not a sign of anything that has gone wrong -- it simply means that the tool will be generating the summary itself since the mako apparently has not been configured to do it. For more information on how to configure a mako to generate its own summary, please see the section (below) on enabling post-processing of the manifest on the mako.

Configuration

In the event that the process of aggregating a region does not complete in a timely fasion due to having to derive summaries for all mako manifests, it is possible to offload that responsibility by enabling post-processing of the mako manifest on the storage node itself. The net result being that the storage node will generate its own summary based on the contents of its full manifest and upload that summary to /poseidon/stor/mako/summary.

To enable post-processing of the manifest on the mako:

MANTA_APP=$(sdc-sapi /applications?name=manta | json -Ha uuid)
echo '{ "metadata": {"MAKO_PROCESS_MANIFEST": true } }' | sapiadm update $MANTA_APP

To disable post-processing of the manifest on the mako:

MANTA_APP=$(sdc-sapi /applications?name=manta | json -Ha uuid)
echo '{ "metadata": {"MAKO_PROCESS_MANIFEST": false } }' | sapiadm update $MANTA_APP

Note: Setting this paramter will likely take effect within seconds or minutes, however the value of this paramter will not be evaluated on the mako until the next time /opt/smartdc/mako/bin/upload_mako_ls.sh is run on the storage node, so if a summary is needed sooner than that, then it might be necessary to login to the storage node and run the script buy hand rather than waiting for the daily scheduled cron job to run it.

To delete the SAPI metadata key

In the event that the operator would like to revert their metadata to the state that it was in prior to enabling this process, they can delete the metadata key by issuing the following:

MANTA_APP=$(sdc-sapi /applications?name=manta | json -Ha uuid)
sdc-sapi /applications/$MANTA_APP -X PUT -d '{"action":"delete", "metadata":{"MAKO_PROCESS_MANIFEST":{}}}'

Functionally, this serves the same purpose as setting the latch to false, however, setting the value of the metadata key to false has a different semantic meaning than deleting it. The former is intended to suggests a desired configuration whereas the latter implies that the operator may not even care about this feature at all.

To Force Processing

Normally, the upload of manifests from the makos are handled via a cron job on each storage node.

1 8 * * * /opt/smartdc/mako/bin/upload_mako_ls.sh >>/var/log/mako-ls-upload.log 2>&1

Note: This time may have been adjusted in the environment you are working in.

If you need to force the mako to run an upload (i.e. an out of band upload for testing) you can do this by running the upload_mako_ls.sh by hand.

Checking Results

The pre-processed summary for the makos will be placed in /posiedon/stor/mako/summary. For example, if you are testing your canary deployment with 1.stor.west.example.com you should see a file created in that directory after processing is finished:

[root@bf4d027b-a0df-e8bf-9b84-9487f9a5eab4 ~]# mls -l /poseidon/stor/mako/summary/1.stor.west.example.com
-rwxr-xr-x 1 poseidon           473 Nov 14 05:53 1.stor.west.example.com

Looking into the file, you should see a Totals line:

[root@6890ac1b (storage) ~]$ mget -q /poseidon/stor/mako/summary/1.stor.west.example.com | egrep "^totals"
totals  262755955915.000000     163891.000000   184.999487      30319751.000000

mako-regional-report's People

Contributors

rhb2 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.