GithubHelp home page GithubHelp logo

opsworks-logstash-ea's Introduction

A complete Logstash stack on AWS OpsWorks

This set of cookbooks began as a fork of the Springtest project, but has been updated to use "official" cookbooks (with customized "wrapper" cookbooks where necessary), rather than creating forked dependencies.

Specifically:

  • kibana -> uses "vanilla" kibana cookbook, with wrapping done by an opsworks-kibana recipe
  • elasticsearch -> uses the official cookbook from Elasticsearch
  • logstash -> uses the "semi-official" /lusis/logstash cookbook

Kibana, Elasticsearch & Logstash

We're going to end up creating three separate layers in an OpsWorks stack:

  • Kibana - web frontend for viewing logs; basically just an Nginx proxy to the Elasticsearch layer
  • Elasticsearch - log storage, indexing, querying
  • Logstash - log collection

For these instructions, we're assuming that you're using SQS as a broker and will demonstrate configuring the Logstash agents appropriately. If this isn't the case, the Kibana and Elasticsearch configuration will remain the same, but you'll need to modify the Logstash parts.

EC2 Setup

Before diving into OpsWorks, you'll need to do a bit of setup in the EC2 area of AWS.

Securing the Stack

By default, Elasticsearch does not require authentication to make requests. It is possible to enable Basic http auth, but this covers only the REST API, and also prevents some web-based plugins from working properly. It's better to think of Elasticsearch as a backend database and secure it as such.

What we want to end up with is:

  • Kibana - available via ssh and http on the public internet, secured by HTTP Basic auth
  • Elasticsearch - available via ssh on the public internet, otherwise only reachable by Kibana and Logstash instances
  • Logstash - available only via ssh on the public internet

We're going to accomplish this with a very basic VPC setup and some security groups. You could go further and put Elasticsearch and Logstash into a private subnet with appropriate NAT rules, but that requires a more involved VPC setup.

Create VPC

Go to the VPC dashboard and click Start VPC Wizard. Select VPC with a Single Public Subnet Only, and then just click through until the VPC has been created.

It's probably a good idea to create a "Name" tag for your VPC, as it can be tricky to keep track of which one is which once you've created several.

Configure Security Groups

Go to the Security Groups section (in the VPC area, not the regular EC2 one). There should already be a default security group defined with a single rule allowing all instances within the group to talk to each other. To additionally enable ssh access, you can add an inbound rule allowing traffic on port 22.

TCP Port      Source
--------      ------
ALL           sg-xxxxxxxx (ID of the default security group)
22 (SSH)      0.0.0.0/0

We additionally want to create a Kibana security group that will allow web traffic to the Kibana dashboard.

TCP Port      Source
--------      ------
22 (SSH)      0.0.0.0/0
80 (HTTP)     0.0.0.0/0
443 (HTTPS)   0.0.0.0/0

Note: For both security groups, ssh access is typically only required for debugging purposes. If you want to really lock things down, you can remove the SSH rules from the groups (changes you make to a security group take effect immediately; you don't need to restart any affected instances).

Create Elasticsearch Load Balancer

Next, we want to be able to put an ELB in front of our Elasticsearch array. We'll create an internal ELB in our VPC; Kibana and Logstash instances will be able to talk to it, but it will be inaccessable to the outside world.

In the EC2 dashboard, create a new ELB

Load Balancer Name: <name>
Create LB inside: <id of your VPC>
Create an internal load balancer: yes

Listener Configuration:

HTTP 9200 -> HTTP 9200
TCP 9300 -> TCP 9300

Configuration Options:

Ping Protocol: HTTP
Ping Port: 9200
Ping Path: /

Selected Subnets:

  • select all of the subnets you created in your VPC

Security Groups:

  • Chose from your existing Security Groups
    • find the default security group and select it

Key Pair

If you have an existing ssh key pair you want to use, that's fine. Otherwise, create a new one.

SQS Setup

If you're planning on using Amazon's SQS as a "broker" between log producers and Elasticsearch, you'll need to configure a queue for this purpose and IAM users to read and write from the queue.

You can just use default values when creating a queue. Make a note of the ARN of your new queue.

IAM Setup

Create two users in IAM called logstash-reader and logstash-writer.

Assign logstash-writer the policy below:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1389301427000",
      "Effect": "Allow",
      "Action": [
        "sqs:SendMessage"
      ],
      "Resource": [
        "{ARN of your queue, eg. arn:aws:sqs:us-east-1:000000000:logstash}"
      ]
    }
  ]
}

Assign logstash-reader the policy below:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1389733069000",
      "Effect": "Allow",
      "Action": [
        "sqs:ChangeMessageVisibility",
        "sqs:DeleteMessage",
        "sqs:GetQueueAttributes",
        "sqs:GetQueueUrl",
        "sqs:ListQueues",
        "sqs:ReceiveMessage"
      ],
      "Resource": [
        "{ARN of your queue, eg. arn:aws:sqs:us-east-1:000000000:logstash}"
      ]
    }
  ]
}

Create an Access Key for logstash-reader and make a note of it. You'll need to put it in your custom Chef json (discussed below).

We're also going to take advantage of IAM Roles and Instance Profiles. Also in IAM, create a Role called logstash-elasticsearch-instance with the policy below:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1393205558000",
      "Effect": "Allow",
      "Action": [
        "ec2:AttachVolume",
        "ec2:CreateSnapshot",
        "ec2:CreateTags",
        "ec2:CreateVolume",
        "ec2:DeleteSnapshot",
        "ec2:DeleteVolume",
        "ec2:DescribeSnapshotAttribute",
        "ec2:Describe*",
        "ec2:DetachVolume",
        "ec2:EnableVolumeIO",
        "ec2:ImportVolume",
        "ec2:ModifyVolumeAttribute"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}

We'll use this role later when setting up our Elasticsearch layer in OpsWorks.

OpsWorks

Stack Setup

On the OpsWorks dashboard, select "Add Stack". Most default values are fine (or you can change things like regions or availability zones to suit your needs), but make sure to set:

  • VPC -> Select the VPC you created earlier
  • Default operating system -> Ubuntu
  • Under the "Advanced" settings:
  • Chef version -> 11.4
  • User custom Chef cookbooks -> Yes
  • Repository URL -> URL you got this from
  • Custom Chef Json -> See below

Custom Chef Json

The custom json below will configure your Kibana and Elasticsearch layers. Make sure to fill in appropriate values for things like {some user name}.

{
    "chef_environment": "production",
    "java": {
        "jdk_version":7,
        "install_flavor":"openjdk"
    },
    "opsworks-kibana": {
        "web_auth_enabled": true,
        "web_user": "{some user name}", //this is how you'll log into your logs dashboard
        "web_password": "{super secret password}"
    },
    "kibana": {
        "webserver": "nginx",
        "webserver_hostname": "logs.example.com", //this value isn't super critical if you don't have a nice hostname
        "es_port": "9200",
        "es_role": "elasticsearch",
        "es_server": "{address of your Elasticsearch ELB}",
        "config_cookbook": "opsworks-kibana",
        "nginx": {
            "template_cookbook": "opsworks-kibana"
         } 
    },
    "elasticsearch": {
        "version": "0.90.9",
        "cluster": {
            "name": "logstash"
        },
        "discovery": {
            "type": "ec2",
            "ec2": {
                "tag": {
                    "opsworks:stack": "{name-of-your-OpsWorks-stack}",
                    "opsworks:layer:elasticsearch": "{name-of-your-Elasticsearch-layer}"
                }
            }
        },
        "plugins": {
            "karmi/elasticsearch-paramedic": {},
            "royrusso/elasticsearch-HQ": {}
        },
        "data": {
            "devices": {
                "/dev/xvdi": {
                    "file_system": "ext3",
                    "mount_options": "rw,user",
                    "mount_path": "/usr/local/var/data/elasticsearch",
                    "format_command": "mkfs.ext3",
                    "fs_check_command": "dumpe2fs",
                    "ebs": {
                        "size": `{amount of space in gb you want for log storage}`,
                        "delete_on_termination": true,
                        "type": "standard"
                    }
                }
            }
        }
    },
    "logstash": {
        "elasticsearch_cluster": "logstash",
        "agent": {
            "version": "1.3.3",
            "source_url": "https://download.elasticsearch.org/logstash/logstash/logstash-1.3.3-flatjar.jar",
            "inputs": [
                {
                    "sqs": {
                        "access_key_id": "{access key id for logstash reader}",
                        "secret_access_key": "{secret access key for logstash reader}",
                        "queue": "{name of your logstash queue}",
                        "region": "us-east-1",
                        "threads": 25,
                        "use_ssl": "false",
                        "codec": "json"
                    }
                }
            ],
            "outputs": [
                {
                    "elasticsearch": {
                        "host": "{address of your Elasticsearch ELB}",
                        "cluster": "logstash"
                    }
                }
            ]
        }
    }
}

This setup assumes you're using SQS as a broker in your logstash layer; if you're not, you'll need to modify the input settings for the logstash section.

The elasticsearch config will create and mount an EBS volume sized as large as you want. By default, it will delete the volume if you terminate the instance. Keep that in mind before you terminate the last instance in your cluster and lose all your logs!

We also install both Elasticsearch-HQ and Paramedic as examples of how to install plugins. If you don't want to use these plugins, you can remove them from the plugins section (or remove the plugins section altogether).

Layer Configuration

Add some layers to your stack:

  • Elasticsearch
    • Layer type - Custom
    • Name - Elasticsearch
    • Short name - elasticsearch
  • Kibana
    • Layer type - Custom
    • Name - Kibana
    • Short name - kibana
  • Logstash
    • Layer type - Custom
    • Name - Logstash
    • Short name - logstash

Then configure them:

Elasticsearch

  • Custom Chef Recipes
  • Setup - java, elasticsearch, elasticsearch::ebs, elasticsearch::data, elasticsearch::aws, elasticsearch::plugins
  • Elastic Load Balancing
  • Select the load balancer you created previously
  • EBS Volumes
  • EBS optimized instances - No
  • Automatically Assign IP Addresses
  • Public IP Addresses: Yes
  • Elastic IP Addresses: No
  • Security Groups
  • Additional Groups - default
  • IAM Instance Profile
  • Layer Profile: logstash-elasticsearch-instance - this is the role we created in IAM previously

Kibana

  • Custom Chef Recipes
  • Setup - opsworks-kibana
  • EBS Volumes
  • EBS optimized instances - No
  • Automatically Assign IP Addresses
  • Public IP Addresses: Yes
  • Elastic IP Addresses: No
  • Security Groups
  • Additional Groups - default, kibana

Logstash

  • Custom Chef Recipes
  • Setup - java, logstash::agent
  • EBS Volumes
  • EBS optimized instances - No
  • Automatically Assign IP Addresses
  • Public IP Addresses: Yes
  • Elastic IP Addresses: No
  • Security Groups
  • Additional Groups - default

Then launch some instances!

Try It Out

Assuming you're using SQS, you can post messages to the queue directly using Amazon's web UI. If everything is working properly, you should see it arrive in the Kibana dashboard shortly thereafter.

opsworks-logstash-ea's People

Contributors

awesometown avatar bilal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.