The ELK stack powered by Docker – Updated !

Hola,

In a previous post, I’ve introduced the ELK stack powered by Docker & Fig (see the ELK stack powered by Docker).

I’ve recently decided to update the project to replace the usage of fig with compose and to replace all my custom images with the latest official images !

It is now based on the following Docker images available on Dockerhub:

elasticsearch: Latest version of Elasticsearch
logstash: Latest version of Logstash
kibana: Latest version of Kibana 4

01/11/2015 : Project updated !

As the project is based on the latest Docker images versions, it means Elasticsearch 2.x, Logstash 2.x and Kibana 4.2.x ! Feel free to discover the new features of these releases (have a look here: https://www.elastic.co/blog/release-we-have).

Note: For the nostalgic folks, you can still access the 1.x version (Elasticsearch 1.x, Logstash 1.x and Kibana 4.1.x) on the 1.x branch ! Here it is: https://github.com/deviantony/docker-elk/tree/1.x

Usage

Pre-requisites

You’ll need Docker and Docker Compose.

The following installation procedures have been tested on Ubuntu 14.04.

Docker installation

Use the following command to install Docker:

$ curl -sSL https://get.docker.com/ubuntu/ | sudo sh

Docker Compose installation

Follow the procedure available at https://docs.docker.com/compose/install/ to install the latest version of Docker Compose.

Use the stack

First, you’ll need to checkout the git repository:

$ git clone https://github.com/deviantony/docker-elk.git

By default, the stack is shipped with a simple Logstash configuration, it will listen for any TCP input on port 5000.

Then start the stack using Compose:

$ cd docker-elk
$ docker-compose up

Compose will start a container for each service of the ELK stack and output their logs.

If you’re still using the default input configuration for Logstash, you can inject some data into Elasticsearch from a file:

$ nc localhost 5000 &amp;lt; /some/log/file.log

Then you can check the results in Kibana by hitting the following URL in your browser: http://localhost:5601

Enjoy 🙂

6 December 201423 July 2015 deviantony

The ELK stack powered by Docker

UPDATE: The stack is now powered by docker-compose and using the latest official images for Elasticsearch/Logstash/Kibana. See my new article https://deviantony.wordpress.com/2015/07/23/the-elk-stack-powered-by-docker-updated/ — 23/07/2015

Ahoy,

I’ve recently created a solution to setup an ELK stack using the Docker engine, it can be used to:

Quickly boot an ELK stack for demo purposes
Use it to test your Logstash configurations and see the imported data in Elasticsearch
Import a subset of data into Elasticsearch and prepare dashboards on Kibana 3
Use Kibana 4 !

UPDATE: The stack is now fully functionnal with docker-compose as a replacement for fig. See http://docs.docker.com/compose/install/ for docker-compose installation. — 27/02/2015

The solution is available on Github: https://github.com/deviantony/docker-elk

It is based on multiple Docker images available on Dockerhub:

elk-elasticsearch: Latest 1.5 stable version of Elasticsearch + Marvel with Oracle Java JDK 7
elk-logstash: Latest 1.4 stable version of Logstash with Oracle Java JDK 7
elk-kibana: Kibana 3.1.2 or Kibana 4

Prerequisites

You’ll need Docker and Fig.

The following installation procedures have been tested on Ubuntu 14.04.

Docker installation

Use the following command to install Docker:

$ curl -sSL https://get.docker.com/ubuntu/ | sudo sh

Fig installation

Fig is available via pip, a tool dedicated to manage Python packages.

First, you’ll need to install pip if it is not already present on your system:

$ sudo apt-get install python-pip

Then, install fig:

$ sudo pip install -U fig

Use the stack

First, you’ll need to checkout the git repository:

$ git clone https://github.com/deviantony/fig-elk.git

By default, the stack is shipped with a simple Logstash configuration, it will listen for any TCP input on port 5000.

You can update the Logstash configuration by updating the file logstash-conf/logstash.conf (to test your filters, for example).

Then start the stack using fig:

$ cd fig-elk
$ fig up

Fig will start a container for each service of the ELK stack and output their logs.

If you’re still using the default input configuration for Logstash, you can inject some data into Elasticsearch from a file:

$ nc localhost 5000 < /some/log/file.log

Then you can check the results in Kibana 3, by hitting the following URL in your browser: http://localhost:8080

Or, if you’d like to use Kibana 4, hit the following URL: http://localhost:5601

Also Elasticsearch is shipped with Marvel, you have access to the cluster monitoring on the following URL http://localhost:9200/_plugin/marvel

Have fun with the ELK stack 🙂

23 September 201413 December 2014 deviantony

How to setup an Elasticsearch cluster with Logstash on Ubuntu 12.04

Hey there !

I’ve recently hit the limitations of a one node elasticsearch cluster in my ELK setup, see my previous blog post: Centralized logging with an ELK stack (Elasticsearch-Logback-Kibana) on Ubuntu

After more researchs, I’ve decided to upgrade the stack architecture and more precisely the elasticsearch cluster and the logstash integration with the cluster.

I’ve been using the following software versions:

Elasticsearch 1.4.1
Logstash 1.4.2

Setup the Elasticsearch cluster

You’ll need to apply this procedure on each elasticsearch node.

Java

I’ve decided to install the Oracle JDK in replacement of the OpenJDK using the following PPA:

$ sudo add-apt-repository ppa:webupd8team/java
$ sudo apt-get update && sudo apt-get install oracle-java7-installer

In case you’re missing the add-apt-repository command, make sure you have the package python-software-properties installed:

$ sudo apt-get install python-software-properties

Install via Elasticsearch repository

$ wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo "deb http://packages.elasticsearch.org/elasticsearch/1.4/debian stable main" | sudo tee -a /etc/apt/sources.list.d/elasticsearch.list
$ sudo apt-get update && sudo apt-get install elasticsearch

You can also decide to start the elasticsearch service on boot using the following command:

$ sudo update-rc.d elasticsearch defaults 95 10

Configuration

You’ll need to edit the elasticsearch configuration file in /etc/elasticsearch/elasticsearch.yml and update the following parameters:

cluster.name: my-cluster-name

I suggest to update the default cluster name with a defined cluster name. Especially if you want to have another cluster on your network with multicast enabled.

index.number_of_replicas: 2

This will ensure a copy of your data on every node of your cluster. Set this property to N-1 where N is the number of nodes in your cluster.

gateway.recover_after_nodes: 2

This will ensure the recovery process will start after at least 2 nodes in the cluster have been started.

discovery.zen.mininum_master_nodes: 2

Should be set to something like N/2 + 1 where N is the number of nodes in your cluster. This is to avoid the “split-brain” scenario.

See this post for more information on this scenario: http://blog.trifork.com/2013/10/24/how-to-avoid-the-split-brain-problem-in-elasticsearch/

Disabling multicast

Multicast is not recommended in production, disabling it will allow more control over your cluster:

discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: [“host-1”, “host-2”]

Of course, you’ll need to specify the 2 others hosts for each node in your cluster:

host-1 will communicate with host-2 & host-3
host-2 will communicate with host-1 & host-3
host-3 will communicate with host-1 & host-2

Cluster overview via Marvel

It’s free for development use ! See marvel’s homepage for more info.

Install it:

$ /usr/share/elasticsearch/bin/plugin -i elasticsearch/marvel/latest

Restart the elasticsearch service:

$ sudo service elasticsearch start

Now you can access the marvel UI via your browser on any of your elasticsearch nodes.
For example the first node: http://elasticsearch-host-a:9200/_plugin/marvel

Automatic index cleaning via Curator

This tool can be installed on node only.

You can use the curator program to delete indexes. See more information in the github repository: https://github.com/elasticsearch/curator

You’ll need pip in order to install curator:

$ sudo apt-get install python-pip

Once it’s done, you can install curator:

$ sudo pip install elasticsearch-curator

Now, it’s easy to setup a cron to delete the indexes older than 30 days in /etc/cron.d/elasticsearch_curator:

@midnight     root        curator delete --older-than 30 >> /var/log/curator.log 2>&1

Setup the Logstash node

Java

Logstash is using Java, you need to ensure you’ve got a JDK installed on your system. Use either OpenJDK or Oracle JDK.

Install via repository

$ wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo "deb http://packages.elasticsearch.org/logstash/1.4/debian stable main" | sudo tee -a /etc/apt/sources.list.d/elasticsearch.list
$ sudo apt-get update && sudo apt-get install logstash

Generate a SSL certificate

Use the following command to generate a SSL certificate request and private key in /etc/ssl:

$ openssl req -x509 -newkey rsa:2048 -keyout /etc/ssl/logstash.key -out /etc/ssl/logstash.pub -nodes -days 1095

Configuration

I’ll skip the configuration for inputs, filter and specify only the output configuration for the communication with the elasticsearch cluster.

We’re not going to specify a elasticsearch host anymore, instead we will specify that this logstash instance needs to communicate with the cluster.

/etc/logstash/conf.d/10_output.conf

output {
       elasticsearch { }
}

Then we’ll edit the init script of logstash in /etc/init/logstash.conf and update the JAVA_OPTS var with:

LS_JAVA_OPTS="-Djava.io.tmpdir=${LS_HOME} -Des.config=/etc/logstash/elasticsearch.yml"

And create the file /etc/logstash/elasticsearch.yml with the following content:

cluster.name: my-cluster-name
node.name: logstash-indexer-01

If you’ve disabled multicast, then you’ll need to add the following line:

discovery.zen.ping.unicast.hosts: "elasticsearch-node-1", "elasticsearch-node-2", "elasticsearch-node-3"]

And start logstash:

$ sudo service logstash start

The logstash node will automatically be added to your elasticsearch cluster.
You can verify that by checking the node count and version in the marvel UI.

19 May 201422 December 2015 deviantony

Centralized logging with an ELK stack (Elasticsearch-Logstash-Kibana) on Ubuntu

Update 22/12/2015

I’ve reviewed the book Learning ELK stack by Packt Publishing, it’s available online for 5$ only: https://www.packtpub.com/big-data-and-business-intelligence/learning-elk-stack/?utm_source=DD-deviantonywp&utm_medium=referral&utm_campaign=OME5D2015

I’ve recently setup an ELK stack in order to centralize the logs of many services in my company, and it’s just amazing !

I’ve used the following versions of the softwares on Ubuntu 12.04 (also works on Ubuntu 14.04):

Elasticsearch 1.4.1
Kibana 3.1.2
Logstash 1.4.2
Logstash-forwarder 0.3.1

About the softwares

Elasticsearch

Elastichsearch is a RESTful distributed search engine using a NoSQL database and based on the Apache Lucene engine. Developped by the Elasticsearch company which also owns Kibana and Logstash.

Elasticsearch Homepage

Logstash

Logstash is a tool used to harvest and filter logs, it’s developed in Java under Apache 2.0 license.

Logstash Homepage

Logstash-forwarder

Logstash-forwarder (previously named Lumberjack) is one of the many log shippers compliants with Logstash.

It has the following advantages:

a light footprint (written in Go, no need for a Java Virtual Machine to harvest logs)
uses a data compression algorithm
uses encryption to send data over the network

Logstash-forwarder Homepage

Kibana

Kibana is a web UI allowing to search and display data stored by Logstash in Elasticsearch.

Kibana Homepage

Architecture

Here is a simple schema of the expected architecture : We will use logstash-forwarder (using the lumberjack protocol) on each server where we want to harvest the logs. These nodes will send data to the indexer : logstash. This one will process them using filters and send the formatted data to elasticsearch.

Kibana, the UI, will allow to display and compile the data. This architecture is scalable, you can quickly add more indexers nodes by adding logstash instances. Same for elasticsearch, which works as a cluster of one node by default.

Setup the Elasticsearch node

NOTE: A one node elasticsearch cluster is not recommended for production, I’ve added another blog post describing the setup of a three node cluster on Ubuntu 12.04, see How to setup an Elasticsearch cluster with Logstash on Ubuntu 12.04.

Requirements

elasticsearch is using Java, you need to ensure you’ve got a JDK installed on your system and that it is available in the PATH.

Install via repository

$ wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo &quot;deb http://packages.elasticsearch.org/elasticsearch/1.4/debian stable main&quot; | sudo tee -a /etc/apt/sources.list.d/elasticsearch.list
$ sudo apt-get update &amp;&amp; sudo apt-get install elasticsearch

You can also decide to start the elasticsearch service on boot using the following command:

$ sudo update-rc.d elasticsearch defaults 95 10

Configuration

You must specify the path to the Java JDK in the file /etc/default/elasticsearch to start the service by adding the following variable:

JAVA_HOME=/path/to/java/JDK

If you want to tune your elasticsearch installation, the configuration is available in the file /etc/elasticsearch/elasticsearch.yml.

You can now start the service:

$ sudo service elasticsearch start

Automatic index cleaning via Curator

You can use the curator program to delete indexes. See more information in the github repository: https://github.com/elasticsearch/curator

You’ll need pip in order to install curator:

$ sudo apt-get install python-pip

Once it’s done, you can install curator:

$ sudo pip install elasticsearch-curator

Now, it’s easy to setup a cron to delete the indexes older than 30 days in /etc/cron.d/elasticsearch_curator:

@midnight     root        curator delete --older-than 30 &gt;&gt; /var/log/curator.log 2&gt;&amp;1

Cluster overview via Marvel

NOTE: marvel needs to be installed on each node of an elasticsearch cluster in order to supervise the whole cluster.

See marvel’s homepage for more info.

Install it:

$ /usr/share/elasticsearch/bin/plugin -i elasticsearch/marvel/latest

Restart the elasticsearch service:

$ sudo service elasticsearch restart

Now you can access the marvel UI via your browser on : http://elasticsearch-host:9200/_plugin/marvel

Setup the Logstash node

Requirements

Logstash is using Java, you need to ensure you’ve got a JDK installed on your system and that it is available in the PATH.

Install via repository

$ wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo &quot;deb http://packages.elasticsearch.org/logstash/1.4/debian stable main&quot; | sudo tee -a /etc/apt/sources.list.d/elasticsearch.list
$ sudo apt-get update &amp;&amp; sudo apt-get install logstash

Generate a SSL certificate

Use the following command to generate a SSL certificate request and private key in /etc/ssl:

$ openssl req -x509 -newkey rsa:2048 -keyout /etc/ssl/logstash.key -out /etc/ssl/logstash.pub -nodes -days 1095

Configuration

The logstash configuration is based in the /etc/logstash/conf.d directory by default. As the configuration can become quite messy with the time, I’ve managed to split the configuration in multiple files:

00_input.conf
02_filter_*.conf
10_output.conf

This allows you to define separated sections for the logstash configuration:

Input section

You’ll define here all the inputs for the indexer, an input is a source on which logstash will read events. It can be file, a messaging queue connection… We are going to use the lumberjack input to communicate with the logstash-forwarder harvesters.

See: http://logstash.net/docs/1.4.2/inputs/lumberjack for more information.

Filter section

Filters are processing methods you will apply to the received events. For example, you can aplpy a calculation method on some numeric value, drop a specific event based on it’s text value… There is a LOT of filters you can use with logstash, see the documentation for more information.

I recommend using a specific configuration file for each service you want to process: 02_filter_apache.conf, 02_filter_mysql.conf…

Output section

NOTE: There is another way to configure the logstash integration with an elasticsearch cluster, it’s more adaptable if have more than a node in your cluster, see How to setup an Elasticsearch cluster with Logstash on Ubuntu 12.04.

The output will define where will logstash send the processed events. We are going to use elasticsearch as the output destination:

/etc/logstash/conf.d/10_output.conf

output {
       elasticsearch {
                     host =&gt; &quot;elasticsearch-node&quot;
                     protocol =&gt; &quot;http&quot;
       }
}

Optional: Logstash contrib plugins

Starting from the version 1.4 of logstash, some plugins have been separated of the project. A new project was born : logstash-contrib, gathering a lot of plugins inside one bundle.

See http://logstash.net/docs/1.4.2/contrib-plugins for more information. It’s also available on Github: https://github.com/elasticsearch/logstash-contrib.

Installation:

$ /opt/logstash/bin/plugin install contrib

Setup the Kibana node

Requirements

You’ll need a web server to use kibana, I’ve chosen apache:

$ sudo apt-get install apache2

Installation

You can retrieve the latest archive for kibana here: http://www.elasticsearch.org/overview/kibana/installation/

Download and extract the archive in your webserver root (replace VERSION with the right version):

$ wget https://download.elasticsearch.org/kibana/kibana/kibana-VERSION.tar.gz
$ tar xvf kibana-*.tar.gz -C /var/www
$ sudo mv /var/www/kibana-VERSION /var/www/kibana

Now setup the default dashboard:

$ cp /var/www/kibana/app/dashboards/default.json /var/www/kibana/app/dashboards/default.json.bak
$ mv /var/www/kibana/app/dashboards/logstash.json /var/www/kibana/app/dashboards/default.json

Update the elasticsearch value in /var/www/kibana/config.js to match your elasticsearch node:

      elasticsearch: &quot;http://elasticsearch-host:9200&quot;,

You can now access the kibana UI: http://kibana-host/kibana

Setup the Logstash forwarder on a node

Install via repository

$ wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo &quot;deb http://packages.elasticsearch.org/logstashforwarder/debian stable main&quot; | sudo tee -a /etc/apt/sources.list.d/elasticsearch.list
$ sudo apt-get update &amp;&amp; sudo apt-get install logstash-forwarder

Configuration

You’ll need to copy the public key you’ve generated previously on the logstash node in the same directory : /etc/ssl/logstash.pub

The configuration for the logstash-forwarder is defined in the file /etc/logstash-forwarder. You can see examples in my logstash recipes posts.

Logstash recipes

Here is a list of posts I’ve made for logstash recipes, they contain both logstash and logstash-forwarder configuration samples:

You can also check my post on how to debug your logstash filters.

Enjoy your logs !