Update 22/12/2015
I’ve reviewed the book Learning ELK stack by Packt Publishing, it’s available online for 5$ only: https://www.packtpub.com/big-data-and-business-intelligence/learning-elk-stack/?utm_source=DD-deviantonywp&utm_medium=referral&utm_campaign=OME5D2015
I’ve recently setup an ELK stack in order to centralize the logs of many services in my company, and it’s just amazing !
I’ve used the following versions of the softwares on Ubuntu 12.04 (also works on Ubuntu 14.04):
- Elasticsearch 1.4.1
- Kibana 3.1.2
- Logstash 1.4.2
- Logstash-forwarder 0.3.1
About the softwares
Elasticsearch
Elastichsearch is a RESTful distributed search engine using a NoSQL database and based on the Apache Lucene engine. Developped by the Elasticsearch company which also owns Kibana and Logstash.
Elasticsearch Homepage
Logstash
Logstash is a tool used to harvest and filter logs, it’s developed in Java under Apache 2.0 license.
Logstash Homepage
Logstash-forwarder
Logstash-forwarder (previously named Lumberjack) is one of the many log shippers compliants with Logstash.
It has the following advantages:
- a light footprint (written in Go, no need for a Java Virtual Machine to harvest logs)
- uses a data compression algorithm
- uses encryption to send data over the network
Logstash-forwarder Homepage
Kibana
Kibana is a web UI allowing to search and display data stored by Logstash in Elasticsearch.
Kibana Homepage
Architecture
Here is a simple schema of the expected architecture : We will use logstash-forwarder (using the lumberjack protocol) on each server where we want to harvest the logs. These nodes will send data to the indexer : logstash. This one will process them using filters and send the formatted data to elasticsearch.
Kibana, the UI, will allow to display and compile the data. This architecture is scalable, you can quickly add more indexers nodes by adding logstash instances. Same for elasticsearch, which works as a cluster of one node by default.
Setup the Elasticsearch node
NOTE: A one node elasticsearch cluster is not recommended for production, I’ve added another blog post describing the setup of a three node cluster on Ubuntu 12.04, see How to setup an Elasticsearch cluster with Logstash on Ubuntu 12.04.
Requirements
elasticsearch is using Java, you need to ensure you’ve got a JDK installed on your system and that it is available in the PATH.
Install via repository
$ wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo "deb http://packages.elasticsearch.org/elasticsearch/1.4/debian stable main" | sudo tee -a /etc/apt/sources.list.d/elasticsearch.list
$ sudo apt-get update && sudo apt-get install elasticsearch
You can also decide to start the elasticsearch service on boot using the following command:
$ sudo update-rc.d elasticsearch defaults 95 10
Configuration
You must specify the path to the Java JDK in the file /etc/default/elasticsearch to start the service by adding the following variable:
JAVA_HOME=/path/to/java/JDK
If you want to tune your elasticsearch installation, the configuration is available in the file /etc/elasticsearch/elasticsearch.yml.
You can now start the service:
$ sudo service elasticsearch start
Automatic index cleaning via Curator
You can use the curator program to delete indexes. See more information in the github repository: https://github.com/elasticsearch/curator
You’ll need pip in order to install curator:
$ sudo apt-get install python-pip
Once it’s done, you can install curator:
$ sudo pip install elasticsearch-curator
Now, it’s easy to setup a cron to delete the indexes older than 30 days in /etc/cron.d/elasticsearch_curator:
@midnight root curator delete --older-than 30 >> /var/log/curator.log 2>&1
Cluster overview via Marvel
NOTE: marvel needs to be installed on each node of an elasticsearch cluster in order to supervise the whole cluster.
See marvel’s homepage for more info.
Install it:
$ /usr/share/elasticsearch/bin/plugin -i elasticsearch/marvel/latest
Restart the elasticsearch service:
$ sudo service elasticsearch restart
Now you can access the marvel UI via your browser on : http://elasticsearch-host:9200/_plugin/marvel
Setup the Logstash node
Requirements
Logstash is using Java, you need to ensure you’ve got a JDK installed on your system and that it is available in the PATH.
Install via repository
$ wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo "deb http://packages.elasticsearch.org/logstash/1.4/debian stable main" | sudo tee -a /etc/apt/sources.list.d/elasticsearch.list
$ sudo apt-get update && sudo apt-get install logstash
Generate a SSL certificate
Use the following command to generate a SSL certificate request and private key in /etc/ssl:
$ openssl req -x509 -newkey rsa:2048 -keyout /etc/ssl/logstash.key -out /etc/ssl/logstash.pub -nodes -days 1095
Configuration
The logstash configuration is based in the /etc/logstash/conf.d directory by default. As the configuration can become quite messy with the time, I’ve managed to split the configuration in multiple files:
- 00_input.conf
- 02_filter_*.conf
- 10_output.conf
This allows you to define separated sections for the logstash configuration:
Input section
You’ll define here all the inputs for the indexer, an input is a source on which logstash will read events. It can be file, a messaging queue connection… We are going to use the lumberjack input to communicate with the logstash-forwarder harvesters.
See: http://logstash.net/docs/1.4.2/inputs/lumberjack for more information.
Filter section
Filters are processing methods you will apply to the received events. For example, you can aplpy a calculation method on some numeric value, drop a specific event based on it’s text value… There is a LOT of filters you can use with logstash, see the documentation for more information.
I recommend using a specific configuration file for each service you want to process: 02_filter_apache.conf, 02_filter_mysql.conf…
Output section
NOTE: There is another way to configure the logstash integration with an elasticsearch cluster, it’s more adaptable if have more than a node in your cluster, see How to setup an Elasticsearch cluster with Logstash on Ubuntu 12.04.
The output will define where will logstash send the processed events. We are going to use elasticsearch as the output destination:
/etc/logstash/conf.d/10_output.conf
output {
elasticsearch {
host => "elasticsearch-node"
protocol => "http"
}
}
Optional: Logstash contrib plugins
Starting from the version 1.4 of logstash, some plugins have been separated of the project. A new project was born : logstash-contrib, gathering a lot of plugins inside one bundle.
See http://logstash.net/docs/1.4.2/contrib-plugins for more information. It’s also available on Github: https://github.com/elasticsearch/logstash-contrib.
Installation:
$ /opt/logstash/bin/plugin install contrib
Setup the Kibana node
Requirements
You’ll need a web server to use kibana, I’ve chosen apache:
$ sudo apt-get install apache2
Installation
You can retrieve the latest archive for kibana here: http://www.elasticsearch.org/overview/kibana/installation/
Download and extract the archive in your webserver root (replace VERSION with the right version):
$ wget https://download.elasticsearch.org/kibana/kibana/kibana-VERSION.tar.gz
$ tar xvf kibana-*.tar.gz -C /var/www
$ sudo mv /var/www/kibana-VERSION /var/www/kibana
Now setup the default dashboard:
$ cp /var/www/kibana/app/dashboards/default.json /var/www/kibana/app/dashboards/default.json.bak
$ mv /var/www/kibana/app/dashboards/logstash.json /var/www/kibana/app/dashboards/default.json
Update the elasticsearch value in /var/www/kibana/config.js to match your elasticsearch node:
elasticsearch: "http://elasticsearch-host:9200",
You can now access the kibana UI: http://kibana-host/kibana
Setup the Logstash forwarder on a node
Install via repository
$ wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo "deb http://packages.elasticsearch.org/logstashforwarder/debian stable main" | sudo tee -a /etc/apt/sources.list.d/elasticsearch.list
$ sudo apt-get update && sudo apt-get install logstash-forwarder
Configuration
You’ll need to copy the public key you’ve generated previously on the logstash node in the same directory : /etc/ssl/logstash.pub
The configuration for the logstash-forwarder is defined in the file /etc/logstash-forwarder. You can see examples in my logstash recipes posts.
Logstash recipes
Here is a list of posts I’ve made for logstash recipes, they contain both logstash and logstash-forwarder configuration samples:
You can also check my post on how to debug your logstash filters.
Enjoy your logs !