Testing Logstash configuration with JSON input/output
Aug 12th, 2020 | 4 min readLogstash is a data processing pipeline that allows you to collect data from various sources, then transform and send it to a destination. It is most commonly used to send data to Elasticsearch (an analytics and search engine), which can then be viewed using Kibana. Together Elasticsearch, Logstash and Kibana form the ELK stack.
Logstash uses configuration files to configure how incoming events are processed.
The problem
Recently, I wanted to test out a Logstash configuration file locally in the simplest possible way.
I had no pieces of the ELK stack installed or setup and had minimal time to push out a fairly complex Logstash config change to a remote environment. I hadn’t enough time to start faffing about getting the full ELK stack installed and getting each component to chat to one another. What I wanted to achieve was simple:
Given an incoming log in a JSON format, apply a Logstash configuration and view the output event(s).
Online documentation/posts seem to be based on Linux environments — fair enough since most production Logstash instances will be deployed in a Linux environment. It is worth pointing out that I was trying to achieve this on Windows. The end results won’t look wildly different but I did have some teething issues.
Setup
For demo purposes I’m going to work from an arbitrary C:\temp directory.
Firstly, create 3 blank files in C:\temp:
- logstash.conf — will be used to declare our config we want to test out.
- logstash_out.logs — where we will store the logstash config output.
- logstash_in.logs — input logs for logstash config to consume.
Installations
You do not need Kibana or Elasticsearch installed as we are going to store the output in a local file.
Logstash is necessary to install. I installed it with Chocolatey, like I recommend with anything else on Windows. The install includes a logstash.bat file which is used to run Logstash, and allows you to specify the configuration.
Please take note of the install location! My binaries were installed under C:\ProgramData\chocolatey\lib\logstash\tools
You will also need Java installed as this is what Logstash uses to run.
Input
Inside the log file should be a list of input logs, in JSON format — one per line. Logstash will consume each line as a separate event.
Let’s add the following to our logstash_in.logs file
{index:"12345",id:1,message:"hello world"}
{index:"12345",id:2,message:"help me"}
{index:"12345",id:3,message:"error"}
Then add the following to the logstash.conf file
input {
file{
path => "C:/temp/\*.logs"
start\_position => "beginning"
sincedb\_path => "NUL"
codec => json
}
}
path Here, we are telling Logstash that the input comes from all .logs files in the C:\temp directory. There is only one in our example.
codec There is nothing special about the .logs extension. The events are consumed as plain text - it is the codec that indicates the format to Logstash (JSON in our example).
start_position We have specified that Logstash should start processing from the start of the list of events.
sincedb_path We specify NUL for the sincedb_path. This is because Logstash stores a pointer to indicate where it got to consuming events. Putting NUL means that we will ignore Logstash’s pointer and always read the whole file. I have been left confused in situations where I’m thinking “why are my input logs not being consumed?” and it’s because Logstash thinks it has consumed them already.
Note also that all slashes in the path are forward, not backward! This caught me out on my Windows machine, where I am used to backslash.
Plugins
This is where you can add whatever custom logic you like and is what you are probably trying to experiment with. Let’s add to our logstash.conf file to do something trivial, like adding an arbitrary field
filter {
mutate {
add\_field => {"source" => "Medium"}
}
}
Output
You can output to any text based file you like. Let’s tell logstash to output events to our (already created) logstash_out.logs file
output {
file {
path => "c:/temp/logstash\_out.logs"
}
}
The result
Almost there. Our end configuration is
input {
file{
path => "C:/temp/\*.logs"
start\_position => "beginning"
sincedb\_path => "NUL"
codec => json
}
}
filter {
mutate {
add\_field => {"source" => "Medium"}
}
}
output {
file {
path => "c:/temp/logstash\_out.log"
}
}
After navigating to the installed directory containing your Logstash install, run Logstash using our configuration like so
logstash.bat -f C:\temp\logstash.conf
And we see the output.logs file is populated with lines that look like
{"[@**version**](http://twitter.com/version "Twitter profile for @version")":"1","index":"12345","message":"hello world","**path**":"C:/temp/logstash\_input.logs","id":1,"**host**":"HXXXXXXXXXXX","[**@timestamp**](http://twitter.com/timestamp "Twitter profile for @timestamp")":"2020-08-12T08:53:06.774Z","source":"Medium"}
Notice how Logstash has added some default fields (in bold) but we also have our “source”: “Medium” one, which we specified in the filter block. Winner.
Now you can play around with the plugins to play around with whatever config you like. We did a simple filter here, but it can get a lot more complex.
Troubleshooting
- If you are having errors around locking of any files, try deleting the .lock file that is located in your Logstash install directory
More reading
Built with GatsbyJS, styled using Terminal CSS