« all posts

Testing Logstash configuration with JSON input/output

Aug 12th, 2020 | 4 min read

Logstash is a data processing pipeline that allows you to collect data from various sources, then transform and send it to a destination. It is most commonly used to send data to Elasticsearch (an analytics and search engine), which can then be viewed using Kibana. Together Elasticsearch, Logstash and Kibana form the ELK stack.

Logstash uses configuration files to configure how incoming events are processed.

The problem

Recently, I wanted to test out a Logstash configuration file locally in the simplest possible way.

I had no pieces of the ELK stack installed or setup and had minimal time to push out a fairly complex Logstash config change to a remote environment. I hadn’t enough time to start faffing about getting the full ELK stack installed and getting each component to chat to one another. What I wanted to achieve was simple:

Given an incoming log in a JSON format, apply a Logstash configuration and view the output event(s).

Online documentation/posts seem to be based on Linux environments — fair enough since most production Logstash instances will be deployed in a Linux environment. It is worth pointing out that I was trying to achieve this on Windows. The end results won’t look wildly different but I did have some teething issues.

Setup

For demo purposes I’m going to work from an arbitrary C:\temp directory.

Firstly, create 3 blank files in C:\temp:

  • logstash.conf — will be used to declare our config we want to test out.
  • logstash_out.logs — where we will store the logstash config output.
  • logstash_in.logs — input logs for logstash config to consume.

Installations

You do not need Kibana or Elasticsearch installed as we are going to store the output in a local file.

Logstash is necessary to install. I installed it with Chocolatey, like I recommend with anything else on Windows. The install includes a logstash.bat file which is used to run Logstash, and allows you to specify the configuration.

Please take note of the install location! My binaries were installed under C:\ProgramData\chocolatey\lib\logstash\tools

You will also need Java installed as this is what Logstash uses to run.

Input

Inside the log file should be a list of input logs, in JSON format — one per line. Logstash will consume each line as a separate event.

Let’s add the following to our logstash_in.logs file

{index:"12345",id:1,message:"hello world"}  
{index:"12345",id:2,message:"help me"}  
{index:"12345",id:3,message:"error"}

Then add the following to the logstash.conf file

input {  
  file{  
    path => "C:/temp/\*.logs"  
    start\_position => "beginning"  
    sincedb\_path => "NUL"  
    codec => json  
  }  
}

path Here, we are telling Logstash that the input comes from all .logs files in the C:\temp directory. There is only one in our example.

codec There is nothing special about the .logs extension. The events are consumed as plain text - it is the codec that indicates the format to Logstash (JSON in our example).

start_position We have specified that Logstash should start processing from the start of the list of events.

sincedb_path We specify NUL for the sincedb_path. This is because Logstash stores a pointer to indicate where it got to consuming events. Putting NUL means that we will ignore Logstash’s pointer and always read the whole file. I have been left confused in situations where I’m thinking “why are my input logs not being consumed?” and it’s because Logstash thinks it has consumed them already.

Note also that all slashes in the path are forward, not backward! This caught me out on my Windows machine, where I am used to backslash.

Plugins

This is where you can add whatever custom logic you like and is what you are probably trying to experiment with. Let’s add to our logstash.conf file to do something trivial, like adding an arbitrary field

filter {  
  mutate {  
    add\_field => {"source" => "Medium"}  
  }  
}

Output

You can output to any text based file you like. Let’s tell logstash to output events to our (already created) logstash_out.logs file

output {  
 file {  
   path => "c:/temp/logstash\_out.logs"  
 }  
}

The result

Almost there. Our end configuration is

input {  
  file{  
    path => "C:/temp/\*.logs"  
    start\_position => "beginning"  
    sincedb\_path => "NUL"  
    codec => json  
  }  
}

filter {  
  mutate {  
    add\_field => {"source" => "Medium"}  
  }  
}

output {  
 file {  
   path => "c:/temp/logstash\_out.log"  
 }  
}

After navigating to the installed directory containing your Logstash install, run Logstash using our configuration like so

logstash.bat -f C:\temp\logstash.conf

And we see the output.logs file is populated with lines that look like

{"[@**version**](http://twitter.com/version "Twitter profile for @version")":"1","index":"12345","message":"hello world","**path**":"C:/temp/logstash\_input.logs","id":1,"**host**":"HXXXXXXXXXXX","[**@timestamp**](http://twitter.com/timestamp "Twitter profile for @timestamp")":"2020-08-12T08:53:06.774Z","source":"Medium"}

Notice how Logstash has added some default fields (in bold) but we also have our “source”: “Medium” one, which we specified in the filter block. Winner.

Now you can play around with the plugins to play around with whatever config you like. We did a simple filter here, but it can get a lot more complex.

Troubleshooting

  • If you are having errors around locking of any files, try deleting the .lock file that is located in your Logstash install directory

More reading


Built with GatsbyJS, styled using Terminal CSS