Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

As we start the next part of the ELK for security analysis course, we're leaving ES

behind for a little while, and we're going to focus specifically on Logstash and
we're going to start of course with Logstash input.

Now if you remember our discussion about the ELK stack architecture you'll
remember that the goal in the center piece is really ES.

That's the database we want to get everything into. But Logstash is a critical part
of that, because Logstash is what allows us to read data and actually send it to
ES.

It's the communication director that is sitting in front of ES, and sending it the
data it needs and properly indexing data.

So all the things we just did in the ES section, well, Logstash is going to be doing
a lot of those for us. Of course we have to have that ES knowledge to make
Logstash do exactly what we want to do.

Now the great thing about Logstash is that it can read data from a variety of
sources. I have a couple of them listed here.

File can read from files, such as log files it can be from databases, SQL
databases, relational databases, other ES databases, and can actually even listen
on the network for data.

So Logstash can actually go out and get data and pull it back to itself, or it can
simply listen for data to be pushed to it, which makes it quite powerful.

Now if we zoom in specifically on the Logstash function with this image from the
Logstash documentation you see we have really three processes that we have to
deal with.

And I like how they have this setup because it's very simple and provides a very
clear workflow of what we're doing between our data source and ES. We have
three processes: we have inputs which specifies how we get the data (whether
we wait for something to be pushed to us at Logstash or we go out and get it),
we have filters where the data is changed and manipulated to the way we want
it to be indexed by ES, and then we have outputs (in this case we'll generally be
using an ES output plugin to send data to a to an ES database).
So all of these functions, inputs, filters, and outputs are based upon ‘plugins’.
And we're going to start with inputs.

Now as I mentioned we can get data to Logstash in a couple of different ways.


The first paradigm is ‘push’ to Logstash. This is where Logstash actually sits
there and just listens for data to be sent to it, and you can do this in a couple
different ways.

One is with Syslog. If you're familiar with Syslog it's been around for quite a long
time. It's a very common mechanism for moving logs around your network, and
if you're already doing some type of SIM1 work then you probably use Syslog to
some degree.

Syslog can push data to Logstash, which will listen for it, and use its log. It's
Syslog input plugin.

There's also another tool we've talked about a little bit, called beats. And beats is
actually an elastic product and designed specifically for sending data into ES
databases usually through Logstash. So beats is a small lightweight client you
can install on various types of technology to send logs and send data to
Logstash.

And finally we have the ability with Logstash, just to simply listen on the network
for certain data to come in. A common scenario that comes to mind here might
be maybe something like using something like NXLog2 or even something like
flow data.

A lot of flow collectors will collect the data and just send it over TCP or UDP, and
you can actually configure Logstash to listen for that data on a specific port, and
then parse it and send it to its output plugin. A lot of options here in terms of
having things push the Logstash and having it listen for data on the network.

On the flip side, we have the ability to ‘pull’ from Logstash, where we instruct the
Logstash to go out and grab the data and bring it back itself.

1
Security Information Management
2
A multi-platform log collection and centralization tool that offers log processing features, including log
enrichment (parsing, filtering, and conversion) and log forwarding.
And there are a couple different ways that this is commonly done. One of those
is simply by ‘grabbing’ files, and those files can be on the local file system, or you
can access those remotely through various types of remote access protocols.

Another option is a ‘database pool’. By providing Logstash credentials to a


database and an address, again whether it's local or remote, it can go out and
read many types of databases, whether it's a SQL database or an oracle
database or any of those, and of course one of those is an ES database.
Logstash can read that as well, either locally or remotely.

In our examples in this course we've got everything installed on the same
machine in terms of ES, Logstash, and Kibana. So you'll see that we're using
Logstash to send data to an ES DB locally, but can also read from it locally too,
so we have both options there.

Now I covered a couple of input plugins on that last slide, but I just want to
show you here the elastic documentation and all of the different input plug-ins
that are supported.

You'll see a few I mentioned. ‘Beats’ obviously is right here at the top. We have
an ES plug-in, but we really have so much more here. You see we have
CloudWatch so we can connect to amazon web services, we have event Logstash
so we can pull directly from the windows event log, we see ‘file’ here, looks like
we have the ability to connect into github and read from a github webhook, we
can connect to ‘heroku’3, we can receive events over HTTP or HTTPS that can be
pretty useful, the ability to read events from ‘irc’, jdbc’ sources, and so on. This
page goes on for quite a while, we even have ‘kafka’, we have ‘kinesis’ for a lot
of your big data processing, log4j, a lot of stuff, even an ‘rss’ feed.

I'd show this page just to demonstrate that there are a lot of ways to get data
into ES via Logstash. There's even a standard end plug-in you can pipe things
through standard in and maybe if you have a custom application you've written
and that's the simplest way to do it.

Lots of options here. I’ve ran into very few scenarios where you can't figure out
a way to get data into ES via Logstash.

3
A cloud platform as a service (PaaS) supporting several programming languages
I show this so you can see all the different options. Here we're going to go
through a lot of these in the course, not all of them of course. There are way too
many to do here and some of them are kind of edge cases.

We're going to go through a lot of the common ones, especially a lot of the
common ones that I think are particularly appropriate for digital incident
response and network security monitoring.

I’ve switched over to my terminal now, and let's go through an actual example,
and let's look around at what we have to work with Logstash. Of course this is
the lab environment we've already installed, so you should have, if you followed
along a Logstash installation already up and running, and I want to show you a
couple things here.

First, in terms of the directory structure, I want to show you where the binary for
Logstash resides and that's going to be in user share Logstash. And that's the
directory where the binary resides and you see there's actually a bin directory.
And that's where we can find the Logstash binaries that we need to use and all
the things associated with it. So we may call upon that later on.

If we were running Logstash manually, we may need to run it manually from


here. We're not doing that of course. we configured Logstash to run
automatically when we booted the system up, and we also configured it to
automatically check our configuration directory for updates.

So we don't have to constantly start it and stop it and all those things we don't
need to do any of that we've set that up well for this particular environment.

Now I’m going to go ahead and change directories into the ‘etc Logstash’
directory. And this is actually the configuration directory. This is the directory
we're going to be working out of the most, because it contains the things we
need.

So we've already looked at ‘Logstash.yml’ earlier to configure the setting where it


would automatically pick up configuration changes.

So that's great. That's a good place to start, but the thing I want to do now is go
ahead and create a simple configuration, and to do that I’m going to create a
simple template file.
I want to create a template file that we can use to build other configurations
from moving forward in this course. And it's going to be a really simple one. It's
not going to be too crazy.

So I’m going to go ahead and do ‘nano’ and let's call this file ‘template.config’.

Now notice when I do that look down here at the bottom permission denied.
Why is that? Well in this case, it's because I don't have permissions under my
current user to write into this directory. It's protected via the Logstash user
account.

So what I’m actually just going to do here is do ‘sudo bang bang’, which is just
going to say ‘sudo nano template.config’, enter my password here, and that will
let me create an editable file.

Now with that said, the template is going to be really simple. What were the
three components of Logstash? They were input, filtering, and output. So that's
all we're going to do here. We're going to do input and create an object here.
We're going to do filter and create an object, and then we're going to do output
and create an object.

There you go. That's our template. Not too crazy. Everything we do with
Logstash will follow this template. We will always have some type of input.
Generally have some type of filter and some type of output filter isn't always
required, but you do want an input and output or you're not really doing
anything with Logstash.

So when Logstash runs, it reads these configurations files. And it expects to have
something to do here. Of course ours doesn't have any configuration files
currently enabled for it. So it's not doing anything. And the way we can check
that is by looking at this ‘conf.d’ directory. Notice LS ?? it is empty.

What we do here, is whenever we build a new configuration file, we want


Logstash to use, we simply place it in the ‘conf.d’ directory. Logstash every three
seconds will read that directory looking for files or changes to files. And if there
‘config’ file in there, it will go ahead and execute it and try to start pulling in
data.
So if I were to take our ‘template.config’ and move it into that ‘conf.d’ folder, it
would attempt to try to execute. Of course, that file doesn't have anything in it
right now. There's not much to be done. So that isn't actually going to produce
anything meaningful. But we're going to get there and we'll do that right now.

So as our first example I’m actually going to repeat one that we looked at (at the
very beginning of the course) when I gave the quick demo of all the components
at once. We're going to dive into that one a little bit more, since you've already
seen it and it's a pretty simple example.

So in this case what we want to do is look at the ‘auth.log’. And the ‘auth.log’
can be found for this UNIX system at ‘var/log/auth.log’. And I just tailed that out
and you see we have things we expect to see.

We see some sessions being established. We see me using ‘sudo’ here, of course
you've seen me do that. So we expect to see this here. The log will also have
SSH logins, logoffs, and so on.

So this is authentication to the local system we have the ELK Stack installed on.
So this should be pretty straightforward. Now to do that, we're just going to
create a new ‘config’ file.

I’m going to take our template and copy that, so ‘sudo cp template.config’ and
we’ll call it ‘authlog.config’. And we should have that file now. So I’m not going
to put it in the conf.d directory yet because it's not functional. I want to get it to
where I think it works and then move it into the conf.director and we'll see if we
can pull in data.

So let's go ahead and edit that file ‘authlog.config’. We have our template, and
we can go ahead and put some things in here. Now we're focused on input right
now. we're not really focused on filtering out, but I’m actually not going to put
any filtering statements in here, whatsoever. I’m going to go down here to
output though, and let's just go ahead and put a simple output. And this is what
we'll use for pretty much all the examples at least until we start to focus
specifically on output.

And this is going to be a pretty simple one. We're just going to send all the logs
to our local Logstash instance which is listening on port 9200 locally on localhost,
so pretty straightforward. Go ahead and save that so I don't lose it.
So we're not going to put anything in the filter. We're going to focus exclusively
on input, and the question is ‘how do we retrieve input from a local file’.

To figure that out, let's actually flip back over to the Logstash documentation.
Let's look at the input plugins. The general process you're going to take when
you want to take a certain type of input, is to go to this page and look for a
plugin that makes sense.

In this case, we're going to use the file plugin which streams events from files.
That's what we want to do, since we have a file based log, locally on the system.

Now we have some description information here. It tells us what we can do, but
I’m going to scroll down here to this area.

And this is important, because it tells us what information is absolutely required,


and what additional information we can provide.

In this case we look at the required column. The only thing that is required here,
is the path to the file which makes sense.

If you want to index the file, we need to tell Logstash where that file is. So we
provide the path variable, and it looks like it's an array if we want to provide
multiples. But it looks like that's the only required thing we have to provide.

Now the rest of this is: we can get a little more granular, we can change the max
number of open files, ignore older things, exclude certain things, and so on.

I think in this case the only thing I want to use is this start position option,
where you can specify strings either beginning or end. And I want to tell it to
start from the beginning, because I want everything in the file.

I’ve never indexed the file before, so on everything from the beginning of it I
don't just want it to log into the new stuff I want everything.

So what I’m going to do is specify the path and the start position, and this file
input plugin.

I’m going to switch back to my template here. And under the input section the
file plug-in is simply called file. It's pretty straight forward isn't it?
Let's go ahead and close this. It's not quite as nice using ‘nano’ here, as it was
when we were using Kibana to do some of this JSON work, where it was
automatically closing our brackets for us, we're doing that manually here.

Now I’m going to go ahead and specify the required parameter in this case,
which is the path parameter. And I’m going to tell it to look at ‘var/log/auth.log’.
Let's close that out and let's tell it we want our ‘start_position’ which is
something we talked about in a minute ago, to be ‘beginning’.

And so that basically is telling us we want to use the file input plug-in, we want
to read from the path ‘var/log/auth.log’, and the start position is ‘beginning’, and
we should be pretty good to go with that, and good to go to start trying to read
this file and move this into the Appropriate directory.

So I’m going to move ‘authlog.config’ to ‘conf.d’. It's going to give me an error,


because I don't have permission.

So we'll run that again and now if we ‘LS conf.d’, that ‘authlog.config’ is in there.

So things ought to be running. Of course Logstash should pick that up within


three seconds. So let's see if we can try to verify that.

Now we could just go straight to Kibana, but the thing I always do when I deploy
a new script, is check the logs for Logstash and that's a bit of a tongue twister.

But Logstash generates its own logs about the parsing it’s doing of these
configuration files, and its activities. And I want to check those to see if there are
any errors, because chances are there's any stupid error I’ve made. They're
going to show up there, and I’m going to be able to learn more about it there.

So it's really the first place I go now for to do that we're going to ‘tail
var/log/Logstash/Logstash-plane.log’, which is the Logstash log where it
produces its output. Now if I run this, It looks like we have a couple of things
down here at the top, but if we start at the bottom which is most recently, this
looks interesting and this looks like something we need to address.

It says: “failed to open var/log/auth.log permission denied”.

Well that brings up an important point: if we're reading a file it's not our user
account reading it. Logstash is not running under my sanders user account, it's
not running under ‘sudo’. It's actually running under a Logstash user in a
Logstash group.

So if we want Logstash to be able to access a file on the system that it doesn't


inherently have permission to, we need to give it permission to that file.

This is you know basic file system stuff. They always say if you can't figure out
the problem, it's probably a permissions issue, and that's exactly what we have
going on here as we have a permissions issue.

So we need to make sure Logstash is able to read the file we want it to be able
to read. To solve this problem in this case, I’m going to just simply add the
Logstash user to a specific group that will give them the ability to read these
particular log files that are in var log. And that's going to be the ADM group.

So I’m going to go ahead and do that here and that should fix my problem. I will
need to go ahead and actually restart the Logstash service here, just to make
sure my changes go into effect and the appropriate permissions are applied.

Now let's go ahead and tail our Logstash log again and looks like we don't have
any errors here. The pipeline started successfully, started Logstash API input
with no permissions issues.

So I think we should be good to go here let's actually flip over to Kibana and
check this out.

Let's go ahead and cat out the indices we have available, and it looks like we do
have a new index. It's ‘Logstash 2017.07.18’. We didn't specify any type of index
name in our output.

So one was just automatically created, and it looks like one index was created
and that means we should have some data we can look at.

I’ll flip over here to the discover tab and it looks like we now have a time filter
option here, because a pattern actually matches the index name search we've
done. So I’m going to go ahead and let's create that and it looks like we have
actual fields we're using here. So that looks pretty good. It looks like we actually
have real data.
If I flip back to discover here sure enough we have data. Great… This is the first
time we're actually seeing our own data in Logstash here. So we see things we
expect to see. We see path ‘var/log/auth.log’, we see a time timestamp which is
in the ingestion occurred, version, host, the actual message, name, and we have
some other metadata here as well going on.

But for the most part we have the data we want. We can actually start searching
for it, so I could just hit my name here and search for sanders and sure enough
it appears in pretty much every log, because I’m the only user using the system.

But we have what we want to see here. We see some sudo action going on here.
So we can actually see the commands being run and so on pretty nice.

So that was our first Logstash config and it worked. It was a very simple one. It's
just reading from a log file, and then taking that log file, and sending it straight
to ES with no filtering or additional manipulations.

So we're making progress and we're well on our way to having our full analysis
suite set up, by using Logstash ELK the entire ELK stack Logstash ES and Kibana.

So let's recap the process we used here to get input data into Logstash and this
is really a repeatable framework I want you to get familiar with, because you're
going to use it certainly in this course.

But in real life these are the questions I want you to ask yourself anytime you
want to get data into Logstash:

First, what data do I want, what's its value (we talked about that earlier when
we talked about the applied collection framework).

From there, where is it located, where does that data sit, is it on the local
system, is it on a remote system, what mechanism needs to be used to access it.

From there, what plugin can I use to get it. That's where we go to the Logstash
documentation and look at that big long page. I showed you of all those various
input plugins and figure out what method is going to be best to get it maybe you
need to have Logstash listen for data and that remote system will send it maybe
you need to have Logstash go out and get it. The options are kind of endless and
it's really what makes sense with your system and network architecture. But you
have to figure out what plugin can I use to get it.
And finally what are the requirements of that plugin, what fields are required,
what do I absolutely have to have.

And from there you can go to what's optional, what can make this process a little
bit better, make my data more represented in the way I would like to look at it.

So if you can answer these four questions, you can get pretty much any data
into Logstash.

This lesson was really your first introduction into how to use Logstash at a
detailed level. And in this lesson first of all we learned that Logstash is really the
central communication hub of data between all the evidence sources you have,
and ES. It's a critical piece of the pathway to getting data ready for analysis and
investigation.

From there we learned about input plugins, and how they're used to introduce
data to Logstash.

We looked at a whole lot of those and particularly we examined the file plugin
which will read logs from a file, and I’ll show you how to do that to use Logstash
to read data from a local log file, and make sure we were able to get that data in
once we had to configure a little permissions issue.

The Logstash config files, we looked at those, we looked at the ‘etc/logstash’

Directory.

And we also looked at the ‘etc/logstash/conf.d’ directory, which is where we


place configuration files, in order for them to be read by Logstash.

And of course we've configured Logstash to read that directory every three
seconds, so we're not having to constantly restart things which is certainly
beneficial.

Of course we also looked at the Logstash log file itself, where we can find errors
of course we had one here, which is great, because it was a good example for
you to see what the error we were dealing with was, and how to track that
down.

Now this was just a very simple Logstash example about as simple as it gets.
We're going to do a lot more with Logstash. We 're going to look first of all more
and varied types of input, and then we'll eventually get on to filtering, which is
where things get really crazy, and finally output where we can send it to a lot of
different output sources.

So our last Logstash journey has just really begun. And hold on, because we've
got a lot more to cover.

You might also like