Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 11

1st Module:

===========
Machine Data(everywhere, constantly flowing data being interacted)
structured/unstructured
How Splunk uses it
takes more than 90 % of industry data.
data may be anything(not only generated by webservers)

without Operational Intelligence(take more time)

with Operational Intelligence(using Machine data) minutes


pinpoint/corelate/alert
Splunk:
--> Splunk can take any data and added to intelligent searchable index and
structured and to
unstructured data and allowing to extract all sort of insights into ur business.
--> application/security/user behaviour/hardware monitoring/sales
--> translator to ur machine data master

2nd Module:
===========
Splunk Components( 5 main functions of splunk enterprise and how it makes user
machine data available, accessable and usable to everyone in the organization)
Index Data
Heart of splunk
index collects data from virtually any soruce and it is important
index as factory and machine data as raw meterials
data enters in, they looks up the data how to process it(apache.log)
when they find the match they labelled the data with soruce type.
work receives the source type is break the data into single events
time stamps are identified and normalize to consistant formats
events are then stored in splunk index where they can be searched
Search & Investigate
By entering query in splunk search board
we can find index that contains values across multiple data soruces allowing to
analyze and run test on the events using splunk search launguage
Add knowledge
we can add knowledge object to the data
this allows u to effect how your data is turbited(give
classsifiaction,enrichment, normalization and save reports for future purpose)
Monitor & Alert
splunk can actively monitor all ur infrastrucutre in real time to identify
issues, problems and attackes before impacted customers and services.
we can create alerts to monitor for specific condtions and automatically respond
with actions
Report & Analyze
splunk allows us to collect reports and visaulization into dashboards and
empowering
groups in organization by giving them information they need oraginzed

Splunk processing components:(3 main)


indexer
process incoming process machine data storing results in index as events
As indexser indexes data it create number of files organized in sets of
directories by age
this organization is important for search.
When we search the data splunk will open the directory that match the time frame
of our search making ur search more efficient
search head
allows us to use the splunk search launguga to seach index data
handle search requests from users and distributes the requests to index which
perform actually searches on the data
SH then consildates and enrich the results from the indexers return to users
tools
dashboards
reports
visualization
Forwarder
Splunk enterprize instances that consume data and forwarded to indexes for
processing
require minimal resourcces
little impact on performacne
reside on the machines where the data originates

webserver --> forwarder installed on server --> sends data to indexers

Deploying and scaling splunk:


Splunk instance
Single instance deployment
one instance of splunk enterprise handles all functions of splunk
input
parsing
indexing
Searching
perfect environment for
Proof of concept
Personal use
Learning
Might serve the needs of small, department sized environments

Clustering in multi-instance
Splunk enterprise can be sclable to fit any environment

3rd module:
===========
Installing splunk enterprise
Software for splunk.com (Free Splunk)
commands:
./splunk start --accept-license
./splunk stop
./splunk restart
./splunk help

linux/windows/mac
Splunk cloud

splunk role --> determine what a user is able to see, do and interact with.
Administrator (can install apps and create knowledge objects for all users)
Power (can create and share knowledge objects for users of an app and do
realtime searches)
user (will only see their own knowledge objects and those shared with them)

Admin/Power/User

admin/changeme/Ansrloader1@3
power users created:
subbukandula/5p1unkbcup

Users can launch and manage apps from our home app.
Search and Reporting app will ship with Splunk enterprise.

4th module:
===========
Getting Data in

Types of data input


Adding data is done by the admin user for the deployment.

upload
allows us to upload local files to splunk enterprise instance then only get
indexed one
good for the data that is created one and never gets updated
monitor
allows us to monitor files directories http events, tcp/udp network ports, data
gethering scripts
located on splunk enterprise instance.
windows specific events
event logs
file system changes
performance metrics
network information
forward
we can receive data from external folders (forwarders installed in web servers)

Data upload input option:


data upload --> file input --> source type --> locate timestamp --> Host name
--> Index(creat new index)
indexes are directories where data is stored.
users store all events in main index allowing to use one index to search all the
data
1) having seperate index can make searches faster(multiple indexes).
Web Data index
Main index
Security Index
eg index=web_data_index fail*
(limits data amount Splunk searches, returns events only from that index)
Multiple indexed allow limiting access by user role
we can control who sees what data
separate indexed allow custom retention policies

Monitor option:
files and directories --> browse log file and select --continuously monitor/index
one
--. whitelist and backlist --> app control --> hotname --> index

Forwarders

Splunk used forwarder/s to categorize the type of data being indexed


Splunk knows where to break the event, where the time stamp is located and how to
automatically create field value pairs using source type
Splunk uses sourcetypes or sourcetype to categorize the type of data being indexded

5th module:
===========
Basic Searching (searching and Reporting Interface):
--> for searching and analyzing data
--> used to create knowledge objects
--> reports
--> Dashboards
Data Summary
--> source (file/directory)
--> soruce types
--> host (hostname)

Commands that create statistics and visualizations are called the transforming
commands
By default search job will remain active for 10 minutes
Shared search job remain active for 7 days
export --> raw/csv/xml/json
search modes
--> Fast mode (field discovery is disabled in Fast mode)
--> Smart mode
--> Verbose mode
Selecting or zooming into events splunk uses your original search job
when we zoom out, Splunk runs a new search job to return newly selected events

Exploring the events list:


Time stamp in events is based on time zone we set in account

Search everything:
wild cards
--> fail*
--> booleans (NOT, OR, AND) order of evaluation (1.NOT 2.OR 3.AND)
failed NOT password
--> exact phrases "failed password"
--> Escaping characters (use backslash \)
info="user \"chrisV4\" not in database"

6th Module:
===========
Using Fields
Fields sidebar--> shows all the fields that were extracted at search time
--> selected fields (atmost importance)
it will appear at the bottom of events
--> interesting fields (we can add to appear this fields in bottom of evetns)
have values in at least 20% of the events
a date_wday (a denotes String)
# date_month (# denotes numeric)
a dest 4(it contains string values and it contains 4 values)
Clicking on fields
--> values
--> count of values
--> percentage
Wildcards can be used for fileds search

Searching fields:(search using fields)


sourcetype=linux_secure
filed names are case sensitive and operators are not.
=, != --> used for numerical or string values
< <= > >= --> used for numnerical values

7th module:
===========
Best Practices:
--> Using time limit is the most efficient way( Last 7 days --> good, All Time -->
bad)
--> the less date u want to search the faster splunk will be
--> more u tell the search engine the more likely u will get good results.
--> inclusion is better than exclusion in splunk search.
Using Time:
several pre time selection
real time search(we can search 10 minutes ago)
advanced (-30m) -30s -30h d w mon y
@ time range picker to round down to nearest unit of specified time.
(-30m@h runs at 9:37.. search will come from 9:00)
Time abbreviations tell Splunk what time to search
Time to search can only be set by the time range picker --> false
Using Indexes
Indexer(use indexer for specific events)
seperate indexes
multiple retention policies
Faster searches
Ability to limit access

8th Module:
===========
SPL Fundamentals

Splunk search language(build from 5 components)


search terms
Commands --> tells splunk what we want to do with search results
(computing statistics and formatting)
Funtions --> explain how we want to show(compute and evaluate results)
Arguments --> variables that we want to apply to functions
clauses --> which explains how results grouped and redfined.

eg: sourcetype=acc* status=200 | stats list(product_name) as "Games Sold" |

sourcetype=acc* status=200 --> search terms


stats --> command
list --> function
product_name --> arguments
as --> clause
| --> pipe --> to pass test results

Visual pipeline:
boolean operators/command modifiers --> orange
commands --> blue
command arguments --> green
functions --. purple
ctrl + / --> moves pipe to new line

limits Explained:
syntax:
search command | command/function | command/function

Fields command: (fields --> command)


include or exclude specific fields from search results.
useful to limit fields displayed and can make search faster

sourcetype=access_combined
| fields status clientip (results will include only status and clientip in
search results)
| fields - status clientip (results will exclude only status and clientip in
search results)

Fields extraction is one the most costly parts of searching in splunk


Field inclusion happends before field extraction and can improve performance
Field exclustion happens after field extraction only affecting displayed
results.

Table command:
retains searched data in a tabulated format

sourcetype=access_combined
| table JESSIONID,product_name,price

Rename command:
used to rename fields

sourcetype=access_combined
| table JESSIONID,product_name,price
| rename JESSTIONID as "User Session"
price as "purchase price"
| fields - "User Session" (use renamed name)

Once renamed, original name is not available to subsequent search commands


new field names will need to be used further down the pipeline

Dedup command:
removes events with duplicate values.

sourcetype=access_combined
| dedup product_name (| dedup product_name price)
| table JESSIONID,product_name,price

Sort command:
Display results in ascending or descending order.

sourcetype=access_combined
| table JESSIONID,product_name,price
| sort product_name price

String data is sorted alphanumberically, numeric data --> numerically


default is ascending order.

sourcetype=access_combined
| table JESSIONID,product_name,price
| sort + product_name price limit=20 (ascending)

limit=20 --> dispays only first 20 results

sourcetype=access_combined
| table JESSIONID,product_name,price
| sort - product_name price (descending)

Excluding fields using the Fields Command will benefit performance.--> No

9th Module:
==========
Transforming commands (top/rare/stats commands)
(order search reults into a data table for statistical purposes)

Top command:
Finds the most common values of a given field in result set

which vendor solds more products this week


sourcetype=access_combined
| top vendor
top command automatically gives count and percent and limit count to 10 by
default

sourcetype=access_combined
| top vendor limit=20

sourcetype=access_combined
| top vendor limit=0 (gives all results)

sourcetype=access_combined
| top vendor product_name limit=20 (top results of vendor and product name)

top command clauses:


limit = int
countfield = string
percentfield = string
showcount = True/False
showperc = True/False
showother = True/False
otherstr = string

sourcetype=access_combined
| top product_name by vendor limit=20 (show top 20 products sold by vendor)
countfield="number of sales" showperc="False"

Rare Command:
Same options as Top command
Shows least common values of a field set

sourcetype=access_combined
| rare vendor product_name limit=20

Stats command:
count --> returns number of events matching search pattern
distinct count --> count of unique values
sum --> returns sum of numeric values
average(avg) --> returns average of numneric values
list --> List values of fields
values --> Unique values of given fields.

Count funtion:
sourcetype=vendor_sales
| stats count (gives count)

sourcetype=vendor_sales
| stats count as "Total sells by vendor"
by product_name (gives count by product_name)

sourcetype=vendor_sales
| stats count(action) as ActionEvents, count as totalEvents
Distinct count function:
Returns the count of unique values in the search results.

sourcetype=vendor_sales
| stats distinct_count(product_name) //dc(product_name)

Sum function:
Returns the sum of all numeric values in a field

sourcetype=vendor_sales
| stats sum(price) as "Gross Sales" by product_name

sourcetype=vendor_sales
| count as "units sold" stats sum(price) as "Gross Sales" by product_name
(count and sum should be same pipe)

Avg funtion:
Creates an average value for a numeric field

sourcetype=vendor_sales
| stats avg(seller_price) as "Avg selling Sales" by product_name

List Function
list all values for a given field

sourcetype=vendor_sales
| stats list(Asset) as "company assests" by Employee

Value function:
returns unique values for a given field

sourcetype=vendor_sales
| stats values(s_hostname) by user_name

10th Module:
============
Reports and Dashboards

Reports:
save as --> report (report will be created)
Reports tab in application menu
we can edit permissions
Schedule reports
bar chart/ statistical table

Visualization:
to visualize our data in many ways
visulation tab in search
save as visaulation as report or dashboard panel

Dashboard:
collection of reports compiled into single pane of class allowing quick visual
access of data
are reports gathered together into a single pane of glass
search --> visualization --> save as dashboard panel --> new --> save
Dashboards tab in application menu
TIme range picker will only work on panels with an inline search
If a search returns this, you can view the results as a chart --> satistical values
user/power/admin role can create reports
time range picker can be included in report

11th module:
============
Pivot
allows users(who don't know splunk) to design reports in simpe to user interface
with our ever having search strings
Data models:
knowledge objects that provide the data structure that drives pivots
created by admin/power
data model as framework and pivot as inteface to data
each data model made of data sets(Smaller collection of data defines for
specific purpose, they are represented in tables)
Settings menu --> Data models --> select pivot to see data sets
save as pivot as report/dashboard panel
Instant pivot
application menu
search --> satistics/visualiztion --> pivot --> save options --> ok
DataSets
allowing users to access small slice of data can help to gain opeation and
intelligence from the data without use search
DataSets menu
display the list of dataSets
explore--> to visualize data in pivot
datasets help users to find data and get answers faster
Splunk datasets add on

Adding child data model objects is like the ______ Boolean in the Splunk search
language.
AND

12th module:
===========
Lookups(allow to add other fields and values to your events not included in index
data)
We can combine fields from external sources to index with searched events
CSV files, sripts, geospatial data
When using a .csv file for Lookups, the first row in the file represents
field names
data used for search might not be available in index
A lookup is categorized as a dataset
Two steps to setup a lookup file
1) define a lookup table
2) define a lookup
optionally configure ur lookup to run automatically
look up filed values are case sensitive by default

lookup table(to use external csv file, inputlookup --> command)


settings --> lookup --> add new(lookupup table files) --> search(destination
app), choose file, destination name(http_status.csv) --> save --> permissions
| inputlookup http_status.csv

Define lookup:(create defination for lookup table created)


settings --> lookup --> add new(lookup definations) --> search,http_status,
file-based, https_status.csv --> save

lookup commnad:
sourcetype=access_combined NOT status=200
| lookup http_status code as status, (code field will be mapped to status0
OUTPUT code as "HTTP Code"
description as "HTTP Description"
OUTPUT --> override existing fields
OUTPUTNEW --> didn't override existing fields

Automatic lookup:
settings --> lookups --> add new(automatic lookup)
code will be mapped to stauts for particular sorucetype

Additional lookup options


populate lookup table with search results
defnine lookup based on external script or command
Use splunk DB connect app
use geospatial lookup to create queries that can be used to generate choropleth
map visualizations
populare events with KV store fields

13th Module:
===========
Scheduled Reports and Alerts
Scheduled Reports is reports that run on scheduled intervals and trigger an action
each time it runs
useful to create montly reports, dashboards and automatically send reports to
emails

Create Scheduled Reports:


create report(save as in search)
schdeule option to schdeule it
Running concurrent reports, and searches behind them can put a big demand on ur
system hardware even if everything configured to the recommended specs
include a schedule window only if the report doesn't have to start at specific
time

manage Scheduled Reports:


settings --> search, reports and alerts
permissions are particulary important for reports
before report can be embedded it must first be scheduled
anyoen with access to web page will be able to see report

Alert into
Baed on searches that run on schdeuled intervals or in real time
alerts trigerred when specific conditions are met
list in interface
log events
send emails
trigger scripts
use a webhook
run a custom alert
An alert is an action triggered by a saved search

Create alert
search --> save as --> alert --> permission
sechduled alert type allows us to set a schdule and time range for the search to
be run
real time alert type will run the search continously in background (more overhead
on system performance)
settings --> alert actions (predefined actions)
Manage Alert
activity --> triggered alerts
settings --> alert actions (predefined actions)

14h Module:
Final thoughts

You might also like