Splunk 8.1 Fundamentals Part 3

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 304

Splunk Fundamentals 3

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 1 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Document Usage Guidelines
• Should be used only for enrolled students
• Not meant to be a self-paced document, an instructor is needed
• Do not distribute

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution 11 January 2021
Splunk Fundamentals 3
turn data into doing™ 2 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Course Prerequisites
To be successful in this course, you should have completed:
• Splunk Fundamentals 1
• Splunk Fundamentals 2

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 3 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Course Guidelines
• Hands-on lab exercises reinforce information presented in the
lecture modules
• To receive a certificate of completion for this course, you must
complete all of the lab exercises
• The lab exercises must be completed sequentially

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 4 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Course Goals
• Explore additional statistical functions
• Explore additional eval functions, including comparison, conversion,
mathematical, and statistical functions
• Include and exclude events based on lookup values
• Create a lookup and use it in an alert
• Learn about regular expressions and use the erex and rex
commands to create temporary fields
• Use spath to work with self-describing data
• Create and use nested macros and macros with event types
• Accelerate reports and data models
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 5 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Course Outline
Module 1: Exploring Statistical Commands
Module 2: Exploring eval Command Functions
Module 3: Exploring Lookups
Module 4: Exploring Alerts
Module 5: Extracting Fields at Search Time
Module 6: Working with Self-Describing Data
Module 7: Exploring Search Macros
Module 8: Using Acceleration Options
Module 9: Report Acceleration
Module 10: Summary Indexing
Module 11: datamodel Command & Data Model Acceleration
Module 12: tsidx Files & tstats Command
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 6 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Callouts
• Scenarios Scenario ?
The online sales manager wants
– Examples in this course relate to a to see the action, productId,
specific scenario and status of customer
interactions in the online store.

– For each example, a question is posed


from a colleague or manager at
Buttercup Games Note
Lookups are discussed in the
• Notes & Warnings Splunk Fundamentals 1 course.

– References for more information on


Warning
a topic and tips for best practices Make sure to regularly feed
Splunk good data or else it might
– Warnings contain important information get very hangry.
you should know
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 7 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Commands and Functions Syntax
...| command [option=arg] function(fieldName) [as newField]

• The ... indicates the command is not a generating command


• Text that is italicized will be replaced with appropriate inputs
– newField might become products or TOTAL
– arg can be an integer, Boolean, or string

• Brackets [ ] indicate optional syntax


• If a slide is dedicated to introducing a function, then the function
will be shown following a compatible command
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 8 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Course Scenario
• Use cases are based on Buttercup Games, a multinational
gaming company
• The Buttercup Games Splunk environment contains data from:
– Business analytics: web access logs and lookups
– Internal operations: mail and internal network data
– Security operations: internal network and badge reader data

• You've recently been promoted to a Splunk Power User who must:


– Manage Splunk knowledge objects and implement best practices
– Utilize Splunk to provide insightful statistics and meaningful reports
– Be at the beck and call of other departments and no, there is no raise
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 9 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Buttercup Games Network
Index Description Sourcetype(s) Host(s)
www1
web Online transactions access_combined www2
www3
Badge reader data history_access badgesv1
AD/DNS data winauthentication_security adldapsv1
security www1
Web login data linux_secure www2
www3
Retail sales data vendor_sales vendorUS1
sales
BI data sales_entries ecommsv1
Firewall data cisco_firewall
network Email data cisco_esa cisco_router1
Web security appliance data cisco_wsa_squid
AWS instance data system_info json_system_data
systems Linux system log server_log mixed_system_data
HTTP status code definitions status_definitions sh-8883
games Game logs SimCubeBeta sim_cube_server
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 10 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 1:
Exploring Statistical
Commands

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 11 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module Objectives
• Use stats command functions to perform statistical analysis:
min, max, mean, median, stdev, var, & range
• Generate summary statistics using fieldsummary
• Add results of a “subpipeline” to search results using appendpipe
• Generate summary statistics on search results using eventstats
• Use the streamstats command to add cumulative summary
statistics over all the results

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 12 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing the stats Command
• The stats command calculates statistics on search results
...| stats statsfunction(field) [as field] [by field-list]

• stats command functions from previous courses:


...| stats count returns the number of events that matches the search criteria

...| stats dc(field) returns a count of unique values for field

...| stats sum(field) returns a sum of numeric values for field

...| stats list(field) lists all values of field


Note

...| stats values(field) To view all functions for


lists unique values of field stats, refer to the
Search Reference guide.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 13 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Additional Functions for the stats Command
Other stats functions are available for calculating statistical information
...| stats min(field) returns the minimum value for field

...| stats max(field) returns the maximum value for field

...| stats avg(field) returns the average value for field

...| stats median(field) returns the middle-most value of field

...| stats range(field) returns the difference between the min and max values of field

...| stats mean(field) returns the arithmetic mean for field; results should match the
values calculated using avg
...| stats stdev(field) returns the the standard deviation (measure of the extent of
deviation of the values) for field
...| stats var(field) returns the variance (measure of how far the values are spread
out) for field
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 14 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
stats Command and Functions: Example
Scenario ? index=sales sourcetype=vendor_sales VendorID>=7000 AND VendorID<9000
Marketing wants to perform a | stats count(price) as count, sum(price) as sum, min(price) as Minimum, max(price)
statistical analysis on all as Maximum, range(price) as Range, mean(price) as Average, median(price) as
APAC retail sales by country Median, stdev(price) as sdev, var(price) as Variance by VendorCountry
over the last 24 hours. | eval Average = round(Average,2), sdev=round(sdev,2), Variance=round(Variance,2)
| sort -sum
| rename count as "Units Sold", sum as "Total Sales", sdev as "Standard Deviation"

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 15 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
stats Function: list Review
This search produces a crowded table in
which the duplicate hosts can be distracting
or difficult to read
index=web sourcetype=access_combined action=purchase status=200
This is what they GET
| stats count by host, product_name
| sort host, -count

Piping results to the list function of the


stats command creates an easier to read
table
index=web sourcetype=access_combined action=purchase status=200
| stats count by host, product_name This is what they WANT
| sort host, -count
| stats list(product_name) as "Product Name" list(count) as
Count by host

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 16 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
stats Function: list Example
Scenario ? index=web sourcetype=access_combined product_name=*
Sales wants a count of all successful online sales action=purchase status=200
purchases over the last 24 hours by host. The | stats count by host, product_name
report should have counts for each product in
| sort -count
descending order and a total count for each host.
A | stats list(product_name) as "Product Name", C
Sales also requested to "make it look nice."
list(count) as Count, sum(count) as total by host
B
• stats list displays all product names
A

and count combinations in the same row


• sum(count) as total sums up all the
B

count values into a new field called


total
• by host tells Splunk how to group the
C

results of list and sum


• Search produces 3 easy-to-read rows of
results Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 17 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
fieldsummary Command
...| fieldsummary [maxvals=num] [field-list]

• Calculates a variety of summary stats for all or a subset of fields


• Displays summary info as results table
• maxvals specifies the maximum number of unique values to display
for each field (optional; defaults to 100)

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 18 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
fieldsummary Command Output Fields
• field: field name • mean (if field is numeric)
• count: number of events with • min (if field is numeric)
that field • numeric_count: count of
• distinct_count: number of numeric values in field
unique values in field • stdev (if field numeric)
• is_exact: boolean (0 or 1) • values: distinct values of field
indicates whether and count of each value
distinct_count is exact
• max (if field numeric)

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 19 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
fieldsummary Command: Example 1
Scenario ?
A security operations manager wants to
compare a sampling of IP addresses from
the security index over the past 24 hours.
index=security
| fieldsummary maxvals=10 bcg_ip src_ip

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 20 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
fieldsummary Command: Example 2
Scenario ?
A business analyst wants to study the variance
between successful online purchase events and
purchase events hindered by a 503 error that
occurred yesterday.

index=web sourcetype=access_combined status IN (200,503)


| stats sum(price) as sales by status
| fieldsummary

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 21 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
appendpipe Command
...| appendpipe [subpipeline]

• Appends the output of a transforming command(s) to current


results set
• The subpipeline is executed when Splunk reaches the
appendpipe command
1. Splunk takes the existing results and pushes them to the subpipeline
2. Then, Splunk appends the result of the subpipeline as new lines to
the outer search

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 22 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
appendpipe Command: Example
Scenario ? index=network sourcetype=cisco_wsa_squid usage!=Business A
B | stats count by usage, cs_username
The CTO wants to find the
number of nonbusiness-related C | appendpipe [stats sum(count) as count by usage]
connections to the internet for the
last 24 hours, by user and usage,
and the total attempts by usage.

• Limit basic search to retrieve non-


A

business usage connections


• Count events by each unique usage
B

and user combination


…at the end of the results…
• Use appendpipe to calculate total
C

event counts for each usage type


• The information is all here but it's
not easy to read
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 23 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
appendpipe Command: Example (cont.)
index=network sourcetype=cisco_wsa_squid usage!=Business
Scenario ? | stats count by usage, cs_username
The CTO wants to find the | appendpipe
number of nonbusiness-related [stats sum(count) as count by usage
connections to the internet for the | eval cs_username = "Total for usage of ".usage] D
last 24 hours, by user and usage,
and the total attempts by usage. E | sort usage

• Use eval to provide a


D

description for the


usage totals that will
appear in the D

cs_username column
• Sorting by usage
E

organizes results into an


E
easy to read table
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 24 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
appendpipe Command: Grand Total
Scenario ? index=network sourcetype=cisco_wsa_squid usage!=Business
| stats count by usage, cs_username
Before you send off the report, you 1st appendpipe
decide to add a grand total to the end of
| appendpipe
the report. [stats sum(count) as count by usage
| eval cs_username = "Total for usage of ".usage]
| sort usage
2nd appendpipe | appendpipe
[search cs_username = "Total for*"
• Multiple appendpipe | stats sum(count) as count
| eval cs_username = "GRAND TOTAL"]
commands can be used
• Second appendpipe adds up
the usage totals and appends
a grand total to results 1st appendpipe

2nd appendpipe
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 25 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
eventstats Command
...| eventstats statsfunction(field) [as field] [by field-list]

• Generates summary statistics of all existing fields in your search


results and saves them as values in new fields
• statsfunction is applied to field and the resulting value is
appended to each of the results
• Just like stats, eventstats supports multiple functions

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 26 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
eventstats Command: Example 1
Scenario ? index=web sourcetype=access_combined action=remove
For a new campaign, the online A | chart sum(price) as lostSales by product_name
sales manager wants to see which B | eventstats avg(lostSales) as averageLoss
products are losing more sales than C | where lostSales > averageLoss
the average during the last 24
hours, visualized in a bar chart.
B

• Find the aggregate lost


A

sales for each product


• Use eventstats to
B
C
calculate the average loss
sales and append this value
to each product
• Use where to only keep the
C

products losing more than


the average
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 27 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
eventstats Command: Example 1 (cont.)
Scenario ?
For a new campaign, the online
sales manager wants to see which
products are losing more sales than
the average during the last 24
index=web sourcetype=access_combined action=remove
hours, visualized in a bar chart. | chart sum(price) as lostSales by product_name
| eventstats avg(lostSales) as averageLoss
| where lostSales > averageLoss

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 28 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
eventstats Command: Example 2
Scenario ? index=web sourcetype=access_combined action=purchase status=200
The sales team wants to know the | timechart sum(price) as totalSales
lowest and highest sales totals A | eventstats max(totalSales) as highest,
during the previous week – and min(totalSales) as lowest
on which days they occurred. | where totalSales=highest OR totalSales=lowest B
| eval Outcome = if(totalSales=highest,"Highest","Lowest") C
• Label the highest and lowest
A
D | eval Day = strftime(_time,"%A")
| table Day, Outcome, totalSales
totalSales values | eval totalSales="$".tostring(totalSales,"commas")

• Filter search
B

• Create a new field and


C

assign "highest" and "lowest"


as values
• Format the time to show day
D Note

of the week The strftime() function


is discussed in more
detail in Module 2.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 29 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
streamstats Command
...| streamstats statsfunction(field) [as field] [by field-list] [option=arg]

• Calculates summary statistics for each result row at the time the
command encounters it and adds these values to the results
• Unlike stats and eventstats, does not need entire result set
• Just like stats, streamstats supports multiple functions
• The following options are available:
– window specifies the number of events to use; default=0 (all events)
– current includes the current event in summary calculations when set to
true (default); if false, search uses field value from previous result
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 30 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
streamstats Command: Example 1
Scenario ?
Sales wants to monitor a moving index=web sourcetype=access_combined action=purchase status=200 productId=*
average of the price of a purchase | table _time, price
on the Buttercup Games website | sort _time B A
over the previous 100 purchases | streamstats avg(price) as averageOrder current=f window=100
during the last 4 hours.

• Calculate the average


A

price over the past 100


events (window=100)
• Do not include current
B

event in summary
calculations (current=f)

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 31 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
streamstats Command: Example 2
Scenario ? index=network sourcetype=cisco_wsa_squid action!="TCP_REFRESH_HIT"
| streamstats count as recentAttempts by bcg_ip
Internal users are complaining
| stats list(c_ip) as users, list(recentAttempts) as recentAttempts
they can't reach remote FTP
servers. IT needs a report of max(recentAttempts) as numAttempts by bcg_ip
recent user IPs by server. Report | sort -numAttempts
should cover the last 4 hours and | table bcg_ip, users, recentAttempts
have the server with the most
attempts listed first.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 32 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 1 Knowledge Check
• True or False: If there is an appendpipe in a search, its
subpipeline will always be executed last.
• True or False: eventstats and streamstats support multiple
stats functions, just like stats.
• What command was used in this search?

index=games sourcetype=SimCubeBeta Action


IN("Promoted","Given A Raise")
| stats count as PositiveEvents by Action,
CurrentStanding
| sort -PositiveEvents
???
| streamstats count(PositiveEvents) by Action

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 33 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 1 Knowledge Check
• False: If there is an appendpipe in a search, its subpipeline will
always be executed last. when Splunk encounters it.
• True: eventstats and streamstats support multiple stats
functions, just like stats.
• What command was used?

index=games sourcetype=SimCubeBeta Action


IN("Promoted","Given A Raise")
| stats count as PositiveEvents by Action,
CurrentStanding
| sort -PositiveEvents
| streamstats count(PositiveEvents) by Action

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 34 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Lab Exercise 1
Time: 45 minutes
Tasks:
• Display retail sales over the last 24 hours for each category and
product name with total sales for each category
• Create a table for the previous week's online purchases by host and
category to see which games were purchased most
• Identify the retail products with less than the average total sales for the
previous week
• Report on the 3 most active network users last week by usage type

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 35 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 2:
Exploring eval
Command Functions

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 36 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module Objectives
• Use makeresults to generate and test data
• Explore eval…
– Conversion functions
– Date and time functions
– String functions
– Comparison and conditional functions
– Informational functions
– Statistical functions
– Mathematical functions

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 37 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
makeresults Command
| makeresults

• Generates a single event in memory with only the _time field


• Generally used with one or more eval commands
• Must be first command in search succeeding a | pipe Scenario ?
Test the regex you created to
pull the reason description for
events in the firewall data.
| makeresults
| eval raw = "Aug 27 2020 21:10:08 awesome-vpn.buttercupgames.com %ASA-4-113019: Group =
buttercupgames Username = lteng, IP = 10.2.10.44, Session disconnected. Session type = IPsec,
Duration = 8h:8m:25s, Bytes xmt: 18998681, Bytes rcv: 1453738, Reason: Connection Lost"
| rex field=raw "^(?:[^:\n]*:){8}\s+(?P<reason>.+)"

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 38 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing the eval Command
...| eval fieldname1 = expression1 [, fieldname2 = expression2...]

• eval allows you to calculate and manipulate fields and their


values in search results
• Supports a vast assortment of functions
• Results of eval are written to a new or existing field
- If the destination field exists, the destination field’s value
is overwritten with the eval result
Note
- If destination field does not exist, the destination field is The full list of eval functions
can be found in Splunk Docs:
created and populated with the eval result Evaluation Functions page.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 39 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing the eval Command (cont.)
...| eval E = ...

field E exists field E does not exist


A B C D E A B C D
.. .. .. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. ..

Index
A B C D E A B C D E
.. .. .. .. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. .. ..
.. .. .. .. .. field E values are field E is created and .. .. .. .. ..
.. .. .. .. .. overwritten but nothing populated with values but .. .. .. .. ..
in the index is changed nothing is added to the index

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 40 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing the eval Command (cont.)

The eval command allows you to:


• Calculate expressions
Type Operators
• Place the results in a field arithmetic + - * / %
• Use that field in searches or concatenation +.
other expressions boolean AND OR NOT XOR
• Concatenate and compare field comparison < > <= >= != = LIKE
values

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 41 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing eval Expressions
index=sales sourcetype=vendor_sales VendorCountry IN("United States", Canada)
| stats sum(price) as "USA+Canada Sales" count as "Total Products Sold"
count(eval(VendorCountry = "United States")) as "Products Sold in US",
count(eval(VendorCountry = "Canada")) as "Products Sold in Canada" by product_name
| eval "USA+Canada Sales" = "$".'USA+Canada Sales'

• Field values are treated in a case-sensitive manner


• String values must be "double-quoted”
• Field names must be unquoted or single quoted when they include a
special character like a space
• Use a period (.) instead of (+) when concatenating strings and
numbers to avoid conflicts

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 42 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Ways to Write Multiple evals
All 3 statements produce same result:

index=network sourcetype=cisco_wsa_squid
Separate eval | stats sum(sc_bytes) as bytes by usage
pipeline segments | eval bandwidth = bytes/(1024*1024)
| eval bandwidth = round(bandwidth, 2)

Nested eval index=network sourcetype=cisco_wsa_squid


commands
| stats sum(sc_bytes) as bytes by usage
targeting the
same field | eval bandwidth = round(bytes/(1024*1024), 2)

index=network sourcetype=cisco_wsa_squid
Combining eval | stats sum(sc_bytes) as bytes by usage
commands with
commas | eval bandwidth = bytes/(1024*1024),
bandwidth = round(bandwidth, 2)

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 43 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Conversion Functions: tostring Example
...| eval field1 = tostring(X[,Y])

• X is number
• Y (optional) determines formatting of X (e.g. "commas", "duration", "hex")
index=web sourcetype=access_combined action=purchase status=503
| stats count(price) as NumberOfLostSales,
avg(price) as AverageLostSales,
sum(price) as TotalLostRevenue
| eval AverageLostSales = "$".tostring(AverageLostSales,
"commas"), TotalLostRevenue = "$".tostring(TotalLostRevenue, "commas")

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 44 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Conversion Functions: tonumber
...| eval field1 = tonumber(NUMSTR[,BASE])

• NUMSTR may be a field name or literal string value


• BASE defines the base of NUMSTR and may be 2 to 36; defaults to 10

Convert string values for the Convert the octal number (base-
field store_sales to numeric 8) 244 and the hexadecimal
number (base-16) A4 to decimal

...| eval myValue="1.4848974e+12"


...| eval n1 = tonumber("244",8), n2 = tonumber("A4",16)
| eval myValueAsInteger = tonumber(myValue)

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 45 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Date and Time Functions
now(): returns the time a search was started
...| eval field1 = now()

time(): returns the time an event was processed by eval command

...| eval field1 = time()

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 46 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Date and Time Functions: relative_time
...| eval field1 = relative_time(X,Y)

• Returns a timestamp relative to a supplied time


• X is a number, representing desired time in epoch seconds
• Y is a relative time specifier
• Relative time specifiers use time unit abbreviations such as:
s = seconds m = minutes h = hours d = days w = week mon = months y = year

Scenario ?
Return timestamp one day prior to
...| eval yesterday = relative_time(now(),"-1d@h")
when the search was started and | eval yesterday = strftime(yesterday,"%F %H:%M")
convert it into a string format.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 47 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Date and Time Functions: Format Variables
The Y argument in strftime(X,Y) and strptime(X,Y) is the
timestamp format variable(s)

Time Days Months & Years


Abbreviated
%H 24 hour 00 - 23 %d Day of month 01 to 31 %b month name Jan

%T 24 hour HMS %w Weekday 0 to 6 %B Month name January


Abbreviated
%a weekday Sun %m Month number 01 - 12
%I 12 hour 01 - 12
%A Weekday Sunday %Y Year 2020
%M minute 00 - 59
%F year-month-day %Y-%m-%d
%p AM or PM Note
The full list of format variables can be
found in Splunk Docs:
Date and time format variables page.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 48 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Date and Time Functions: strftime
...| eval field1 = strftime(X,Y)

• Converts a timestamp (X) to a string


• Y determines how X is formatted
index=sales sourcetype=vendor_sales
index=sales sourcetype=vendor_sales | timechart span=1h sum(price) as h_sales
| timechart span=1h sum(price) as h_sales | eval _time = strftime(_time,"%b %d, %I %p")

Before formatting After formatting

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 49 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Date and Time Functions: strptime
...| eval field1 = strptime(X,Y)

• X is a time represented by a string


• Y is the format of the string time value in field X
index=systems sourcetype=system_info asctime=*
| eval NewAsctime = strptime(asctime, "%Y-%m-%d %H:%M:%S,%N")
| table asctime, NewAsctime

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 50 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Text Functions: substr
...| eval field1 = substr(X,Y[,Z])

• Returns a substring of X starting at Y with the number of characters specified by Z


• X is a literal string (e.g. “abcd” or “1234”) or an existing field
• Y is numeric and specifies where the substring begins
– If positive, substring starts at Y and moves forward
– If negative, Splunk starts from end of string and grabs Y characters working backwards
• Z (optional) is numeric and specifies the number of characters to return if Y is
positive; if not specified, returns the rest of the string
Scenario ?
Return the employee username ...| eval employee = substr(bcg_workstation,6)
from workstations where the
names are BG##-username.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 51 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Text Functions Example
Scenario ? index=web sourcetype=access_combined
User wants to create a new field with the first | dedup product_name, categoryId, itemId
three letters of the categoryId, the uppercase | eval lowercase_product = lower(product_name),
product name, and the last two digits of the uppercase_product = upper(product_name),
itemId. substr_category = substr(categoryId, 1, 3),
substr_item = substr(itemId, 5),
• lower(X): converts string to lowercase ItemCode = substr_category."-".uppercase_product."
-".substr_item
• upper(X): converts string to uppercase | table lowercase_product, uppercase_product,
categoryId, substr_category, itemId, substr_item,
• X can be a field or a literal string ItemCode

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 52 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Text Functions: replace
...| eval field1=replace(X,Y,Z)

• Returns a string by substituting index=sales sourcetype=sales_entries


Z for every occurrence of Y in X A | stats count by AcctCode
B | eval AcctCode = replace(AcctCode,"(\d{4}-)\d{4}","\1xxxx")
• X is a literal string or field
A
• Y contains regex to identify a
pattern in X
B
• Z specifies the substitution
• Useful for masking data such as
account numbers and IP
addresses

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 53 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Text Functions: replace Example
Scenario ? index=web sourcetype=access_combined action=purchase status=200
Show sales information for the 3 | chart sum(price) as totalSales over clientip by product_name limit=3 useother=f
best-selling products in the last 24 | eval clientip = replace(clientip, "(\d+\.)\d+\.\d+(\.\d+)","\1xxx.xxx\2")
hours. Mask the middle octets of | fillnull
the customer IP addresses.

Note
The limit argument of the chart
command limits the number of
products displayed to the top x
values (in this case, the top 3).

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 54 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Comparison and Conditional Functions: if Example
...| eval field1 = if(X,Y,Z)

• X is a boolean expression
index=sales sourcetype=vendor_sales
- If X evaluates to TRUE, | eval SalesTerritory = if((VendorID>=7000 AND VendorID<8000),
"Asia", "Rest of the World")
returns Y | stats sum(price) as TotalRevenue by SalesTerritory
| eval TotalRevenue = "$".tostring(TotalRevenue, "commas")
- If X evaluates to FALSE,
returns Z
• Strings must be enclosed in
double quotes
• Field values are case
Note ?
sensitive Vendor ID for Asia is
7000 – 7999.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 55 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Comparison and Conditional Functions: cidrmatch & match

• cidrmatch(X,Y): returns TRUE/FALSE based on whether


provided IP address Y matches subnet specified by X
index=network sourcetype=cisco_wsa_squid
| eval isLocal = if(cidrmatch("10.2/16",bcg_ip), "IS local sub2", "NOT local sub2")

• match(subject,regex): returns TRUE/FALSE based on whether


the argument value Y matches the CIDR notation of X
index=network sourcetype=cisco_wsa_squid
| eval proper_ip_address = if(match(src,"^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$"), "true", "false")

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 56 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Comparison and Conditional Functions: coalesce
...| eval field1 = coalesce(X1,X2,...)

• Returns the first non-null


values from the provided (index=network sourcetype=cisco_wsa_squid) OR
(index=web sourcetype=access_combined)
arguments X1, X2,… | table c_ip, clientip, oneIP, sourcetype
| eval oneIP = coalesce(clientip,c_ip)
• Great for normalizing
field values from results
sets where two or more
field names represent
the same data

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 57 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Informational Functions
index=web sourcetype=access_combined
...| eval field1 = isnull(X) | eval IsProductIdNull = if(isnull(productId),"yes","no")
| table IsProductIdNull, productId

• isnull: evaluates X for null value and


returns TRUE or FALSE

...| eval field1 = typeof(X) index=web sourcetype=access_combined status>399


| stats sum(price) as totalSales, avg(price) as avgSales,
count(price) as numLostSales
• typeof: returns a string that represents | eval totalSales = tostring(totalSales,"commas"),
typeOf = typeof(totalSales)
the data type of X
– Possible results: number, string, or bool
– Can be used for validating or
troubleshooting fields
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 58 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Statistical Functions
max: takes an arbitrary number of numeric or string arguments and
returns the maximum
...| eval field1 = max(X1,X2,...)

min: takes an arbitrary number of numeric or string arguments and


returns the minimum
...| eval field1 = min(X1,X2...)

random: takes no arguments and returns a pseudo-random integer


ranging from zero to 231-1
Note
...| eval field1 = random() For both max and min,
strings are considered
greater than numbers.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 59 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Statistical Functions Example
Scenario ? index=games sourcetype=SimCubeBeta
The Sim Cubicle Beta team wants | table User
to test a new feature on random | dedup User
groups of users. Assign a random | eval group = (random() % 5) + 1
group number to unique user | stats list(User) as Users by group
emails from the past 60 minutes.

Note
The % (modulo) operator returns the
remainder of a division. The remainder is
always less than the divisor (in this example,
the number 5) so using a modulo operator
can give you control over your random
numbers. If you want a range of random
numbers, then just add the first range
number to your operation. In this case, we
want a range of 1 – 5, so we add 1.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 60 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
where Command (Review)
• The where command can use the eval expression syntax
• The where command can use most of the eval functions
index=network sourcetype=cisco_wsa_squid
| where cidrmatch("10.2/16",bcg_ip)
| dedup bcg_ip, bcg_workstation
| table bcg_ip, bcg_workstation

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 61 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 2 Knowledge Check
• True or False: Functions can be nested in an eval expression.
• How many results does makeresults generate by default?
• Which eval function is the best option for masking sensitive data?
• If you need to evaluate a regex string within an eval expression,
which of the following functions would be required: cidrmatch,
match, or typeof

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 62 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 2 Knowledge Check
• True: Functions can be nested in an eval expression.
• How many results does makeresults generate by default?
A single _time value is generated, by default
• Which eval function is the best option for masking sensitive data?
replace(X,Y,Z)
• If you need to evaluate a regex string within an eval expression,
which of the following functions would be required:
match

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 63 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Lab Exercise 2
Time: 45 minutes
Tasks:
• Sim Cubicle Beta team wants to know the top 3 "Made" actions performed in the
games over the last 7 days
• Verify the usernames of everyone who badged into the building over the last 7
days hours and identify an unauthorized entrant
• Display a count of sales over the last 4 hours and mask the vendors’ customer
account codes
• Supervisor wants to know which employees were shopping in the online store
over the last 4 hours
• Challenge: Troubleshoot a search that is not performing as expected

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 64 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 3:
Exploring Lookups

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 65 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module Objectives
• Review lookups
• Include events based on lookup values
• Use KV Store lookups
• Use external lookups
• Use geospatial lookups
• Identify best practices for lookups

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 66 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing File-based Lookups
• Sometimes static (or
relatively unchanging)
data is required for
searches, but aren’t
available in the index
• Lookups pull such
data from standalone
files at search time
and add it to search
results
Lookup input field

Lookup output fields

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 67 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing File-based Lookups (cont.)
• Lookups allow you to add more fields to your events
• Lookups can be invoked by the lookup command or configured to
run automatically
• After a lookup is invoked, you can use the lookup fields in searches
and reports
• The lookup fields also appear in the Fields sidebar

Note
Lookups were introduced in
Splunk Fundamentals 1.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 68 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing File-based Lookups (cont.)
1
Settings > Lookups > Lookup table files > new lookup table file

1• Click New Lookup Table File


2• Select a Destination app 2
3
3• Browse and select the file to use
for the lookup table 4

4• Enter a name for the lookup file


5

Note
Uploading a lookup file is required
for file-based and geospatial
lookups.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 69 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing File-based Lookups (cont.)
1
Settings > Lookups > Lookup definitions> new lookup table file

• Click New Lookup Definition


1

• Select a Destination app


2
2

• Name the Lookup definition


3 3
4
• For Type, select File-based
4 5

• Browse and select the file to use


5

for the lookup table 6

Lookup now listed in Settings > Lookups > Lookup definitions

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 70 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing File-based Lookups (cont.)
Minimum matches:
min # of matches for
each input lookup value Maximum matches:
max # of matches for
each lookup value
Default matches: value to
output when fewer than
the minimum matches are
Case sensitive match:
returned for a given input;
if unchecked, case
defaults to an empty string
insensitive matching is
performed for all fields
in lookup table

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 71 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing File-based Lookups (cont.)
Note
For information about the
other advanced lookup
options, refer to the
Knowledge Manager Manual.

Batch index query: if


checked, improves
performance of
Match type: similar to filter large lookup files
lookup, but uses wildcards
and/or CIDR (for matching IP
addresses against a pattern) Filter lookup: filters
results from lookup
table before
returning data

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 72 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using Lookup Tables to Include Values
With a lookup file, you can use the lookup command to access a
lookup file and pass values to the search
Scenario ? index=security sourcetype=history_access Address_Description="San Francisco"
A suspicious event occurred at the
date_hour>=22 OR date_hour<=8
San Francisco HQ early yesterday | lookup employees.csv RFID as rfid OUTPUTNEW USERNAME, DEPT
morning. SecOps wants a list of all | stats count by rfid, USERNAME, DEPT
employee usernames and their | where count=2
departments who badged into and out
of the building once between 10pm
and 8am.

Note
This search would also work with the
lookup definition, employee_lookup

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 73 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using KV Store Lookups
• Instead of matching against values in a CSV file, you can also
match against values in a KV Store (key value store)
• Use for large lookup tables or ones that are updated often
• KV Store saves and retrieves data in collections of key-value pairs
– Similarto database tables in which each record has a unique key
– Provides multiuser access locking so that multiple users can’t edit the
same record at the same time

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 74 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
CSV Files vs KV Store
File-based (CSV) KV Store
Allows for per-record insertion and editing
Suitable for frequent updating
Allows for data type enforcement
Allows for field accelerations
Provides REST API access to the data collection
Can toggle case-sensitivity
Require a full rewrite of the file to edit values
Supports case-insensitive field lookups
Uploading lookup file is not mandatory
Allows for multiuser access locking
Matches against values in a KV Store
Matches against values in a .csv file
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 75 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Steps to Set up a KV Store Lookup
1. Add configuration stanzas to collections.conf (admin only)
2. Create KV Store definition
3. Populate the KV Store lookup with data via:
– outputlookup command (admin and power user ability)
– REST API (admin ability)
– A front-end form (not discussed in this course)

Note
Once defined, the admin can
share the KV Store collection with
other apps and users
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 76 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Setting Up a KV Store: collections.conf
An admin must add a stanza for each KV Store in the
collections.conf file before a definition can be created
collections.conf
$SPLUNK_HOME
[kvstorecoll] A [collection_name] is required

etc enforceTypes = true


field.name = string
Admins must define the data
field.id = number types for each field and have the
apps
field.address_street = string option to enforce those data types

field.address_city = string Data type options include:


appname field.address_state = string number, bool, string, time

field.address_zip = string
Note
local When data type values are
enforced, any invalid value added
to a collection causes record
insertion to fail.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 77 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Setting Up a KV Store: collections.conf (cont.)
• Enforcing data types is useful if you want to:
– Guarantee a field is always treated as a specific data type
– Improve the collection’s accelerations (beyond the scope of this course)

• For example, an admin would create the following configuration stanza to


enforce the data types of this JSON record
collections.conf

[kvstorecoll] {
enforceTypes = true "name" : "Splunk Seattle",
field.name = string "id" : 123,
field.id = number "address" :
field.address_street = string {
field.address_city = string "street" : "1730 Minor Avenue",
field.address_state = string "city" : "Seattle",
"state" : "WA",
field.address_zip = string
"zip" : "98101"
}
}
Sample JSON data

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 78 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Setting Up a KV Store: Create a Definition
Settings > Lookups > Lookup definitions > new lookup definition

1 Choose destination app


2 Name the definition (what
3 Change to you’ll use in the search
“KV Store” string)

4 Enter collection name as


5 List all fields defined in
collections.conf
supported by
the lookup collections.conf

If only results from a subset of records in a


large KV store collection are required for a Note
search, improve performance by filtering Each collection must have at least two fields.
One of these fields must match the values of a
field in your event data.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 79 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Setting Up a KV Store: Populating
Option 1: Use outputlookup Option 2: Use Splunk REST API
to write search results into a (outside scope of this course)
specific KV Store collection
... curl -k -u admin:yourpassword \
|outputlookup kvstorecoll_lookup https://localhost:8089/servicesNS/nobody/kvst
oretest/storage/collections/data/kvstorecoll \
-H 'Content-Type: application/json’ \
…sends results to… -d '{"name": "Splunk HQ", "id": 123, "address": {
"street": "250 Brannan Street", "city": "San Francisco",
"state": "CA", "zip": "94107"}}'

kvstorecoll_lookup

Note
You must have access to the
collection in order to write results
using the outputlookup command
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 80 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Review: lookup Command
lookup is used to add fields from a lookup to the search results

...
|lookup lookup-table-name lookupField AS eventField OUTPUT lookupDestfield(s)

Will be matched Will be matched against Specify the fields


against eventField lookupField; not required if from the lookup that
fields share the same name you want added to
your events

These fields have matching values!

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 81 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Review: lookup Command (cont.)
Use the same name for lookupField and lookupDestfield to
limit results to only those events that contain the lookupField

index=security sourcetype=linux_secure

index=security sourcetype=linux_secure
| lookup knownusers.csv user OUTPUT user

Note
This search will not work in
the Fundamentals 3 app
until knownusers.csv has
been uploaded by the
student. (See Lab Guide.)
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 82 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using External Lookups
• Uses scripts or executables to populate events with field values
from an external source
• Often referred to as scripted lookups
external_lookup.py

index=web sourcetype=access_combined
| lookup dnslookup clientip
| stats count by clienthost </>
Underlying script or executable

External source
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 83 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
The External Lookup Script
• Must be a Python script or binary executable
• Must be added to your Splunk deployment in either:
– $SPLUNK_HOME/etc/searchscripts
– $SPLUNK_HOME/etc/apps/<app_name>/bin

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 84 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
External Lookup Example: external_lookup.py
• Splunk ships with a sample script external_lookup.py in
$SPLUNK_HOME/etc/system/bin
To use the sample script:
1. Move external_lookup.py script to appropriate directory
2. Create dnslookup definition as shown in next slide
3. Invoke lookup, either:
... | lookup dnslookup clienthost or
... | lookup dnslookup clientip
• Splunk passes values for clienthost into script and script returns
clientip (or vice versa)
Note
• Returned values are used to populate clientip or clienthost in The first step has been
the results completed already.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 85 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Defining an External Lookup
Settings > Lookups > Lookup definitions > new lookup definition
• Select Destination app
1
1
• Name the lookup
2 2
definition 3

• Change Type to External


3 4

• Enter script name and


4
5
arguments passed to
script
• List all fields supported
5
6
by the lookup
Note
The arguments passed to the script are the
field headers from the input/output CSV files.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 86 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using Geospatial Lookups
• Used to create searches that generate choropleth map visualizations
• Matches location coordinates in your
events to geographic feature collections in
a .KMZ or .KML file
• Outputs fields to your events that provide
corresponding geographic feature info
encoded in the KMZ or KML (e.g., country, .KMZ
Choropleth map visualization
state, or county names) geo_countries

• Splunk ships with two geospatial lookup Note

files, geo_us_states and geo_countries Choropleth maps and the geom


command were introduced in
Splunk Fundamentals 2.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 87 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
KML and KMZ Files
• In order to define a geospatial lookup, upload
a .KML (Keyhole Markup Language) or .KMZ
(zipped KML) file
• Similar to uploading a .CSV file before defining
a CSV lookup
• Many free KML/KMZ files available online or
you can create your own

Note
For more information, refer to
Appendix C: Creating New
Choropleth Maps.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 88 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Adding a Geospatial Lookup Table File
Settings > Lookups > Lookup table file > new lookup table file
• Select a
1

Destination app 1

• Browse and
2 2

select the .kmz or 3


.kml file to use
for the lookup
table 4

• Enter a name for


3

the lookup file


Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 89 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Defining a Geospatial Lookup
Settings > Lookups > Lookup definitions > new lookup definition

• Select a Destination
1

app 1

• Name the lookup


2
3
2

definition
4
• Change Type to
3

Geospatial
5
• Select the Lookup
4

file from the drop-


down list
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 90 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Review: geom Command
Scenario ?
Display the previous week’s retail sales in
EMEA.

index=sales sourcetype=vendor_sales
VendorID > 4999 AND VendorID < 6000
| stats count as Sales by VendorCountry
| geom geo_countries featureIdField=VendorCountry

Use the geom command to access


geospatial lookups

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 91 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Database Lookups with DB Connect
• With the DB Connect (DBX) app, you
can use lookups to reference fields in
an external SQL database
– Import database data for indexing,
analysis, and visualization
– Export machine data to an external
database
– Use SQL queries to build dashboards
mixing Splunk-ingested and DB data
Note
For more information, check out the YouTube
video Using Splunk DB Connect.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 92 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Best Practices for Lookups
• Order fields in lookup tables so that ‘key’ field is first
(leftmost), followed by other values
key value
• If a lookup needs to be invoked, include the lookup
... ...
command at the beginning of the search, when possible
... ...
• For commonly used fields, make lookups automatic
... ...
• Use gzipped CSV files or KV Store for large lookups
... ...
• Keep your lookups fresh and relevant:
– Do you really need the lookup table to contain a year’s worth of
data or is one week enough?
– Maintain the lookup table and delete older data if not needed

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 93 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Best Practices for Lookups (cont.)
• Check search.lookups
in job inspector to see
how long lookups took to
execute
• If there is latency, see if
there is one or many
lookups being invoked
against large files/tables
Note
You can also use lookups as input
to or output from alerts. We'll
discuss this in the next module.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 94 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 3 Knowledge Check
• True or False: Use KV Store lookups for large sets of data that is
rarely updated.
• True or False: You must upload a lookup file for both file-based
and KV store lookups.
• What command should you use if you want to write the results of a
search to a lookup?
• Splunk ships with the external_lookup.py script. What steps
need to happen next, so the external lookup can be used
in search?

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 95 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 3 Knowledge Check
• False: Use KV Store lookups for large sets of data that is rarely
updated frequently updated.
• False: You must upload a lookup file for both file-based and KV
store lookups. This is not a requirement for KV Store lookups.
• What command should you use if you want to write the results of a
search to a lookup?
outputlookup
• Splunk ships with the external_lookup.py script. What steps
need to happen next, so the external lookup can be used
in search? An admin needs to move the script to an appropriate
directory and the external lookup needs to be defined.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 96 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Lab Exercise 3
Time: 50 minutes
Tasks:
• Upload a lookup file of Buttercup employees
• Generate a report of known employees who have visited uncategorized websites over the last
24 hours
• Create a lookup definition to filter non-standard Buttercup employees
• Use an external lookup to return a count of sales events by host over the last 15 minutes
• Challenge: Include HTTP status and HTTP status descriptions in previous report
• Create a geospatial lookup to return a choropleth map of Canadian retail sales by province
during the previous week
• Challenge: Fix Canadian choropleth map
• Challenge: Find unknown users with more than 3 login attempts
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 97 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 4:
Exploring Alerts

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 98 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module Objectives
• Review alerts
• Use lookups in alerts
• Output alert results to a lookup
• Log and index searchable alert events
• Use a webhook alert action

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 99 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing Alerts
• Splunk alerts are based on searches
that can run either:
– On a regular scheduled interval
– In real-time

• Alerts are triggered when the result(s)


of the search meet a specific
condition that you define
• One or more actions can be selected

Available Alert Actions

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 100 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Output Alert Results to a Lookup

Option 1: Associate the Option 2: Use the


”Output results to a lookup” outputlookup command
action with a saved alert

...| outputlookup weberrors.csv

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 101 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Output Results to Lookup Action: Option 1
index=web sourcetype=access_combined status!=200
• Run search then click
1 | stats count by host action status

Save As > Alert 1

• Set the schedule


2

2
2
• Click +Add Actions >
3

Output results to lookup

When alert triggers, Splunk


3
sends search results to a
CSV lookup file

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 102 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
outputlookup Command: Option 2
...| outputlookup <filename>|<tablename>

• Writes search results to a specified file-based lookup (CSV) or KV


Store collection
• Can be executed from a search, ad-hoc report, scheduled search
or alert
• Saves results to a filename ending in .csv or .gz or definition
users.csv = filename ...|outputlookup users.csv

Note
usergroup = definition/tablename ...|outputlookup usergroup
If saving to a lookup definition, the
lookup table file or KV collection
must already exist.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 103 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
outputlookup Command: Option 2 (cont.)
The createinapp argument is one of many optional arguments
outputlookup lookup.csv

lookup.csv does not exist


lookup.csv already exists
The createinapp argument
Lookup contents will be determines where the lookup
overwritten with the new results is created

Note createinapp=true (default) createinapp=false


There are many options to control Lookup file created in the lookups directory Lookup file created for the system
the behavior of outputlookup in of current app lookups directory
different scenarios. Refer to
outputlookup lookup.csv createinapp=true outputlookup lookup.csv createinapp=false
Splunk documentation for
complete details.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 104 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Outputting Alert Results to a Lookup Example
Scenario ? Step 1: Create a daily scheduled report that:
SecOps has discovered an increase in
malicious activity. They want you to build an
alert that runs every 5 minutes and triggers
• Calculates the average daily failed login
when a user exceeds the average daily failed
login attempts within a 24-hour period. Use a
attempts per user for the last 30 days
30-day sampling window to calculate average
daily failed login attempts. • Uses outputlookup to send results
to averages.csv
Step 2: Create an alert that:
• Triggers when a user exceeds the
Note daily average
You will be creating a scheduled
report instead of an alert to
generate a lookup because it
• Logs a searchable event that the SecOps
doesn't require a trigger action.
team can monitor
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 105 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Outputting Alert Results to a Lookup Example (cont.)
Step 1: Create a daily scheduled report that populates a lookup
index=security sourcetype=linux_secure "failed password" earliest=-30d
| stats count by user
| eval daily_average = round(count/30) calculates daily average for each user
| fields - count
| outputlookup averages.csv createinapp=true creates lookup

| inputlookup averages.csv

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 106 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Outputting Alert Results to a Lookup Example (cont.)
Step 2: Create an alert to run every 5 minutes and correlates user values to the lookup
and extracts daily_average values
assign appropriate actions, like logging an event
index=security sourcetype=linux_secure "failed password" src_ip=*
| lookup averages.csv user OUTPUT daily_average
| stats count, values(src_ip) by user, daily_average
| where count > daily_average
| eval percent_increase=tostring(round(((count/daily_average)*100),2))."%"
| sort -percent_increase

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 107 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Logging Searchable Alert Events
• Alerts can be configured to create new searchable events
• Log events are sent to your Splunk deployment for indexing
• Can be used alone or combined with other alert actions
• Requires admin privileges or edit_tcp capability
Events trigger... ...alerts that create... ...events!

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 108 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Creating a Log Event Alert Action
Event: What you’ll see in the raw data of
the logged event (see following slides
for an example and info on tokens)

Values in these fields populate the source,


sourcetype, host, and index fields of the
logged event

By default:
Source = alert:$name$ where $name$ is
the name of the alert
Sourcetype = generic_single_line
Host = IP address of the host of the alert
Index = main

Note
It is highly recommended to use an index
other than the default index, main. In this
example, we created an index just for alerts.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 109 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Tokens for Log Events
• Tokens act as placeholders
for data values that populate $trigger_date$ $trigger_time$
Date when alert triggered Time when alert triggered
when the search completes formatted as YYYY-MM-DD formatted as epoch time

• Event fields can be populated


with plain text and/or tokens
• Tokens are available to
represent:
– Search metadata $result.sourcetype$
Sourcetype value from first search result row
– Search results
– Server information
– Job information
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 110 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Tokens Types for Log Events
Search metadata tokens access information about the search
$name$: search name
$description$: search description
$alert.severity$: alert severity level

Result tokens provide field values from the first search results row

$result.fieldname$
(e.g. $result.sourcetype$ or $result.src_ip$)

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 111 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Tokens Types for Log Events (cont.)
Server tokens provide details about your Splunk deployment
$server.version$: Splunk version number
$server.build$: Splunk build number
$server.serverName$: name of server hosting Splunk

Job information tokens provide data specific to a search job


$job.resultCount$: search job result count
$job.earliestTime$: initial job start time
$job.runDuration$: time for search job completion

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 112 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Tokens Types for Log Events (cont.)
Refer to the Splunk Documentation Alerting Manual for a full list
of tokens

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 113 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Searching Log Events

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 114 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using a Webhook Alert Action
• Allows you to define custom callbacks on a web resource
• When an alert triggers, a webhook action:
– Generates JSON-formatted info about the alert
– Sends an HTTP POST request to the specified URL with the alert info
in the body
• Why use a webhook alert action?
– Generate a ticket for BCG or other vendor ticketing systems
– Make an alert message pop up in a chat room
– Post a notification on a web page

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 115 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Webhook Data Payload
The webhook POST request
sends a JSON data payload that
includes:
• Result is the first row/event from
the triggered search results
• Search ID (SID) for the saved
search that triggered the alert
• results_link is the URL to search
results
• Search owner and app
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 116 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Creating a Webhook Alert Action
Choose Webhook action and enter the URL where you
index=web sourcetype=access_combined status!=200 want the POST request sent (default value shown)

3
1

JSON data payload is


automatically generated from
4 search metadata and results
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 117 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 4 Knowledge Check
• True or False: When using the outputlookup command, you can
use the lookup's filename or definition.
• What does a webhook POST request send?
• To use the Log Event alert action, the user will need ___ privileges
or edit_tcp ability.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 118 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 4 Knowledge Check
• True: When using the outputlookup command, you can use the
lookup's filename or definition.
• What does a webhook POST request send? A JSON data payload
• To use the Log Event alert action, the user will need admin
privileges or edit_tcp ability.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 119 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Lab Exercise 4
Time: 20 minutes
Task:
Create a scheduled alert that logs events detailing how many client IPs
are experiencing web server errors from buttercupgames.com.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 120 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 5:
Extracting Fields
at Search Time

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 121 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module Objectives
• Review the Field Extractor
• Use regex (regular expressions)
• Use the erex command
• Use the rex command
• Identify regex best practices

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 122 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Field Extraction Methods
• Can be persistent or temporary, depending on use case
• Choose between ease and precision

Ease of use Precision


Use Field Manually code a
Persistent
Extractor (FX) regular expression
Use erex SPL Use rex SPL
Temporary
command command

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 123 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing the Field Extractor
• Splunk provides the Field
Extractor (FX) graphical UI
to extract fields
• FX offers two methods to
perform a field extraction:
– Regular Expression: use for
unstructured data (e.g., a
system log file)
– Delimiters: use for
structured data (e.g., a Note
Using the Field Extractor was
CSV file) discussed in Splunk Fundamentals 2.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 124 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing the Field Extractor (cont.)
•1 If you choose the Regular Expression method, FX generates a regex
•2 You can edit that regex to more precisely match your needs

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 125 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing the Field Extractor (cont.)
Best practice: Use the FX to generate an initial regex, then edit it to your
specifications for best performance and accuracy
Warning
Once you edit the regex, you
can’t return to the automatic Field
Extractor UI workflow.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 126 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
What Is Regex?
• A regex (regular expression) is a case-sensitive sequence of
characters defining a pattern
• Each character is either a regular character (with literal meaning)
or a metacharacter (with special meaning)
• Widely used in programming and scripting languages for a variety
of string processing tasks
regex example for email addresses

\b[a-zA-Z0-9._%+-]+@[a-zA-Z0-9_-]+\.[a-zA-Z]{2,}\b
Note
Splunk uses Perl-compatible
regex.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 127 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Basics of Regex
cat c.t c\.t
regex

Regular characters are A .


is treated as a wildcard A \ is used to “escape”
treated literally and will match any character characters so they can be
treated as literal characters

cat
matches

cut
cat c1t c.t
c#t
…any many others…

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 128 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Regex Captures
• Regex can “capture” part of the matching pattern by using
parentheses
• You can reference the capture by giving it a name using: ?<name>
TraderID:(?<TraderID>\S+)
What will be captured; in
What comes before the this case, the next non-
capture group; in this case, What the capture whitespace characters
the pattern “TraderID:” group will be named

• With some older versions of regex, a “P” must be inserted in


order to perform named captures (?P<> )

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 129 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Regex Examples
user\s(\w+) captures the word following user
Failed password for invalid user fpass from 211.24.4.4
Successfully captures fpass
Failed password for invalid user jean-luc from 211.25.4.4
Doesn’t successfully capture jean-luc because “-” isn’t a “word”
character

user\s(\S+)
Failed password for invalid user jean-luc from 211.25.4.4
Successfully captures jean-luc

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 130 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Temporary Field Extraction
• Temporary field extraction is also known as extracting fields at
search time
• Extraction only exists for duration of search, doesn't persist as
knowledge object
• Good for rarely used fields
• Splunk offers 2 search time extraction commands
– erex: don’t have to know regex, just provide example values
– rex: must write regex, finds data that matches pattern

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 131 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
erex Command
...| erex fieldName examples="example1, example2,..."

• Extracts a field based on example values you provide


• The examples used must be in the returned results
• fieldName is the name of the new field created for this search

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 132 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
erex Command Example
Scenario ?
Sec Ops wants to display the IP address and index=security sourcetype=linux_secure port "failed password"
port of potential attackers. The field port does 1 | erex port examples="3572,2471"
not currently exist and would need be created. | table src_ip, port 2

• Creates a temporary
1

new field, port


• Extracts values using
2

examples provided 3
(3572 and 2471)
• To view the regex
3

generated by your
search, click the
Job drop-down menu
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 133 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
rex Command
...| rex [field=field] "regex-expression"

• Matches the value of the field against unanchored regex


• field is any available field you want to extract information from;
defaults to field=_raw
• The regex-expression must include the capture, which would
include the field name and match pattern
• You can use erex to generate an initial regex, then edit it to your
specifications for use with the rex command

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 134 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
rex Command Example 1
Scenario ? index=network sourcetype=cisco_esa mailfrom=*
| rex "\<(?<potentialAttacker>.+)@"
Display the usernames of potential email attackers.
| table potentialAttacker

• The Cisco router server contains the email


addresses of those sending email to
the company
• Use rex to
extract just local-
part of email
address at
search time

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 135 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
rex Command Example 2
Scenario ?
index=network sourcetype=cisco_esa mailfrom=*
Display the usernames and mail domains
| rex field=mailfrom "(?<potentialAttacker>.+)@(?<domain>.+)"
from which employees are receiving email.
| table mailfrom, potentialAttacker, domain

• The field mailfrom is being used


to create two new fields,
potentialAttacker and domain
• Use rex to extract it at
search time
• Can perform multiple extractions
Note
By default, the scope of the rex command is _raw. You can
limit the scope of the command by specifying a particular
field using the field argument. This can significantly
reduce the complexity of your regex expression.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 136 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
rex Command: sed Mode
...| rex [field=field] mode=sed "sed-expression"

• Search and replace within a field using a sed (Unix stream editor)
expression
• Use the s flag to replace strings or the y flag to
substitute characters
• Example sed expressions:
– "s/(\d{4}-){3}/XXXX-XXXX-XXXX-/g" matches the regex to a
series of numbers and replaces them with an anonymized string
– "y/string1/string2/" substitutes “string2” for “string1”

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 137 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
rex Command: sed Mode Example
Scenario ?
Customer Success would like to view the
number of transactions by account code in the
last 4 hours. However, the account code must
be masked for the shared dashboard. index=sales sourcetype=sales_entries AcctCode
| stats count, values(TransactionID) as TransactionID by AcctCode
| rex field=AcctCode mode=sed "s/(\w{4}-)\S+/\1xxxx/g"
"s/(\w{4}-)\S+/\1xxxx/g"

The backslash / splits the sed


expression into four parts:
• The option (s for replace or y
for substitute)
• The regex
• The replacement
• The flag (g stands for global so all
matching occurrences are replaced)
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 138 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
erex vs. rex
index=security index=security
sourcetype=linux_secure Scenario ? sourcetype=linux_secure port
port "failed password" Display IP addresses and ports of "failed password"
| erex port examples="3048,2601" potential attackers. | rex "port\s(?<port>\d+)"
| table src_ip, port | table src_ip, port

erex rex
• Easier because regex • Must know regex (difficult)
knowledge is not needed • Don’t have to provide
• Generates a regex expression examples
• Must constantly provide • Can use regex generated by
examples from current data erex and then customize
• Should not be used in as needed
saved reports • Can be used in saved reports
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 139 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Regex Best Practices
• Avoid “backtracking” by writing simple and concise regex
– Backtracking occurs when the regex engine must return to a
previously saved state to continue its search for a match causing the
engine to make multiple passes
• Quantifiers (e.g. *) and alternation constructs (e.g. |) are powerful
but can slow performance by causing backtracking
• Use + rather than *
• Avoid using multiple .* matches
• Avoid greedy operators (.*), use non-greedy (.*?) instead
Note
For regex examples, see Appendix A.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 140 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Benchmarking Field Extraction

Check search.kv in job


inspector to see time
taken to apply field
extractions to events

Note
There are many books,
tutorials, and tools
available for regex. The
Splunk Wiki has pages on
Regex Syntax in Splunk
and Regex Testing Tools.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 141 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 5 Knowledge Check
Fill in the blanks with the four methods of field extraction:

Ease of use Precision

Persistent ??? ???

Temporary ??? ???

Options: erex SPL command, manually code a regular expression,


rex SPL command, Field Extractor (FX)
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 142 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 5 Knowledge Check
Fill in the blanks with the four methods of field extraction:

Ease of use Precision


Field Extractor Manually code a
Persistent
(FX) regular expression
erex SPL
Temporary rex SPL command
command

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 143 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Lab Exercise 5
Time: 30 minutes
Task:
Use the erex and rex commands to extract fields at runtime and
include or exclude events based on pattern matching

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 144 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 6:
Working with
Self-Describing Data

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 145 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module Objectives
• Define self-describing data
• Read JSON and XML files
• Use the spath command
• Use the eval command with the spath function
• Extract fields from table-formatted events with multikv

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 146 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
What Is Self-Describing Data?
• The structure is embedded in the data itself
• Comprised of metadata which may include:
– Properties/elements/attributes
– Data types/items
– Compression/encoding scheme
– Other info

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 147 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
JSON and XML Examples
JSON
{"menu": {
"id": "file",
• Splunk extracts fields
"value": "File", from JSON and XML
"popup": {
"menuitem": [ based on the formatting
{"value": "New", "onclick":
"CreateNewDoc()"}, • Both consist of:
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Close", "onclick": "CloseDoc()"} – Collections of
]
}
name/value pairs
}} – Ordered lists of values
XML
(arrays)
<menu id="file" value="File">
<popup>
<menuitem value="New" onclick="CreateNewDoc()" /> Note
<menuitem value="Open" onclick="OpenDoc()" /> Indexes in JSON and XML
<menuitem value="Close" onclick="CloseDoc()" /> are slightly different. In JSON,
</popup> numbering begins with 0; in
</menu> XML, it begins with 1.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 148 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Automatically Interpreting JSON
When JSON data is onboarded index=systems sourcetype=colors_json
and given a sourcetype
configured for JSON, Splunk
automatically extracts fields at
search time (otherwise, no fields
are extracted)

Note
In props.conf, AUTO_KV_JSON is
a parameter that tells Splunk
whether to try for json extraction
automatically. It defaults to true.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 149 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Interpreting XML
By default, XML data doesn't extract fields automatically
index=systems sourcetype=status_definitions

Note
Admins can configure the
sourcetype with KV_MODE=XML to
automatically extract fields at
search time.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 150 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using spath with XML
The spath command interprets the XML structure so you can
access the data as Splunk fields
index=systems sourcetype=status_definitions
| spath

Note
Field names include the
parent elements (datapath).
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 151 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
spath Command
...| spath [input=field][output=field][path=datapath|datapath]

• Extracts fields from XML or JSON data


• Can be used alone or with optional arguments:
– input: field from which data is to be extracted (defaults to _raw)
– output: data to be extracted is written to this field (defaults to value of
the path argument)
– path: the location path to the value you want to extract
ê Valid syntax is path=datapath or just datapath
ê If not specified, extracts all fields from first 5000 characters of input

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 152 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
spath Command: Location Steps
• path argument can contain one or more location steps, separated
by periods
• Each step contains field name and optional index (position) in
curly brackets
– If index is an integer, it specifies the position of data in an array
– If index is a string preceded by an @ symbol, it specifies an
XML attribute
• Examples:
– recordings.album.artist
– entities.hashtags{3}.text
– purchases.book.title{@yearPublished}
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 153 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
spath Example 1
Scenario ? index=systems sourcetype=status_definitions
IT wants to create a table | spath
containing status, description, | table root.*
and status type from data in an
| rename root.row.* as *
XML file.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 154 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
spath Example 2
Use curly brackets to indicate the index of an array
Scenario ? index=systems sourcetype=menu_json
The Dev team wants to display a | spath output=menuItems path=menu.popup.menuitem{}.value
table of popup menu values from | table menuItems
data in a JSON file.

Note
If you do not use the output option,
then you should rename the field to
remove the {}. Doing so avoids
issues that can occur when using
field names containing {} with
certain commands.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 155 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
spath Example 3
Use the @ symbol to specify an XML attribute
Scenario ? index=systems sourcetype=library_xml
The Docs team wants to extract a
| spath output=publicationDates path=purchases.book.title{@yearPublished}
table of book publication dates | table publicationDates
from an XML file.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 156 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
spath Example 4
If a field has a valid structure, i.e. contains a block of JSON or XML,
then spath can be used to extract additional fields

index=systems sourcetype=server_log

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 157 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
spath Example 4 (cont.)
To extract data from the system_info field, invoke the spath
command with the input option
index=systems sourcetype=server_log
| spath input=system_info

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 158 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using the eval Command spath Function
• As an alternative to the
spath command, you
can also use the spath index=systems sourcetype=status_definitions
| eval description = upper(spath(_raw, "root.row.status_description"))
function of the eval | table description
command
• spath(X,Y) where:
X: input source field
Y: XML or JSON
formatted location path
to the value you want to
extract from X
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 159 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Table-Formatted Event Example
• Some data types are formatted as large single events in a table
• Each event contains titles with tabular values
– Field names are derived from the title row A

– All other rows represent values B


index=main sourcetype=netstat

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 160 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Extracting Fields Using the multikv Command
• For table-formatted events, multikv creates an event for each row
• Field names derived from header row of each event
index=main sourcetype=netstat

index=main sourcetype=netstat
| multikv

Warning
Students do not have
access to the main index.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 161 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using multikv: fields and filter Options
• Use fields option to extract only specified fields
• Use filter option to include only table rows containing at least one
field value from specified list
index=main sourcetype=netstat
| multikv fields LocalAddress ForeignAddress State filter ESTAB LISTEN
| table LocalAddress, ForeignAddress, State

Note
The arguments supplied to
the filter option are treated
like they are connected using
OR operators.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 162 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 6 Knowledge Check
• True or False: By default, JSON and XML data extract fields
automatically.
• The path argument of the spath command can contain one or
more ___ ___.
• If no ___ argument is defined for the spath command, the
extracted data will be written to the value of the path argument.
• For table-formatted events, ___ creates an event for each row.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 163 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 6 Knowledge Check
• False: By default, JSON and XML data extract fields automatically.
• The path argument of the spath command can contain one or
more location steps.
• If no output argument is defined for the spath command, the
extracted data will be written to the value of the path argument.
• For table-formatted events, multikv creates an event for
each row.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 164 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Lab Exercise 6
Time: 35 minutes
Tasks:
• Extract fields from an XML file using the spath command and the spath
function of the eval command
• Analyze the performance of a Linux server based on a system log
• Display an area chart comparing the average CPU and RAM
percentages used over the last week
• Display a line chart comparing the average free RAM and used RAM
over the last 24 hours

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 165 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 7:
Exploring Search Macros

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 166 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module Objectives
• Review search macros
• Use nested search macros
• Preview search macros before executing
• Use tags and event types in search macros

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 167 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing Search Macros
• Useful for frequently run searches
with similar syntax
• Time range selected at search time
index=sales sourcetype=vendor_sales

• Can be a full search string or a VendorCountry="United States"


| stats sum(price) as USD by product_name
portion of a search that can be | eval USD = "$".tostring(USD, "commas")

reused in multiple places


• Allows you to define arguments
within a search segment and pass
parameters at execution time

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 168 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing Search Macros (cont.)
2 Choose destination app

Create the macro name


3 that will be used in search

Test the macro definition,


1 i.e. search string, before
creating macro

Macros are called to the


search line by wrapping the
macro name in back ticks

Note
Creating search macros was
Mac Windows discussed in Splunk Fundamentals 2.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 169 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing Search Macros: Arguments

1 Add arguments by
including the number of
arguments in parenthesis
after the macro name

2 Specify argument names and surround with $

Order of arguments must


3 match when running the
macro in search

index=sales sourcetype=vendor_sales
VendorCountry IN(Germany, France, Italy)
| `monthly_sales(Euro,€,0.85)`

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 170 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing Search Macros: Expand Search
To expand search macros, use the keyboard shortcut:

Windows or Linux
Ctrl Shi$ E

`monthly_sales(Euro,€,0.85)`

(If syntax highlighting or line numbering is


enabled, these features are displayed as well)
Mac OSX

Splunk displays the expanded search string, resolving all nested


search macros
Note
If macro is private, you need to run
the search before expanding.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 171 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using Nested Search Macros
Macros can be nested within each other
1. Create “inner” macro first
2. Put “inner” macro name surrounded by backticks in definition of
“outer” macro

inner macro

outer macro
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 172 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using Nested Search Macros Example
Scenario ? Execute the outer
1
Display the count of web security search macro which…
events with status 404 by location
during the last 7 days.
…references the
index=network sourcetype=cisco_wsa_squid 2 inner search macro,
status=404 | `location_count` location_count…

index=network sourcetype=cisco_wsa_squid ...so that stats command


status=404 | stats count by location 3 is processed, and search
is executed, which…

4 ...returns events!

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 173 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using Nested Search Macros with Arguments
• Arguments can be passed from “outer” macro to “inner” macro
• Provide variable values at search time

inner macro

outer macro
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 174 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using Nested Search Macros with Arguments (cont.)
Scenario ? Pass value to the outer
Display the count of web 1 search macro at
security events with status 404 execution time which…
by department during the last
60 minutes.
index=network sourcetype=cisco_wsa_squid …passes value to
2
status=404 | `stats_count(dept)` inner macro so that…

index=network sourcetype=cisco_wsa_squid …stats command is


status=404 | stats count by dept processed and search
3
is executed using that
value, which…

4 ...returns events!

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 175 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using Other Knowledge Objects with Macros
• Reference knowledge objects in macros just as you would from
A

the search bar—field aliases, calculated fields, tags, etc.


• The following slides show how to use tags and event types with
B

macros

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 176 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using Tags/Event Types with Macros
1 index=network sourcetype=cisco_wsa_squid status=404
1 Run a search and
verify that all results 2

meet your event type


criteria
2 Save As > Event Type 3

3 Provide a Name and


optional Tags for your
event type
4 Save
4

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 177 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using Tags/Event Types with Macros (cont.)
Settings > Advanced Search > Search Macros > New Search Macro
1 Choose destination app

2 Name the macro

Include the event type OR


3
tag in the macro definition

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 178 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using Tags/Event Types with Macros (cont.)
Scenario ?
Display the count of web security
events with status 404 by location 1 Execute the search
macro which…
during the last 7 days.

References the tag


tag=wsa-not-found | `location_count` 2
wsa-not-found that…

index=network sourcetype=cisco_wsa_squid
3 References the event type
status=404 | stats count by location search string and pipes results
to the location_count macro
that executes the stats
command and…

4 …returns results!

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 179 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Lab Exercise 7
Time: 35 minutes
Tasks:
• Build a macro that accepts a first and last name and returns information about
whether that employee scanned into more than one office
• Create a tagged event type to identify all attempted online purchases with
invalid HTTP status codes
• Create a macro that enables users to choose between generating summary and
average statistics
• Create a macro that enables users to specify a status code when performing a
search
• Create a macro that uses the tagged event type and both previously
created macros
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 180 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 8:
Using Acceleration Options

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 181 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module Objectives
• Describe acceleration
• Determine how summaries make search efficient
• Identify acceleration methods

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 182 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
What is Acceleration?
• A Splunk feature that relies on summaries of event data to speed
up search performance
• There are 3 acceleration
methods:
– Reportacceleration
– Summary Indexing
– Data model acceleration

Note
Data model acceleration is the easiest
and most efficient acceleration option
and should be your first choice. We'll
discuss data model acceleration in
Module 11.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 183 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
How Summaries Make Searches Efficient
• Searches run against summaries should complete much
faster because:
– Summaries are considerably smaller than the original data set from
which they were generated
– Summaries contain only the data needed to fulfill the searches run
against them
• Summaries can be automatically or manually created; this is
determined by what searches are being accelerated

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 184 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Terms
• The terms "acceleration" and "summary" are not interchangeable
• Report acceleration, data model acceleration, and summary indexing
are all acceleration methods that rely on summaries
• The differences in these methods are:
– How they are made
– How they are maintained
– How they are used

• These differences are discussed in more detail in the


following 4 modules

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 185 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Types of Acceleration Methods
Report Data Model
Summary Indexing
Acceleration Acceleration
Automatically created
Search is saved as a User manually creates a
when user enters Pivot
report and the scheduled populating
How it's made Accelerate Report search and enables
editor or an admin forces
persistent acceleration for
option is chosen Summary Indexing
a data model
How it's
The summary is updated
accelerated Splunk automatically Splunk automatically
every time the
(i.e. how the summary creates summaries creates summaries
populating search is run
is generated)
Search the data model in
Run a search against the
How to use Run the report
summary
pivot or use a persistently
accelerated data model

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 186 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 9:
Report Acceleration

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 187 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module Objectives
• Describe report acceleration
• List the requirements for report acceleration
• Identify which commands must be and can be used for report
acceleration and the order in which they should appear
• Who can and how to accelerate a report
• Understand scenarios where Splunk does not create an
acceleration summary for a report
• Search against an acceleration summary

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 188 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Report Acceleration
• Reports that span a large volume of data can:
– Takea long time to complete
– Consume a lot of system resources
• Accelerated reports run off acceleration summaries which:
– Storeonly the data needed to fulfill the report
– Are automatically populated in the background

Accelerated reports run faster because they are running off


curated, updated data
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 189 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Report Acceleration: How to Qualify
• Reports must meet certain guidelines to be eligible for
report acceleration:
– Users must have the schedule_search privilege and
accelerate_search capabilities (power and admin users
have this by default)
– Search mode must be set to either 💡Smart or ⚡Fast
– Search must include a transforming command

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 190 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Report Acceleration: Commands
An accelerated report must include transforming commands and
may include streaming and non-streaming commands:
Must be included May be included, however order is important

Transforming Commands Streaming commands Non-streaming commands


• Order results into a • Operate on each event as it • Execute after all events
data table is returned by search are returned
...| stats ...| eval ...| table

...| timechart ...| search ...| sort

...| top ...| fields ...| fillnull


Note
...| rename This is not a full list. Refer to
the Search Reference manual
for a more detailed list.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 191 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Report Acceleration: Commands (cont.)
Distributable streaming
commands typically run on Centralized streaming
the indexers and are the only commands always execute
command type allowed at the search head and are
before a transforming only allowed after a
Search Head
command (they are allowed transforming command
after too)
...
| <transforming command> ...
| <distributable streaming command> ... ...
Execution depends Indexers Always | <transforming command> ...
on command order | <centralized streaming command> ...
...
| <distributable streaming command> ...
| <transforming command> ...

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 192 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Report Acceleration: To Summarize
If there are any commands that come before the transforming
command, they must be distributable streaming commands

| eval ... | transaction ...


| stats ... | stats ...

If there are any commands that come after the transforming


command, they can be streaming (distributable or centralized) or
non-streaming commands
| stats ... | stats ...
| eval ... | eventstats ...

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 193 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
What Reports Qualify for Acceleration?
index=web sourcetype=access_combined action=purchase status=200
| stats sum(price) as revenue by productId
| eval revenue = "$".revenue

index=web sourcetype=access_combined action=purchase status=404

index=web sourcetype=access*
| fields price action host
| chart sum(price) over action by host

index=web sourcetype=access_combined Note


| transaction clientip startswith="view" endswith="purchase" Refer to the Search
| stats avg(duration) as avgDuration Reference manual for
commands and their types.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 194 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Report Acceleration Flowchart
What is your role? Yes
Are there any
commands before the
transforming What is the search
Does your search have a
command? If yes, are mode?
transforming command?
Power Admin
they distributable
User
streaming commands?

No Yes
Run a regular search Splunk changes
(or check with admin) mode to Smart
when you
about being granted accelerate.
schedule_search and
accelerate_search No
privileges.
You can accelerate the report!

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 195 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Accelerating a Report
1
1 Create a qualifying search index=sales sourcetype=vendor_sales
| stats values(Vendor) as Vendor sum(price) as
and save as a report revenue by VendorCity
| sort 10 -revenue
2 After saved,
click Acceleration
3 Check the box next to
Accelerate Report and
choose the 3
2
Summary Range

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 196 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Report Acceleration: Summary Range
• Determines how much time
the acceleration summary
spans relative to now
• Searches within the time
range will only use summary
data
• Splunk automatically
removes older summary data
that ages out of range Note
Report acceleration features automatic
backfill. If for some reason you have a data
interruption, Splunk software can detect
this and automatically update or rebuild
your summaries as appropriate.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 197 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Accelerating a Previously Saved Report
A previously saved report
can be accelerated, too 1
1
Click on Reports in the app
navigation bar and select a
saved report
2 Edit > Edit Acceleration and
enable the qualifying report 2

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 198 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Viewing Accelerated Reports
Once accelerated, a lightning bolt appears next to the saved
report in Settings > Searches, Reports, and Alerts

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 199 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Acceleration Summary Not Created
Even if report acceleration is enabled, Splunk may not create an
acceleration summary

Suspend summary
Is the number of events creation for 24 hours
and check again
returned > 100,000 hot Yes
bucket events?
Will summary
No be too large? Yes

Run as normal search


Note Create acceleration
A summary is too large if it will exceed
No
10% of your total bucket size in your
summary
deployment. (Refer to docs for
more information.)

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 200 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Acceleration Summary Not Created (cont.)
• Some searches run faster without a summary if:
– There are fewer than 100K events in hot buckets covered by summary range
– Summary size is projected to be too big

• If acceleration summary was defined and not created for the above
reasons, Splunk:
– Continues to check periodically
– Automatically creates a summary if/when the report meets the requirements

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 201 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Acceleration Summary Created
• Acceleration is a good option for reports that calls on 100k or
more events from hot buckets for the summary range selected
• Splunk automatically populates acceleration summaries every
10 minutes
• Report acceleration summaries are stored by time alongside
buckets in your indexes
– This
is different from summary indexes which are stored at the search
head (more on this in the next module)

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 202 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Acceleration Summary Created (cont.)
Splunk automatically shares summaries with users who have
access to the accelerated report

Users of an accelerated
Any searches run by these
shared report benefit from
users pulls data from the
having access to the
acceleration summary
acceleration summary for
when possible
that report

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 203 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Searching Against an Acceleration Summary
• In addition to saved accelerated reports, ad hoc searches can use
the summary when:
– Search criteria matches the base saved search
– The user executing the ad hoc query has permission to the acceleration
summary
• You can also append the search string with additional commands,
for example:
index=web sourcetype=access_combined
index=web sourcetype=access_combined
| stats count by price
| stats count by price
| eval discount = price/2
Populating Search Ad Hoc Search

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 204 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using Summaries
• The Job Inspector shows when summaries are being used for a
search

• Deleting all reports that use an acceleration summary


automatically deletes the acceleration summary

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 205 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Viewing Report Acceleration Summaries
Settings > Report Acceleration Summaries
– Summary ID and Normalized Summary ID: unique hashes assigned to the
summary (clicking these hashes loads the summary details page)
– Reports Using Summary: saved reports associated with the summary

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 206 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Viewing Report Acceleration Summaries (cont.)
Settings > Report Acceleration Summaries Note
If Summarization Load
is high and Access
– Summarization Load: calculation of effort to update summary Count is low, consider
deleting the summary.
SL = time to run populating report / interval of populating report
– Access Count: how often summary used

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 207 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Viewing Report Acceleration Summaries (cont.)
Settings > Report Acceleration Summaries
– Summary Status: either % of summary complete at that moment, or a
status value
• Summarization not started
• Pending: the search head
about to schedule new update
for the summary
• Building summary
• Complete
• Suspended: summary size too
big to be useful
• Not enough data to summarize:
summary size too small (fewer
than 100K events)

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 208 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Viewing Summary Details
• Click on summary ID to view
Summary Details
– Size on Disk: how much storage
space the summary takes up
– Summary Range: range of time
spanned by the summary, relative to
present moment
– Timespans: size of data chunks
comprising the summary
– Buckets: number of index buckets the
summary spans
– Chunks: number of data chunks
comprising the summary
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 209 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Viewing Summary Details (cont.)
• Actions
– Verify: examines a subset of the
summary and verifies that all
examined data is consistent
– Update: updates the summary
– Rebuild: rebuilds the summary
from scratch
– Delete: deletes the summary

Note
If accelerated report isn't returning expected results, it may
be that an underlying tag, event type, or field extraction
rule was changed. If that happens, use Verify to determine
whether data is consistent. If verification fails, use Rebuild
to recreate the summary.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 210 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 9 Knowledge Check
• True or False: By default, Power and Admin users have the
privileges that allow them to accelerate reports.
• An accelerated report must include a ___ command.
• Which command type is allowed before a transforming command
in an accelerated report?
• True or False: Report acceleration summaries are stored on the
search head.
• Does this search qualify for report acceleration?
index=network sourcetype=cisco_wsa_squid
| fields http_content_type dept
| chart count by http_content_type, dept useother=false
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 211 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 9 Knowledge Check
• True: By default, Power and Admin users have the privileges that
allow them to accelerate reports.
• An accelerated report must include a transforming command.
• Which command type is allowed before a transforming command
in an accelerated report? distributable streaming commands
• False: Report acceleration summaries are stored on the search
head alongside buckets in your indexes.
• Does this search qualify for report acceleration? Sure does!
index=network sourcetype=cisco_wsa_squid
| fields http_content_type dept
| chart count by http_content_type, dept useother=false
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 212 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Lab Exercise 9
Time: 20 minutes
Task:
Create a rolling 90 day report on all successful online purchases
and accelerate it

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 213 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 10:
Summary Indexing

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 214 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module Objectives
• Describe summary indexing
• Identify when to use a summary index
• Define a summary index
• Create a report with si-commands
• Schedule the report
• Enable summary indexing

• Search against a summary index


• Avoid gaps and overlaps in summary indexes

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 215 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Summary Indexing Overview
• An alternative option for reports that do not qualify
for report acceleration
• Your role needs write access to the index, like Summary Index

summary, to enable summary indexing Search Head

• Summary indexes are built and stored on the


search head, by default
• A summary index is manually populated by a Indexers

special type of scheduled report, discussed


in the following slides Forwarders
Summary indexes exist
separately from other
indexes in your deployment

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 216 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Populating a Summary Index
• Populating searches for summary indexes must
be manually created and: si
– Contain an si<command>
– Must be scheduled to run more often (generally
smaller intervals) than the subsequent searches Populating Search

– Must have the summary indexing enabled and


index selected
Summary Index
• The summary index is updated every time the
populating search is run
Subsequent Searches
• Subsequent searches run against the specified
summary index
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 217 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Summary Indexing Transforming Commands
Use when defining the scheduled report that populates the summary
index
...| sichart si- version of chart command

...| sitimechart si- version of timechart command

...| sistats si- version of stats command

...| sitop si- version of top command

...| sirare si- version of rare command

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 218 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Summary Indexing Transforming Commands (cont.)
• A set of special fields is added to the summary index data; these
fields all begin with psrsvd ("prestats reserved")
• When a transforming command-containing search is run against
the summary index, these fields are used to calculate results that
are statistically correct

Note
index=web sourcetype=access_combined For information about what
| sistats count by product_name the suffix gc and v mean,
visit the Knowledge
Manager Manual.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 219 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Summary Indexing Transforming Commands (cont.)
index=web sourcetype=access_combined
| sistats count by product_name

1 Save as report

Report will run frequently


2
Populating search
results are formatted
and stored as statistics
index=summary
index=web

sales_Summary_purchasedProducts
3
Note
Note how the populating
search runs more frequently
For this example, subsequent searches must be
than the searches run
run against the summary index
against the summary.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 220 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Defining a Summary Index
1 index=security sourcetype=linux_secure
• Create a search with
1
| sitop src_ip, user

si- command to populate


the summary index
2
• Save the search as a
2

report with a
3
meaningful name
• After the report is
3

created, click Schedule


Note
The name of the report should
indicate that it’s populating a
summary index.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 221 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Defining a Summary Index (cont.)
• Click Schedule Report
4

checkbox
• Enter an appropriate
5
4

Schedule and
5
Time Range

Note 6
Remember that the populating search
should run more frequently than searches
against the summary index.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 222 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Defining a Summary Index (cont.)
• Go to Settings >
7

Searches, reports, and


alerts 7

• Click Edit > Edit


8

8
Summary Indexing for
the report

Note
You must have appropriate permissions to
edit summary indexing.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 223 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Defining a Summary Index (cont.)
9
• Click the Enable summary
indexing checkbox
10 • Select the index type 9

11 • Choose your summary 10

11
index (summary by default)

Note 12

Not surprisingly, the default summary index is the summary index,


but you can specify any index to which you have permission to
write. Splunk recommends having different summary indexes
dedicated to different types of data.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 224 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Searching Against a Summary Index
• You created a summary index that populates every day at 12 AM
with data from the last 24 hours
index=security
sourcetype=linux_secure
| sitop src_ip, user
index=summary

• After population, you can run efficient searches against the summary
index based on the available historical data
index=summary
search_name="summary linux_secure
top src_ip user"
| top src_ip, user

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 225 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Benefits of Summary Indexing
• More efficient reports on large datasets over long time ranges
– Summary index searches run faster because they’re searching a much
smaller, more narrowly focused data set
• Amortize costs over different reports
– Summary indexing volume isn’t counted against your license, only the
primary data volume
– Cost of the populating search can be spread over different reports and
different, overlapping time ranges Note
By default, the summary indexing
• Enables creation of rolling reports data does not count against your
license because the
sourcetype=stash. If the sourcetype
is changed, then summary indexing
will count against your license.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 226 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Avoid Gaps and Overlaps in Summary Indexes
• Be careful to schedule populating reports appropriately to avoid
gaps and overlaps
• Gaps in data can occur if:
–A populating report runs too long, past the next scheduled run time
– The time range for the report has a smaller time window than the
frequency of the report schedule
– Splunk goes down

• Use the fill_summary_index.py script to backfill gaps


–A backfill script runs saved searches to populate summary index as
they would have been executed at their regularly scheduled times
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 227 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Avoid Gaps and Overlaps in Summary Indexes (cont.)
• Overlaps appear as events in a summary index that share the
same timestamp
– Skews reports and statistics created from summary index
• Overlaps can occur if:
– Youset report time range to be longer than frequency of report
schedule

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 228 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Other Summary Indexing Considerations
• Don’t pipe other search operators after main transforming
command in the populating search
– Forexample, don’t include additional | eval commands
– Save extra search operators for the searches you run against the
summary index, not the search you use to populate it
• Results from an optimized search can’t be modified but you can
append commands to manipulate the output
– Ifyou populate a summary index with sistats <args>, the only data
you can retrieve is stats <args>
– You can’t create or modify fields before the stats <args> command

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 229 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Report Acceleration vs. Summary Indexing
• Report acceleration is both easier to use and more efficient
– Doesn’t require manually scheduling a populating search
– Doesn’t require specifying a summary index name in order to use
– Doesn't require a subsequent search to run against the summary index

• Remember, once an acceleration summary is created from a


shared report, any report that can use it, will use it
• Use summary indexing for reports that don’t qualify for acceleration
Note
Data model acceleration is the easiest
and most efficient acceleration option
and should be your first choice. Data
model acceleration is discussed in the
next module.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 230 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 10 Knowledge Check
• True of False: Summary indexing is an alternative to reports that
do not qualify for report acceleration.
• Summary indexes are stored on the ___.
• Events in a summary index that share the same timestamp
represent ___ in data.
• ___ can occur if the time range of the populating report is shorter
than the frequency of the report schedule.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 231 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 10 Knowledge Check
• True: Summary indexing is an alternative to reports that do not
qualify for report acceleration.
• Summary indexes are stored on the search head.
• Events in a summary index that share the same timestamp
represent overlaps in data.
• Gaps can occur if the time range of the populating report is
shorter than the frequency of the report schedule.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 232 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Lab Exercise 10
Time: 25 minutes
Tasks:
• Create a schedule report the returns the most frequently occurring
IP addresses that are making sshd connections on the web server
• Enable summary indexing on the scheduled report
• Search against the summary index

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 233 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 11
datamodel Command &
Data Model Acceleration

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 234 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module Objectives
• Review data models
• Explore data models using the datamodel command
• Differentiate between ad hoc and persistent data model
acceleration

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 235 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Reviewing Data Models
• Hierarchically
structured datasets
that generate
searches and
drive Pivot
• Pivot reports are
created based
on datasets
• Each event, search, or
transaction is saved Note
Data models were discussed in
as a separate dataset Splunk Fundamentals 2.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 236 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
datamodel Command
• Used to display the structure of a | datamodel

data model or to search against it


• Returns a description of all or a
specified data model and its objects
• datamodel is a generating
command so it must be used as the
first command in a search following
a leading | pipe
Note
Use the datamodel command by itself (without
arguments) to display all the data models in your
deployment that you have access to.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 237 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
datamodel Command (cont.)
| datamodel AccButtercup_Games_Online_Sales
• If the name of the data
model is included as the
first argument, Splunk
shows the details of that
data model in JSON format
| datamodel [data model name]

• Click the + next to


objectNameList to show
all dataset names in the
selected data model
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 238 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
datamodel Command (cont.)
| datamodel AccButtercup_Games_Online_Sales
• To view the details of a dataset
within the data model, expand and
explore objects
– For example, for information about
the successful_purchase dataset,
click the third + under objects

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 239 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
datamodel Command (cont.)
Alternatively, you can display an object within a data model by
using the dataset name as the second argument
| datamodel [data model name] [dataset name]

| datamodel vsales apac

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 240 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
datamodel Command: Options
To view the events associated with the specified dataset, use the
search option
| datamodel [data model name] [dataset name] search

| datamodel vsales apac search

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 241 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
datamodel Command: Options (cont.)
The flat option returns the same results as search but field
names are "flattened" by stripping hierarchical information

| datamodel vsales apac search | datamodel vsales apac flat

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 242 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
datamodel Command: Options (cont.)
• The dataset name and search argument aren’t valid unless
preceded by the data model name
• When using the datamodel command, the data model name and
dataset name are case sensitive

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 243 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Data Model Acceleration
• Accelerating a data model accelerates an entire set of fields
defined by the data model
• Any pivot or report generated by the data model should complete
much quicker, even if the data model represents a large dataset
• Unlike report acceleration and summary indexing, there are two
types of accelerations for data models, each with its own creation
method, benefits, and limitations

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 244 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Data Model Acceleration: Ad hoc
• An Ad hoc acceleration summary is built on
the search head anytime a user runs a pivot
on a dataset
– Summaries are created in dispatch
directories on the search head
– A summary is created for each user
currently accessing a data model dataset Dispatch Directory

Search Head
in Pivot (increasing search head load)
• Storing acceleration summaries on the
search head allows for the acceleration of
all 3 root dataset types: event, search, and
transaction (and their children) Indexers

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 245 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Data Model Acceleration: Ad hoc (cont.)
• The initial acceleration summary is built over all time and as the
pivot is fine-tuned in the Pivot editor, the performance improves
• However, Ad hoc acceleration is temporary and only exists for the
duration of the pivot session
– Therefore, reports or dashboard panels made in a pivot session will
not benefit from ad hoc acceleration

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 246 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Data Model Acceleration: Persistent
• Persistent data model acceleration builds dedicated summaries in
indexes and exists as long as the data model exists
• Reports and dashboard panels generated from persistently
accelerated data models complete more quickly
• Multiple users can access the summary at the same time
• Once accelerated, Splunk maintains the dedicated summary and
the summaries can be used by Pivot, datamodel, and tstats

Note
The tstats command is
discussed in the next module.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 247 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Persistent Data Model Acceleration Restrictions
• You must have admin permissions or the accelerate_datamodel
privilege to accelerate a data model
• Private data models can’t be accelerated
• Accelerated data models can’t be edited
• When accelerating a data model, only the following datasets are
accelerated:
– Root event datasets
– Root search datasets that only include streaming commands

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 248 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Persistent vs. Ad Hoc Data Model Acceleration
Persistent Ad Hoc
The acceleration is built every time
Explicitly defined before using
the pivot editor is accessed
Exists only for duration
Exists as long as data model exists
of user’s pivot session
Runs over all time (i.e., can’t be
Can be scoped to specific time ranges
scoped to specific time range)
Has some restrictions Has no restrictions on use
Reports run faster and Reports run without
perform better overall any acceleration

The rest of this module will deal with


persistent data model acceleration only.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 249 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Accelerating a Data Model
1. Click Settings > Data Models
2. Select a data model and click Edit > Edit Acceleration
3. Click the Accelerate check box and choose Summary Range

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 250 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
After You Accelerate a Data Model...
• Splunk builds an acceleration summary for the specified
summary range
• Summary takes the form of inverted time-series index (tsidx) files
that have been optimized for speed
• Files are stored in the index containing events that have fields
specified in the data model (parallel to their corresponding
index buckets)
• Each bucket in each index may contain multiple tsidx files, one
for each associated accelerated data model

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 251 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
After You Accelerate a Data Model... (cont.)
• Acceleration summary always contains a store of data that at least
meets the summary range (may slightly exceed)
• Splunk updates tsidx files every 5 minutes and removes
outdated data every 30 minutes
• Pivot reports with time ranges within summary range run against
acceleration summary rather than against source index data

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 252 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Data Model Acceleration on the Indexers
Data model acceleration creates dedicated summaries containing
.tsidx files alongside buckets in your indexes
Inside a bucket

... .tsidx raw

Index Index Bucket Files


Indexers

tsidx files will exist parallel to


the buckets that A) contain the
.tsidx .tsidx .tsidx .tsidx events referenced in the file
Note and B) which cover the range
of time for the summary
The role of tsidx files in persistent data
Data Model Acceleration Summary
model acceleration are discussed in more
detail in the next module.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 253 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
datamodel Command: summariesonly Option
• Returns results only from the tsidx data generated by the acceleration
and does not include unsummarized data
| datamodel [data model name] [dataset name] search summariesonly=true

• Maximizes speed of search execution


| datamodel vsales apac search | datamodel vsales apac search summariesonly=true

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 254 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 11 Knowledge Check
• If you run the datamodel command by itself, what will Splunk return?
• How would you edit this search to explore the events associated
with the dataset?
| datamodel Buttercup_Games_Online_Sales failed_add_to_cart ???

• True or False: Reports and dashboards created within Pivot from a


persistently accelerated data model will not benefit from
acceleration with the Pivot editor is closed.
• Which component stores acceleration summaries for ad hoc data
model acceleration?

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 255 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 11 Knowledge Check
• If you run the datamodel command by itself, what will Splunk return?
Splunk will display all the data models you have access to
• How would you edit this search to explore the events associated with
the dataset?
| datamodel Buttercup_Games_Online_Sales failed_add_to_cart search

• False: Reports and dashboards created within Pivot from a


persistently accelerated data model will not benefit from acceleration
with the Pivot editor is closed.
• Which component stores acceleration summaries for ad hoc data
model acceleration? The search head
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 256 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 12
tsidx Files &
tstats Command

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 257 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module Objectives
• Learn about tsidx files
• Work with tsidx files using the tstats command
• Use the tstats command with data models
• Determine which acceleration option to use

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 258 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Index tsidx Files
• tsidx = "Time Series Index Files"
• Exist inside buckets alongside raw data files and consist of a lexicon
and a posting list and the indexed field::value combinations (by
default, these are host, source, and sourcetype)
Lexicon
The lexicon is an alpha-
… Accepted djohnson Failed for from invalid numerically ordered list of terms
nobody password port ssh2 sshd sysadmin … found in the data at index time
Posting List
.tsidx
Accepted
The posting list is an array of
djohnson pointers that match each term to
. events in the raw data files
.
.

sshd[87755]: Accepted password for djohnson from 10.3.10.46 port 2988 ssh2 Splunk uses the pointers to
sshd[3954]: Failed password for invalid user sysadmin from 10.3.10.46 port 4759 ssh2
sshd[1268]: Failed password for mail from 10.3.10.46 port 1617 ssh2
search just the events that
raw match the terms, making the
sshd[4816]: Failed password for nobody from 10.3.10.46 port 4412 ssh2
sshd[5744]: Failed password for sync from 10.3.10.46 port 4664 ssh2 search much more efficient

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 259 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Data Model Acceleration on the Indexers
When you search against an
accelerated data model, Splunk:
1 Retrieves information about the data 2
model that has been stored on disk in .tsidx raw

the tsidx files that make up the


Index Bucket Files
acceleration summary ...
2 Will pull additional events from
Index
bucket files if search is outside data Indexers .tsidx
.tsidx
.tsidx
summary range
1
Note Data Model Acceleration Summary
In addition to data model acceleration, tsidx
files play a huge role in the Splunk search
process. This topic is discussed further in
Advanced Searching and Reporting.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 260 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
tstats Command
| tstats statsFunction [summariesonly=bool] [from datamodel=data_model_name]
[where searchQuery] [by fields]

• A generating command that performs statistical queries on tsidx files


• Perform a basic count or a (supported) function on a field by specifying
a statsFunction
• Specify the data model to read .tsidx files from using:
from datamodel=data_model_name (optional)
• Use where to filter results (optional)
• Results can be grouped by field(s) (optional)

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 261 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
tstats Command (cont.)
• tstats only searches tsidx files which means:
– Search is limited to only indexed fields in the tsidx
– tstats searches execute very fast because it does open or read
raw events

.tsidx
| tstats values(sourcetype) as sourcetype by index

raw

This search was run over All Time and


completed in 0.17 seconds!
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 262 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
tstats Command: from Clause
• Use the from clause to search through tsidx files not created at
index time such as data model acceleration summary tsidx files
• To query accelerated data model tsidx files, use the syntax
from datamodel=data_model_name

Scenario ? | tstats count from datamodel=AccButtercup_Games_Online_Sales


TechOps wants a count of all web
requests during the last 24 hours.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 263 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
tstats Command: Without from Clause
Scenario ? | tstats count by index
User wants to count the events | sort -count
per index, for all indexes to which
they have access.

If you don’t use a from


clause, a search is
performed of indexed fields
in the index tsidx

Note
Statistical queries can only be
performed on indexed fields, not
search time fields.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 264 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
tstats Command: by Clause
Group by any number of fields using by field-list
Scenario ?
TechOps is reconfiguring the web
servers and wants a count of all | tstats count from datamodel=AccButtercup_Games_Online_Sales by host
web requests per web server over
the last 24 hours.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 265 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
tstats Command: summariesonly Option
| tstats statsFunction [summariesonly=bool] [from datamodel=data_model_name]

• true or t returns results only from the tsidx data generated by the
acceleration and does not include non-summarized data
• false or f (default) generates results from both summarized and
non-summarized data

| tstats count from (last 4 months) 2629 results by scanning


datamodel=AccButtercup_Games_Online_Sales 25,342 events in 0.279 seconds

| tstats count from


datamodel=AccButtercup_Games_Online_Sales (last 4 months) 2617 results by scanning
summariesonly=t 25,320 events in 0.054 seconds

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 266 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
tstats Command: summariesonly Option (cont.)
• When running a search with summariesonly set to false, you might
notice a larger result count because:
– Some or all of the index data may not have been added to the summary yet
– The search range may be greater than the acceleration's summary range

• If used with an unaccelerated data model, summariesonly=t produces


no results

| tstats count from datamodel=AccButtercup_Games_Online_Sales summariesonly=t


Accelerated Data Model

| tstats count from datamodel=Buttercup_Games_Online_Sales summariesonly=t


Unaccelerated Data Model

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 267 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Data Model Field Names When Using tstats
• To use a data model field with tstats, you'll | datamodel AccButtercup_Games_Online_Sales

need to reference their location in the data model


with dot notation (owner.fieldName)
• Use the datamodel command to return details of
the data model and its objects
– Then, note the owner for the field you want
to access
| tstats sum(http_request.price) from
datamodel=AccButtercup_Games_Online_Sales

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 268 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
datamodel Notation When Using tstats
• If a data model has more than one accelerated root dataset, you
must specify the dataset you want by using dot notation
datamodel.dataset
• Fields follow dot notation as well: dataset.fieldname

| tstats sum(http_request.price) as tsales from


datamodel=AccButtercup_Games_Online_Sales.http_request
where (http_request.action=purchase AND http_request.status=200) by http_request.product_name

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 269 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Field and Dataset Notation with tstats Example
Scenario ? | tstats sum(http_request.price) as tsales from
The Online Sales manager datamodel=AccButtercup_Games_Online_Sales.http_request
launched a new campaign where (http_request.action=purchase AND http_request.status=200)
yesterday. Provide her with the by http_request.product_name
total sales for yesterday. | sort - tsales
| eval tsales="$".tostring(tsales,"commas")
| rename http_request.product_name as Product, tsales as "Daily Sales"
| fields Product, "Daily Sales"

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 270 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Searching Unaccelerated Data Models with tstats
| tstats count from datamodel=Buttercup_Games_Online_Sales
• tstats can search by host
| sort -count
unaccelerated data models
3 results scanning 272,746 events in 4.885 sec.
• However, searches run the
same as a normal search with | tstats count from datamodel=AccButtercup_Games_Online_Sales
by host
no performance benefit | sort -count

• A best practice is to use 3 results scanning 272,746 events in 0.745 sec.


tstats with accelerated
data models

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 271 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
tstats Command: span Option
• If you group by _time, use
Search Time Range Default Span
span (e.g., span=3m) to group 5 minutes 5 seconds
into time buckets 15 minutes 10 seconds

• If you don’t specify a span, 60 minutes 1 minute


4 hours 5 minutes
the value set by the time 24 hours 30 minutes
picker determines the range 7 days 1 day

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 272 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
tstats Command: statsFunction
Scenario ? | tstats values(sourcetype) as sourcetype by index
ITOps wants a list of all source
types by index.

Most functions available for


stats can be used with tstats

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 273 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
tstats Command: Wildcards
• tstats does not support wildcard fields or field-list, however the
wildcard can be used in the where clause to search on field values
• You can specify:
| tstats count where host=w* by source
| sort -count

• But not these:


| tstats count(source*)

| tstats count where host=w* by source*

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 274 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
tstats vs stats for Indexed Fields
Scenario ?
IT is doing resource planning and
wants the event load for the When working with a massive amount of data and
security index. Count the events
for all time by source, sourcetype, using indexed fields, consider using tstats
and host. Sort descending on
count and format with commas.

| tstats count as events index=security


where index=security by source, sourcetype, host | stats count as events by source, sourcetype, host
| sort -events | sort –events
| eval events = tostring(events,"commas") | eval events = tostring(events, "commas")

11 results scanning 971,466 events in 0.08 sec 11 results by scanning 971,016 events in 1.59 seconds

| tstats count as events index=*


where index=* by source, sourcetype, host | stats count as events by source, sourcetype, host
| sort -events | sort –events
| eval events = tostring(events,"commas") | eval events = tostring(events, "commas")

101 results by scanning 12,945,032 events in 0.39 sec 101 results by scanning 12,944,443 events in 58.8 seconds

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 275 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
stats to tstats Search Optimization
• Any stats search that can be converted to use tstats instead is
converted automatically
• Any datamodel search using the stats command is converted
automatically to use tstats
Note
• For example stats to tstats optimization was
introduced in Splunk Enterprise 7.3.
datamodel/stats to tstats
search index=_internal | stats count optimization was introduced in Splunk
Enterprise 8.0.
is automatically converted to
| tstats count where index=_internal
• Greatly increases speed of searches that rely solely on indexed
fields or simple counts
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 276 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Comparison of Data Summary Creation Methods

• Uses automatically created summaries to speed completion times for


qualified reports
Report • Easier to create than summary indexes and backfills automatically
Acceleration • Depending on the defined time span, periodically ages out data
• Can correct gaps and overlaps from the UI “rebuild” feature
• Cannot create a “data-cube” and report on smaller subsets

• Useful for speeding up searches that don't qualify for report acceleration
Summary • Can persist after underlying events have been frozen by controlling retention
Indexing period or index size
• Backfill is a manual (scripted) process

Data Model • Uses automatically created summaries to speed completion times for pivots
Acceleration • Takes the form of time-series index (tsidx) files

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 277 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 12 Knowledge Check
• True or False: Index tsidx files and data model acceleration
summary tsidx files exist inside buckets in the index.
• True or False: The tstats command needs to come first in the
search pipeline because it is a generating command.
• True or False: tstats can only search accelerated data models.
• To search a data model acceleration summary with tstats, you
must use the ___ clause.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 278 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Module 12 Knowledge Check
• False: Index tsidx files and data model acceleration summary
tsidx files exist inside buckets in the index. Data model
acceleration summary tsidx files exist alongside buckets in
the index.
• True: The tstats command needs to come first in the search
pipeline because it is a generating command.
• False: tstats can only search accelerated data models,
unaccelerated data models, and index tsidx files.
• To search a data model acceleration summary with tstats, you
must use the from clause.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 279 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Lab Exercise 12
Time: 20 minutes
Tasks:
• Display the number of indexed events by month for the last 365
days with the number and time formatted
• Explore the accelerated Vendor Sales data model
• Display a listing of the APAC vendors with retail sales of more than
$200 for the previous week

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 280 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Course Wrap-Up

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 281 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Appendixes
• Appendix A: More Regex
• Appendix B: Creating New Choropleth Maps

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 282 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Up Next: Certification
• At this point, you are eligible to complete two certifications
• If you complete Advanced Searching & Reporting and Creating
Dashboards with Splunk you can get Advanced Power User Certified
• Check out the Splunk Certification Handbook for information about the
exams and the Certification Exam Study Guide for some sample
questions and other study material
Advanced Searching & Reporting
Fundamentals 1 Fundamentals 2 Fundamentals 3
Creating Dashboards with Splunk

Splunk User Splunk Power


Certification User Certification Splunk Core Advanced Power User Certification

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 283 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Up Next: Advanced Searching and Reporting
• For power users and app developers
• Understand the underlying Splunk architecture and use that
knowledge to improve the efficiency of your searches
• Use additional commands and functions
• Append results from a search to the results of another search
• Work with multivalue fields and advanced transactions
• Incorporate subsearches
Recommendation
• And more! Minimum 6 months’ experience using
the Splunk search language before
taking this class.

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 284 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Up Next: Advanced Searching and Reporting

• The lispy value determines which tokens to look for in the .tsidx
• lispy uses a variant of prefix notation where the operator appears
before the operands
index=web 76.169.7.252

Square brackets
group the expression

The IP address has been segmented Tells Splunk to only look


into 4 tokens with the AND operator for buckets associated
appearing before the tokens with web’s .tsidx files

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 285 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Go Faster with Our Welcoming
Community
& Ecosystem
102K+ 1900+
Questions Apps on
Answered Splunkbase

130+ 2000+
User Groups Partners

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 286 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Community
• Splunk Community Portal • Slack User Groups
splunk.com/en_us/community.html splk.it/slack
– Splunk Answers • Splunk Dev Google Group
answers.splunk.com groups.google.com/forum/#!forum/splunkdev
– Splunk Apps
• Splunk Docs on Twitter
splunkbase.com twitter.com/splunkdocs
– Splunk Blogs
splunk.com/blog/ • Splunk Dev on Twitter
twitter.com/splunkdev
– Splunk Live!
splunklive.splunk.com • IRC Channel
– .conf #splunk on the EFNet IRC server
conf.splunk.com
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 287 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
© 2020 SPLUNK INC.

SAVE THE DATE!

October 18-21, 2021


Las Vegas, Nevada

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk How-To Channel
• Check out the Splunk Education How-To channel on YouTube:
splk.it/How-To
• Free, short videos on a variety of Splunk topics

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 289 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Support Programs
• Web
– Documentation: dev.splunk.com and docs.splunk.com
– Wiki: wiki.splunk.com
• Splunk Lantern
Guidance from Splunk experts
– lantern.splunk.com
• Global Support
Support for critical issues, a dedicated resource
to manage your account – 24 x 7 x 365
– Web: splunk.com/index.php/submit_issue
– Phone: (855) SPLUNK-S or (855) 775-8657

• Enterprise Support
– Access customer support by phone and manage your
cases online 24 x 7 (depending on support contract)
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 290 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Appendix A: More Regex

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 291 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Regex Pattern Matching Characters
Regex provides many wildcards and matching rules, such as:
What it matches Example Matches
. Any character c.t cat, c1t, c#t, c_t
\d Any digit c\dt c1t, c0t
\D Any non-digit c\Dt cat, c#t, c_t
\w Any “word” character c\wt cat, c1t, c_t
(alphanumeric or underscore)
\W Any non-“word” character c\Wt c#t
\s Any whitespace character c\sa\st cat
(Unicode separator)
\S Any non-whitespace character \S\S\S cat, dog, pig, gnu
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 292 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Regex Quantifiers
• Quantifiers can be added to any character
What it matches Examples Matches
* Zero or more of the previous c.*t ct, cat, c####t
character (.* matches anything) ca*t ct, cat, caaaat
+ One or more c.+t cat, c####t, caat
{n} Exactly n occurences c\d{2}t c11t, c23t
? Zero or one occurence c.?t ct, cat, c#t
• By default, quantifiers are “greedy”—they match as many characters
as possible
• Adding ? after a count (*?, +?, ??) makes it “non-greedy”—matching
as few characters as possible
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 293 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Regex Groupings
• You can also use parentheses just for grouping (without
performing the capture) by using (?: )
• Can be useful when used with | (the alternation/OR operand)

(?:invalid|wrong)
“Do not capture
this group” Matches ”invalid” or “wrong”

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 294 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Regex Examples
\sMID\s\d{6} looks for MID surrounded by white space followed by
6 digits
Info Start MID 245040 ICID 743983

\[....\].*for looks for [, then any 4 characters, then ] followed by


anything, then for (“greedy” version)
www3 sshd[2348] Failed password for apache for

\[....\].*?for same as above, but not “greedy,” so stops at first for


www3 sshd[2348] Failed password for apache for
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 295 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Multiple Solutions
• There are multiple ways to write regex and get the same results
• For example, the example presented in Module 2 can be written
different ways and achieve the same results:
index=network sourcetype=cisco_wsa_squid
| eval proper_ip_address=if(match(src,"^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$"), "true", "false")

index=network sourcetype=cisco_wsa_squid
| eval proper_ip_address=if(match(src,"^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|
[0-9])\.){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])$"), "true", "false")

index=network sourcetype=cisco_wsa_squid
| eval proper_ip_address=if(match(src,"^\d+\.\d+\.\d+\.\d+$"), "true", "false")

After piping
to stats

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 296 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Appendix B:
Creating New
Choropleth Maps

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 297 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Choropleth Maps
• Uses shading to show relative metrics for
predefined geographic regions
• Splunk ships with two:
– geo_us_states, United States
– geo_countries, countries of the world

• You can import other choropleth maps or


create your own

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 298 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Choropleth Terminology
• KML (Keyhole Markup Language):
type of XML developed by Google
and others
• KMZ: a zipped KML file
• Polygon: the specific KML tag that
Splunk uses to define its choropleth
map data

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 299 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
How Choropleth Maps Work
• Choropleth map data serves
two purposes:
1. Defines polygons to produce
the colored map
2. Provides method to determine
within which polygon a given
latitude/longitude is located
• Splunk can use a choropleth
KML file as a lookup

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 300 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Finding Other KML Choropleth Data Files
• Census bureau sites for
US, UK, Australia
• Lots of other free
KML/KMZ files available
online

Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 301 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Converting Other File Types to KML
• You can also convert choropleth
files to KML from other formats,
such as Shapefile
• Mapping systems have been
around for over 20 years—some
formats not so easy to work with

Note
For complete details, check
the Splunk blog article
Use Custom Polygons in
Choropleth Maps.
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 302 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Build Your Own Choropleth Files using Online Tools
• Google Earth (http://earth.google.com)
• Sketchup (http://www.sketchup.com)
• Other online point-and-click tools (for example,
http://www.birdtheme.org/useful/v3tool.html)
• Make sure:
– Shapes being created are polygon,
not polyline
– Polygons are closed (start and end at
the same coordinate)
– No carriage returns in coordinates list
(Splunk won't accept them)
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 303 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021
Using KML Files in Splunk
1. Upload the KML/KMZ file into Splunk as a lookup file
2. In the search, indicate an events data source that contains either
featureID (location name) or latitude and longitude
If file contains only lat/long, you can use lookup to find location name (e.g.,
|lookup my_geo_map latitude longitude )
3. Use transforming command to aggregate data by location name
For example, stats count by featureId |
4. Optionally, select and configure a visualization
5. Create the choropleth map using the geom command
For example, geom my_geo_map Note
For complete details, refer to
https://docs.splunk.com/Documentation/SplunkCloud/latest/Viz/ChoroplethGenerate
Generated for Sandiya Sriram (qsnd@novonordisk.com) (C) Splunk Inc, not for distribution
Splunk Fundamentals 3
turn data into doing™ 304 Copyright © 2021 Splunk, Inc. All rights reserved | 11 January 2021

You might also like