Splunk Q & A Final Document

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 129

SPLUNK INTERVIEW Q & A

Ph.No:9100005757

Splunk Interview
Questions and
Answers Complete
Material

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

PROVOKE TRAININGS
FlatNo:102, B-Block, Balaji Towers, Prime
Hosipital Lane, Ameerpet, Hyd- 38
----------------------------------------------------
Splunk Questions and Answers
Q.Tell me about yourself?
Myself Prasad(Tell me u r name) ,I have total 6.4 years of experience in IT industry, 4.4
years of experience in Splunk Admin and Development, remaining 2 years of experience
in middleware technologies, like web logic server, websphere server. Presently I am
working in CDK Global(Tell me u r company), It’s located in Hyderabad.

I have good experience Splunk components like Forwarder, indexer,Search Head and
Deployment Server,Cluster Master, License Master,Deployer.

I have good experience in configuration files like inputs.conf, outputs.conf, props.conf,


trasforms.conf,web.conf and indexes.conf configuration files.

I have good experience in data age concept like hot,warm,cold,frozen and thawaed
buckets.

I have good experience Onboarding the data, Decommissioning Servers.

I have good experience in onboarding the data different data source. Like agent based,
rsyslog feed,HTTP Event Collector, DB connect app, REST API and SFTP

I have good experience in SPL Commands like stats,Eval,table,dedup,sort,fileds


commands

I have good experience creating Dashboards, Alerts and Reports.

This is my oral Technical Experience.

Q.What are the day to day activities?


ANSWER – First I need to check all my mails. Then I checked the service now ticketing
tool. Based on the incident ticket I started my work. Before that I need to check the
application and server health check. Application and server up or not. Then start my
work. like Onboarding the data, dashboards, alerts, reports. End of the day I have to
submit my work to the team lead.

Q. How many ways do we need to onboard the data into Splunk? Or


Q.What are the data sources in Splunk? or
Q.How many types of on-boarding the data into Splunk?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Agent Based onboarding


Rsyslog Feed
HEC(HTTP Event Collector)
REST API
SFTP
DB Connect
Script Based
Q.How to Onboarding data into Splunk? Or How to ingest the logs into
Splunk?
We have several ways to onboarding the data into Splunk, like agent based,
rsyslog feed,HEC,DB connect, in my project 80% onboarding is agent based.
my manager is given the excel sheet. in the excel sheet they are mention the

server IP,

server name,

source file path,

source file name,

index name,

source type

inputs app name

inputs server class

parsing app name

parsing server class

this detailes they are mention the excel sheet.

First i need to connect to to deployment server through CLI in putty.

after that i will go to particular path like

cd /opt/splunk/etc/deployment-apps/

under deployment-apps i will create a inputs app

under input app i will create a local folder

under local folder i will create a inputs.conf file

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

in inputs.conf file i will mention the monitor stanza.

means bracket ([) open monitor colon(:) slash slash then source file path slash then
source file name then bracket close.

after that i will mention the index=test, first i need to test in test index then everything
is fine then i will move to test index to production index.

After that source type = source type name.

this details i mention in the inputs.conf

based on requirement i will mention few parameters in inputs.conf file like

_TCP_ROUTING and

disabled = 1,

ignoreolderthan this parameters i will mention in inputs.conf file.

after that i will connect the Deployment Server through GUI.

then go to Setting -->forwarder

then click on server class

after that click on new server class right side button.

then create create server class.

after that we need to add inputs app then check the restart splunkD check box.

then we need to add server name in client white list.

after that we need to reload the server class through this command.

splunk reload deploy-apps -class <server class name>

after that i will go to search head.

in serach head i will check the events

based on events time i will create the props.conf file.

in the props.conf file we need to mention the

time zone= US/Eastern

time prefix

time format

should line merge = false

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
line breaker here we need to mention regex command like if in case that is digit D

if in case word W, if in case space S this details we need to mention in the line breaker

then truncate = 999999

this props code i will push to heavy forwarder. after that we need to reload the heavy
forwarder server class.

then i will check the events again. every thing is fine. then i will move to test index to
prod index. then again reload the inputs server class.

this process we are following in my organization.

Q.What is your SPLUNK ARCHITECTURE in your projects?


My current projects Splunk Architecture is Around

50-Indexer,12-Search Head,1-Deployment Server,1-License Master

1-Cluster Master,1-Deployer,Daily License Volume of data is 600 GB per day.

Q.What is your architecture means single site or


multi site architecture?
Single Site Architecture

Q.What is Replication Factor and Search Factor?


Real Time Explanation:
My replication factor is 3 and search factor is 2.

Q.You have 50 indexer, why replication factor is 3?


Here 3 is primary copies because of that replication factor is 3. Here one file is raw data,
remaining two files raw data+TSDIX files. TSIDX files means Time series index files

Because of that Search Factor is 2. Remaining 47 indexer’s are secondary copies.

SF is less than or equal to RF.

Q.What is major difference between UF and HF


UF is having only the forwarding capability. But HF having capabilities of
forwarding,Parsing the data , filtering the data and masking the data.

What is the Management port in Splunk?


8089

Q.What are common port numbers used by Splunk?


Splunk Web Port: 8000

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Splunk Management Port: 8089

Splunk Network port: 514

Splunk Index Replication Port: 8080

Splunk Indexing Port: 9997

KV store: 8191

Q.How to identify which port splunk is running on ?


Go to /bin and run the following command - ./splunk show web-port
to know the management port, run this command - ./splunk show splunkd-port

Q. How do you identify which machine is not reporting to


splunk ?

Login to Deployment server - Check for the deployment client, i.e. Universal Forwarder
and check for the phone home interval - if the Phone home interval is longer than usual,
ex, 24 hours ago, 3 days ago that means the machine is no longer reporting to Splunk

Q.How you deploy app in SH ?


Through Deployer

Q.Brute force attack in splunk?


Check for attempts to gain access to a system by using multiple accounts with multiple
passwords.
Q.what kind of data you configured in Splunk?
it's purly application logs like weblogic server,websphere server,apache tomcat
sever,jboss,zabix server logs.

Q.What is Epoch time and how do you convert epoch


time into Standard time ?
Epoch time is UNIX based time in splunk. Epoch time is converted to Standard time
using this function - |eval Time = strftime (EpochTimeField, "%y-%m-%d
%H:%M:%S")

Q.How will you troubleshoot Splunk performance


issu/error?

Look through Splunkd.log for diagonostic and error metrices. We can also go to
Monitoring console app and check for the resource utilization of different server
components like, CPU, MEMORY Utilization etc.
We can also install splunk-on-splunk app from splunbase.com and monitor health of
different splunk instances

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

How to optimize the Splunk Query in real time ?

There are lot of techniques - Base searches for dashboards, Filter as early as possible,
avoid using wildcards, Inclusion is always better than exclusion. Example, search
specifically for status = 50* rather than searching for |search NOT status=50*.

Use summary indexes to speed up search operations

Use Report Accelaration to speed up report exceution time.

Use data models which can be used within lot of other saved searches like dashboards
and reports.

Q.Explain how data ages in Splunk?

In Splunk we have hot bucket, warm bucket, cold bucket, frozen bucket and thawed
bucket.First fresh data avalaible in HOT bucket. Hot bucket is searchable.

After that data moved to warm bucket. Data rolled from the hot bucket. Warm bucket
also searchable.After that data moved warm bucket to cold bucket. Data rolled from
warm bucket. Cold bucket also searchable. Later data moved to cold to frozen bucket.

Data rolled from cold to frozen bucket. It’s not searchable. The indexer deletes frozen
data by default, but you can also archive it. Later data moved frozen to thawed bucket.

Q.Which bucket is not searchable?

Frozen and Thawed buckets

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Q.How Splunk prevents duplicate indexing of logs/data?


Q.What is FishBucket? How does it work?
Splunk prevents duplicate data by means of fishbucket index. Fishbucket index mainly
consist of record of last ingested data. So, let us say if the last entry from a particular
source was pushed at 4:18 PM CST, it will keep a pointer there, we call it as the
Instruction pointer. Now next entry from the similar source will be appended after it.

Q.what are most important configuration files in Splunk


?
✔ inputs.conf

✔ output.conf

✔ transforms.conf

✔ props.conf

✔ indexes.conf

✔ web.conf

✔ limits.conf

✔ authentication.conf

✔ autherization.conf

✔ collections.conf

Q.How to extracting IP address from logs?


rex field=_raw “(?<ip_address>\d+\.\d+\.\d+\.\d+)”

Q.What is the Time Zone property in Splunk?


TZ=US/Eastern we need to mnetion in props.conf file
Q.What are the different types of lookups available in
Splunk?(HCL Vikas Gopal Question)
File Based lookup
CSV Lookup
KV Store Lookup

Q.is there any role props.conf file in universal forwarders


level?(HCL Vikas Gopal Question)
Yes, By using event boundaries. Event_Breaks in universal forwarders but line breaking
in indexer.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Q.How to filter the unwanted data in Splunk?


Yes, We can do, we can route the specific unwanted data to NULL QUEUE. Here we can
use transform.conf and props.con file. it's do not count the daily license quota.

props.conf
[wineventLog:system]
TRANSFORMS=null_queue_filter

transforms.conf
[null_queue_filter]
REGEX=(?!)^Eventcode=(592|593)
DEST_KEY=queue
FORMAT=nullQueue

Q.Index time field extraction and Search time


field extraction?
Index time fields are fields which are indexed at the time of parsing the data in Splunk.
They are stored in Memory and hence occupies space.
Index time fields can be accessed using Tstats Command.

Search time fields are created using Eval command in Splunk.

They do not occupy disk space.


These fields which show No results when you try to group them using Tstats command.

They can only be invoked using Stats and similar commands.


Q. What is Splunk Workload Management ? Have you
worked on this?
Workflow actions are for automating low level implementation details and getting things
automated.
We can create workflow actions by going to Settings > Fields > Workflow actions

Q.How to fix the events feature time in Splunk?

By using DATETIME_CONFIG parameter we can mention in props.conf file like


DATETIME_CONFIG = CURRENT

Q.How to fix the binary files are text in Splunk?


First find out inputs app and source type name then we can create a props.conf file in
inputs app. here we need to mention
[source type name]
NO_BINARY_CHECK=true
then reload the inputs server class.
Q.How you worked on HEC(HTTP Event Collector)?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Yes, i worked on HTTP Event Collector, it's token based HTTP input.First we need to
generate the token.

go to the settings->data inputs->HTTP Event Collector

then click on New Token, Here, we can mention Name, Source Name Override,
Description, then check the enable indexer acknowledgement. once token is generated,
we can configure the token backend also.

directly we can go to the particular path like

cd opt/splunk/etc/deployment-apps/splunk_httpinputs/local

[http://bs_pcf_lab_usw2]

disabled = 0

index= index name

token= here we can mention token name

HEC onboarding steps:

1. create a index in indexes.conf

2. create token on GUI of pre-prod Deployment Server

3. we need to write the code in inputs app

a.stanza name

b.index name

c.token

4. push to Heavy Forwader then restart when possible

5. we need to provide the source with URL and Token name and index name, kafka time
will set up the token from there end.

Q.What is metadata in Splunk?


Metadata is data about data

In splunk there are 4 default fields, _time, Host, Source, Sourcetype.

_time – The time when the data was ingested into Splunk.

Host – Host is the indexer where the data is stored in the form of Indexes/table.

Source – TheFile name or machine from where I am getting the data.

Sourcetype – The format of file ingested into Splunk. Example, .csv, json, xml, .txt etc.

Q.Explain file precedence in Splunk?

1. System local directory — highest priority


2. App local directories
3. App default directories
4. System default directory — lowest priority

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Q.what are the knowledge objects?


tags, event types, field extractions, lookups, reports, alerts, data models, saved
searches,transactions, workflow actions, and fields.these are knowledge objects.

Q.What are the troubleshoot the logs not coming to


Splunk?
First we need to check the file log generating or not,if log is generating but not in Splunk
then I will start troubleshooting, First i need to check the internal logs means
index=_internal host=host name then source=file name then hit enter, events showing
clearly means “ Last time we saw this initcrc, file name was different, You may
wish to use larger initCrcLen for this source type or a CRC salt on this source”

Then I will increase initCrcLen =1024 to 2048, then i will reload the server class, then I
will check data again, if data is coming then fine still data is not coming I will increase
initCrcLen=2048 to 4096..I will check like this. Finally I will use CrcSalt=<source>

Then need to check in monitor stanza, source file path is correct or not,then

Check the splunkd logs for know exact issue.

Q.What is tstat command in Splunk?or Difference


between stats and tstats command?
Tstats command works on only index time fields. Like the stats commands it shows up
the data in the tabulat format. It is very fast compared to stats command but using
tstats you can only group-by with index fields not the search time fields which are
created using Eval command.

What is the REST API?


REST APIs are path to specific locations or directories for accessing different types of
knowledge objects. Example by using, |rest /servicesNS/-/-/saved/searches, you can get
list of all reports and alerts. Similary by running, |rest /servicesNS/-/-/data/ui/views,
you can get list of all dashboards and so on.

Q.Have you migrated any knowledge objects from one


environment to another environment?
Yes, you can do them with the help of REST APIs.

Q.Where we can find the path of dashboard?


Use rest /servicesNS/-/-/data/ui/views |search eai.acl_app = *
label="Dashboard_title"|table eai.acl_app label
Or
/splunk/etc/apps/xta_app/local/data/ui/views/

Q.Alert didn’t trigger? Reason? How to troubleshoot?


Run the following command - |rest /services/search/jobs "Alert Name". This will tell you
when the alert has last ran. You can also run the following command if you have admin

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
permissions - index=_audit "Alert Name" - This will tell you what time the alert took to
run and when it was last executed. Run, index=_internal to get the diagnostics metrices
for the same alert name.
You can also run, |rest /servicesNS/-/-/saved/searches |search cron_schedule = "0 *"
(Give the wildcard cron schedule for the alert and check if there are lot of concurret
saved searches running at same interval of time. Try changing the schedules of other
alerts and reports by 1-2 mins ahead or behind).

Or

1. Check triggered alerts from


Settings > Knowledge > Searches, reports and alerts > Alerts column
OR
Activity > Triggered alerts

2. Check under python.log for any error/warning message related to savedsearch/alert


you want to trigger

3. Also, you may need to check for the skipped searches. Maybe during skipped searches
time, you were running into your maxconcurrent limit, which is why this search was
skipped multiple times and that is why you did not receive the alert.

Example of the log is as below:

INFO SavedSplunker -
savedsearch_id="nobody; SystemManage; SVaccount-authfail-emailsend", user="abcd",
app="", savedsearch_name="", priority=, status=skipped, reason="maxconcurrent limit
reached", scheduled_time=1498555860, window_time=0

In case if you see the above info message in logs, you should increase the limit for the
maximum number of concurrent searches in limits.conf

Q.Alerts and reports are stored in where?


/opt/splunk/ ..... /saved/searches.

Q.Diff b\w dashboard and form?


Dashboard is a view. A form incorporates entire dashboard code by means of a form
name. You can refer form name while calling low level APIs for Splunk integration with
third-party apps.

Q.How to optimize the Splunk Query in real time?


There are lot of techniques - Base searches for dashboards, Filter as early as possible,
avoid using wildcards, Inclusion is always better than exclusion. Example, search
specifically for status = 50* rather than searching for |search NOT status=50*.

Use summary indexes to speed up search operations

Use Report Accelaration to speed up report exceution time.

Use data models which can be used within lot of other saved searches like dashboards
and reports.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Q.How to add new indexer to cluster?


Go to Settings > Indexer Clustering > Add peer node - Give master URI. Since it is a
new cluster member, you need to run this command so that all data is synced with this
cluster as well. The command is - splunk apply cluster-bundle.

Q.If my indexer is down....How to troubleshoot?


If one of the indexer cluster members is down, follow a simple process of restarting
splunk. After that go to, /opt/splunk and see if you can loop through this directory
without any error like, File or directory doesn't exists. If this error persists again and
again then check _internal logs in search bar and see what kind of exception has
occurred that has caused peer node to go down. Alternatively you can go to
/opt/splunk/var/log/splunk/splunkd.log and check for latest 10, 000 entries. Third way
would be to go to Monitoring Console app and check for the status of the down peer and
see what diagnostic metrices are there. Fourth way would be to go to Settings > Health
Card Manager and see what is the status of indexer cluster. if the status for several
parameters is in RED that means there is some issue on the server backend as well, it's
now time for you to involve server teams as well since it might be a server crash issue as
well.

Normally when a indexer cluster member having searchable copies goes down, the _raw
copies of data gets converted to searchable files (tsidx). Master node in this case takes
care of bucket fixing, that is tries to keep the match with the Replication factor you've
set up.

Q.What is maintainance mode?


Also called as halt mode because it prevents any bucket replication within indexer
cluster. Example, in case you are upgrading your splunk from 7.X to 8.X you need to
enable maintenance mode. To enable maintenence mode, you need to go to Master node
and run command, splunk enable maintenance-mode. After the maintenance activity has
occurred you can run, splunk disable maintenance-mode.

Q.What does maintainance mode do?


Maintenance mode will halt all buckets fixups, meaning, if there is any corrupt bucket it
will not be fixed to normal. Also, maintenance mode will not check for conditions like,
Replication factor is not met or Search factor is not met. It also prevents timely rolling of
Hot buckets to warm buckets.

Q.what is search affinity splunk?


In case of Mutisite cluster, search affinity refers to setting up search heads in a way that
they must only query for results from their local site, that is the site that is nearest to
them. Example, if you have a multisite cluster in 2 different sites, namely, Rochelle and
Hudson. Now if a user searches for any data from Rochelle, all the search requests must
go to Indexer clusters which are in Rochelle zone and similar for hudson site as well.
Setting up search affinity helps in reducing latency within networks.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Q.50% of search heads are down? What will happen?


How to resolve?
Run, splunk show shcluster-status to see if the captain is also down. In this case you
need to setup a static captain as follows - ./splunk edit shcluster-config -mode captain
-captain_uri https://SHURL:8089 -election false. In case you have 4 SH members and 2
went down that means your default replication factor which is set to 3 will not be met. In
this case you can reinstantiate a SH cluster with following command as follows by setting
the RF to 2. Here is the command, ./splunk init shcluster-config -auth
username:password -mgmt_uri https://shheadURI:8089 -replication_port 9000
-replication_factor 2 -conf_deploy_fetch_url http://DeployerURL:8089 -secret
deployerencyptedpassword -shcluster_label labelName.

Q.How to check kv store status?


We need to run command in bin folder

Splunk show kvstore-status

Q.At a time 10 persons searching for same query but


only 6 members getting the query remaining not why?
Depends on no. of VCPUs that your infrastructure supports. Let's say if you are having 3
Search head members with 2 VCPUs each that means only 2*3 = 6 Concurrent searches
can run at a time. You need to increase your throughput by adding more CPUs for
concurrent processing.

Q.If deployment server went down? How to resolve?


What is the impact?
The main purpose why we used DS is to distribute apps and updates to a group of
non-clustered Splunk instances. In case DS went down all the Deployment clients
polling to DS will not get the latest set of apps and updates.

Q.Index = abc | delete ….after deleting the data how we


can retrieve the data?
Delete command makes data unavailable from the index. It never reclaims disk space.
You can always get the data back from your disk even after you run delete command

Q.What is the difference between top and head?


Top gives you list of most common field values along with a percentage of how
frequently it is present compared to other field values. Head command just gives initial
few results based on the query specified. Example, there is a field called price which has
values as, 20, 30, 40, 50, 60, 70, 80, 90, 20, 30, 40, 20. When you run, | top price, this
command will give you price value as 20 in first row because 20 is appearing maximum
no. of 3 times in all price field values, it will also show percentage of how many times

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
value 20 is appearing. Similarly, if you run - |head 5 price it will give you this as the
output - 20, 30, 40, 50, 60.

Q.What is dispatch directory for?


Dispatch directory is for running all scheduled saved searches and adhoc searches.

Q.Explain few transforming commands in SPL?


Transforming commands are used for transforming event data into a different format,Stats, chart,
timechart, rare, top….

Q.What is the difference between stats eventstats and


streamstats?
Stats command will give you everything in tabular format. you cannot use stats
command evaluation fields in later part of the search.

Example, if you do |stats sum(price) as Sum_Price by Product_Name and later you do


|table price Product_Name, you will see NULL values for price field. Eventstats is helpful
in these cases, Eventstats adds corresponding ouput to each event and you can also
re-use the evaluation field in later part of searches as well.

Example, |eventstats sum(price) as Sum_Price by Product_Name and later you do |table


price Product_Name, you will see actual values for price field as well compared to stats
command.

Streamstats gives running calculation for any field name specified plus it also keeps the
orignal value of a field name as same. Example, price field has values as, 20, 30, 40, 50.
after you do, |streamstats sum(price) as Sum_Price by Product_Name, you will see that
in first row output will be 20, in second it will be 50 (20+30), in third line it will be 90
(50+40) and so on. Later if you do |table price Product_Name, you can also see actual
values for price field.

Q.How to delete the indexed data?


|delete command to temporarily make the data un-searchable from a search head.
There is another command called clear which is run from CLI and removes the entire
data from an index.

Q.What is Lookup. ? How is it useful and used in Splunk?

Lookup is a knowledge object in Splunk. Within our SPL code if we need to reference to
an external file we can do that using lookup. Lookup files can be added to splunk by
going to settings > lookups > add lookup files.
Lookups are useful also from the perspective of performing several types of joins like,
inner, outer etc.

Q.Change retention period in Indexer? What is the


config file?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Retention period can be changed by editing - Indexes.conf

Q.fillnull command?
Replaces null values with a specified value.

Example:

For the current search results, fill all empty fields with NULL.

| fillnull value=NULL

Q.What is Chart command in Splunk?


Chart command is used to visualise the data in 2-D. Using the chart command we can
group by using only 2 fields.

Example, index=index name | chart count by Code, price

Q.What is TimeChart command in Splunk?


Timechart command is used to visualise the data in 2D. Using the timechart command
we can group by only one field.

Example, index=index name | timechart count by Code

Q.What are the Boolean operators in Splunk?


AND

OR

NOT

Q.Static captain and dynamic captain?


Static captain in a search head cluster is one which doesn't changes. We configure a
static captain by logging to servers.conf and changing the parameter - preferred_captain
= false.
Dynamic captain is one which keeps on changing with passage of time. to set a Dynamic
captain we login to servers.conf and change the parameter - preferred_captain = true.

Q.What is bloom filters?


Splunk Enterprise uses bloom filters to decrease the time it requires to retrieve events
from the index. This strategy is effective when you search for rare terms. Bloom filters
work at the index bucket level.

Q.What kind of data splunk is going to read?


Structured and unstructured data except binary files

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Q.How do you identify how much data is injected?

index=_internal idx=* earliest="-1d@d" latest=now() |eval Size =


b/(1024/1024/1024)|table Size

Q.which indexer is down? How to identify?

to identify which Indexer is down we can again run a simple command - index=_internal
source="*splunkd.log*""*Connection failure*" - By running this command you will get to
know the indexerIP which is having connection failure.

Q.What is the difference between apps and Add-ons? or


Q.Have you worked on Apps and Addons ? or
Q.What apps and add-on have you installed and
configured in your org?

- Apps are a full fledged version of splunk enterprise. They contains options for creating
dashboards, reports, alerts, lookups, eventypes, tags and all other kind of knowledge
objects. Add-ons on other hand perform a limited set of functionalities like for Example,
Windows Addon can only get the data from windows based systems, Unix based Add-ons
can get data from specfic unix based servers and so on.

We have installed apps like, DBCONNECT and Add ons like, Windows app for
infrastructure and Unix apps for getting data from Unix based systems

Splunk App for DB connect and Splunk app/addon for linux/unix.

Q.What is Knowledge bundle in Search head?

Knowledge bundle is basically a kind of app bundle which is for sending regular updates
to all serach head members in a cluster. The captain of the search head cluster
distributes knowledge bundle to every search head member whenever any change in 1
or more search head takes place.

Q.Is there any way to segregate and discard few


unwanted data from a single file before reaches the
index queue?
In HF, we have to create the transforms and props files for discard the data

Transforms.conf

[discardingdata]

REGEX=(i?) error

DEST_KEY=queue

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
FORMAT=nullQueue

Props.conf

[Sourcetype]

TRANSFORMS-abc = discardingdata

Q.Am I able to mask customer sensitive data before it


reaches the index queue?
Yes we can by using Masking rules in props and transforms in HF.

Q.I am in need of transferring five 10 GB files. How


much disk space do i need to maintain in my indexer?
5*10=50GB=Actual data

Actual data = 10% raw data + (10 to 110%) of raw data = ¼ of actual data

= 5 + 6.5 = round of 12.5 GB space required.

Q.I am having a log file in which few dataset wants to


send to Index A and remaining few to IndexB. Is that
possible?
Yes you can do this. But you need to write Index overriding rules in HF and forward few
data to A index and remaining few to B index based on key words

Q.What is dispatch directory and are we able to take


control over it?
Dispatch directory is nothing but whatever search in search head bar its going to store
that records. We have shell scripting based on that script we will clean the dispatch
directory.

Q.Where I can check health of my cluster?


In indexer master dash board, we can see

Q.How to check search head cluster status?


./splunk show shcluster-status -auth admin:abc123

Q.Is it possible to exclude an index from replication?


Yes, we can exclude.

Repfactor= auto

Repfactor=0 (exclude)

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Q.How searches will get processed in Indexer cluster


architecture?
End user run the query in search head, SH contact to master then master guide to SH
about which indexer have to go.

Q.Diff between standalone search head and clustered


search head?
SH:: its wont replicate the splunk knowledge objects

SHC:: in SHC replicate the splunk knowledge objects

Q.What will happen if your license is expired on a


particular calendar day??
If license is expired, we cannot able to search and there is no effect on indexing the data
in indexer. We will get the 5 alert messages in 30days, after the indexing also not
happen.

Q.How to troubleshoot splunk performance issues?


We need to check metrics.log file.

Q.What are alerts in Splunk?


Alert is nothing but condition or action. In the splunk we have three types of alerts
Real-Time alerts, Scheduled alerts and Rolling-window alerts.

Recently i have created the three alert. That is Windows CPU Usage greater than 95 %
we need to trigger the alert.

eventtype=perfmon_windows earliest_time=-2m object=Processor counter="%


Processor Time" host="*" host!=*AOS* host!=*ARS* host!="vmw-tools-c2w01"| sort
-_time, -Value | dedup host | eval value=round(Value)| eval type="CPU" | where
value>=95 | table _time, host,type, value

Windows Free Memory less than 10% we need to trigger the alert.

eventtype=perfmon_windows object=Memory counter="% Committed Bytes In Use"


host="*"|stats max(Value) as Value by host| sort -Value| dedup host | eval
Value=round(Value)| rename Value as value| where value >= 90|table host, object,
value

Windows Disk Free Space less than 10% Check then we need to trigger the alert.

index=windows eventtype=hostmon_windows Type=Disk host="*" FileSystem="*"


DriveType="*" host!=*AOS* host!=*ARS*| dedup host, Name | eval
FreeSpacePct=round(FreeSpaceKB/TotalSpaceKB*100) | eval

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
TotalSpaceGB=round(TotalSpaceKB/1024/1024) | eval
FreeSpaceGB=round(FreeSpaceKB/1024/1024) | search FreeSpacePct="*"
TotalSpaceGB="*" | dedup host, Name, DriveType, TotalSpaceGB, FreeSpaceGB,
FreeSpacePct | search FreeSpacePct <=10 | table host, Name, DriveType, TotalSpaceGB,
FreeSpaceGB, FreeSpacePct

Q.What are the phases Splunk?


They are three phases in Splunk like First phase, second phase and third phase.

First phase means: first data is moved to forwarder

Second phase means: forwarder moved to indexer. Indexer is main heart of the
Splunk. Sorting the data, analysing the data.

Third phase means: end user run the query in search head but data will come in
indexer only.

Q.what is Lookup in splunk?


The lookup command adds fields based while looking at the value in an event,
referencing a lookup table, and adding the fields in matching rows in the lookup
table to your event. they are two types of lookups.
Inputloolup and Outlookup

Input lookup is used to enrich the data and output lookup is used to build their
information.
For Example: | inputlookup IDC_FORD_Trend.CSV

Outlookup is used to Writes search results to a static lookup table, or KV store


collection.
For Example: | outputlookup IDC_FORD_Trend.CSV

Q.What are the defaults fields for every event in Splunk?


There are about 5 fields that are default and they are barcoded with every event into
Splunk.
They are host, source, source type, index and timestamp.

Q.Which is latest splunk version in use?


The latest version is 8.1.1 but currently my company using 7.3.9

Q.By Default Splunk username and password?


Admin and changeme

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Q.How to change the Splunk password?

1. Move the $SPLUNK_HOME/etc/passwd file to $SPLUNK_HOME/etc/passwd.bak.


2. Restart Splunk. After the restart you should be able to login using the default login
(admin/changeme).
splunk edit user admin -password newPassowrd -auth admin:changeme

Q.Where splunk default configuration does is stored?


$ cd opt/splunk/etc/system/default
Q.Who are the biggest direct competitors to Splunk?
logstash, Loggly, Loglogic, sumo logic etc..
Q.What is Command to enable splunk to boot start?
$cd opt/splunk/bin/splunk enable boot-start
Q.How to disable splunk boot start?
$ cd opt/splunk/bin /splunk disable boot-start

Q.What is Eval Command?


Strftime and strptime commands I used recently, that time I used eval command in
dashboard creation.

Q.What is fishbucket or what is fishbucket index?


Q.Different between stats and eventstats commands?

Stats – This command produces summary statistics of all existing fields in your
search results and store them as values in new fields.
Eventstats – It is same as stats command except that aggregation results are
Added in order to every event and only if the aggregation is applicable to that
event. It computes the requested statistics similar to stats but aggregates them to
the original raw data.

Q.Knowledge Objects on splunk?


I will explain the 3 main Knowledge objects: Splunk Time chart, Data models
and Alert.

Splunk Timechart: - The timechart command generates a table of summary


statistics. This table can then be formatted as a chart visualization, where your
data is plotted against an x-axis that is always a time field. Use the timechart
command to display statistical trends over time You can split the data with another
field as a separate series in the chart. Timechart visualizations are usually line,
area, or column charts.

When you use the timechart command, the x-axis represents time. The y-axis can
be any other field value, count of values, or statistical calculation of a field value

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Example:-

index=_internal "group=thruput" | timechart avg(instantaneous_eps) by


processor

Data models: you have a large amount of unstructured data which is critical to your
business, and you want an easy way out to access that information without using
complex search queries. This can be done using data models as it presents your data
in a sorted and hierarchical manner.The key benefits of data models are:

Data models are a combination of multiple knowledge objects such as Lookups,


Event types, Field and more
Step 1: Go to Settings-> Data Models.

Step 2: Click ‘New Data Model’ to create a new data model.

Step 3: Specify a ‘Title’ to your data model. You can use any character in the title,
except an asterisk. The data model ‘ID’ field will get filled automatically as it is a
unique identifier. It can only contain letters, numbers, and underscores. Spaces
between characters are not allowed.
Step 4: Choose the ‘App’ you are working on currently. By default, it will be
‘home’.
Step 5: Add a ‘Description’ to your data model.

Step 6: Click ‘Create’ and open the new data model in the Data Model Editor.

Q.Index-Time processingis vs. Search-Time processing?

Index-time processing is the processing of data that happens before the event
is actually indexed. Examples of this are data fields which get extracted as and
when the data comes into the index like source, host and timestamp.
Following are the processes that occur during index time:
● 1.Default field Source type customization
● 2.Index-time field extraction
● 3.Event timestamping
● 4.Event line breaking
● 5.Event segmentation

Search-time processing is the processing of data that happens while a search is


running. Examples of this are any kind of searches or alerts or reminders or
lookups.
Following are the processes which occur during search time:
● 1.Event segmentation (also happens at index time)

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
● 2.Event type matching
● 3.Search-time field extraction
● 4.Field aliasing
● 5.Field lookups from external data sources
Basic Commands:

1. Search – Search command in splunk is used to search for data which is stored in
Key-Value pairs.
2. Stats –We use stats command to gather statistics about any field or set of fields.
The output will always be shown in a tabular format.
3. Rename –Rename command is use to give a field or set of fields another name.
4. Table –A table command is used to show the fields in a tabular format.

Getting started with SSPL (Splunk Search Processing Language) [5 commands


and 3 operators]

Commands –

1. Table – Table command is use for displaying multiple field name (s).
Example, index="ajeet" sourcetype="csv" |table host, sourcetype, source.
2. Dedup – Dedup command is used for removing duplicate field values.
Example, index="ajeet" sourcetype="csv"
| dedup host, sourcetype, source
| table host, sourcetype, source

Example, index="ajeet" sourcetype="csv"


| dedup categoryId
| table categoryId
3. Search – Search command is used for searching field names. When we use
search command, we are explicitly telling Splunk search Engine to only search
for specified field names. This way I am ensuring that my Time complexity is
minimised.
Example, index="ajeet" sourcetype="csv"
| search categoryId=*
| table categoryId
4. Sort – sort command is used to sort the results in ascending or descending
order. By default, sort command sort the results in ascending order. If we
apply a ‘-’ operator, the results gets sorted in descending order.
Example 1, index="ajeet" sourcetype="csv"
| search price=*
| sort price
| dedup price
| table price
Example 2, index="ajeet" sourcetype="csv"
| search price=*
| sort - price
| dedup price
| table price
Example 3, index="ajeet" sourcetype="csv"
| search categoryId=*
| sort - categoryId
| dedup categoryId
| table categoryId

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
5. Rename – Rename command is used to give field another meaningful name.
Example, index="ajeet" sourcetype="csv"
| search product_name=*
| dedup product_name
| table product_name
| rename product_name as "PRODUCT NAME"

Operators –

1. OR – OR is used to display one or more field values.


Example, index="ajeet" sourcetype="csv"
| search categoryId=STRATEGY OR categoryId=ARCADE
| dedup categoryId
| table categoryId
2. AND – AND is used to display field values when all the conditions match.
Example, index="ajeet" sourcetype="csv"
| search product_name="Benign Space Debris" AND categoryId=ARCADE
3. NOT – NOT is used to exclude certain Field-value pairs.
Example, index="ajeet" sourcetype="csv"
| search categoryId=* NOT categoryId=TEE
| dedup categoryId
| table categoryId

18 –May - 2019

6. Stats – Stats command is used for gathering statistics based on a particular


field or set of fields.
Example, index="ajeet" sourcetype="csv"
| stats count by categoryId, Code, price, product_name

29-MAY-2019

Chart – Chart command is used to visualise the data in 2-D. Using the chart
7.
command we can group by using only 2 fields.
Example, index=raja1

| chart count by Code, price

8. Timechart – Timechart command is used to visualise the data in 2D. Using


the timechart command we can group by only one field.

index=raja1

| timechart count by Code

Aggregate Functions –

1. Sum – Sum is used to sum all the values of a specific field


Example, index=raja1
price=*
| stats sum(price) as "Total Sum"

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

2. Avg – Avg is used to give average of all values in a particular field.


Example, index=raja1
price=*
| stats avg(price) as "Average Price"

3. Max- Max is used to give the maximum value of a field.


Example, index=raja1
price=*
| stats max(price) as "Maximum Price"

4. Min – Min will give minimum value of a field.


Example, index=raja1
price=*
| stats min(price) as "Minimum Price"

5. Mode – Mode will give the most repetitive value of a field.


Example, index=raja1
price=*
| stats mode(price) as "Repetitive Price Value"

Commands –

Addtotals – Addtotal command will give the total of a particular row.


Example, index=raja1
price=*
| table price
| addtotals

Addtotals col=t – This command will give you both the row and the column total.
Example, index=raja1
price=*
| table price
| addtotals col=t

Addcoltotals –Addcoltotals will give you the column total of a particular filed.
Example, index=raja1
price=*
| table price
| addcoltotals

Addcoltotals row=t – Will give both the row and column total.
Example, index=raja1
price=*
| table price
| addcoltotals row=t

Q.Addtotals Addcoltotals difference?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

## Addtotals ##
The addtotals command computes the arithmetic sum of all numeric fields for each
search result

index=onlinestore | chart count by clientip category_id usenull=f | addtotals


fieldname="Products Total"

## Addcoltotals ##
The addcoltotals command appends a new result to the end of the search result set.
The result contains the sum of each numeric field or you can specify which fields to
summarize.

index=onlinestore | chart count by clientip category_id usenull=f | addcoltotals


labelfield=clientip label="Mytotal"

Q.How to retrieve the top 10 values and last 10 values?


### Top & Rare ###
index=onlinestore | top limit=10 category_id

index=onlinestore | rare limit=10 category_id

Q.Difference between append,appendpipe and


appendcols?
https://answers.splunk.com/answers/144351/what-are-the-differences-
between-append-appendpipe.html
Append
● If your required data present in two datasets then use subsearch to merge it
together
● Append is the command to merge your subsearch results to your first datasets
● Subsearch always run first and append the data to main query
● Do not use append in real time searches. Results are partial and may be
inaccurate
● It just append your subsearch result to first result set. Does not format your
output

Appendcols

- Appends the fields of the subsearch results with the input search results.

Appendcols Specific usecase

index="mydata" sourcetype="mydata" | stats dc(clientip) by category_id | append [


search index="mydata" sourcetype="mydata" | top 1 clientip by category_id] | table
category_id,clientip,dc(clientip),count

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
index="mydata" sourcetype="mydata" | stats dc(clientip) by category_id |
appendcols [ search index="mydata" sourcetype="mydata" | top 1 clientip by
category_id] | table category_id,clientip,dc(clientip),count

Appendpipe

- Append the output of reporting commands

- Same dataset can subject into postprocess in a single query

- Single dataset undergoes two evaluations based on one master search

index=mydata | stats count by action clientip | appendpipe [stats sum(count) as


count by action | eval customers= "ALL USERS"] | search customers="ALL USERS" |
table action count

Q.Stats and Chart difference?


### Stats and Chart ###

index=onlinestore | stats count by clientip category_id --------- > Tabular view

index=onlinestore | chart count by clientip category_id usenull=f ---------- > Matrix view

Q.Strptime and strftime examples ?


index=_internal | head 1 | eval str="2016-08-21 10:00:00" | eval
en="2016-08-21 12:00:00" | eval start=strptime(str,"%Y-%m-%d %H:%M:%S")
| eval end=strptime(en,"%Y-%m-%d %H:%M:%S") | eval duration=end-start |
table str en start end duration
index=_internal | head 1 | eval StartTime="2016-08-15 10:00:00" | eval
Runtime="01:30:00" | eval startepoch=round(strptime(StartTime,"%Y-%m-%d
%H:%M:%S")) | convert dur2sec(Runtime) as mytime | eval
endepoch=startepoch+mytime | eval end=strftime(endepoch,"%Y-%m-%d
%H:%M:%S") | table StartTime startepoch mytime endepoch end

Q.what is data model?


Data model is nothing but hierarchically-structured search-time mapping of semantic
knowledge about one or more datasets.

Q.What is earliest and latest?


earliest: Specify the earliest time for the time range of your search.

latest: Specify the latest time for the time range of your search.

Example:

For example#1:- to start your search an hour ago, use either

earliest=-h (or) earliest=-60m

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
For example#2:-earliest=-7d@w1 latest=@w6

For example#3:-earliest="11/5/2012:20:00:00" latest="11/12/2012:20:00:00"

Q.transaction?
transaction command finds transactions based on events that meet various constraints.

example:-

| transaction host cookie maxspan=30s maxpause=5s

Top Command:-Return the top values like Return the 10 most common values for a field

| top limit=10 referer

Q.fillnull command?
Replaces null values with a specified value.

Example:

For the current search results, fill all empty fields with NULL.

| fillnull value=NULL

Q.Head command?

Returns the first N number of specified results in search order.

Example:

|head limit=10

Q.geostats command?
generate statistics to display geographic data and summarize the data on maps.

Example:

| geostats latfield=eventlat longfield=eventlong avg(rating) by gender

Q.iplocation command?
Extracts location information from IP addresses by using 3rd-party databases.

Example:

sourcetype=access_* | iplocation clientip

Q.transpose command?
Returns the specified number of rows (search results) as columns.

Example:

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
index=_internal | stats count by sourcetype | sort -count | transpose 3

Q.Join?
Combine the two quries. we can use join.i have queri like A,i have another query like B.i
want to combine both A nd B.we can use JOIN command.

Q.How to clear splunk search history?


Delete following file on splunk server

$splunk_home/var/log/splunk/searches.log

Q.What Is Dispatch Directory?


$SPLUNK_HOME/var/run/splunk/dispatch contains a directory for each search that is
running or has completed. For example, a directory named 1434308943.358 will contain
a CSV file of its search results, a search.log with details about the search execution, and
other stuff. Using the defaults (which you can override in limits.conf), these directories
will be deleted 10 minutes after the search completes – unless the user saves the search
results, in which case the results will be deleted after 7 days.

Q.How Would You Handle/troubleshoot Splunk License


Violation Warning Error?
License violation warning means splunk has indexed more data than our purchased
license quota.We have to identify which index/sourcetype has received more data
recently than usual daily data volume.We can check on splunk license master pool wise
available quota and identify the pool for which violation is occurring.Once we know the
pool for which we are receiving more data then we have to identify top sourcetype for
which we are receiving more data than usual data.Once sourcetype is identified then we
have to find out source machine which is sending huge number of logs and root cause
for the same and troubleshoot accordingly.

Q.What Is Difference Between Splunk Sdk And Splunk


Framework?
Splunk SDKs are designed to allow you to develop applications from the ground up and
not require Splunk Web or any components from the Splunk App Framework. These are
separately licensed to you from the Splunk Software and do not alter the Splunk
Software.Splunk App Framework resides within Splunk’s web server and permits you to
customize the Splunk Web UI that comes with the product and develop Splunk apps
using the Splunk web server. It is an important part of the features and functionalities of
Splunk Software , which does not license users to modify anything in the Splunk
Software.

Q.what is the difference between apps and Add-ons ?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Apps in splunk are full fledged Splunk artifacts - You can basically create and save all
knowledge objects, you can also create saved searches

An app is a way of localising your data and preventing people from other Application
teams to make use of them ....

Apps in Splunk can be created directly using the Splunk GUI ....

Add ons performs a specific set of functionalities - Like Windows and Linux Add ons for
getting data from Windows and Linux servers respectively ....

Add ons are normally imported from Splunkbase.com, a repository for splunk facilitated
apps and add ons ....

Compared to apps, add ons only exhibits limited functionality and they could be grouped
for one to many use. Example, A Linux add on installed on Universal forwarder can only
receive data from Linux servers and not Windows servers

Q.How do you integrate network device logs to Splunk ?


Data from network devices like, Switch, Router are facilitated through either TCP or UDP
protocol

Q.Index time fields are fields which are indexed at the


time of parsing the data in Splunk.?
They are stored in Memory and hence occupies space.

Index time fields can be accessed using Tstats Command.

Search time fields are created using Eval command in Splunk.

They do not occupy disk space.

These fields which show No results when you try to group them using Tstats command.

They can only be invoked using Stats and similar commands.

Q.What is Tsidex error?


This error normally occurs when _raw files are not able to be converted to Searchable
buckets

=========================
1. Explain about your roles and responsibilities in your
organization?
- Taking care of scheduled maintenance activities
- Creating user based knowledge objects like, Dashboards, reports, alerts, static and
dynamic lookups, eventtypes, doing field extractions.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
- Troubleshooting issues related to production environment like Dashboard not showing
up the data - In this case we basically check from the raw logs if the format of the data
has changed or not.
- Been part of mass password update activities for DATABASE related inputs because if
the DATABASE password change happens we need to change the connection password
created in our BBConnect application

2. Explain the Architecture of Splunk components in your


organization. Single site/Multi site cluster ?
- We have a multisite cluster both at Rochelle and Hudson. Each of these clusters
contains 40 indexers each. Each of the cluster has 1 cluster master, 1 deployment server,
more than 10000 forwarders installed on clients, 1 deployment server configured to
receive data from forwarders - Deployment server consists of 3 kinds of apps, 7 search
heads in a cluster, 1 deployer

3. Questions on Ports will be asked . Can you change the


default port on which splunk component runs and if yes
how?
Yes, it is configurable like this -

1. Log into Splunk Web as the admin user.


2. Click Settings in the top-right of the interface.
3. Click the Server settings link in the System section of the screen.
4. Click General settings.
5. Change the value for either Management port or Web port, and
click Save.
Alternatively, you can also go /bin folder and run the following command -

splunk set web-port newportnumber

4. Have you worked on Data Onboarding(very crucial


and BAU one) – What kind of data you have on boarded?
What process you follow while onboarding a data? Have
you also worked on Data normalization?
Yes. We need to login to the machine basically the client. If the Universal forwarder is
already not installed there, we need to install one. Later on we need to go to ./bin and
run the following command -

/splunk add monitor SOURCEFILENAME -index INDEXNAME. Example, we need to


monitor a file like this - /var/log/introspection/resource_usage.log. To monitor such file
we need to run a command like this, ./splunk add monitor
/var/log/introspection/resource_usage.log -index Ashu(Ashu is the name of the index
where the data will be stored).

Types of data On-boarded - Application logs, webserver logs.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Yes, I did worked on Data Normalization. Data normalization as the name states is the
process of removing redundant/duplicate data, plus, it also comprises of logically
grouping data together.
In Splunk we makes use of tags during search time to normalize data. There is one thing
we need to take care while normalizing data. Data normalization should only be done at
search time and not index time. It is a technique which is adopted for faster data
retrieving and lesser search execution time. So, it is better we do it once the data is
stored in indexers.

Pointer: Explain types of data we can ingest in splunk . Common ones they expect us to
answer is flat files(logfile, textfile etc) and syslog onboarding. You can also talk about
Database and csv onboarding.

5. What are the important configs on Universal


forwarder . Explain them and also explain what all
params you define while writing them?

Universal forwarder doesn't have any GUI. So everything that we need to configure is by
logging to the UF through admin credentials. Once you login to it using this
path opt/splunkforwarder/bin we need to need to run following command to add
indexerIP/hostname where it will be forwarding data to. The command is this, ./splunk
add forward-server indexerIP:9997

Once this is done, we need to add sourcefile names that needs to be added. Example has
already been explained in #4, /splunk add monitor
/var/log/introspection/resource_usage.log -index Ashu

Inputs.conf - All the sourcenames added like this, /splunk add monitor
/var/log/introspection/resource_usage.log -index Ashu will be visible under inputs.conf

Outputs.conf - All the indexers name added using this path, /splunk add forward-server
indexerIP:9997 will be visible under outputs.conf
Pointers: inputs.conf , outputs.conf

6. Have you worked on Heavy forwarders ? Whats the


importance of it ? what are the important configurations
files you have on HF?
- Heavy forwarders are used for data pre-processing meaning it is used for selective data
forwarding and removing unwanted values as well.
These are the important configs of a Heavy forwarder -

When you open transforms.conf, these are the CONFIG parameters which are
configurable -

DEST_KEY
REGEX
FORMAT
7. Have you worked on data transformation? For
example can achieve below scenarios?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

a. How will you mask the sensitive data before its indexed?

Open transforms.conf and configure a SEDCMD class -


SEDCMD-<class> = s/<regex>/<replacement>/flags

● regex is a Perl language regular expression


● replacement is a string to replace the regular expression match.
● flags can be either the letter g to replace all matches or a number to replace a
specified match.
b. Can you change/replace hostname with new host ?

Using the above class and using the replacement parameter - replacement is a string to
replace the regular expression match

c. How can you filter out unwanted data from a data source and
drop it before it gets indexed so that I will save on licensing cost?

This is done by means of a heavy forwarder using the same configs -

DEST_KEY
REGEX
FORMAT

8. How have you onboarded syslog data? Can you


explain that ?

Yes, using the unencrypted SYSLOG service and a universal forwarder. Alternatively, we
can also use daemon processes like, Collectd and Statsd to transmit data using UDP.

9. Why is sourcetype and source definition so important


?

Sourcetype is used as a data classifier whereas source contains the exact path from
where the data needs to be onboarded

10. What is a license master? How does the licensing of


the Splunk work ? how to you create a license master
and license pool.?
License master is a splunk instance which is used for monitoring splunk data volume on
a daily basis. This is how we configure a license master -

Login to any particular indexer - Go to settings > under System > Licensing > Add
License File (Mainly an XML based licensing file is added)

11. How much data will be applicable for license cost –


Is the entire data size that is being ingested or only the
compressed raw data after indexed.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Licensing cost is calculated on entire data size that is ingested. Compressing has nothing
to with License usage, compressing is done to save on Disk space.

12. What is the data compression ratio – Raw:Index?

Normally data is 38-45% compressed.


We can check compressed data by running |dbinspect command.

14. What is props.conf and transform.conf ? How do you


write stanzas and relate them?
Props.conf is a configuration file used for selective indexing - mainly used for data
pre-processing. We need to mention the sourcename in the props.conf. Transforms.conf
is for specifying what all set of events/parameters/fields needs to be excluded. Example,
DEST_KEY
REGEX
FORMAT

15. Questions on regex will be asked – A common one


would be – could you tell me the regex for IP address.?
Regex in splunk are done with the help of rex, regex and erex command. No one will
ever ask you to tell regex about IPaddresses

16. What is Deployment server used for ? what is


Serverclass and apps ? how do you deploy base
configuration(inputs,outputs.conf) from DS?
Deployment server is a splunk instance which is used for polling from different
deployment clients like, indexer, Universal forwarder, Heavy forwarder etc.
Server classes is used for grouping different servers based on the classes - like if I have
to group all the UNIX based servers i can create a class called - UNIX_BASED_SERVERS
and group all servers under this class. Similarly, for Windows based servers I can create
a WINDOWS_BASED_SERVERS class and group all servers under this class.
Apps are basically a set of stanzas which are deployed to different members of a server
class.

when we set up server classes and assigns them apps or set of apps we need to restart
splunkd, a system level process - Once this is done any new app updates will be
automatically sent to all servers.

18. What is Summary index .how do you create


accelerated reports ? is Licensing cost applicanble on
Summary index?

Summary index contains summarised or brief data. We create accelerated reports by


enabling accelerate reports option. Kindly remember that Report acceleration should only

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
be done on data coming from summary index, not on data coming from the application
or main index.

Summary index doesn't counts on licensing volume.

19. Name some default Splunk indexes and their use?


Main - Contains all system related data. While adding monitor using this command -
./splunk add monitor SOURCEFILENAME if we don't mention any index name the data
will automatically go into this index

_audit - All search related information - Scheduled searches as well as adhoc searches

_introspection - All system wide data, including memory and CPU data

_internal - Error specific data. example, DATABASE connectivity hampered, etc.

21. What do you know about Splunk CIM ? What is


Splunk Data normalization?

CIM is common information model used by splunk. CIM acts as a common standard used
by data coming from different sources.
Data normalization is already explained above

24. What is dispatch directory for?

- Dispatch directory is for running all scheduled saved searches and adhoc searches.

25. How do you check any config file consistency? (


Explain Btool command)
Btool command shoudn't be used on most of the cases. It is most unstable command
and is very rarely updated. It's mainly for mainframe health check statuses.
However, if we still need to run and debug things we can use this command -

./splunk btool inputs list --debug

26. How do you configure Search Head cluster ? Explain


the Deployer ?
Search head is a detailed process and requires a lot of pre-requisites to be in place.

This is how it is configured -

PRE-REQUISITES FOR SEARCH HEAD CLUSTERING ————

● System Requirements -

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

● Each member must run on it’s VM


● All machines must run the same OS (Need to clarify whether version difference is
also important)
● All members must run the same version of splunk Enterprise
● Members must be connected over a high speed n/w
● There must be at least 3 members deployed to start a SH cluster

● Replication Factor ———

Replication factor must be met for all the scenarios

Other system requirements -

● Updates to dashboards, reports, new saved searches created are always subject
to Captain - The captain takes care of all of this
Before we Configure search head clustering, we need to configure a deployer because
Deployer IP is required to create a search head cluster

A bit about Deployer -

DEPLOYER -

Distributes apps and other configurations to SH cluster members


Can be colocated with deployment server if no. of deployment clients < 50
Can be colocated with Master node
Can be colocated with Monitoring console
Can service only one SH cluster
The cluster uses security keys to communicate/authenticate with SH members

Configure a Deployer ———‘

Go to - /opt/splunk/etc/system/local and vi servers.conf

After this add the below stanza -

[shclustering]
Pass4symmkey = password
shcluster_label = cluster1

Restart the Splunk since change has been done in .conf files

While setting up Search head clustering we first have to create a Deployer as above -

When it comes to setting up a SH clustering, the first thing we need to do is login to that
particular Search head and run the command by going to bin as follows -

./splunk init shcluster-config -auth admin:password -mgmt_uri


IPaddressofSHinHTTPSFollowedByMGMTPort -replication_port 9000 -replication_factor 3
-conf_deploy_fetch_url DeployerIpaddress:8089 -secret passwordofdeployer
-shcluster_label clusterName

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

27. What are orphaned searches and reports ? How do


you find them and change the owner/delete them?

Scheduled saved searches which are under different user names who are no more part of
the splunk ecosystem or have left the company are called as orphaned searches. It
happens because there is no role associated within splunk for that particluar user.

With recent upgrade of Splunk to 8.0.1 the problem with orphaned searches has almost
resolved. But still if you see the orphaned searches warning appearing under Messages
in your search head you can follow this guideline on how to resolve.
https://docs.splunk.com/Documentation/Splunk/8.0.2/Knowledge/Resolveorphanedsearc
hes

28. Explain different Roles and their capabilities in


Splunk?
- User - Can only read from splunk artifacts. Example, Reports, dashboards, alerts and
so on. Don't have edit permissions.
- Power user - Can create dashboard, alerts, reports and have Edit permissions
- Admin- Have access to all production servers, can do server restarts, take care of
maintenance activities and so on. Power user and normal user role are subsets of Admin
role

31. What is tstat command and how does it work?


Explain what is tsidx ?
Tstats command works on only index time fields. Like the stats commands it shows up
the data in the tabulat format. It is very fast compared to stats command but using
tstats you can only group-by with index fields not the search time fields which are
created using Eval command.

TSIDX files are Time series index files. When raw data comes to index it gets converted
into Tsidx files. Tsidx files are actually searchable files from Splunk search head

32. What are the stages of buckets in splunk ? How do


you achieve data retention policy in Splunk ?
Buckets are the directories in Splunk which stores data.

Different stages are -

- Hot Bucket - Contains newly incoming data. One the Hot bucket size reaches a
particular threshold, it gets converted to a cold bucket. This bucket is also searchable
from the search head
- Warm bucket - Data rolled from hot bucket comes to this bucket. This bucket is not
actively written into but is still searchable. Once the indexer reaches maximum number
of cold buckets maintained by it, it gets rolled to warm buckets
- Cold - Contains data rolled from warm buckets. The data is still searcahble from the
search head. It's mainly for the backup purpose in case one or more hot or warm
buckets are unsearchable. After cold buckets reaches a threshold, they gets converted to
Frozen buckets

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
- Frozen - Once the data is in Frozen buckets, it is either archived or deleted.

41. License Warning ?


Queue and pipelinein case the daily license limit is exhausted. There will be warnings
coming on the search heads that you've exceeded daily license volume and you either
need to upgrade your license or stop ingesting the data.

Each and every user authenticated to Splunk has limited search quota - Normal users
has around 25 MB whereas power users has around 50-125 MB. Once this threshold is
exceeded for a particular time, users searches will start getting queued.

42. Phonehome interval ? Server class ? Token ?

Phonehome interval is the time interval for which a particular deployment client will keep
polling your Deployment server. Ex, 2 seconds ago, 10 seconds etc.

Server class are group of servers coming from the same flavour or same geographic
location. Ex, to combine all windows based servers we will create a windows based
server class. Similarly, to combine all Unix based servers we will create a unix based
server class.

Token is a placeholder for a set of values for a particular variable. Example, Name =
$Token1$. Now here Name field can have multiple values like, Naveen, Ashu, Ajeet etc.
The value that a particular token will hold completely depends upon the selection. Tokens
are always enclosed between $$, like the example above.

43. List the ways for finding if a Forwarder is not


reporting to Deployment Server?

Check if the Forwarder host name/Ip Address is not under the blacklist panel in
Deployment server.

44. Can SF be 4 ? What data issues you have fixed ?


Yes search factor can be 4 if replication factor is at least 5

45. What is throttle ? Dashboard ? 2 types of dashboards


?
Throttling is suppressing an alert for a specific interval of time. this is normally done on
each search result basis.

Dashboard is a kind of view which contains different panels and panel shows up different
metrices.

2 types of dashboards - I didn't understood this question

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

46. licesne master data has exceeded ? What will happen


?
If License master data has exceeded you will start seeing warnings on search head, Data
ingestion will be stopped but a user will still be able to search the data.

47. What is Data models and Pivot tables?


Data models are a hierarchal representation of data. It shows the data in a more
structured and organised format. Pivot tables are subsets of a data model, it's an
interface where users can create reports, alerts without much involvement to SPL
language.

48. Default indexes created during Indexer installation?


Default indexes are - main, default, summary, _internal, _introspection, _audit

49. How to onboard only JSON files ?


Set the sourcetype as JSON

50. How splunk software handle data ?


It breaks raw data into set of events. Each event is assigned 5 defalut values - host,
source, sourcetype, tiemstamp, indexname

52. How will you make a indexer not searchable for


user?

53. Which config file will you change so that RF & SF to


be same in multicluster environment ?

Indexes.conf

54.How to pull Yesterday's data from DB, if server was


down ?
If there is a connection problem between database and DBconnect in Splunk and now it
has been resolved, we can run a SQL query which contains functions like, sysdate-1 if
it's a ORACLE DB or to_date() function again for oracle and other DBMS.

55. What is accelerate reports ?


Reports acceleration are subjected to Summary indexing. We cannot do report
acceleration on data coming directly from application indexes. Report acceleration is
done so that a report executes quickly on it's scheduled time. It basically means to
minimise the info_max_time.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

56. Push app from deployer to search head in search


head cluster?

This is the done with this command and already explained above -
./splunk init shcluster-config -auth admin:password -mgmt_uri
IPaddressofSHinHTTPSFollowedByMGMTPort -replication_port 9000 -replication_factor 3
-conf_deploy_fetch_url DeployerIpaddress:8089 -secret passwordofdeployer
-shcluster_label clusterName

58. How to create indexes ?


From the splunk web, we can navigate to Settings > Indexes > New

59. How do you read data from third layer of bucket?

Buckets reside in - $Splunk_Home/var/lib/splunk/defaultdb/db

By default Splunk sets the default size to 10 GB for all buckets on a 64-bit OS.
Third layer of bucket refers to cold bucket which is still searchable.

60. A user is not able to search with particular index. A


request to face the issue ? What will you do ?

we need to go to settings > Access controls > Roles > YourUserRole > Indexes and
check if the user has read access to index.

61. Forwarder types ?Uses ?


2 types of forwarders - Universal Forwarder and Heavy Forwarder.

Universal forwarders are basically agents which are installed on the client, i.e., servers
from where we are getting the data. They don't have any pre-processing capability.
Heavy forwarders in turn have pre-processing, routing and filtering capabilities.

62. What are alerts ? How moves the data ?


Alerts are saved searches in Splunk. They are used for notifying application/server
owners etc about the erroneous conditions that may occur

65. How do you reset admin password ?


Stop Splunk Enterprise
Find the passw file for your instance ($SPLUNK_HOME/etc/passwd) and rename it to
passwd.bk

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Create a file named user-seed.conf in your $SPLUNK_HOME/etc/system/local/ directory.
In the file add the following text:
[user_info]
PASSWORD = NEW_PASSWORD
In the place of "NEW_PASSWORD" insert the password you would like to use.
Start Splunk Enterprise and use the new password to log into your instance from Splunk
Web.
If you previously created other users and know their login details, copy and paste their
credentials from the passwbk file into the passwd file and restart Splunk.

67. Monitor entire health of system ? which component


do you login to see the dashboard ?

monitoring console application - This is a pre-built application

68. How do you update the indexer when there is a new


index to be added ?
Login to Splunk web, go to setting > indexes >new

SV Reddy Answers
1. Explain about your roles and responsibilities in your
organization -
- Taking care of scheduled maintenance activities
- Creating user based knowledge objects like, Dashboards, reports, alerts, static and
dynamic lookups, eventtypes, doing field extractions.
- Troubleshooting issues related to production environment like Dashboard not
showing up the data - In this case we basically check from the raw logs if the format
of the data has changed or not.
- Been part of mass password update activities for DATABASE related inputs because
if the DATABASE password change happens we need to change the connection
password created in our BBConnect application

2. Explain the Architecture of Splunk components in your


organization. Single site/Multi site cluster ?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
- We have a multisite cluster both at Rochelle and Hudson. Each of these clusters
contains 40 indexers each. Each of the cluster has 1 cluster master, 1 deployment server,
more than 10000 forwarders installed on clients, 1 deployment server configured to
receive data from forwarders - Deployment server consists of 3 kinds of apps, 7 search
heads in a cluster, 1 deployer

3. Questions on Ports will be asked . Can you change the


default port on which splunk component runs and if yes
how. ?
Yes, it is configurable like this -

1. Log into Splunk Web as the admin user.


2. Click Settings in the top-right of the interface.
3. Click the Server settings link in the System section of the screen.
4. Click General settings.
5. Change the value for either Management port or Web port, and
click Save.
Alternatively, you can also go /bin folder and run the following command -

splunk set web-port newportnumber

4. Have you worked on Data Onboarding(very crucial


and BAU one) – What kind of data you have on boarded .
What process you follow while onboarding a data. Have
you also worked on Data normalization.
Yes. We need to login to the machine basically the client. If the Universal forwarder is
already not installed there, we need to install one. Later on we need to go to ./bin and
run the following command -

/splunk add monitor SOURCEFILENAME -index INDEXNAME. Example, we need to


monitor a file like this - /var/log/introspection/resource_usage.log. To monitor such file
we need to run a command like this, ./splunk add monitor
/var/log/introspection/resource_usage.log -index Ashu(Ashu is the name of the index
where the data will be stored).

Types of data On-boarded - Application logs, webserver


logs.
Yes, I did worked on Data Normalization. Data normalization as the name states is the
process of removing redundant/duplicate data, plus, it also comprises of logically
grouping data together.
In Splunk we makes use of tags during search time to normalize data. There is one thing
we need to take care while normalizing data. Data normalization should only be done at
search time and not index time. It is a technique which is adopted for faster data
retrieving and lesser search execution time. So, it is better we do it once the data is
stored in indexers.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Pointer: Explain types of data we can ingest in splunk . Common ones they expect us to
answer is flat files(logfile, textfile etc) and syslog onboarding. You can also talk about
Database and csv onboarding.

5. What are the important configs on Universal


forwarder . Explain them and also explain what all
params you define while writing them
Universal forwarder doesn't have any GUI. So everything that we need to configure is by
logging to the UF through admin credentials. Once you login to it using this
path opt/splunkforwarder/bin we need to need to run following command to add
indexerIP/hostname where it will be forwarding data to. The command is this, ./splunk
add forward-server indexerIP:9997

Once this is done, we need to add sourcefile names that needs to be added. Example has
already been explained in #4, /splunk add monitor
/var/log/introspection/resource_usage.log -index Ashu

Inputs.conf - All the sourcenames added like this, /splunk add monitor
/var/log/introspection/resource_usage.log -index Ashu will be visible under inputs.conf

Outputs.conf - All the indexers name added using this path, /splunk add forward-server
indexerIP:9997 will be visible under outputs.conf
Pointers: inputs.conf , outputs.conf

6. Have you worked on Heavy forwarders ? Whats the


importance of it ? what are the important configurations
files you have on HF?
- Heavy forwarders are used for data pre-processing meaning it is used for selective data
forwarding and removing unwanted values as well.

These are the important configs of a Heavy forwarder -

When you open transforms.conf, these are the CONFIG parameters which are
configurable -

DEST_KEY
REGEX
FORMAT

7. Have you worked on data transformation? For


example can achieve below scenarios-
a. How will you mask the sensitive data before its indexed
Open transforms.conf and configure a SEDCMD class -
SEDCMD-<class> = s/<regex>/<replacement>/flags

● regex is a Perl language regular expression


● replacement is a string to replace the regular expression match.
● flags can be either the letter g to replace all matches or a number to replace a
specified match.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
b. Can you change/replace hostname with new host ?

Using the above class and using the replacement parameter - replacement is a string to
replace the regular expression match

c. How can you filter out unwanted data from a data source and drop it before it gets
indexed so that I will save on licensing cost?

This is done by means of a heavy forwarder using the same configs -

DEST_KEY
REGEX
FORMAT

8. How have you onboarded syslog data? Can you


explain that ?
Yes, using the unencrypted SYSLOG service and a universal forwarder. Alternatively, we
can also use daemon processes like, Collectd and Statsd to transmit data using UDP.

9. Why is sourcetype and source definition so important


?
Sourcetype is used as a data classifier whereas source contains the exact path from
where the data needs to be onboarded

10. What is a license master? How does the licensing of


the Splunk work ? how to you create a license master
and license pool.?
License master is a splunk instance which is used for monitoring splunk data volume on
a daily basis. This is how we configure a license master -

Login to any particular indexer - Go to settings > under System > Licensing > Add
License File (Mainly an XML based licensing file is added)

11. How much data will be applicable for license cost –


Is the entire data size that is being ingested or only the
compressed raw data after indexed.
Licensing cost is calculated on entire data size that is ingested. Compressing has nothing
to with License usage, compressing is done to save on Disk space.

12. What is the data compression ratio – Raw:Index


Normally data is 38-45% compressed.
We can check compressed data by running |dbinspect command.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

13. Have you worked on Apps and Addons ? What apps


and addons have you installed and configured in your
org?
- Apps are a full fledged version of splunk enterprise. They contains options for creating
dashboards, reports, alerts, lookups, eventypes, tags and all other kind of knowledge
objects. Add-ons on other hand perform a limited set of functionalities like for Example,
Windows Addon can only get the data from windows based systems, Unix based Add-ons
can get data from specfic unix based servers and so on.

We have installed apps like, DBCONNECT and Add ons like, Windows app for
infrastructure and Unix apps for getting data from Unix based systems

Pointer: Talk about Splunk App for DB connect and Splunk app/addon for linux/unix .
These are 2 common apps/addons that you should know.
Would be a good deal if you can talk about Splunk app for AWS (cloud integration)

14. What is props.conf and transform.conf ? How do you


write stanzas and relate them .?
Props.conf is a configuration file used for selective indexing - mainly used for data
pre-processing. We need to mention the sourcename in the props.conf. Transforms.conf
is for specifying what all set of events/parameters/fields needs to be excluded. Example,
DEST_KEY
REGEX
FORMAT

15. Questions on regex will be asked – A common one would be – could you tell me the
regex for IP address.?

Regex in splunk are done with the help of rex, regex and erex command. No one will
ever ask you to tell regex about IPaddresses

16. What is Deployment server used for ? what is


Serverclass and apps ? how do you deploy base
configuration(inputs,outputs.conf) from DS?
Deployment server is a splunk instance which is used for polling from different
deployment clients like, indexer, Universal forwarder, Heavy forwarder etc.
Server classes is used for grouping different servers based on the classes - like if I have
to group all the UNIX based servers i can create a class called - UNIX_BASED_SERVERS
and group all servers under this class. Similarly, for Windows based servers I can create
a WINDOWS_BASED_SERVERS class and group all servers under this class.
Apps are basically a set of stanzas which are deployed to different members of a server
class.

when we set up server classes and assigns them apps or set of apps we need to restart
splunkd, a system level process - Once this is done any new app updates will be
automatically sent to all servers.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

17. How will you troubleshoot Splunk performance


issu/error?
Look through Splunkd.log for diagonostic and error metrices. We can also go to
Monitoring console app and check for the resource utilization of different server
components like, CPU, MEMORY Utilisation etc.
We can also install splunk-on-splunk app from splunbase.com and monitor health of
different splunk instances

18. What is Summary index .how do you create


accelerated reports ? is Licensing cost applicanble on
Summary index?
Summary index contains summarised or brief data. We create accelerated reports by
enabling accelerate reports option. Kindly remember that Report acceleration should only
be done on data coming from summary index, not on data coming from the application
or main index.

Summary index doesn't counts on licensing volume.

19. Name some default Splunk indexes and their use?


Main - Contains all system related data.
While adding monitor using this command - ./splunk add monitor SOURCEFILENAME if
we don't mention any index name the data will automatically go into this index

_audit - All search related information - Scheduled searches as well as adhoc searches

_introspection - All system wide data, including memory and CPU data

_internal - Error specific data. example, DATABASE connectivity hampered, etc.

20. What is Epoch time and how do you convert epoch


time into Standard time ? (function in splunk)
Epoch time is UNIX based time in splunk. Epoch time is converted to Standard time
using this function - |eval Time = strftime(EpochTimeField, "%y-%m-%d %H:%M:%S")

21. What do you know about Splunk CIM ? What is


Splunk Data normalization?
CIM is common information model used by splunk. CIM acts as a common standard used
by data coming from different sources.
Data normalization is already explained above

22. What is the file precedence Splunk follows. Could


you explain that?
System local directory ( highest preference)
Then, App local directories

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Then, App default directories
System default directories ( lowest Precedence)

23. How Splunk prevents duplicate indexing of logs/data


? what is FishBucket ? how does it work ?
Splunk prevents duplicate data by means of fishbucket index. Fishbucket index mainly
consist of record of last ingested data. So, let us say if the last entry from a particular
source was pushed at 4:18 PM CST, it will keep a pointer there, we call it as the
Instruction pointer. Now next entry from the similar source will be appended after it.

24. What is dispatch directory for?


-what ever searches you run on search head, it will store in backend in Dispatch
directory, by default it will delet twise of schedule time of scheduled saved searches and
adhoc searches are every 10mins

25. How do you check any config file consistency? (


Explain Btool command)
Btool command shoudn't be used on most of the cases. It is most unstable command
and is very rarely updated. It's mainly for mainframe health check statuses.
However, if we still need to run and debug things we can use this command -

./splunk cmd btool inputs list --debug

26. How do you configure Search Head cluster ? Explain


the Deployer ?
Search head is a detailed process and requires a lot of pre-requisites to be in place.

This is how it is configured -

PRE-REQUISITES FOR SEARCH HEAD CLUSTERING ————

● System Requirements -

● Each member must run on it’s VM


● All machines must run the same OS (Need to clarify whether version difference is
also important)
● All members must run the same version of splunk Enterprise
● Members must be connected over a high speed n/w
● There must be at least 3 members deployed to start a SH cluster

● Replication Factor ———

Replication factor must be met for all the scenarios

Other system requirements -

● Updates to dashboards, reports, new saved searches created are always subject
to Captain - The captain takes care of all of this

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Before we Configure search head clustering, we need to configure a deployer because
Deployer IP is required to create a search head cluster

A bit about Deployer -

DEPLOYER -
Distributes apps and other configurations to SH cluster members
Can be colocated with deployment server if no. of deployment clients < 50
Can be colocated with Master node
Can be colocated with Monitoring console
Can service only one SH cluster
The cluster uses security keys to communicate/authenticate with SH members

Configure a Deployer ———‘


Go to - /opt/splunk/etc/system/local and vi servers.conf

After this add the below stanza -

[shclustering]
Pass4symmkey = password
shcluster_label = cluster1

Restart the Splunk since change has been done in .conf files

While setting up Search head clustering we first have to create a Deployer as above -

When it comes to setting up a SH clustering, the first thing we need to do is login to that
particular Search head and run the command by going to bin as follows -

./splunk init shcluster-config -auth admin:password -mgmt_uri


IPaddressofSHinHTTPSFollowedByMGMTPort -replication_port 9000 -replication_factor 3
-conf_deploy_fetch_url DeployerIpaddress:8089 -secret passwordofdeployer
-shcluster_label clusterName

27. What are orphaned searches and reports ? How do


you find them and change the owner/delete them?

Scheduled saved searches with invalid owners are considered "orphaned". They cannot
be run because Splunk cannot determine the roles to use for the search context.

Bu default its enabled, if we want disabled means limits.conf, look for


the [system_checks] stanza, and
set orphan_searches to disabled.https://docs.splunk.com/Documentation/Splunk/8.0.2/
Knowledge/Resolveorphanedsearches

28. Explain different Roles and their capabilities in


Splunk.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

- User - Can only read from splunk artifacts. Example, Reports, dashboards, alerts and
so on. Don't have edit permissions.
- Power user - Can create dashboard, alerts, reports and have Edit permissions
- Admin- Have access to all production servers, can do server restarts, take care of
maintenance activities and so on. Power user and normal user role are subsets of Admin
role

29. What is Lookup. ? How is it useful and used in


Splunk?
Lookup is a knowledge object in Splunk. Within our SPL code if we need to reference to
an external file we can do that using lookup. Lookup files can be added to splunk by
going to settings > lookups > add lookup files.
Lookups are useful also from the perspective of performing several types of joins like,
inner, outer etc.

30. Explain few transform commands in SPL?


Transforming commands are used for transforming event data into a different format,
this may include converting it to Chart, table, etc.

Below are some of the examples -

stats
chart
timechart
rare
top etc

31. What is tstat command and how does it work?


Explain what is tsidx ?
tstats command to perform statistical queries on indexed fields in tsidx files.

TSIDX files are Time series index files. When raw data comes to index it gets converted
into Tsidx files. Tsidx files are actually searchable files from Splunk search head

32. What are the stages of buckets in splunk ? How do


you achieve data retention policy in Splunk ?
Buckets are the directories in Splunk which stores data.

Different stages are -

-> Hot Bucket - Contains newly incoming data. One the Hot bucket size reaches a
particular threshold, it gets converted to a cold bucket. This bucket is also searchable
from the search head

-> Warm bucket - Data rolled from hot bucket comes to this bucket. This bucket is not
actively written into but is still searchable. Once the indexer reaches maximum number
of cold buckets maintained by it, it gets rolled to warm buckets

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

-> Cold - Contains data rolled from warm buckets. The data is still searcahble from the
search head. It's mainly for the backup purpose in case one or more hot or warm
buckets are unsearchable. After cold buckets reaches a threshold, they gets converted to
Frozen buckets
-> Frozen - Once the data is in Frozen buckets, it is either archived or deleted. In this
stage we are not able searchable anything.

33. How will you configure indexer cluster master ?


Cluster master or master node is for maintaining a particular cluster. This is how it is
configured -

Enable the master


To enable an indexer as the master node:
1. Click Settings in the upper right corner of Splunk Web.
2. In the Distributed environment group, click Indexer clustering.
3. Select Enable indexer clustering.
4. Select Master node and click Next.
5. There are a few fields to fill out:

● Replication Factor.The replication factor determines how many copies of data the
cluster maintains. The default is 3. For more information on the replication factor,
see Replication factor. Be sure to choose the right replication factor now. It is
inadvisable to increase the replication factor later, after the cluster contains
significant amounts of data.
● Search Factor. The search factor determines how many immediately searchable
copies of data the cluster maintains. The default is 2. For more information on the
search factor, see Search factor. Be sure to choose the right search factor now. It
is inadvisable to increase the search factor later, once the cluster has significant
amounts of data.
● Security Key. This is the key that authenticates communication between the
master and the peers and search heads. The key must be the same across all
cluster nodes. The value that you set here must be the same that you
subsequently set on the peers and search heads as well.
● Cluster Label. You can label the cluster here. The label is useful for identifying the
cluster in the monitoring console. See Set cluster labels in Monitoring Splunk
Enterprise.
6. Click Enable master node.
The message appears, "You must restart Splunk for the master node to become active.
You can restart Splunk from Server Controls."
7. Click Go to Server Controls. This takes you to the Settings page where you can initiate
the restart.
Important: When the master starts up for the first time, it will block indexing on the
peers until you enable and restart the full replication factor number of peers. Do not
restart the master while it is waiting for the peers to join the cluster. If you do, you will
need to restart the peers a second time.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

34. What is replication and search factors in CM?


Search factor nothing but no of copies of indexing files, i.e, it tells about the point out
the raw data.
Replication factor nothing but no of copies of raw data, i.e, compress formate of the
actual data.

35. Questions on Knowledge objects will be asked (


alerts ,reports, tags,eventtypes etc)
You already know about them

36. What is Splunk Workflow ? Have you worked on this.


Workflow actions are for automating low level implementation details and getting things
automated.
We can create workflow actions by going to Settings > Fields > Workflow actions

40. Static captain and dynamic captain

To switch to a static captain, reconfigure each cluster member to use a static captain:
1. On the member that you want to designate as captain, run this CLI command:
splunk edit shcluster-config -mode captain -captain_uri <URI>:<management_port>
-election false
2. On each non-captain member, run this CLI command:
splunk edit shcluster-config -mode member -captain_uri <URI>:<management_port>
-election false
Note the following:

● The -mode parameter specifies whether the instance should function as a captain or
solely as a member. The captain always functions as both captain and a member.
● The -captain_uri parameter specifies the URI and management port of the captain
instance.
● The -election parameter indicates the type of captain that this cluster uses. By
setting -election to "false", you indicate that the cluster uses a static captain.
You do not need to restart the captain or any other members after running these
commands. The captain immediately takes control of the cluster.
To confirm that the cluster is now operating with a static captain, run this CLI command
from any member:
splunk show shcluster-status -auth <username>:<password>
The dynamic_election flag will be set to 0.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Revert to the dynamic captain


When the precipitating situation has resolved, you should revert the cluster to control by
a single, dynamic captain. To switch to dynamic captain, you reconfigure all the
members that you previously configured for static captain. How exactly you do this
depends on the type of scenario you are recovering from.
This topic provides reversion procedures for the two main scenarios:

● Single-site cluster with loss of majority, where you converted the remaining
members to use static captain. Once the cluster regains a majority, you should
convert the members back to dynamic.

● Two-site cluster, where the majority site went down and you converted the members
on the minority site to use static captain. Once the majority site returns, you should
convert all members to dynamic.

Return single-site cluster to dynamic captain

In the scenario of a single-site cluster with loss of majority, you should revert to dynamic
mode once the cluster regains its majority:

1. As members come back online, convert them


one-by-one to point to the static captain:
splunk edit shcluster-config -election false -mode member -captain_uri
<URI>:<management_port>
Note the following:

● The -captain_uri parameter specifies the URI and management port of the static
captain instance.
You do not need to restart the member after running this command.
As you point each rejoining member to the static captain, it attempts to download the
replication delta. If the purge limit has been exceeded, the system will prompt you to
perform a manual resync, as explained in "How the update proceeds."
Caution: During the time that it takes for the remaining steps of this procedure to
complete, your users should not make any configuration changes.
2. Once the cluster has regained its majority, convert all members back to dynamic
captain use. Convert the current, static captain last. To accomplish this, run this
command on each member:
splunk edit shcluster-config -election true -mgmt_uri <URI>:<management_port>
Note the following:

● The -election parameter indicates the type of captain that this cluster uses. By
setting -election to "true", you indicate that the cluster uses a dynamic captain.
● The -mgmt_uri parameter specifies the URI and management port for this member
instance. You must use the fully qualified domain name. This is the same value that
you specified when you first deployed the member with the splunk init command.
You do not need to restart the member after running this command.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
3. Bootstrap one of the members. This member then becomes the first dynamic captain.
It is recommended that you bootstrap the member that was previously serving as the
static captain.
splunk bootstrap shcluster-captain -servers_list
"<URI>:<management_port>,<URI>:<management_port>,..." -auth
<username>:<password>

Dynamic captain is one which keeps on changing with passage of time. to set a Dynamic
captain we login to servers.conf and change the parameter - preferred_captain = true.

41. License Warning ? Queue and pipeline


in case the daily license limit is exhausted. There will be warnings coming on the search
heads that you've exceeded daily license volume and you either need to upgrade your
license or stop ingesting the data.

Each and every user authenticated to Splunk has limited search quota - Normal users
has around 25 MB whereas power users has around 50-125 MB. Once this threshold is
exceeded for a particular time, users searches will start getting queued.

InputQueue

Parsing Queue

Merging Queue

Typing Queue

Indexing Queue

Null Queue

42. Phonehome interval ? Server class ? Token ?


Phonehome interval is the time interval for which a particular deployment client will keep
polling your Deployment server. Ex, 2 seconds ago, 10 seconds etc.

Server class are group of servers coming from the same flavour or same geographic
location. Ex, to combine all windows based servers we will create a windows based
server class. Similarly, to combine all Unix based servers we will create a unix based
server class.

Token is a placeholder for a set of values for a particular variable. Example, Name =
$Token1$. Now here Name field can have multiple values like, Naveen, Ashu, Ajeet etc.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
The value that a particular token will hold completely depends upon the selection. Tokens
are always enclosed between $$, like the example above.

43. List the ways for finding if a Forwarder is not


reporting to Deployment Server?
Check if the Forwarder host name/Ip Address is not under the blacklist panel in
Deployment server.

44. Can SF be 4 ? What data issues you have fixed ?


If search factor can be 4, then replication factor is equal to 4 or more than 4

45. What is throttle ? Dashboard ? 2 types of dashboards


?
Throttling is nothing but suppress results until for a specific time period. this is normally
done on each search result basis.

Dashboard is a kind of view which contains one or more rows, each row contains one or
more panels each panel shows up different metrices.

There are three kinds of dashboards typically created with Splunk:


Dynamic form-based dashboards
Real-time dashboards
Dashboards as scheduled reports

46. licesne master data has exceeded ? What will happen


?
If License master data has exceeded you will start seeing warnings on search head, we
are not able to searchble but there is no effect on indexing the logs. Like this we will get
5 warnings within 30 days after the indexing also stopped.

47. What is Data models and Pivot tables?


Data models are a hierarchal representation of data. It shows the data in a more
structured and organised format. Pivot tables are subsets of a data model, it's an
interface where users can create reports, alerts without much involvement to SPL
language.

48. Default indexes created during Indexer installation?


Default indexes are - main, default, summary, _internal, _introspection, _audit, history,
_thefishbucket,_telemetry

49. How to onboard only JSON files ?


In props.conf we need to use below attribute

INDEXED_EXTRACTIONS = json

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
TRUNCATE = 10000

50. How splunk software handle data ?


It breaks raw data into set of events. Each event is assigned 5 defalut values - host,
source, sourcetype, tiemstamp, indexname

51. What is Knowledge bundle in Search head ?


Knowledge bundle is basically a kind of app bundle which is for sending regular updates
to all serach head members in a cluster. The captain of the search head cluster
distributes knowledge bundle to every search head member whenever any change in 1
or more search head takes place.

52. How will you make a indexer not searchable for user
(Question wrong)
I don't know who to do it but I will ask someone

53. Which config file will you change so that RF & SF to


be same in multicluster environment ?
Indexes.conf

54.How to pull Yesterday's data from DB, if server was


down ?
If there is a connection problem between database and DBconnect in Splunk and now it
has been resolved, we can run a SQL query which contains functions like, sysdate-1 if
it's a ORACLE DB or to_date() function again for oracle and other DBMS.

55. What is accelerate reports ?


Reports acceleration are subjected to Summary indexing. We cannot do report
acceleration on data coming directly from application indexes. Report acceleration is
done so that a report executes quickly on it's scheduled time. It basically means to
minimise the info_max_time.

56. Push app from deployer to search head in search


head cluster
This will do from deployer

./splunk apply cluster-bundle

How to initialize the Shcluster by using below command?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
./splunk init shcluster-config -auth admin:password -mgmt_uri <https://IP of
SH:8089>-replication_port 9000 -replication_factor 3 -conf_deploy_fetch_url
DeployerIpaddress:8089 -secret passwordofdeployer -shcluster_label clusterName

57. How to delete the indexed data ?


|delete command to temporarily make the data un-searchable from a search head.
Or
From on particular indexer
./splunk clean eventdata -index <indexName>

58. How to create indexes ?


From the splunk web, we can navigate to Settings > Indexes > New
OR
./splunk add index <index_name>

OR
Via configuration files -- indexes.conf

59. How do you read data from third layer of bucket?


Buckets reside in - $Splunk_Home/var/lib/splunk/defaultdb/db

By default Splunk sets the default size to 10 GB for all buckets on a 64-bit OS.
Third layer of bucket refers to cold bucket which is still searchable.

60. A user is not able to search with particular index. A


request to face the issue ? What will you do ?
we need to go to settings > Access controls > Roles > YourUserRole > Indexes and
check if the user has read access to index.

61. Forwarder types ?Uses ?


2 types of forwarders - Universal Forwarder and Heavy Forwarder.

Universal forwarders agents are basically installed on the client, i.e., where we are
getting the data. Its consume very less CPU and memory and its don’t have any web.
Heavy forwarders having full enterprise version of splunk software, it will do parsing
(masking, Index routing, sourcetype routing and ignore the garbage date), i.e,
pre-processing, routing and filtering capabilities and it have web.

62. What are alerts ? How moves the data ?


Alerts are saved searches in Splunk. It is used for if anything goes wrong ( reach search
condition) then immediately notify to appropriate team.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

63. How do you identify how much data is injected and


which indexer is down ?
index=_internal idx=* earliest="-1d@d" latest=now() |eval Size =
b/(1024/1024/1024)|table Size

to identify which Indexer is down we can again run a simple command - index=_internal
source="*splunkd.log*""*Connection failure*" - By running this command you will get to
know the indexerIP which is having connection failure.

64. Different configuration files you worked with ?


inputs.conf
outputs.conf
props.conf
transforms.conf
server.conf
serverclass.conf
indexes.conf

65. How do you reset admin password ?


Stop Splunk Enterprise
Find the passw file for your instance ($SPLUNK_HOME/etc/passwd) and rename it to
passwd.bk
Create a file named user-seed.conf in your $SPLUNK_HOME/etc/system/local/ directory.
In the file add the following text:
[user_info]
PASSWORD = NEW_PASSWORD
In the place of "NEW_PASSWORD" insert the password you would like to use.
Start Splunk Enterprise and use the new password to log into your instance from Splunk
Web.
If you previously created other users and know their login details, copy and paste their
credentials from the passwbk file into the passwd file and restart Splunk.

66. How to identify which port splunk is running on ?


Go to /bin and run the following command - ./splunk show web-port
to know the management port, run this command - ./splunk show splunkd-port

67. Monitor entire health of system ? which component


do you login to see the dashboard ?
monitoring console application - This is a pre-built application

68. How do you update the indexer when there is a new


index to be added ?
Login to Splunk web, go to setting > indexes >new

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

69. How do you identify which machine is not reporting


to splunk ?
Login to Deployment server - Check for the deployment client, i.e. Universal Forwarder
and check for the phone home interval - if the Phone home interval is longer than usual,
ex, 24 hours ago, 3 days ago that means the machine is no longer reporting to Splunk

70. How you deploy app in SHcluster ?


First we need to download the splunkbase and copy to $SPLUNK_HOME/etc/shcluster in
Deployer and then need to push to searchhead members from the deployer
splunk apply shcluster-bundle

71. Change retention period in Indexer ? What is the


config file ?
Retention period can be changed by editing - Indexes.conf
forzenTimePeriodinSecs =

72. Which bucket is not searchable ?


Once reach data to frozen bucket, we are not able to searchable

===================================================================================
=

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Ashutosh Bhatt Answers


1. Explain about your roles and responsibilities in your
organization -

- Taking care of scheduled maintenance activities

- Creating user based knowledge objects like, Dashboards, reports, alerts, static and
dynamic lookups, eventtypes, doing field extractions.

- Troubleshooting issues related to production environment like Dashboard not showing


up the data - In this case we basically check from the raw logs if the format of the data
has changed or not.

- Been part of mass password update activities for DATABASE related inputs because if
the DATABASE password change happens we need to change the connection password
created in our BBConnect application

2. Explain the Architecture of Splunk components in your


organization. Single site/Multi site cluster ?

- We have a multisite cluster both at Rochelle and Hudson. Each of these clusters
contains 40 indexers each. Each of the cluster has 1 cluster master, 1 deployment server,
more than 10000 forwarders installed on clients, 1 deployment server configured to
receive data from forwarders - Deployment server consists of 3 kinds of apps, 7 search
heads in a cluster, 1 deployer

3. Questions on Ports will be asked . Can you change the


default port on which splunk component runs and if yes
how. ?

Yes, it is configurable like this -

1.
Log into Splunk Web as the admin user.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
2.
3.
Click Settings in the top-right of the interface.

4.
5.
Click the Server settings link in the System section of the screen.

6.
7.
Click General settings.

8.
9.
Change the value for either Management port or Web port, and click Save.

10.
Alternatively, you can also go /bin folder and run the following command -

splunk set web-port newportnumber

4. Have you worked on Data Onboarding(very crucial


and BAU one) – What kind of data you have on boarded .
What process you follow while onboarding a data. Have
you also worked on Data normalization.

Yes. We need to login to the machine basically the client. If the Universal forwarder is
already not installed there, we need to install one. Later on we need to go to ./bin and
run the following command -

/splunk add monitor SOURCEFILENAME -index INDEXNAME. Example, we need to


monitor a file like this - /var/log/introspection/resource_usage.log. To monitor such file
we need to run a command like this, ./splunk add monitor
/var/log/introspection/resource_usage.log -index Ashu(Ashu is the name of the index
where the data will be stored).

Types of data On-boarded - Application logs, webserver logs.

Yes, I did worked on Data Normalization. Data normalization as the name states is the
process of removing redundant/duplicate data, plus, it also comprises of logically
grouping data together.
In Splunk we makes use of tags during search time to normalize data. There is one thing
we need to take care while normalizing data. Data normalization should only be done at
search time and not index time. It is a technique which is adopted for faster data
retrieving and lesser search execution time. So, it is better we do it once the data is
stored in indexers.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Pointer: Explain types of data we can ingest in splunk . Common ones they expect us to
answer is flat files(logfile, textfile etc) and syslog onboarding. You can also talk about
Database and csv onboarding.

5. What are the important configs on Universal


forwarder . Explain them and also explain what all
params you define while writing them

Universal forwarder doesn't have any GUI. So everything that we need to configure is by
logging to the UF through admin credentials. Once you login to it using this
path opt/splunkforwarder/bin we need to need to run following command to add
indexerIP/hostname where it will be forwarding data to. The command is this, ./splunk
add forward-server indexerIP:9997

Once this is done, we need to add sourcefile names that needs to be added. Example has
already been explained in #4, /splunk add monitor
/var/log/introspection/resource_usage.log -index Ashu

Inputs.conf - All the sourcenames added like this, /splunk add monitor
/var/log/introspection/resource_usage.log -index Ashu will be visible under inputs.conf

Outputs.conf - All the indexers name added using this path, /splunk add forward-server
indexerIP:9997 will be visible under outputs.conf

Pointers: inputs.conf , outputs.conf

6. Have you worked on Heavy forwarders ? Whats the


importance of it ? what are the important configurations
files you have on HF?

- Heavy forwarders are used for data pre-processing meaning it is used for selective data
forwarding and removing unwanted values as well.

These are the important configs of a Heavy forwarder -

When you open transforms.conf, these are the CONFIG parameters which are
configurable -

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

DEST_KEY

REGEX

FORMAT

7. Have you worked on data transformation? For


example can achieve below scenarios-
a. How will you mask the sensitive data before its indexed

Open transforms.conf and configure a SEDCMD class -

SEDCMD-<class> = s/<regex>/<replacement>/flags


regex is a Perl language regular expression



replacement is a string to replace the regular expression match.



flags can be either the letter g to replace all matches or a number to replace a
specified match.


b. Can you change/replace hostname with new host ?

Using the above class and using the replacement parameter - replacement is a string to
replace the regular expression match

c. How can you filter out unwanted data from a data source and drop it before it gets
indexed so that I will save on licensing cost?

This is done by means of a heavy forwarder using the same configs -

DEST_KEY

REGEX

FORMAT

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

8. How have you onboarded syslog data? Can you


explain that ?

Yes, using the unencrypted SYSLOG service and a universal forwarder. Alternatively, we
can also use daemon processes like, Collectd and Statsd to transmit data using UDP.

9. Why is sourcetype and source definition so important


?

Sourcetype is used as a data classifier whereas source contains the exact path from
where the data needs to be onboarded

10. What is a license master? How does the licensing of


the Splunk work ? how to you create a license master
and license pool.?

License master is a splunk instance which is used for monitoring splunk data volume on
a daily basis. This is how we configure a license master -

Login to any particular indexer - Go to settings > under System > Licensing > Add
License File (Mainly an XML based licensing file is added)

11. How much data will be applicable for license cost –


Is the entire data size that is being ingested or only the
compressed raw data after indexed.

Licensing cost is calculated on entire data size that is ingested. Compressing has nothing
to with License usage, compressing is done to save on Disk space.

12. What is the data compression ratio – Raw:Index

Normally data is 38-45% compressed.

We can check compressed data by running |dbinspect command.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

13. Have you worked on Apps and Addons ? What apps


and addons have you installed and configured in your
org?

- Apps are a full fledged version of splunk enterprise. They contains options for creating
dashboards, reports, alerts, lookups, eventypes, tags and all other kind of knowledge
objects. Add-ons on other hand perform a limited set of functionalities like for Example,
Windows Addon can only get the data from windows based systems, Unix based Add-ons
can get data from specfic unix based servers and so on.

We have installed apps like, DBCONNECT and Add ons like, Windows app for
infrastructure and Unix apps for getting data from Unix based systems

Pointer: Talk about Splunk App for DB connect and Splunk app/addon for linux/unix .
These are 2 common apps/addons that you should know.

Would be a good deal if you can talk about Splunk app for AWS (cloud integration)

14. What is props.conf and transform.conf ? How do


you write stanzas and relate them .?

Props.conf is a configuration file used for selective indexing - mainly used for data
pre-processing. We need to mention the sourcename in the props.conf. Transforms.conf
is for specifying what all set of events/parameters/fields needs to be excluded. Example,
DEST_KEY

REGEX

FORMAT

15. Questions on regex will be asked – A common one


would be – could you tell me the regex for IP address.?

Regex in splunk are done with the help of rex, regex and erex command. No one will
ever ask you to tell regex about IPaddresses

16. What is Deployment server used for ? what is


Serverclass and apps ? how do you deploy base
configuration(inputs,outputs.conf) from DS?
PROVOKE TRAININGS www.provoketrainings.com
SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Deployment server is a splunk instance which is used for polling from different
deployment clients like, indexer, Universal forwarder, Heavy forwarder etc.

Server classes is used for grouping different servers based on the classes - like if I have
to group all the UNIX based servers i can create a class called - UNIX_BASED_SERVERS
and group all servers under this class. Similarly, for Windows based servers I can create
a WINDOWS_BASED_SERVERS class and group all servers under this class.

Apps are basically a set of stanzas which are deployed to different members of a server
class.

when we set up server classes and assigns them apps or set of apps we need to restart
splunkd, a system level process - Once this is done any new app updates will be
automatically sent to all servers.

17. How will you troubleshoot Splunk performance


issu/error?

Look through Splunkd.log for diagonostic and error metrices. We can also go to
Monitoring console app and check for the resource utilization of different server
components like, CPU, MEMORY Utilisation etc.

We can also install splunk-on-splunk app from splunbase.com and monitor health of
different splunk instances

18. What is Summary index .how do you create


accelerated reports ? is Licensing cost applicanble on
Summary index?

Summary index contains summarised or brief data. We create accelerated reports by


enabling accelerate reports option. Kindly remember that Report acceleration should only
be done on data coming from summary index, not on data coming from the application
or main index.

Summary index doesn't counts on licensing volume.

19. Name some default Splunk indexes and their use?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Main - Contains all system related data. While adding monitor using this command -
./splunk add monitor SOURCEFILENAME if we don't mention any index name the data
will automatically go into this index

_audit - All search related information - Scheduled searches as well as adhoc searches

_introspection - All system wide data, including memory and CPU data

_internal - Error specific data. example, DATABASE connectivity hampered, etc.

20. What is Epoch time and how do you convert epoch


time into Standard time ? (function in splunk)

Epoch time is UNIX based time in splunk. Epoch time is converted to Standard time
using this function - |eval Time = strftime(EpochTimeField, "%y-%m-%d %H:%M:%S")

21. What do you know about Splunk CIM ? What is


Splunk Data normalization?

CIM is common information model used by splunk. CIM acts as a common standard used
by data coming from different sources.

Data normalization is already explained above

22. What is the file precedence Splunk follows. Could


you explain that?

- System local directory has the highest preference

- Then, App local directories

- Then, App default directories

- And at last, System default directories

23. How Splunk prevents duplicate indexing of logs/data


? what is FishBucket ? how does it work ?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Splunk prevents duplicate data by means of fishbucket index. Fishbucket index mainly
consist of record of last ingested data. So, let us say if the last entry from a particular
source was pushed at 4:18 PM CST, it will keep a pointer there, we call it as the
Instruction pointer. Now next entry from the similar source will be appended after it.

24. What is dispatch directory for?

- Dispatch directory is for running all scheduled saved searches and adhoc searches.

25. How do you check any config file consistency? (


Explain Btool command)

Btool command shoudn't be used on most of the cases. It is most unstable command
and is very rarely updated. It's mainly for mainframe health check statuses.

However, if we still need to run and debug things we can use this command -

./splunk btool inputs list --debug

26. How do you configure Search Head cluster ? Explain


the Deployer ?

Search head is a detailed process and requires a lot of pre-requisites to be in place.

This is how it is configured -

PRE-REQUISITES FOR SEARCH HEAD CLUSTERING ————

● System Requirements -

● Each member must run on it’s VM


● All machines must run the same OS (Need to clarify whether version difference is
also important)
● All members must run the same version of splunk Enterprise
● Members must be connected over a high speed n/w
● There must be at least 3 members deployed to start a SH cluster

● Replication Factor ———

Replication factor must be met for all the scenarios

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Other system requirements -

● Updates to dashboards, reports, new saved searches created are always subject
to Captain - The captain takes care of all of this
Before we Configure search head clustering, we need to configure a deployer because
Deployer IP is required to create a search head cluster

A bit about Deployer -

DEPLOYER -
Distributes apps and other configurations to SH cluster members
Can be colocated with deployment server if no. of deployment clients < 50
Can be colocated with Master node
Can be colocated with Monitoring console
Can service only one SH cluster
The cluster uses security keys to communicate/authenticate with SH members

Configure a Deployer ———‘

Go to - /opt/splunk/etc/system/local and vi servers.conf

After this add the below stanza -

[shclustering]
Pass4symmkey = password
shcluster_label = cluster1

Restart the Splunk since change has been done in .conf files

While setting up Search head clustering we first have to create a Deployer as above -

When it comes to setting up a SH clustering, the first thing we need to do is login to that
particular Search head and run the command by going to bin as follows -

./splunk init shcluster-config -auth admin:password -mgmt_uri


IPaddressofSHinHTTPSFollowedByMGMTPort -replication_port 9000 -replication_factor 3
-conf_deploy_fetch_url DeployerIpaddress:8089 -secret passwordofdeployer
-shcluster_label clusterName

27. What are orphaned searches and reports ? How do


you find them and change the owner/delete them?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Scheduled saved searches which are under different user names who are no more part of
the splunk ecosystem or have left the company are called as orphaned searches. It
happens because there is no role associated within splunk for that particluar user.

With recent upgrade of Splunk to 8.0.1 the problem with orphaned searches has almost
resolved. But still if you see the orphaned searches warning appearing under Messages
in your search head you can follow this guideline on how to resolve.

https://docs.splunk.com/Documentation/Splunk/8.0.2/Knowledge/Resolveorphanedsearc
hes

28. Explain different Roles and their capabilities in


Splunk.

- User - Can only read from splunk artifacts. Example, Reports, dashboards, alerts and
so on. Don't have edit permissions.

- Power user - Can create dashboard, alerts, reports and have Edit permissions

- Admin- Have access to all production servers, can do server restarts, take care of
maintenance activities and so on. Power user and normal user role are subsets of Admin
role

29. What is Lookup. ? How is it useful and used in


Splunk?

Lookup is a knowledge object in Splunk. Within our SPL code if we need to reference to
an external file we can do that using lookup. Lookup files can be added to splunk by
going to settings > lookups > add lookup files.

Lookups are useful also from the perspective of performing several types of joins like,
inner, outer etc.

30. Explain few transform commands in SPL?

Transforming commands are used for transforming event data into a different format,
this may include converting it to Chart, table, etc.

Below are some of the examples -

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
stats

chart

timechart

rare

top etc

31. What is tstat command and how does it work?


Explain what is tsidx ?

Tstats command works on only index time fields. Like the stats commands it shows up
the data in the tabulat format. It is very fast compared to stats command but using
tstats you can only group-by with index fields not the search time fields which are
created using Eval command.

TSIDX files are Time series index files. When raw data comes to index it gets converted
into Tsidx files. Tsidx files are actually searchable files from Splunk search head

32. What are the stages of buckets in splunk ? How do


you achieve data retention policy in Splunk ?

Buckets are the directories in Splunk which stores data.

Different stages are -

- Hot Bucket - Contains newly incoming data. One the Hot bucket size reaches a
particular threshold, it gets converted to a cold bucket. This bucket is also searchable
from the search head

- Warm bucket - Data rolled from hot bucket comes to this bucket. This bucket is not
actively written into but is still searchable. Once the indexer reaches maximum number
of cold buckets maintained by it, it gets rolled to warm buckets

- Cold - Contains data rolled from warm buckets. The data is still searcahble from the
search head. It's mainly for the backup purpose in case one or more hot or warm
buckets are unsearchable. After cold buckets reaches a threshold, they gets converted to
Frozen buckets

- Frozen - Once the data is in Frozen buckets, it is either archived or deleted.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

33. How will you configure indexer cluster master ?

Cluster master or master node is for maintaining a particular cluster. This is how it is
configured -

Enable the master


To enable an indexer as the master node:
1. Click Settings in the upper right corner of Splunk Web.
2. In the Distributed environment group, click Indexer clustering.
3. Select Enable indexer clustering.
4. Select Master node and click Next.
5. There are a few fields to fill out:


Replication Factor.The replication factor determines how many copies of data the
cluster maintains. The default is 3. For more information on the replication factor,
see Replication factor. Be sure to choose the right replication factor now. It is
inadvisable to increase the replication factor later, after the cluster contains
significant amounts of data.



Search Factor. The search factor determines how many immediately searchable
copies of data the cluster maintains. The default is 2. For more information on the
search factor, see Search factor. Be sure to choose the right search factor now. It is
inadvisable to increase the search factor later, once the cluster has significant
amounts of data.



Security Key. This is the key that authenticates communication between the master
and the peers and search heads. The key must be the same across all cluster nodes.
The value that you set here must be the same that you subsequently set on the peers
and search heads as well.



Cluster Label. You can label the cluster here. The label is useful for identifying the
cluster in the monitoring console. See Set cluster labels in Monitoring Splunk
Enterprise.


6. Click Enable master node.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
The message appears, "You must restart Splunk for the master node to become active.
You can restart Splunk from Server Controls."
7. Click Go to Server Controls. This takes you to the Settings page where you can initiate
the restart.
Important: When the master starts up for the first time, it will block indexing on the
peers until you enable and restart the full replication factor number of peers. Do not
restart the master while it is waiting for the peers to join the cluster. If you do, you will
need to restart the peers a second time.

34. What is replication and search factors in CM?

Search factor tells the no. of searchable copies available

Replication factor tells how many indexers have the search copies available

35. Questions on Knowledge objects will be asked (


alerts ,reports, tags,eventtypes etc)

You already know about them

36. What is Splunk Workload Management ? Have you


worked on this.

Workflow actions are for automating low level implementation details and getting things
automated.

We can create workflow actions by going to Settings > Fields > Workflow actions

40. Static captain and dynamic captain


Static captain in a search head cluster is one which doesn't changes. We configure a
static captain by logging to servers.conf and changing the parameter - preferred_captain
= false.
Dynamic captain is one which keeps on changing with passage of time. to set a Dynamic
captain we login to servers.conf and change the parameter - preferred_captain = true.

41. License Warning ? Queue and pipeline

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
in case the daily license limit is exhausted. There will be warnings coming on the search
heads that you've exceeded daily license volume and you either need to upgrade your
license or stop ingesting the data.

Each and every user authenticated to Splunk has limited search quota - Normal users
has around 25 MB whereas power users has around 50-125 MB. Once this threshold is
exceeded for a particular time, users searches will start getting queued.

42. Phonehome interval ? Server class ? Token ?


Phonehome interval is the time interval for which a particular deployment client will keep
polling your Deployment server. Ex, 2 seconds ago, 10 seconds etc.

Server class are group of servers coming from the same flavour or same geographic
location. Ex, to combine all windows based servers we will create a windows based
server class. Similarly, to combine all Unix based servers we will create a unix based
server class.

Token is a placeholder for a set of values for a particular variable. Example, Name =
$Token1$. Now here Name field can have multiple values like, Naveen, Ashu, Ajeet etc.
The value that a particular token will hold completely depends upon the selection. Tokens
are always enclosed between $$, like the example above.

43. List the ways for finding if a Forwarder is not


reporting to Deployment Server?
Check if the Forwarder host name/Ip Address is not under the blacklist panel in
Deployment server.

44. Can SF be 4 ? What data issues you have fixed ?


Yes search factor can be 4 if replication factor is at least 5

45. What is throttle ? Dashboard ? 2 types of dashboards


?
Throttling is suppressing an alert for a specific interval of time. this is normally done on
each search result basis.

Dashboard is a kind of view which contains different panels and panel shows up different
metrices.

2 types of dashboards - I didn't understood this question

46. licesne master data has exceeded ? What will happen


?
If License master data has exceeded you will start seeing warnings on search head, Data
ingestion will be stopped but a user will still be able to search the data.

47. What is Data models and Pivot tables?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Data models are a hierarchal representation of data. It shows the data in a more
structured and organised format. Pivot tables are subsets of a data model, it's an
interface where users can create reports, alerts without much involvement to SPL
language.

48. Default indexes created during Indexer installation?


Default indexes are - main, default, summary, _internal, _introspection, _audit

49. How to onboard only JSON files ?


Set the sourcetype as JSON

50. How splunk software handle data ?


It breaks raw data into set of events. Each event is assigned 5 defalut values - host,
source, sourcetype, tiemstamp, indexname

51. What is Knowledge bundle in Search head ?


Knowledge bundle is basically a kind of app bundle which is for sending regular updates
to all serach head members in a cluster. The captain of the search head cluster
distributes knowledge bundle to every search head member whenever any change in 1
or more search head takes place.

52. How will you make a indexer not searchable for user
I don't know who to do it but I will ask someone

53. Which config file will you change so that RF & SF to


be same in multicluster environment ?
Indexes.conf

54. How to pull Yesterday's data from DB, if server was


down ?
If there is a connection problem between database and DBconnect in Splunk and now it
has been resolved, we can run a SQL query which contains functions like, sysdate-1 if
it's a ORACLE DB or to_date() function again for oracle and other DBMS.

55. What is accelerate reports ?


Reports acceleration are subjected to Summary indexing. We cannot do report
acceleration on data coming directly from application indexes. Report acceleration is
done so that a report executes quickly on it's scheduled time. It basically means to
minimise the info_max_time.

56. Push app from deployer to search head in search


head cluster
PROVOKE TRAININGS www.provoketrainings.com
SPLUNK INTERVIEW Q & A
Ph.No:9100005757

This is the done with this command and already explained above -
./splunk init shcluster-config -auth admin:password -mgmt_uri
IPaddressofSHinHTTPSFollowedByMGMTPort -replication_port 9000 -replication_factor 3
-conf_deploy_fetch_url DeployerIpaddress:8089 -secret passwordofdeployer
-shcluster_label clusterName

57. How to delete the indexed data ?


|delete command to temporarily make the data un-searchable from a search head.
There is another command called clear which is run from CLI and removes the entire
data from an index.

58. How to create indexes ?


From the splunk web, we can navigate to Settings > Indexes > New

59. How do you read data from third layer of bucket?


Buckets reside in - $Splunk_Home/var/lib/splunk/defaultdb/db

By default Splunk sets the default size to 10 GB for all buckets on a 64-bit OS.
Third layer of bucket refers to cold bucket which is still searchable.

60. A user is not able to search with particular index. A


request to face the issue ? What will you do ?
we need to go to settings > Access controls > Roles > YourUserRole > Indexes and
check if the user has read access to index.

61. Forwarder types ?Uses ?


2 types of forwarders - Universal Forwarder and Heavy Forwarder.

Universal forwarders are basically agents which are installed on the client, i.e., servers
from where we are getting the data. They don't have any pre-processing capability.
Heavy forwarders in turn have pre-processing, routing and filtering capabilities.

62. What are alerts ? How moves the data ?


Alerts are saved searches in Splunk. They are used for notifying application/server
owners etc about the erroneous conditions that may occur

63. How do you identify how much data is injected and


which indexer is down ?
index=_internal idx=* earliest="-1d@d" latest=now() |eval Size =
b/(1024/1024/1024)|table Size

to identify which Indexer is down we can again run a simple command - index=_internal
source="*splunkd.log*""*Connection failure*" - By running this command you will get to
know the indexerIP which is having connection failure.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

64. Different configuration files you worked with ?


inputs.conf
outputs.conf
props.conf
transforms.conf

65. How do you reset admin password ?


Stop Splunk Enterprise
Find the passw file for your instance ($SPLUNK_HOME/etc/passwd) and rename it to
passwd.bk
Create a file named user-seed.conf in your $SPLUNK_HOME/etc/system/local/ directory.
In the file add the following text:
[user_info]
PASSWORD = NEW_PASSWORD
In the place of "NEW_PASSWORD" insert the password you would like to use.
Start Splunk Enterprise and use the new password to log into your instance from Splunk
Web.
If you previously created other users and know their login details, copy and paste their
credentials from the passwbk file into the passwd file and restart Splunk.
66. How to identify which port splunk is running on ?
Go to /bin and run the following command - ./splunk show web-port
to know the management port, run this command - ./splunk show splunkd-port

67. Monitor entire health of system ? which component


do you login to see the dashboard ?
monitoring console application - This is a pre-built application

68. How do you update the indexer when there is a new


index to be added ?
Login to Splunk web, go to setting > indexes >new

69. How do you identify which machine is not reporting


to splunk ?
Login to Deployment server - Check for the deployment client, i.e. Universal Forwarder
and check for the phone home interval - if the Phone home interval is longer than usual,
ex, 24 hours ago, 3 days ago that means the machine is no longer reporting to Splunk

70. How you deploy app in SH ?


Through Deployer - Deployer configuration has already been explained above.

71. Change retention period in Indexer ? What is the


config file ?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Retention period can be changed by editing - Indexes.conf

72. Which bucket is not searchable ?


Frozen and Thawed buckets

Ashutosh Answers in 2021


How to add new indexer to cluster? ------------- Go to Settings > Indexer Clustering > Add
peer node - Give master URI. Since it is a new cluster member, you need to run this
command so that all data is synced with this cluster as well. The command is - splunk
apply cluster-bundle.

If my indexer is down....How to troubleshoot? ------------------- If one of the indexer cluster


members is down, follow a simple process of restarting splunk. After that go to,
/opt/splunk and see if you can loop through this directory without any error like, File or
directory doesn't exists. If this error persists again and again then check _internal
logs in search bar and see what kind of exception has occurred that has caused peer
node to go down. Alternatively you can go to /opt/splunk/var/log/splunk/splunkd.log and
check for latest 10, 000 entries. Third way would be to go to Monitoring Console app and
check for the status of the down peer and see what diagnostic metrices are there. Fourth
way would be to go to Settings > Health Card Manager and see what is the status of
indexer cluster. if the status for several parameters is in RED that means there is some
issue on the server backend as well, it's now time for you to involve server teams as well
since it might be a server crash issue as well.

Normally when a indexer cluster member having searchable copies goes down, the _raw
copies of data gets converted to searchable files (tsidx). Master node in this case takes
care of bucket fixing, that is tries to keep the match with the Replication factor you've
set up.

what is search affinity splunk? - In case of Mutisite cluster, search affinity refers to
setting up search heads in a way that they must only query for results from their local
site, that is the site that is nearest to them. Example, if you have a multisite cluster in 2
different sites, namely, Rochelle and Hudson. Now if a user searches for any data from
Rochelle, all the search requests must go to Indexer clusters which are in Rochelle zone
and similar for hudson site as well. Setting up search affinity helps in reducing latency
within networks.

Any experience in creating custom apps ? - NO

What is maintainance mode ? - Also called as halt mode because it prevents any bucket
replication within indexer cluster. Example, in case you are upgrading your splunk from

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
7.X to 8.X you need to enable maintenance mode. To enable maintenence mode, you
need to go to Master node and run command, splunk enable maintenance-mode.
After the maintenance activity has occurred you can run, splunk disable
maintenance-mode.

What does maintainance mode do ? - Maintenance mode will halt all buckets fixups,
meaning, if there is any corrupt bucket it will not be fixed to normal. Also, maintenance
mode will not check for conditions like, Replication factor is not met or Search factor is
not met. It also prevents timely rolling of Hot buckets to warm buckets.

50% of searchheads are down ? what will happen? How to resolve? ---------- Run, splunk
show shcluster-status to see if the captain is also down. In this case you need to setup a
static captain as follows - ./splunk edit shcluster-config -mode captain
-captain_uri https://SHURL:8089 -election false. In case you have 4 SH members and 2
went down that means your default replication factor which is set to 3 will not be met. In
this case you can reinstantiate a SH cluster with following command as follows by setting
the RF to 2. Here is the command, ./splunk init shcluster-config -auth
username:password -mgmt_uri https://shheadURI:8089 -replication_port 9000
-replication_factor 2 -conf_deploy_fetch_url http://DeployerURL:8089 -secret
deployerencyptedpassword -shcluster_label labelName.

what are the Challenges you are faced? ----------- HERE YOU CAN GIVE ANY EXAMPLE.

How to upgrade the version from scratch? -----------------Enable Maintenance Mode. take a
backup of all splunk artifacts to some repository or to some backup server. Install the
newer package using wget utility on linux machines and getting the windows installer on
windows. Keep using Monitoring console and Health card report manager to check the
status of your Splunk instances.

What is Base search and child search or post process search? - Base search or post
process searches are used for optimising Splunk searches run time. There will be one
search that will be executed once and same search can be used in multiple panels of
a dashboard. To create a base search, do the following, <search id="basesearchID">
and then use <search base="basesearchID"> in all the panels that will be using the base
search.

What is the average time taken to ingest the 500gb data ? - Depends on how the
ingestion is happening.

If deployment server went down ? how to resolve? What is the impact? - The main
purpose why we used DS is to distribute apps and updates to a group of
non-clustered Splunk instances. In case DS went down all the Deployment clients
polling to DS will not get the latest set of apps and updates.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

If cluster master went down ? how to resolve? What is the impact? - Cluster master is
responsible for managing the entire Indexer cluster members. In case, CM goes down
replication between different iNDEXER MEMBERS will not happen. A user search will
randomly land in one of the indexer member cluster and Co-ordination will break. As
part of remedy, restart splunkd on Cluster master and look for it's internal logs on other
cluster members. To resync equal data between all members, run - splunk apply
cluster-config and also splunk resync cluster-config commands individually in all Cluster
members so that all the members have same set of data.

After that you can randomly run, |rest /servicesNS/-/-/data/ui/views and |rest
/servicesNS/-/-/saved/searches REST calls in any of the indecer members to see if all
the default ARTIFACTS are same across all member nodes.

How many servers do we need to ingest tha 300gb data? How can you segregate the
data? - You can make use of https://splunk-sizing.appspot.com/ website to make
selection on amount of bandwidth you may require.

Segregation of data must always happen based on SOURCETYPE

At a time 10 persons searching for same query but only 6 members getting the query
remaining not why? - Depends on no. of VCPUs that your infrastructure supports. Let's
say if you are having 3 Search head members with 2 VCPUs each that means only 2*3 =
6 Concurrent searches can run at a time. You need to increase your throughput by
adding more CPUs for concurrent processing.

How to check kv store status? - Splunk show kvstore-status

What is SDK framework? - Software development kit. SDK is a framework provided by


Splunk for creating custom apps once that could be used for third party tools integration,
connecting to Kafka clusters and many more similar things. As Splunk says it, SDK
framework is used for creating, SPLUNK APPS FOR DEVELOPERS.

Hot to warm rolling conditions ? - Based on retention policy of Hot bucket and maximum
size of each bucket. You can use |dbinspect indexname command to get the bucket info
about any index. Alternatively, you can also vi indexes.conf to get info about hot and
warm buckets.

What is the background process to ingest the data into the splunk ? -Install UF on any
machine using the wget utiliy supported by splunk/downloads, once this is done loop
through, /opt/splunkforwarder directory, if you can successfully loop through this
directory that means UF package is successfully installed. Run, splunk enable boot-start
so that UF is always available at run time.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
You can also use STATSD and COLLECTD as a background process to ingest data into
Splunk. Both, STATSD and COLLECTD uses UDP to transmit data compared to UF
which uses TCP to transmit data.

What is the role of captain ?how can we define captain ? - Captain takes care of
replication and managing searches efficiently between different search head members.
Captain can be defined as follows - ./splunk bootstrap shcluster-captain -servers_list
“https://shmemberURI:8089, otherSHmembersURI”

you running the Apply bundle command?what is background process ? - Splunkd

How to onboard the AD Logs in to Splunk? -

How to onboard the data through Splunk Add-on? - Settings > Data and Inputs >
Continuously monitor > filename > Ingest using add-on (From dropdown select the
add-on name).

How to onboard the syslog data from the scratch? - Use an agent like http event
collecter or REST APIs to first send all the syslog data to syslog-NG server and within the
syslogNG servers you can install UF package from where you can ingest the data to
Splunk indexers.

How to ingest teh data from routers to Splunk? - SAME APPRAOCH AS ABOVE

Events time is showing feture timestamp? what is the reason? How to fix this issue? -
Look for the timestamp column been used from the log files to ingest data. The column
might be having future timestamps pertaining to future migrations dates or DR activities
for example. In your UF for to indexes.conf and set a parameter, time=Date.

====

How to optimize the Splunk Query in real time ? - There are lot of techniques - Base
searches for dashboards, Filter as early as possible, avoid using wildcards, Inclusion is
always better than exclusion. Example, search specifically for status = 50* rather than
searching for |search NOT status=50*.

Use summary indexes to speed up search operations

Use Report Accelaration to speed up report exceution time.

Use data models which can be used within lot of other saved searches like dashboards
and reports.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Index = abc | delete ….after deleting the data how we can retrieve the data ? -------
Delete command makes data unavailable from the index. It never reclaims disk space.
You can always get the data back from your disk even after you run delete command

Diff b\w dashboard and form ? - Dashboard is a view. A form incorporates entire
dashboard code by means of a form name. You can refer form name while calling low
level APIs for Splunk integration with third-party apps.

Alerts and reports are stored in where ? - /opt/splunk/ ..... /saved/searches.

Alert didn’t trigger ? reason ?How to troubleshoot? - Run the following command - |rest
/services/search/jobs "Alert Name". This will tell you when the alert has last ran. You can
also run the following command if you have admin permissions - index=_audit "Alert
Name" - This will tell you what time the alert took to run and when it was last executed.
Run, index=_internal to get the diagnostics metrices for the same alert name.

You can also run, |rest /servicesNS/-/-/saved/searches |search cron_schedule = "0 *"
(Give the wildcard cron schedule for the alert and check if there are lot of concurret
saved searches running at same interval of time. Try changing the schedules of other
alerts and reports by 1-2 mins ahead or behind).

what is the difference between top and head? - Top gives you list of most common field
values alongwith a percentage of how frequently it is present compared to other
field values. Head command just gives initial few results based on the query specified.
Example, there is a field called price which has values as, 20, 30, 40, 50, 60, 70, 80, 90,
20, 30, 40, 20. When you run, | top price, this command will give you price value as 20
in first row becuse 20 is appearing maximum no. of 3 times in all price field values, it will
also show percentage of how many times value 20 is appearing. Similarly, if you run -
|head 5 price it will giv eyou this as the output - 20, 30, 40, 50, 60.

What is the REST API? - REST APIs are path to specific locations or directories for
accessing different types of knowlegde objects. Example by using, |rest
/servicesNS/-/-/saved/searches, you can get list of all reports and alerts. Similary by
running, |rest /servicesNS/-/-/data/ui/views, you can get list of all dashboards and so
on.

Have you migrated any knowledge objects from one environment to another
environment? - Yes, you can do them with the help of REST APIs as explained above.

Where we can find the path of dashboard? ---------- Use rest /servicesNS/-/-/data/ui/views
|search eai.acl_app = * label="Dashboard_title"|table eai.acl_app label

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
What is the difference between stats eventstats and streamstats? - Stats command will
give you everything in tabular format. you cannot use stats command evaluation fields in
later part of the search. Example, if you do |stats sum(price) as Sum_Price by
Product_Name and later you do |table price Product_Name, you will see NULL values for
price field. Eventstats is helpful in these cases, Eventstats adds corresponding ouput to
each event and you can also re-use the evaluation field in later part of searches as well.
Example, |eventstats sum(price) as Sum_Price by Product_Name and later you do |table
price Product_Name, you will see actual values for price field as well compared to stats
command. Streamstats gives running calculation for any field name specified plus it also
keeps the orignal value of a field name as same. Example, price field has values as, 20,
30, 40, 50. after you do, |streamstats sum(price) as Sum_Price by Product_Name, you
will see that in first row output will be 20, in second it will be 50 (20+30), in third line it
will be 90 (50+40) and so on. Later if you do |table price Product_Name, you can also
see actual values for price field.

SV Reddy First Time Questions 2019


## Carbentech, soluginex,CTS,data crasted
1. How you ensure data availability across the Splunk
Infrastructure?
Ans: To ensure that high availability of date in environment, if one
peer goes down, it should not be effects on end user. so in indexer

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

cluster, one peer goes down then another peer comes to picture and
server the data to end user.
2. What is Replication Factor and Search Factor
RF :: No of copies of raw data, its equal or less then no of peers in
cluster. RF depends on probability of network down tolerance.
SF:: No. of copies of index file, its equal or less then no of RF.
3. What will happen if peer goes down?
every 60Sec, master send the heart beat frequency to all peers then
all peers should be reply to that heart beat frequency.
4. What will happen if master goes down?
peer try to call to master when its get the data form UF, if master is
not respond it will wait 60 sec then again try to contact to master like
this 3 time it will do, after that peer will go previous history of master
server suggested until 24 hours. after 24hours that pervious history
also delete then peers act as standard lone.
5. Difference between valid and complete cluster?
Valid is nothing but non- searchable copy
Complete is nothing but searchable copy
Searchable is nothing but which have both replication factor and
search factor

6. Write multisite replication factor for 5 Indexers with RF as 4

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

7. How to implement Indexer Clustering


In master
[clustering]
Mode=master
Replication_factor= 3
Search_factor = 2
Pass4symmkey=apple
In peer
[replication_port://9080]
[clustering]
Master_uri=https://1.1.1.1:8089
Mode=slave
Pass4symmkey=apple
In search head
[clustering]
Master_uri=https://1.1.1.1:8089
Mode=search head
Pass4symmkey=apple
Via GUI
Settings indexer cluster enable cluster - indexer ip : 8089
credentials and secrete key
5. What is the best way to make peer down permanently

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

6. How to upgrade the cluster?


First, we have to upgrade the master then immediately enable the
maintenance mode. After the upgrade the peers one by one then
upgrade the search head. Then disable the maintenance mode
./splunk enable maintenance-mode
./splunk disable maintenance-mode

7. Where I can check health of my cluster?


In indexer master dash board, we can see

8. Is it possible to exclude an index from replication?


Yes, we can exclude.
Repfactor= auto
Repfactor=0 (exclude)
9. How to push common configuration to peers in single shot?
./splunk apply cluster-bundle

10. How searches will get processed in Indexer cluster architecture?


End user run the query in search head, SH contact to master then
master guide to SH about which indexer have to go.

11. Diff between standalone search head and clustered search head
SH:: its wont replicate the splunk knowledge objects
SHC:: in SHC replicate the splunk knowledge objects

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

12. How to restart all the peers in one shot?


By using rolling restart command
./splunk rolling-restart cluster-peers

13. What are the changes to do in conf files for multisite?

14. Do we need to declare site info in Master and Searchhead as


well?

15. Is it okay to have Search head as Indexer Master?


Strictly no, its should be separate components, because master have
very critical activities. Its tracking and monitoring of all components
in cluster mean what indexer doing and when indexer got the data, in
which time replicate the index and what time search head search the
data, these all things are recording in indexer master. So master is
very critical activates doing. That’s why not assign any other
activities.
If you assign any activities, after 3 to 4 days it wont work properly.

16. Can a single server act as Master as well as peer?


Strictly no, its should be separate components, because master have
very critical activities. Its tracking and monitoring of all components
in cluster mean what indexer doing and when indexer got the data, in
which time replicate the index and what time search head search the

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

data, these all things are recording in indexer master. So master is


very critical activates doing. That’s why not assign any other
activities.
If you assign any activities, after 3 to 4 days it wont work properly.

17. Can we have Master as 6.4 and peers as 6.3?


No, it won`t support, in cluster environment all components have
same OS as well as same version of splunk software
18. Can we have Master as Windows and peers as combination of
*nix flavors
No, it wont support, in cluster environment all components have
same OS as well as same version of splunk software

19. How to maintain License for Standalone Indexer

20. Do we need to include license for Search head


No need, license is for indexing only
21. Are we able to convert HF with forwarder License.
Yes, if we uase forwarder license in HF then automatically disable
index in HF.
22. How to maintain License for Clustered Architecture
By using license master, it manage and control the license slaves.
From the license master we can define pools, adding license and
manage license.
23. What is License pool and its allocation

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

License pool manage license slaves, license pool assign particular


space to group of indexers.
24. What is the workflow of Splunk Licensing

25. What will happen if your license is expired on a particular


calendar day?
If license is expired, we cannot able to search and there is no effect
on indexing the data in indexer. We will get the 5 alert messages in
30days, after the indexing also not happen.

##

1. How to connect standalone indexers with Search head.


Via GUI :: We have to do binding in SH - settings - distributed
search search peers indexer Ip:8089 indexer credentials then
restart

Via Configuration ::
we have to give the Indexers in distserach.conf file
/opt/splunk/etc/syaytem/local
Vi distsearch.conf
[distributedSearch]
Servers = 1.1.1.1:8089,2.2.2.2:8089

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

;wq!
Then we have to copy the trusted.pem file of search head and paste
to indexer.
/opt/splunk/etc/distserverkeys

2. I want to deploy a common change to all of my forwarders. Is it


possible to take a central control server
DS is centralized management of forwarders and instated of
frequently login to Application server for create and modify inputs
and outputs file, by using DS we just push configuration files to
deployment client. And its also called as forwarder management.

3. What are pre-requisites for Splunk installation


- Root access
1 GB space df -h
->Supported OS uname -a
Default ports availability netstat -an | grep 8000 or 8089
4. Is it possible to change the Splunk default ports
Yes, /opt/splunk/etc/syaytem/local
Vi web.conf
[settings]
httpPort=80001
mgmtHomePort = 127.0.0.1:8090
wq!

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Note :: after installation and before start the splunk services only
we have to change this.

5. Is it ok to load Indexer and Search head capability in a single


component?
Yes.indexer have capability of searching But it will impact indexer
performance. So its good to create a standalone SH.

6. What is major difference between UF and HF


UF is having only the forwarding capability. But HF having capabilities
of forwarding, filtering and masking of the data.

7. What is phonehome interval ? Are we able to tune it?


By default every 60 sec Deployment client contact to Deployment
server, this process is called phoning and this time of interval is called
phone home interval.
Yes we are able to tune it in deploymentclient.conf

8. Are we able to disable the web


Yes 3 methods are available via UI / web.conf / CLI command
Via GUI :: Settings Server Settings General Settings
Check Run SplunkWeb as No
Via CLI ::
./splunk disable webserver
./splunk enable webserver

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Via configuration::
Go to web.conf
Add this stanza under [settings]
startwebserver=0
If 0 means disabling the web
If 1 means enabling the web

9. How to restrict specific users not to see few index dataset


While creating roles, enable access only to required index

10. What are Splunk capabilities

11. What is the difference betweeen Sourcetype overriding and


Sourcetype renaming
Sourcetype override is to assign different sourcetype for few sets of
data -- It is through writing rules in props
Source type rename is to completely rename the existing sourcetype
name

12. Are we able to collect traps from network devices


Yes we are able to collect traps from SNMP devices

13. Is it possible to filter the inputs section


Yes. Use whitelist and blacklist in inputs.conf

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

14. Shall i use receiving port as replication port in indexers


No. Replication port should be unique
15. What is repository and target repository location
Repository Location is nothing but in Deployment server, which
location the folder having inputs and outputs file for ready to push to
deployment client that’s is called repository location
/opt/splunk/etc/deployment-apps
Target Repo Location is where folders has to keep in client side - It is
in UF side
/opt/splunkforwarder/etc/apps

16. What is dispatch directory and are we able to take control over
it?
Dispatch directory is nothing but whatever search in search head bar
its going to store that records
/opt/splunk/var/run/splunk/dispatch
/opt/slunk/bin
./splunk cmd splunkd clean-dispatch /tmp -24h@h
Using limits.conf we can control these settings
Adhoc search - 10 minutes
## Global behaviour via limits.conf
limits.conf
[search]
ttl = 600
PROVOKE TRAININGS www.provoketrainings.com
SPLUNK INTERVIEW Q & A
Ph.No:9100005757

# default - 10 mins
[subsearch]
ttl = 300
# default - 5 mins

17. I am having a log file in which few dataset wants to send to


Index A and remaining few to IndexB. Is that possible
Yes you can do this. But you need to write Index overriding rules in
HF and forward few data to A index and remaining few to B index
based on key words

18. What is metadata in Splunk


Metadata is data about data ( Data which is referring to identify my
data - Source / Sourcetype / Host )

19. What are the basic troubleshooting steps if you not receive your
data in indexer end.
First we need to check the communication between UF and Indexer
then
Then need to check in monitor stanza, given correct path and
available index or not.
Check the splunkd logs for know exact isuue.

20. If index and sourcetype is not created in Indexer but referred in


inputs. Will Splunk create that automatically?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Splunk will not create automatically any index. If index is not present
it will throw you an error ( if u not mention any index name in
monitor stanza it will go to default index MAIN.
Splunk will create sourcetype automatically. It will strip last part from
source

##

1. What kind of data splunk is going to read?


Structured as well as unstructured and its Expecting timestamps.
Its cannot able to read binary files.

2. How data traverse from application environment to splunk


infrastructure

Explain Forwarding process ( Setting up forwarder and enable the


indexer )
TCP channel ( Why tcpout matters )
Receiving port ( Indexer port)

3. Is there any way I can assure that all of my data reached at


Splunk Infrastructure
Due to useACK mechanism, forwarder send the 64 byte of data to
indexer and it will wait 60 sec for indexer acknowledgement, if its not
response it will send the data to another indexer in same manager.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

By Default useACk is false, we have to enable in outputs.conf file.

4. How forwarder deals with failure


Due to autoLB. If one indexer is not available it will look for
alternative.
By default autoLB is true.
5. Is there any way to neglect my old historical data before get into
Splunk receiver
During onboard time, we have to check with application team for
gathering is there any historical data and historical log files. If is there
we have to ignore that data by ignoreolderthan attribute in
inputs.conf files
6. Is there any way to segregate and discard few unwanted data
from a single file before reaches the index queue
In HF, we have to create the transforms and props files for discard the
data
Transforms.conf
[discardingdata]
REGEX=(i?) error
DEST_KEY=queue
FORMAT=nullQueue
Props.conf
[Sourcetype]
TRANSFORMS-abc = discardingdata

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

7. Am I able to mask customer sensitive data before it reaches the


index queue
Yes we can by using Masking rules in props and transforms in HF.

8. I am in need of transferring five 10 GB files. How much disk space


do i need to maintain in my indexer
5*10=50GB=Actual data
Actual data = 10% raw data + (10 to 110%) of raw data = ¼ of actual
data
= 5 + 6.5 = round of 12.5 GB space required.

9. What is the default path of index


/opt/splunk/var/lib/splunk/

10. How to set the index retention policy


frozentimeperiodinSecs::
data is going to delete at particular point, even though coldDB
has free space
maxTotalDataSize ::
its size of total index

11. what happen if cluster master down?


12. how cluster master knows if peer down?
13. when roll-over hot to warm backets?
14. what is diff b/w offline and stop of the peer?
PROVOKE TRAININGS www.provoketrainings.com
SPLUNK INTERVIEW Q & A
Ph.No:9100005757

15. why need index cluster..?


16. what is diff b/w search head and indexer?
17. what is diff b/w indexer cluster and search head cluster..?
18. how to apply bundles in cluster?
19. how to restart peers in cluster?
20. How you know retention of data in buckets?

Edureka Splunk Interview Questions and Answers


The questions covered in this blog post have been shortlisted after collecting inputs
from many industry experts to help you ace your interview. In case you want to
learn the basics of Splunk then, you can start off by reading the first blog in my
Splunk tutorial series: What Is Splunk? All the best!

Q1. What is Splunk? Why is Splunk used for analyzing machine


data?
This question will most likely be the first question you will be asked in any Splunk
interview. You need to start by saying that:

● Splunk is a platform which allows people to get visibility into machine data,
that is generated from hardware devices, networks, servers, IoT devices and
other sources
● Splunk is used for analyzing machine data because it can give insights into
application management, IT operations, security, compliance, fraud detection,
threat visibility etc

Q2. Explain how Splunk works.


This is a sure-shot question because your interviewer will judge this answer of yours
to understand how well you know the concept. The Forwarder acts like a dumb agent
which will collect the data from the source and forward it to the Indexer. The Indexer
will store the data locally in a host machine or on cloud. The Search Head is then
used for searching, analyzing, visualizing and performing various other functions on
the data stored in the Indexer.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

You can find more details about the working of Splunk here: Splunk Architecture:
Tutorial On Forwarder, Indexer And Search Head.

Read Blog

Q3. What are the components of Splunk?


Splunk Architecture is a topic which will make its way into any set of Splunk
interview questions. As explained in the previous question, the main components of
Splunk are Forwarders, Indexers and Search Heads. You can then mention that
another component called Deployment Server(or Management Console Host)
will come into the picture in case of a larger environment. Deployment servers:

● Act like an antivirus policy server for setting up Exceptions and Groups, so
that you can map and create different set of data collection policies each for
either a windows based server or a linux based server or a solaris based
server
● Can be used to control different applications running in different operating
systems from a central location
● Can be used to deploy the configurations and set policies for different
applications from a central location.

Making use of deployment servers is an advantage because connotations, path


naming conventions and machine naming conventions which are independent of
every host/machine can be easily controlled using the deployment server.

Q4. Why use only Splunk? Why can’t I go for something that is
open source?
This kind of question is asked to understand the scope of your knowledge. You can
answer that question by saying that Splunk has a lot of competition in the market
for analyzing machine logs, doing business intelligence, for performing IT operations
and providing security. But, there is no one single tool other than Splunk that can do
all of these operations and that is where Splunk comes out of the box and makes a
difference. With Splunk you can easily scale up your infrastructure and get
professional support from a company backing the platform. Some of its competitors
are Sumo Logic in the cloud space of log management and ELK in the open source
category. You can refer to the below table to understand how Splunk fares against
other popular tools feature-wise. The detailed differences between these tools are
covered in this blog: Splunk vs ELK vs Sumo Logic.

Q5. Which Splunk Roles can share the same machine?


This is another frequently asked Splunk interview question which will test the
candidate’s hands-on knowledge. In case of small deployments, most of the roles

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

can be shared on the same machine which includes Indexer, SearchHead and
LicenseMaster. However, in case of larger deployments the preferred practice is to
host each role on stand alone hosts. Details about roles that can be shared even in
case of larger deployments are mentioned below:

● Strategically, Indexers and Search Heads should have physically dedicated


machines. Using Virtual Machines for running the instances separately is not
the solution because there are certain guidelines that need to be followed for
using computer resources and spinning multiple virtual machines on the same
physical hardware can cause performance degradation.
● However, a License master and Deployment server can be implemented
on the same virtual box, in the same instance by spinning different Virtual
machines.
● You can spin another virtual machine on the same instance for hosting the
Cluster master as long as the Deployment master is not hosted on a
parallel virtual machine on that same instance because the number of
connections coming to the Deployment server will be very high.
● This is because the Deployment server not only caters to the requests
coming from the Deployment master, but also to the requests coming from
the Forwarders.

Q6. What are the unique benefits of getting data into a Splunk
instance via Forwarders?
You can say that the benefits of getting data into Splunk via forwarders are
bandwidth throttling, TCP connection and an encrypted SSL connection for
transferring data from a forwarder to an indexer. The data forwarded to the indexer
is also load balanced by default and even if one indexer is down due to network
outage or maintenance purpose, that data can always be routed to another indexer
instance in a very short time. Also, the forwarder caches the events locally before
forwarding it, thus creating a temporary backup of that data.

Q7. Briefly explain the Splunk Architecture


Look at the below image which gives a consolidated view of the architecture of
Splunk. You can find the detailed explanation in this link: Splunk Architecture:
Tutorial On Forwarder, Indexer And Search Head.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Q8. What is the use of License Master in Splunk?


License master in Splunk is responsible for making sure that the right amount of
data gets indexed. Splunk license is based on the data volume that comes to the
platform within a 24hr window and thus, it is important to make sure that the
environment stays within the limits of the purchased volume.
Consider a scenario where you get 300 GB of data on day one, 500 GB of data the
next day and 1 terabyte of data some other day and then it suddenly drops to 100
GB on some other day. Then, you should ideally have a 1 terabyte/day licensing
model. The license master thus makes sure that the indexers within the Splunk
deployment have sufficient capacity and are licensing the right amount of data.

Q9. What happens if the License Master is unreachable?


In case the license master is unreachable, then it is just not possible to search the
data. However, the data coming in to the Indexer will not be affected. The data will
continue to flow into your Splunk deployment, the Indexers will continue to index
the data as usual however, you will get a warning message on top your Search head
or web UI saying that you have exceeded the indexing volume and you either need
to reduce the amount of data coming in or you need to buy a higher capacity of
license.
Basically, the candidate is expected to answer that the indexing does not stop; only
searching is halted.

Q10. Explain ‘license violation’ from Splunk perspective.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

If you exceed the data limit, then you will be shown a ‘license violation’ error. The
license warning that is thrown up, will persist for 14 days. In a commercial license
you can have 5 warnings within a 30 day rolling window before which your Indexer’s
search results and reports stop triggering. In a free version however, it will show
only 3 counts of warning.

Q11. Give a few use cases of Knowledge objects.


Knowledge objects can be used in many domains. Few examples are:
Physical Security: If your organization deals with physical security, then you can
leverage data containing information about earthquakes, volcanoes, flooding, etc to
gain valuable insights
Application Monitoring: By using knowledge objects, you can monitor your
applications in real-time and configure alerts which will notify you when your
application crashes or any downtime occurs
Network Security: You can increase security in your systems by blacklisting certain
IPs from getting into your network. This can be done by using the Knowledge object
called lookups
Employee Management: If you want to monitor the activity of people who are
serving their notice period, then you can create a list of those people and create a
rule preventing them from copying data and using them outside
Easier Searching Of Data: With knowledge objects, you can tag information,
create event types and create search constraints right at the start and shorten them
so that they are easy to remember, correlate and understand rather than writing
long searches queries. Those constraints where you put your search conditions, and
shorten them are called event types.
These are some of the operations that can be done from a non-technical perspective
by using knowledge objects. Knowledge objects are the actual application in
business, which means Splunk interview questions are incomplete without
Knowledge objects. In case you want to read more about the different knowledge
objects available and how they can be used, read this blog: Splunk Tutorial On
Knowledge Objects

Read Blog

Q12. Why should we use Splunk Alert?What are the different


options while setting up Alerts?
This is a common question aimed at candidates appearing for the role of a Splunk
Administrator. Alerts can be used when you want to be notified of an erroneous
condition in your system. For example, send an email notification to the admin when
there are more than three failed login attempts in a twenty-four hour period.
Another example is when you want to run the same search query every day at a
specific time to give a notification about the system status.
Different options that are available while setting up alerts are:

● You can create a web hook, so that you can write to hipchat or github. Here,
you can write an email to a group of machines with all your subject, priorities,
and body of the message

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

● You can add results, .csv or pdf or inline with the body of the message to
make sure that the recipient understands where this alert has been fired, at
what conditions and what is the action he has taken
● You can also create tickets and throttle alerts based on certain conditions like
a machine name or an IP address. For example, if there is a virus outbreak,
you do not want every alert to be triggered because it will lead to many
tickets being created in your system which will be an overload. You can
control such alerts from the alert window.

You can find more details about this topic in this blog: Splunk alerts.

Q13. Explain Workflow Actions


Workflow actions is one such topic that will make a presence in any set of Splunk
Interview questions. Workflow actions is not common to an average Splunk user and
can be answered by only those who understand it completely. So it is important that
you answer this question aptly.
You can start explaining Workflow actions by first telling why it should be used.
Once you have assigned rules, created reports and schedules then what? It is not
the end of the road! You can create workflow actions which will automate certain
tasks. For example:

● You can do a double click, which will perform a drill down into a particular list
containing user names and their IP addresses and you can perform further
search into that list
● You can do a double click to retrieve a user name from a report and then pass
that as a parameter to the next report
● You can use the workflow actions to retrieve some data and also send some
data to other fields. A use case of that is, you can pass latitude and longitude
details to google maps and then you can find where an IP address or location
exists.

The screenshot below shows the window where you can set the workflow actions.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Q14. Explain Data Models and Pivot


Data models are used for creating a structured hierarchical model of your data. It
can be used when you have a large amount of unstructured data, and when you
want to make use of that information without using complex search queries.
A few use cases of Data models are:

● Create Sales Reports: If you have a sales report, then you can easily create
the total number of successful purchases, below that you can create a child
object containing the list of failed purchases and other views
● Set Access Levels: If you want a structured view of users and their various
access levels, you can use a data model
● Enable Authentication: If you want structure in the authentication, you can
create a model around VPN, root access, admin access, non-root admin
access, authentication on various different applications to create a structure
around it in a way that normalizes the way you look at data.
So when you look at a data model called authentication, it will not matter to
Splunk what the source is, and from a user perspective it becomes extremely
simple because as and when new data sources are added or when old one’s
are deprecated, you do not have to rewrite all your searches and that is the
biggest benefit of using data models and pivots.

On the other hand with pivots, you have the flexibility to create the front views of
your results and then pick and choose the most appropriate filter for a better view of
results. Both these options are useful for managers from a non-technical or

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

semi-technical background. You can find more details about this topic in this blog:
Splunk Data Models.

Check Out Splunk Course

Q15. Explain Search Factor (SF) & Replication Factor (RF)


Questions regarding Search Factor and Replication Factor are most likely asked when
you are interviewing for the role of a Splunk Architect. SF & RF are terminologies
related to Clustering techniques (Search head clustering & Indexer clustering).

● The search factor determines the number of searchable copies of data


maintained by the indexer cluster. The default value of search factor is 2.
However, the Replication Factor in case of Indexer cluster, is the number of
copies of data the cluster maintains and in case of a search head cluster, it is
the minimum number of copies of each search artifact, the cluster maintains
● Search head cluster has only a Search Factor whereas an Indexer cluster has
both a Search Factor and a Replication Factor
● Important point to note is that the search factor must be less than or equal to
the replication factor

Q16. Which commands are included in ‘filtering results’ category?


There will be a great deal of events coming to Splunk in a short time. Thus it is a
little complicated task to search and filter data. But, thankfully there are commands
like ‘search’, ‘where’, ‘sort’ and ‘rex’ that come to the rescue. That is why, filtering
commands are also among the most commonly asked Splunk interview questions.
Search: The ‘search’ command is used to retrieve events from indexes or filter the
results of a previous search command in the pipeline. You can retrieve events from
your indexes using keywords, quoted phrases, wildcards, and key/value expressions.
The ‘search’ command is implied at the beginning of any and every search operation.
Where: The ‘where’ command however uses ‘eval’ expressions to filter search
results. While the ‘search’ command keeps only the results for which the evaluation
was successful, the ‘where’ command is used to drill down further into those search
results. For example, a ‘search’ can be used to find the total number of nodes that
are active but it is the ‘where’ command which will return a matching condition of an
active node which is running a particular application.
Sort: The ‘sort’ command is used to sort the results by specified fields. It can sort
the results in a reverse order, ascending or descending order. Apart from that, the
sort command also has the capability to limit the results while sorting. For example,
you can execute commands which will return only the top 5 revenue generating
products in your business.
Rex: The ‘rex’ command basically allows you to extract data or particular fields from
your events. For example if you want to identify certain fields in an email id:
abc@edureka.co, the ‘rex’ command allows you to break down the results as abc
being the user id, edureka.co being the domain name and edureka as the company
name. You can use rex to breakdown, slice your events and parts of each of your
event record the way you want.

Q17. What is a lookup command? Differentiate between


inputlookup&outputlookup commands.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Lookup command is that topic into which most interview questions dive into, with
questions like: Can you enrich the data? How do you enrich the raw data with
external lookup?
You will be given a use case scenario, where you have a csv file and you are asked
to do lookups for certain product catalogs and asked to compare the raw data &
structured csv or json data. So you should be prepared to answer such questions
confidently.
Lookup commands are used when you want to receive some fields from an
external file (such as CSV file or any python based script) to get some value of an
event. It is used to narrow the search results as it helps to reference fields in an
external CSV file that match fields in your event data.
An inputlookupbasically takes an input as the name suggests. For example, it
would take the product price, product name as input and then match it with an
internal field like a product id or an item id. Whereas, an outputlookupis used to
generate an output from an existing field list. Basically, inputlookup is used to enrich
the data and outputlookup is used to build their information.

View Course Curriculum

Q18. What is the difference between ‘eval’, ‘stats’, ‘charts’ and


‘timecharts’ command?
‘Eval’ and ‘stats’ are among the most common as well as the most important
commands within the Splunk SPL language and they are used interchangeably in the
same way as ‘search’ and ‘where’ commands.

● At times ‘eval’ and ‘stats’ are used interchangeably however, there is a subtle
difference between the two. While ‘stats‘ command is used for computing
statistics on a set of events, ‘eval’ command allows you to create a new field
altogether and then use that field in subsequent parts for searching the data.
● Another frequently asked question is the difference between ‘stats’, ‘charts’
and ‘timecharts’ commands. The difference between them is mentioned in the
table below.

Stats Chart Tim

Stats is a reporting command which is Timechart allows y


Chart displays the data in the form of a
used to present data in a tabular line graphs. Howev
bar, line or area graph. It also gives
format. possible.
the capability of generating a pie chart.

In Stats command, you can use In Chart, it takes only 2 fields, each
In Timechart, it ta
multiple fields to build a table. field on X and Y axis respectively.
the X-axis is fixed

Q19. What are the different types of Data Inputs in Splunk?


This is the kind of question which only somebody who has worked as a Splunk
administrator can answer. The answer to the question is below.

● The obvious and the easiest way would be by using files and directories as
input

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

● Configuring Network ports to receive inputs automatically and writing scripts


such that the output of these scripts is pushed into Splunk is another
common way
● But a seasoned Splunk administrator, would be expected to add another
option called windows inputs. These windows inputs are of 4 types: registry
inputs monitor, printer monitor, network monitor and active directory monitor.

Q20. What are the defaults fields for every event in Splunk?
There are about 5 fields that are default and they are barcoded with every event into
Splunk.
They are host, source, source type, index and timestamp.

Q21. Explain file precedence in Splunk.


File precedence is an important aspect of troubleshooting in Splunk for an
administrator, developer, as well as an architect. All of Splunk’s configurations are
written within plain text .conf files. There can be multiple copies present for each of
these files, and thus it is important to know the role these files play when a Splunk
instance is running or restarted. File precedence is an important concept to
understand for a number of reasons:

● To be able to plan Splunk upgrades


● To be able to plan app upgrades
● To be able to provide different data inputs and
● To distribute the configurations to your splunk deployments.

To determine the priority among copies of a configuration file, Splunk software first
determines the directory scheme. The directory schemes are either a) Global or b)
App/user.
When the context is global (that is, where there’s no app/user context), directory
priority descends in this order:

1. System local directory — highest priority


2. App local directories
3. App default directories
4. System default directory — lowest priority

When the context is app/user, directory priority descends from user to app to
system:

1. User directories for current user — highest priority


2. App directories for currently running app (local, followed by default)
3. App directories for all other apps (local, followed by default) — for exported
settings only
4. System directories (local, followed by default) — lowest priority

Q22. How can we extract fields?


You can extract fields from either event lists, sidebar or from the settings menu via
the UI.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

The other way is to write your own regular expressions in props.conf configuration
file.

Q23. What is the difference between Search time and Index time
field extractions?
As the name suggests, Search time field extraction refers to the fields extracted
while performing searches whereas, fields extracted when the data comes to the
indexer are referred to as Index time field extraction. You can set up the indexer
time field extraction either at the forwarder level or at the indexer level.
Another difference is that Search time field extraction’s extracted fields are not part
of the metadata, so they do not consume disk space. Whereas index time field
extraction’s extracted fields are a part of metadata and hence consume disk space.

Q24. Explain how data ages in Splunk?


Data coming in to the indexer is stored in directories called buckets. A bucket moves
through several stages as data ages: hot, warm, cold, frozen and thawed. Over
time, buckets ‘roll’ from one stage to the next stage.

● The first time when data gets indexed, it goes into a hot bucket. Hot buckets
are both searchable and are actively being written to. An index can have
several hot buckets open at a time
● When certain conditions occur (for example, the hot bucket reaches a certain
size or splunkd gets restarted), the hot bucket becomes a warm bucket
(“rolls to warm”), and a new hot bucket is created in its place. Warm buckets
are searchable, but are not actively written to. There can be many warm
buckets
● Once further conditions are met (for example, the index reaches some
maximum number of warm buckets), the indexer begins to roll the warm
buckets to cold based on their age. It always selects the oldest warm bucket
to roll to cold. Buckets continue to roll to cold as they age in this manner

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

● After a set period of time, cold buckets roll to frozen, at which point they are
either archived or deleted.

The bucket aging policy, which determines when a bucket moves from one stage to
the next, can be modified by editing the attributes in indexes.conf.

Get Started with Splunk

Q25. What is summary index in Splunk?


Summary index is another important Splunk interview question from an
administrative perspective. You will be asked this question to find out if you know
how to store your analytical data, reports and summaries. The answer to this
question is below.
The biggest advantage of having a summary index is that you can retain the
analytics and reports even after your data has aged out. For example:

● Assume that your data retention policy is only for 6 months but, your data
has aged out and is older than a few months. If you still want to do your own
calculation or dig out some statistical value, then during that time, summary
index is useful
● For example, you can store the summary and statistics of the percentage
growth of sale that took place in each of the last 6 months and you can pull
the average revenue from that. That average value is stored inside summary
index.

But the limitations with summary index are:

● You cannot do a needle in the haystack kind of a search


● You cannot drill down and find out which products contributed to the revenue
● You cannot find out the top product from your statistics
● You cannot drill down and nail which was the maximum contribution to that
summary.

That is the use of Summary indexing and in an interview, you are expected to
answer both these aspects of benefit and limitation.

Q26. How to exclude some events from being indexed by Splunk?


You might not want to index all your events in Splunk instance. In that case, how
will you exclude the entry of events to Splunk.
An example of this is the debug messages in your application development cycle.
You can exclude such debug messages by putting those events in the null queue.
These null queues are put into transforms.conf at the forwarder level itself.
If a candidate can answer this question, then he is most likely to get hired.

Q27. What is the use of Time Zone property in Splunk? When is it


required the most?
Time zone is extremely important when you are searching for events from a security
or fraud perspective. If you search your events with the wrong time zone then you
will end up not being able to find that particular event altogether. Splunk picks up
the default time zone from your browser settings. The browser in turn picks up the

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

current time zone from the machine you are using. Splunk picks up that timezone
when the data is input, and it is required the most when you are searching and
correlating data coming from different sources. For example, you can search for
events that came in at 4:00 PM IST, in your London data center or Singapore data
center and so on. The timezone property is thus very important to correlate such
events.

Q28. What is Splunk App? What is the difference between Splunk


App and Add-on?
Splunk Apps are considered to be the entire collection of reports, dashboards, alerts,
field extractions and lookups.
Splunk Apps minus the visual components of a report or a dashboard are Splunk
Add-ons. Lookups, field extractions, etc are examples of Splunk Add-on.
Any candidate knowing this answer will be the one questioned more about the
developer aspects of Splunk.

Q29. How to assign colors in a chart based on field names in


Splunk UI?
You need to assign colors to charts while creating reports and presenting results.
Most of the time the colors are picked by default. But what if you want to assign
your own colors? For example, if your sales numbers fall below a threshold, then you
might need that chart to display the graph in red color. Then, how will you be able to
change the color in a Splunk Web UI?
You will have to first edit the panels built on top of a dashboard and then modify the
panel settings from the UI. You can then pick and choose the colors. You can also
write commands to choose the colors from a palette by inputting hexadecimal values
or by writing code. But, Splunk UI is the preferred way because you have the
flexibility to assign colors easily to different values based on their types in the bar
chart or line chart. You can also give different gradients and set your values into a
radial gauge or water gauge.

Q30. What is sourcetype in Splunk?


Now this question may feature at the bottom of the list, but that doesn’t mean it is
the least important among other Splunk interview questions.
Sourcetype is a default field which is used to identify the data structure of an
incoming event. Sourcetype determines how Splunk Enterprise formats the data
during the indexing process. Source type can be set at the forwarder level for
indexer extraction to identify different data formats. Because the source type
controls how Splunk software formats incoming data, it is important that you assign
the correct source type to your data. It is important that even the indexed version of
the data (the event data) also looks the way you want, with appropriate timestamps
and event breaks. This facilitates easier searching of data later.
For example, the data maybe coming in the form of a csv, such that the first line is a
header, the second line is a blank line and then from the next line comes the actual
data. Another example where you need to use sourcetype is if you want to break
down date field into 3 different columns of a csv, each for day, month, year and then
index it. Your answer to this question will be a decisive factor in you getting
recruited.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

Splunk April Month Interview Questions

( Lockdown Period )

==================================

Optum(UHG) First round Anil Interview Questions(23/04/2020)

Panel :Pawan and other 2 members

1 Tell about your self?

2. Day to day activities

3. When you create server class

4. creating in ds restarting server -----restart/ reload

5. Manually adding GUI or Server.conf

6. After pushing the server class but how forwarder know there in server clss there in
server, forwarder will chk by using which config forwarder will chk ds will I need for this
application.

7. Hva you created deploymentclient.conf. what will be there .

8. Can you tell again input.conf ( about stanza)

9. What difference between nas storage path and actual storage path

10 Explain about props .conf and imp about time zone.

11. Where you create index

12. Will you indexer regular or vocationally

13. Give general idea about index.conf

14. Describe about the bucket

15 Which bucket writeable capabil…

Bharth Accolite Screening Questions?(23/04/2020)

Interview Panel: Keerthi

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
1.explain windows events process and windows performance data?

2.what is commercial code executive by the forwerder?

3.how are you use converse s intervel ?

4.what is DB connect ?

5.what is a rising coloumn?

6.what is http event collector?

7.difference Between splunk events vs metrix index?

8.difference between stats and tstats command?

9.if you need to migrate to data said how do you do it data join?

10.what is lookups?

11.what is defference between csv store vs kv store?

12.how do you handle case sensitivity?

13. what is a key in a kv store?

14. how to integrate additional HTML visualization?

Bangalore one of MNC Company Questions (Satish given)

1. What are the queues in Splunk

2. Replication factor and search factor

3.how to extract host from universal forwarder itself

4. How would universal forwarder knows that all the events have been reached to
indexer and which attribute we use in configuration

5. Regular expressions and they gave scenarios to extract the fields and we should do it
on board

6. How to send events to indexers from two machines .. scenario based

7. Bucket concepts

8.site affinity and how it is helpful

9. For suppose there are 4 indexers and we don't know which one is down.Need to find
out which is down and bring the sever up. (Scenario based)

10. Transaction command and definition

11.How Splunk finds duplicate events

12.where Splunk search artifacts saved

13. How to on-board data into Splunk.

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
Lowe’s first round interview questions (Charan) Banglour

1.Tell me about yourself

2.could you tell me arthecture

3.IS is single site or multi site and can u explain multisite briefly

4.do you know clustering

5.how many types of clustering

6.why we are using clustering

7.what Search head clustering and index clustering

8.how we have to iniliase

9.what is captain and how to iniliase

10.what is deployer and why we are using

11.what is RF and SF

12. what is replication port

13. IS it captain static or dynamic

14.where it is replicated

15.where we have create config files

16.what is local and default

17. how to check cluster status

18. what is the role of license in clustering and license is applicable for clustering and
where we have to configure

19. In index clustering if one of the indexer is down how do know that and where we
have to check

20. what is data age concept

21. could you tell me the naming convention of buckets

22.where we have to configure

23. what is macros

24. what is field extraction and fileld

aliasing 25.do you know alerts and how to

configure 26.what is cronjob and how to set

that

27.why we are using secretkey in clustring and where it is replicated

28.after iniliasing the clustering the secretkey is unencrypted or encrypted

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
29.why we are using secret key and is it same for deployer

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
30.what is stats command and what is the use

31.what is the eval command and use

32.what is the difference between stats and event stats

33.what are the methods of data injetion(DB connect app)

34.what is DB connect app and how to setup

35.what type of data ur onboarding

36.how ur onboarding

37. what is schedule alert and real time alert

38. how to push the files to search head and what is the command

39.what is lookup and types

40. what is the use of license master

41. how many forwarders are using in your architecture

Accenture first interview questions charan banglour

1. Tell me about yourself

2. could tell me your architecture

3.day to day activities

4. do know the clustering

5. what is clustering and why we are using

6.How we are intiliasing the clustering

7.Replication and search factor will deponds on the indexers

8.so how many RF and SF your intialising

9. what is bloomfilter

10. how do you change the password and create the password in CLI

11.what is data age concept

12.how to implement that

13.which log files are sending

14.Retention period and where we have to configure

15.have you faced any troubleshoot issues on License

16.could u tell the that issue

17. which type of data your onboarding

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
18. what is RF and SF

19. do u have any experience on alerts

20. I have 100 indexers what is RF and SF

=======================================================

Write multisite replication factor for 5 Indexers with RF as 4

What is the best way to make peer down permanently

What are the changes to do in conf files for multisite?

Do we need to declare site info in Master and Searchhead as well?

How to maintain License for Standalone Indexer

What is the workflow of Splunk Licensing

What are Splunk capabilities

what happen if cluster master down?

how cluster master knows if peer down?

when roll-over hot to warm backets?

what is diff b/w offline and stop of the peer?

why need index cluster..?

what is diff b/w search head and indexer?

what is diff b/w indexer cluster and search head cluster..?

how to apply bundles in cluster?

how to restart peers in cluster?

How you know retention of data in buckets?

1. what is the difference between index and indexer

2. what is the difference between forwarder and universal forwarder?

3. what is the difference between universal forwarder and heavy forwarder?

4. what is the use of regex formula?

5. how to extract fields?

6. how many types to extract fields?

7. what are the default ports?

8. retension period?

9. what is the default bucket for splunk?

10. file precedence

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
11. if i make a changes any files or splunk level, what is first job we do?

12. what is the difference between power user and admin & user?

13. what is the workflow of splunk and architecture?

14. components of splunk?

15. what is the home directory of splunk?

16. can you please splunk directory structure?

17. what are default fields in splunk?

18. how many ways we can inject the data?

19. how many ways we can inject the data in monitior?

20. what are configuration files?

21. how to reset the fish bucket?

22. how to use wild card in splunk?

23. what is blacklist in splunk?

24. what is the use of regular expression?

25. can you write one regular expression?

26. what is field extraction?

27. how many ways we can extract the fields?

28. search time and index time?

29. which is the best practice in field extraction?

30. why search time is best practice?

31. which information stored in props.conf

32. what is fine tuning data?

33. what is trouble shooting in splunk?

34. what command is used for trouble shoot?

35. what is the difference between top,rare and stats?

36. what is the difference between stats, eval?

37. what is the difference betweenhead and top?

38. what is the difference betweenfast mode,smart mode and verbose mode?

39. which is best practice mode?

40. all time used in realtime if it is yes,why?

41. what are knowledge objects in splunk?

42. what are data models in splunk?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
43. what is lookup?

44. where lookup stored?

45. is license applied for lookup?why?

46. is tableau is interacts with splunk?

47. what is clustering? why clustering?

48. types of clustering?

49. if i upload the data, how many types indexing?

50. what is replication port?

51. steps for splunk clustering configurations?

52. the maximum retension period of company?

53. how many maximum buckets inside the bucket?

54. which language used for splunk?

55. what it mean by phoning back?

56. splunk stores any form of data files?

57. what is input? absolute path.

58. what is output? destination of index?

59. what are alternate tools for splunk?

SHYAM

1. How many servers do we need to ingest tha 300gb data? How can you
segregate the data?

2. If cluster master went down ? how to resolve? What is the impact?

3. If deployment server went down ? how to resolve? What is the impact?

ASHISH MALVIYA ( Accenture )

1. Diff b/w UF AND HF


2. Searchhead captain ? why we need?
3. At a time 10 persons searching for same query but only 6 members getting the
query remaining not why?
4. Alert didn’t trigger ? reason ?
5. 50% of searchheads are down ? what will happen? How to resolve?
6. Data age concepts ? which are searchable ? which are writable ?
7. Explain about dbconnect?
8. How many types of licences ?
9. Licence key expired then what happened ?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
10. Stats ,tstats,event stats ?
11. Fields and alias ?
12. How to check the permissions of the file ?
13. How to check the size of the directory ?

ROHIT YADAV ( AIR BUS )

14. What is Rf and Sf ?


15. How to upgrade the splunk ?
16. How to check kv store status ?
17. What is maintainance mode ?
18. What does maintainance mode do ?
19. What is rolling restart ? advantage ? how many peers it will restart ?
20. Knowledge bundle ? validation bundle ?
21. Use of btool command ?
22. Diff types of queues ?
23. Maxthroughput ?
24. How to change permissions for dashboard ? backend file name ?
25. Access buckets ? threshold value ?

WIPRO

26. What is splunk ? why we use it ?


27. Architecture of the splunk ?
28. How to configure indexes ?
29. Benefits of the data send to the splunk ?
30. What is the purpose of the licence master ?what happened when licence is
expired ?
31. What is the average time taken to ingest the 500gb data ?
32. Diff kind of options to set alerts ?
33. What are workflows ?
34. What is RF and SF ?
35. Lookup command ?
36. Scripting languages ?
37. Diff type of data inputs ?
38. File precidency ?
39. Data age concepts ?
40. Summery index ? what are limitations ?
41. Splunk add-ons ?
42. Diff b\w apps and add-ons ?
43. How to avoid duplicate log indexing ?
44. Data models ? workflows ?
45. Have you created any dashboard ?
46. Dashboard updation ?
47. How do you generate alerts ?
48. Common ports used in splunk ?
49. Search is case sensitive or case insensitive ?
50. What are tags ? how do we use it ?
51. Command for extractions ?
52. Table command ?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
53. Rest API ?
54. Sdk framework ?
55. Indexing is not done yet then how we write queries ?

GOURAV ( WIPRO )

56. Queues in splunk ? explain ?


57. What is the Diff b\w RF and SF ?
58. Configuration ratio ( compression ratio ) ?
59. Pipelines ?
60. Fish bucket concept ?
61. Credit card number masking ?
62. Btool?
63. Migration of data from one index to another index ?
64. Index = abc | delete ….after deleting the data how we can retrieve the data ?
65. Hot to warm rolling conditions ?
66. Maxwarmdbcount ?
67. Diff b\w stats and tstats ?
68. Diff b\w index time field and search time field ?
69. Post process search ?
70. Base search and child search ?

RAVI

71. In UF we can break the events or not ?


72. What is the use of configuration files ?
73. Distributed environment ? cluster environment ?
74. What is the background process to ingest the data into the splunk ?
75. Forwarder is fine but data is not coming to index ? reasons ?
76. We need to add index to cluster ? how ?
77. Cold is searchable or not ?
78. What is the role of captain ?how can we define captain ?
79. What is RF and SF ? what is the error message if both are same ?
80. Diff b\w deployment server and deployer ?
81. Apply bundle ? background process ?
82. Any experience in creating custom apps ?
83. What is directory structure ? what are the folders in that ?
84. File precidency ?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
85. Where we can find the path of dashboard ?
86. Diff b\w dashboard and form ?
87. Versions ?
88. Process of upgrade the version ?

SONY

89. Splunk components ?


90. UF installation ?
91. UF troubleshootings ?
92. Props.conf explain ?
93. Events are breaking then what will we do ?
94. Boolean operations ?
95. Fillnull ?
96. Epoc time to human readable ?
97. Challenging thing you faced ?
98. Splunk components ?
99. Logs issues trouble shooting ?
100. Btool
101. File precidency ?
102. Props.conf explain ?
103. Props.conf can move to UF or not ?
104. Regex for ipaddress ?
105. Boolean operations ?
106. Index time and search time extraction ?
107. Epoc time to human readable ?
108. Fillnull command ?
109. Stats \ tstats \event stats ?
110. Search modes ?
111. Duplicate files ?
112. Fish bucket ?
113. Alert not triggered then what is trouble shooting process ?
114. Alerts and reports are stored in where ?
115. Have you migrated any knowledge objects from one environment to
another environment?
116. Data age concepts ?
117. Dispatch directory ?
118. Addons and apps diff ?
119. How to install apps ? (dbconnect)
120. Do you ingest any data in AWS ?
121. Lookups ?
122. Dashboard in private how to share to other teams ?
123. Logs are missing ? troubleshootings ?
124. Server is down then what are the steps do you go ?
NVISH

1.Components of splunk

2.Use of Licence Master

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
3. Knowledge objects

4. Search factor and Replication factor

5.File presidency

6.Commands for filtering

7.Rex Command

8.Types of Data inputs

9.Data age concepts

10.Timezone

11.How to motivate youngsters to work on weekends

8K MILES

1. Daily Activities

2. Ways to Onboarding the Data

3. Onboarding Network Devices (Rubrik)

4.Data from Router and How to push the Logs

5.Configure files are you push directly to the Splunk in Router

6.Masking

7.Architecture

8.Challenges You faced

9.Troubleshooting

10.Future date and time logs created then how to resolve

11.UF installation

12.What is the configuration files used for connect from forwarder to deployment server

13.Dashboards explanation

14. Stats ,T-stats, Event-stats

15. Alerts Explanation

16.Aws Addons

17.DB-connect

Accenture 3rd Round

Panel:senthil kumar loganathan

1. Tell me about yourself

2. Which port used for Indexing

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
3. What about Management port?

4. Are you worked on Regular expression parsing the data, Lets for example, you want to
firewall logs it is not in correct format i need to extract the ip address and domain
name?

5. What are the common performance issues in splunk?Tell me top 3 issues?

6. What are the common performance issues in splunk admin side and development
side?

7. I have 1TB data for ingesting. what are the recomendations for the splunk
architecture(How many indexes, search heads)?

8. How many users search the data in search head effectively?

9. Disaster recovery in splunk?

10. Do you have experience in cloud?

11. How to automate push the logs in heavy forwarer?

12. How will you troubleshoot universal forwarder agent issues?

13. After troubleshooting Universal forwarder agent restart is done, but still issue is not
solved, what is the next step we must take?

14. What is bucket?

15. Is there any physical difference b/w hot warm cold and frozen and tawad buckets?

16. Why we require small hot bucket, medium warm bucket and large cold frozen
thawad buckets.Do you any brief idea behind that?

17. How can i increase the hot bucket?

18. What is frozen bucket and what is its use?

19. Can we search the data in frozen bucket-NO Do you know the reason?

20. Once the data is archived, but for investigation purpose we need that data after 30
days. How will that data be restored?

21. A dashboard want to be viewed by a paricular user. How will that happen?

22. What is the meaning of search and replication factor?

23. What is the biggest advantage of using splunk(SIEM Tool)?

24. What we call Splunk?

25. What is the business prospective other tools in SIEM?

26. Are you tried in ELK?

27. License master server is not reachable. what will be impact in splunk instance?

28. Did you worked on the splunk use cases?

29. Brute force attack in splunk?

Check for attempts to gain access to a system by using multiple accounts with

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757
multiple passwords.

Metmox(06-10-2020)Akshay
Panel-Vijay

1. Describe about Yourself

2. Give Explanation about Splunk Components

3. explain me about Knowledge objects

4. Explain me about any use cases you used

5. How do you optimize the search performance in search head

6. Explain me alert you created in your firm(You explained gim fire Exception)

7. Explain Configuration files

Metmox(Akshay)Panel-Bhargavi
1. Explain about your technical stuff

2. Explain about AWS & TCP in elaborate

3. What type of work you do with AWS

4. Do you have access database.Where is your splunk located

5. Explain Onboarding phenomena

6. Do you have sop prepared for this

7. If your client will not get SOP then are you capable of doing Onboarding

8. Have you integrated Fir…

Metmox(06-10-2020)AkshayPanel-vijay

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

1. Discribe about Yourself

2. Give Explaination about Splunk Components

3. explain me about Knowledge objects

4. Explain me about any usecases you used

5. How do you optimise the search performance in search head

6. Explain me alert you created in your firm(You explained gim fire Exception)

7. Expalin Configuration files

Metmox(Akshay)Panel-Bhargavi
1. Explain about your technical stuff

2. Explain about AWS & TCP in eloberate

3. What type of work you do with AWS

4. Do you have access database.Where is your splunk located

5. Explain Onboarding phenomena

6. Do you have sop prepared for this

7. If your client will not get SOP then are you capable of doing Onboarding

8. Have you integrated Fir…

Value Labs-12-11-2020Panel- Vamsi Krishna


Konjati
1. Can you explain current Business followed by your company in regarding your duties?

2. Do you have any experience on ITSI?

3. How to create services on UI?

4. Do you have experience on onprime or cloud?

5. What is splunk?

6. What are the important components of splunk?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

7. What is the main difference between UF and HF?

8. What are the important config files on splunk?

9. If i gave you a file, a file contain a password and ingest the data into a splunk. What is the
process of that? I am asking from scratch?

10. Can you explain about buckets in splunk?

11. What is the difference between splunk app and splunk addon?

12. What command do you use to create a field?

Sony(30/10/2020)
Panel:NALLAMATHU RAVI KUMAR

1. Can you brief me about the recent project hat you have done and roles and responsibility?

2. Can you brief me about project architecture they are using in splunk?

3. what type of Data you are onboarding on splunk?

4. Components of splunk?

5. What is the main difference between UF and HF?

6. In UF can we break the events?

7. What are the configaration files you worked on?

8. What is the main purpose of configaration files?

9. What is the difference between distributed and standalone environment in splunk?

10. Can you explain me how the baground process i.e splunk is ingesting data into the splunk?

11. How raw data is backing and where that process are happening?

12. Can you explain the trouble shooting?

Value Labs-12-11-2020
Panel- Vamsi Krishna Konjati
1. Can you explain current Business followed by your company in regarding your duties?

2. Do you have any experience on ITSI?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

3. How to create services on UI?

4. Do you have experience on onprime or cloud?

5. What is splunk?

6. What are the important components of splunk?

7. What is the main difference between UF and HF?

8. What are the important config files on splunk?

9. If i gave you a file, a file contain a password and ingest the data into a splunk. What is the
process of that? I am asking from scratch?

10. Can you explain about buckets in splunk?

11. What is the difference between splunk app and splunk addon?

12. What command do you use to create a field?

13. What is the di…

WIPRO - Sai KiranPanel - Nitesh


1. Have you worked on splunk cloud?

2. Have you worked on Distributed Environment and Cluster Environment?

3. What is search factor and replication factor?

4. Which syslog server you worked?

5. Why can't we install Universal forarder in syslog server?

6. If your server is running. Have you gone through vunerability part?Did you fix any vunerability in
your current project?

7. What is splunk Enterprise security?

8. How many types of autentications are there in splunk?

9. Have you configured LDAP and SAMEL?

10. Have you worked on Deployment server?

11. just help me with simple steps to set up our environment 3 indexer, 2 search head 50 uf and 50
windows servers--> to bring this data into splunk index…

Accenture 3rd Round


Panel:senthil kumar loganathan
PROVOKE TRAININGS www.provoketrainings.com
SPLUNK INTERVIEW Q & A
Ph.No:9100005757

1. Tell me about yourself

2. Which port used for Indexing

3. What about Management port?

4. Are you worked on Regular expression parsing the data, Lets for example, you want to firewall
logs it is not in correct format i need to extract the ip address and domain name?

5. What are the common performance issues in splunk?Tell me top 3 issues?

6. What are the common performance issues in splunk admin side and development side?

7. I have 1TB data for ingesting. what are the recomendations for the splunk architecture(How
many indexes, search heads)?

8. How many users search the data in search head effectively?

9. Disaster recovery in splunk?

10. Do you have experience in cloud?

11. How to automate push the logs in heavy forwarer?

12. How will you troubleshoot universal forwarder agent issues?

13. After troubleshooting Universal forwarder agent restart is done, but still issue is not solved,
what is the next step we must take?

14. What is bucket?

15. Is there any physical difference b/w hot warm cold and frozen and tawad buckets?

16. Why we require small hot bucket, medium warm bucket and large cold frozen thawad
buckets.Do you any brief idea behind that?

17. How can i increase the hot bucket?

18. What is frozen bucket and what is its use?

19. Can we search the data in frozen bucket-NO Do you know the reason?

20. Once the data is archived, but for investigation purpose we need that data after 30 days. How
will that data be restored?

21. A dashboard want to be viewed by a paricular user. How will that happen?

22. What is the meaning of search and replication factor?

23. What is the biggest advantage of using splunk(SIEM Tool)?

24. What we call Splunk?

25. What is the business prospective other tools in SIEM?

26. Are you tried in ELK?

27. License master server is not reachable. what will be impact in splunk instance?

PROVOKE TRAININGS www.provoketrainings.com


SPLUNK INTERVIEW Q & A
Ph.No:9100005757

28. Did you worked on the splunk use cases?

29. Brute force attack in splunk?

PROVOKE TRAININGS www.provoketrainings.com

You might also like