Network Assurance and Management Course

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 104

ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

SNMP
Simple Network Management Protocol

Topology:

 Components (NMS and SNMP Agent)


 Versions (1, 2c and 3)
 MIB and Object -IDs
 Getting the NMS (Zabbix) ready
 Version 2c Configuration on IOS-XE and IOS-XR
 Version 3 Security Levels
 Version 3 Configuration

Task 01:
 Get the Zabbix node ready for being an NMS for the network
 Configure SNMP v2c with community: public on Edge-Router-254, CSR1000v-1 and
XRv9k-2
 All Traps should be sent to the NMS

NETWORK ASSURANCE AND MANAGEMENT 1


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Solution:
SNMP stands for Simple Network Management Protocol. As the name implies, it is simple to
Undedrstand, Configure and Work with!
SNMP has two major components:
 NMS: The Network Management Station which a server and is going to collect the
necessary information from the Infrastructure devices such as Routers, Switches,
Servers etc… or make some changes on these devices.
 SNMP Agent: This component is running on the Infrastructure devices (Routers,
Switches etc…) and is going to be at service of the NMS, for example: reading the NMS
requested information from a database and send it to the NMS, when some event
happens, it will inform the NMS and etc… .
If you take a look at the topology, there is a Zabbix node, this node is going to be our NMS, and
all the Routers and Switches are running SNMP Agent.
SNMP has 3 versions:
 Version 1: which is the original one and it is considered obsolete, it supports 32-bit
counters which is limited for nowadays networks with devices that have links above
Gigabits per second bandwidth. The security considerations is another major problem
with these initial versions (1 and 2c), simply they put a community string (something
like a password but not so good as a password! ) into the messages trying to make a
little bit security with the solution, maybe this idea was good in those days (30 years
ago) but nowadays?! Not at all! On most devices there is a “public” community by default
and if you don’t change it or not put some ACLs to limit the NMS IPs, everyone can GET
the information from your entire MIB (we will talk about it soon, it is something like a
database).
Maybe you think of changing that default community (“public”) to something complex,
but trust me anyone can do a simple packet capture and find that complex community!
We will not try to configure any Version 1 at all, no one willing to use it anymore.
 Version 2: This version introcuced a new security system but for some reasons which is
beyond the scope of this article I will not go through it, so they simply ignored v2 and
introduces v2c (when we talk about SNMP v2 we are actually talking about v2c).
In version 2c, the came up with a standard approach that could be used with any vendor,
improvements like: 64-bit counters support, community string from old fashion version
1 added back to this version as well, improvements to the MIB structure and some new
messages (GETBULK and INFORMs).
This version even nowadays is the widely used version! Even without good security!

NETWORK ASSURANCE AND MANAGEMENT 2


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

The reason is: Simplicity in any aspect, in terms of dealing with MIB, dealing with
configuration.
 Version 3: this is a totally different SNMP approach, there are many security
improvements like: providing true Authentication (MD5 and SHA) and Privacy (AES,
DES etc..), in this version we can authenticate the user, put them in a group, provide
them only portional access of the MIB (by using Views) and finally Encrypt the message
contents with some Encryption algorithms.
This was only an introduction to different versions, we will talk about them in detail
throught the tasks.
The SNMP v2c configuration on the Cisco routers and switches is very easy and straight
forward:
Edge-Router-254:

snmp-server community public RO SNMP_RO_ACCESS_LIST

snmp-server trap-source Ethernet0/1

snmp-server enable traps

snmp-server host 192.168.0.3 version 2c public

ip access-list standard SNMP_RO_ACCESS_LIST

permit 192.168.0.3

As you can see, we have specified a community with “public” as the string, and RO stands for
(Read Only which allows the NMS to only read the information on the device), there is another
option which is RW (Read Write, which allows NMS to read and also make some changes on
the device, and RW is not recommended because of security problems in 2c). most
implementations using the RO to just collect the information from the devices.
The Named Access Control List is being used at the end of the first command to just allow
specific devices to poll the information from the device (in this case out NMS IP address is
192.168.0.3).
We have used trap-source interface as being used to send the TRAPs (for example if you
shutdown an interface, immediately that info will be sent to the NMS to let the NMS know
something happened in the device otherwise the NMS will poll the information in some specific
intervals and will not be notified until next poll).

Then we have enabled sending traps for all possible entries, it can be enabled for individual
entries as well:

NETWORK ASSURANCE AND MANAGEMENT 3


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Edge-Router-254(config)#snmp-server enable traps ?

aaa_server Enable SNMP AAA Server traps

atm Enable SNMP atm traps

auth-framework Enable SNMP CISCO-AUTH-FRAMEWORK-MIB traps

bfd Allow SNMP BFD traps

bgp Enable BGP traps

bstun Enable SNMP BSTUN traps

bulkstat Enable Data-Collection-MIB Collection notifications

ccme Enable SNMP ccme traps

cef Enable SNMP CEF traps

cnpd Enable NBAR Protocol Discovery traps

config Enable SNMP config traps

config-copy Enable SNMP config-copy traps

config-ctid Enable SNMP config-ctid traps

cpu Allow cpu related traps

dial Enable SNMP dial control traps

diameter Allow Diameter related traps

dlsw Enable SNMP dlsw traps

dnis Enable SNMP DNIS traps

ds1 Enable SNMP DS1 traps

dsp Enable SNMP dsp traps

eigrp Enable SNMP EIGRP traps

entity Enable SNMP entity traps

entity-ext Enable SNMP entity extension traps

ethernet Enable SNMP Ethernet traps

event-manager Enable SNMP Embedded Event Manager traps

firewall Enable SNMP Firewall traps

flowmon Enabel SNMP flowmon notifications

frame-relay Enable SNMP frame-relay traps

fru-ctrl Enable SNMP entity FRU control traps

gdoi Enable SNMP GDOI traps

hsrp Enable SNMP HSRP traps

ike Enable IKE traps

ipmobile Enable SNMP ipmobile traps

ipmulticast Enable SNMP ipmulticast traps

NETWORK ASSURANCE AND MANAGEMENT 4


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

ipsec Enable IPsec traps

ipsla Enable SNMP IP SLA traps

isdn Enable SNMP isdn traps

isis Enable IS-IS traps

l2tun Enable SNMP L2 tunnel protocol traps

lisp Enable SNMP LISP MIB traps

memory Enable SNMP Memory traps

mempool Enable SNMP memory pool traps

mpls Enable SNMP MPLS traps

msdp Enable SNMP MSDP traps

mvpn Enable Multicast Virtual Private Networks traps

nhrp Enable SNMP NHRP traps

ospf Enable OSPF traps

ospfv3 Enable OSPFv3 traps

pfr Enable SNMP PfR traps

pim Enable SNMP PIM traps

pki Enable SNMP PKI Traps

pppoe Enable SNMP pppoe traps

pw Enable SNMP PW traps

resource-policy Enable CISCO-ERM-MIB notifications

rf Enable all SNMP traps defined in CISCO-RF-MIB

rsvp Enable RSVP flow change traps

snmp Enable SNMP traps

srst Enable SNMP srst traps

stun Enable SNMP STUN traps

syslog Enable SNMP syslog traps

trustsec-sxp Enable SNMP CISCO-TRUSTSEC-SXP-MIB traps

tty Enable TCP connection traps

voice Enable SNMP voice traps

vrfmib Allow SNMP vrfmib traps

vrrp Enable SNMP vrrp traps

waas Enable WAAS traps

xgcp Enable XGCP protocol traps

<cr>

NETWORK ASSURANCE AND MANAGEMENT 5


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

This was a sample snmp version 2c configuration on the IOS device.


The configuration on the IOS-XE is the same:
CSR1000v-1:

snmp-server community public RO

snmp-server source-interface informs GigabitEthernet1

snmp-server enable traps

snmp-server location 1st floor DC

snmp-server host 192.168.0.3 informs version 2c public

This time we have not used any ACL to limit the NMS IP addresses (which is not recommended,
always try to use ACLs).
And this Agent on IOS-XE is going to send Informs instead of traps to the NMS (Informs are just
like Traps but they expect an acknowledgement from the NMS).
You can also specify the more information about the device such as where it is placed, the
Admin contact info and etc… .
Let’s take a look at to the IOS-XR configs:
XRv9k-2:

snmp-server host 192.168.0.3 informs version 2c public

snmp-server community public RO

snmp-server traps

snmp-server ifindex persist

As you can see the command syntax is almost the same as IOS.
We have also put ifindex persist command, this feature provides an interface index (ifIndex)
value that is retained and used when the router reboots.
That was all about SNMP v2c configuration on thed devices, as I mentioned before, It is very
simple configure and work with!
Let’s get the NMS ready:
There are many Network Management Servers out there from different vendors that are doing
the same job, but with different GUI and features, We will configure 3 of them in this lab, the
first one is Zabbix which is a free Linux based solution.
In EVE-NG there is no template for this node by default, in the SNMP video training We have
explained how to add this node to the EVE-NG (please refer to those videos).

NETWORK ASSURANCE AND MANAGEMENT 6


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

There is a DHCP server running on the Edge-Router-254, which will provide IP address, DNS
and Default Router information to this node. For simplicity We are going to use the Dynamic IP
address on the NMS (in a real invironment make sure to configure static values).
Just power on the Zabbix node and click on it, in the VNC Console use these default Username
and Password to login to the device:
Username: root
Password: zabbix

It shows you the Web GUI default username and Passwords:


Username: Admin
Password: zabbix

NETWORK ASSURANCE AND MANAGEMENT 7


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

If you enter the ip address command, you can find about the eth0 interface IP address.
Let’s login to the Web GUI (you can use any windows node in this lab to go to the web browser
and enter the zabbix web GUI (You can use Test-PC, Syslog-Server and etc…) these are the
Windows 10 and Server 2016 nodes.

For example I logged in to the Syslog-Server node (Win Server 2016) and entered the Zabbix
IP address.
 Windows 10 default credentials:
Username: User
Password: Test123
 Windows Server 2016 credentials:
Username: Administrator
Password: Test123
You need to add Cisco IOS device template to the zabbix, by default it does not have any
template for Cisco dedvices.

NETWORK ASSURANCE AND MANAGEMENT 8


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

The reason we add them is: NMS will GET the values of Object IDs inside the MIB (Management
Information Base).
MIB is a hierarchical structured dabased which we can go through it and search for an Object
ID (For example Interfaces have their own Object ID in the MIB and we can find detailed
information related to counters, Link Status etc… inside it).
You can get the Cisco IOS Official template from their website:

They may have some Template dependencies as well, For example:

NETWORK ASSURANCE AND MANAGEMENT 9


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Just Click on the link and download and import them before adding Cisco IOS template.

After adding templates, You can add the devices to the inventory:

NETWORK ASSURANCE AND MANAGEMENT 10


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

For example we add CSR1000v-1:

{$SNMP_COMMUNITY} is refering to default values of community in the zabbix for this


template. It is “public” by default, you can put any specific value in this box as well, for example
“communityX_123”, in this case make sure to define the community RO value on the SNMP
Agent as well.
Assosiate the desired template to the device on the Templates tab:

We select the Net Cisco IOS SNMPv2 the we imported recently.


NETWORK ASSURANCE AND MANAGEMENT 11
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Save the device configuration and that is all!


In the Monitoring > Hosts section you can see those devices are available:

Click on one of the devices name and go for the Graphs:

NETWORK ASSURANCE AND MANAGEMENT 12


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

You can find graphs related to the CPU, Interfaces and Memory:

More graphs can be seen as well, for example the temprature of the Chassis if you use the
physical dedvices and also dedpends on the Template that you are using (Which Object ID
values it can get from the MIB of the devices).
If you click on the Latest Data tab in the Monitoring section, it is going to show you the Latest
data collected from the device:

Let’s take a look at this data:

NETWORK ASSURANCE AND MANAGEMENT 13


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

These are the information related to the Ethernet0/0 of the Edge-Router-254. Such as counters
and state and name of it.
Let’s shutdown eth0/0, and do a packet capture on the eth0 interface of Zabbix node:
 Before shutdown:

The NMS is getting the SNMP information from the Agents, Take a look at Object IDs: Long
numbers: 1.3.6.1.2.1.31.1.1.1….. !

NETWORK ASSURANCE AND MANAGEMENT 14


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

A human cannot remember them, that is why they have created templates and added to the
NMS, to request for these specific information in the MIB.
Let’s analyze one of these GET messages:

The messages are UDP using the port number 161. It includes the community (clear text) and
also refering to the Object IDs.
You can find more information about the SNMP MIB OIDs on cisco webside (just google it),
there is an online tool out their by cisco to find about the MIB and OIDs.
 Let’s shutdown the interface:
Edge-Router-254:

interface e0/0

shutdown

This time trap will be sent to the NMS immediately:

NETWORK ASSURANCE AND MANAGEMENT 15


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

This Trap makes sure that NMS gets the latest information about an Object immediately.

NETWORK ASSURANCE AND MANAGEMENT 16


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Task 02:
 Configure SNMPv3 on the Switch 2 (SW for subnet 192.168.1.0/24).
 Configure the NMS to do SNMPv3 for this device
 Use AuthPriv Security Level

Solution:
SNMPv3 provides major security improvements to the SNMP.
There are different Security Levels:
 NoAuthNoPriv
 AuthNoPriv
 AuthPriv
NoAuthNoPriv: In this model there is no Authentication and Privacy at all (Never use it unless
you have a good reason to do so!.
AuthNoPriv: This provides Authentication by Username and an MD5 or SHA password but no
Privacy, which the packets will not be encrypted at all.
AuthPriv: The strongest level of the Security for SNMP. In this model the SNMPv3 will provide
Authentication as well as Encryption.

In this example we will configure the strongest one which is Security Level 3 (AuthPriv):
First of all, we need to define a view .By using the Views we can specify which section of the
MIB will be accessable for a specific Group.

SW2(config)#do sh snmp mib

dot1xPaeSystem.1

dot1xPaePortEntry.2

dot1xPaePortEntry.3

dot1xPaePortEntry.4

dot1xPaePortEntry.5

dot1xAuthConfigEntry.1

NETWORK ASSURANCE AND MANAGEMENT 17


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

dot1xAuthConfigEntry.2

dot1xAuthConfigEntry.3

dot1xAuthConfigEntry.4

dot1xAuthConfigEntry.5

dot1xAuthConfigEntry.6

dot1xAuthConfigEntry.7

dot1xAuthConfigEntry.8

dot1xAuthConfigEntry.9

dot1xAuthConfigEntry.10

dot1xAuthConfigEntry.11

dot1xAuthConfigEntry.12

dot1xAuthConfigEntry.13

dot1xAuthConfigEntry.14

dot1xAuthStatsEntry.1

dot1xAuthStatsEntry.2

dot1xAuthStatsEntry.3

--More--

--More--
This is the most difficult and confusing part of the SNMPv3 configuration, and I think that is
one of the reasons people still prefer using SNMPv2c! Maybe it is not so Simple as the name
implies! 

SW2:

interface Vlan1

ip address 192.168.1.22 255.255.255.0

no shutdown

snmp-server group Admins v3 priv read ALL

snmp-server view ALL iso included

snmp-server user Navid Admins v3 auth md5 PASSWORD123 priv aes 128 ABC123ABC123ABC123ABC123 access NMS

snmp-server host 192.168.0.71 informs version 3 priv Navid

ip access-list standard NMS

permit 192.168.0.3

Now we have configured the device to work with the Version 3.

NETWORK ASSURANCE AND MANAGEMENT 18


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

First of all we need an interface on the subnet, we have created an SVI.


We also defind a view with the name of ALL to allow everything in the MIB accessable.
We have defined a group named Admins, with the 3rd Security Level (Priv stands for AuthPriv)
and gave the read access to ALL view.
You have to also create a User, in this case the Name: Navid, we specified this user to be part of
Admins Group, and for authentication we are using MD5 with the password of PASSWORD123
and for encryption the AES is being used.
That’s all you need to do on the switch.
Let’s do the configuration on the Zabbix node:

Just like the previous example we need to download Cisco IOS Switches SNMPv3 template
from their website, and upload it to the zabbix,
Then we add a device with SNMPv3.
{$SNMP_SECNAME}: Is the Username field refering to the dedfault values of the template (you
can use a specific username in this box as well).
{$SNMP_AUTH}: Password of the specific User.

NETWORK ASSURANCE AND MANAGEMENT 19


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

{$SNMP_PRIV}: Encryption Key.

Then we have linked the CiscoSwitchInterfaceSNMPv3 template (downloaded from the zabbix
website) to this device.
NOTE: This SW node is a virtual device, we don’t expect the template to work with it, you need
a Physical box to test this SNMPv3 template.

NETWORK ASSURANCE AND MANAGEMENT 20


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Task 03:
 Configure Cacti as the NMS for the same devices in the lab.

Solution:
There are many Network Management Stations out there by many companies, Cacti is one of
them which is also free and widely used nowadays.
You can install in on the Windows as well as Linux Operating Systems.
In this example we will install it on the Linux Debian 10:
Just Power on the Node and set the networking parameters.
In this example we will set IP address of 192.168.0.71/24 to the Debian box.
The default credentials of this node we are using in the training is:
Username: root
Password: Test123
 Open the Terminal and enter apt update command to get the recent packages.
 Make sure you have the internet access for this node
 Enter apt install cacti
 Follow the instructions to set the password (refer to the Cacti part video in the training
for more info)
 Open the Firefox Browser on Debian 10 box and enter http://127.0.0.1/cacti
 Use the Username Admin and the Password specified during the installation

NETWORK ASSURANCE AND MANAGEMENT 21


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

This is the Main Dashboard of Cacti.


You don’t need to add any SNMP templates for Cisco IOS devices. By default Cacti includes
them.
Let’s make sure that our devices sending Traps to this new node and also ACL includes the IP
address of this new NMS:
Edge-Router-254:

snmp-server host 192.168.0.71 version 2c public

ip access-list standard SNMP_RO_ACCESS_LIST

permit 192.168.0.71

permit 192.168.0.3

CSR1000v-1:

snmp-server host 192.168.0.71 version 2c public

XRv9k-2:

snmp-server host 192.168.0.71 traps version 2c public

snmp-server community public RO IPv4 NMS

ipv4 access-list NMS

10 permit ipv4 host 192.168.0.3 any

20 permit ipv4 host 192.168.0.71 any

commit

Now everything is ready on the dedvices, let’s define them on Cacti:

NETWORK ASSURANCE AND MANAGEMENT 22


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Save it and Add other devices as well:

NETWORK ASSURANCE AND MANAGEMENT 23


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Now Click on one of the devices then click Create Graphs for this Device:

check the desired stuff and click on Create:

NETWORK ASSURANCE AND MANAGEMENT 24


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Go to the Graphs section and click on a graph name:

As an example Ethernet0/0 graph of Edge-Router-254:

NETWORK ASSURANCE AND MANAGEMENT 25


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Task 04:
 Remove the Cacti node from the lab
 Add a Windows Server 2016 node instead
 Install and Configure the PRTG NETWORK MONITOR as an NMS

Solution:
Let’s install a Paid solution as well, PRTG is a paid NMS, but we can test the Free Demo version
for 30 days. It will work full functional for 30 days.
Download the free version on Windows server 2016 node (included the free trial key):
https://www.paessler.com/prtg?gclid=EAIaIQobChMIhuLrzMmY8gIV2EaRBR2PzgevEAAYAS
AAEgISkvD_BwE
Click on the installer, follow the steps (make sure you have the internet access for license
activation phase).
Enter http://127.0.0.1 on the Firefox of Win Server 2016 node and login to the PRTG Web GUI:
Username: prtgadmin
Password: prtgadmin
Just like other solutions the default community value is set to “public”, in this lab we are okay
with that no need to change.

NETWORK ASSURANCE AND MANAGEMENT 26


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

From the devices section we did a right click on the Network Infrastructure and addedd a new
Group name Routers:

NETWORK ASSURANCE AND MANAGEMENT 27


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

NOTE: By default the SNMPv2 community is public, you don’t need to change it for this lab.
Go to the devices tab, Right Click on Routers (new group we just created) and Add a new
Device:

Click on OK.
After Device Creation Click on the Run Auto Discovery button next to the device name:
NETWORK ASSURANCE AND MANAGEMENT 28
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

The PRTG will automatically scan the Sensors for this device.
Just wait a couple minutes and you can see the sensor information.
Add the other dedvices using the same steps.
And We are done!

NETWORK ASSURANCE AND MANAGEMENT 29


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

The PRTG will draw beautiful Graphs and Diagrams for each the Objects:

NETWORK ASSURANCE AND MANAGEMENT 30


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

SysLog

Topology:

 Logging Targets
 Syslog Messages, Severities and Facilities
 Configuration on IOS-XE and IOS-XR
 Implementing Solarwinds Kiwi Syslog Server

Task 01:
 Do the Syslog configuration only on IOS-XE box (CSR1000v)
 SysLog messages should have the buffer size of 8192 Bytes
 Logs with Severity levels of 0 to 6 should be send to Virtual Terminals (Test it)
 Setup the Kiwi Syslog Server, and Configure CSR1000v-1 and XRv9k-2 to send the Logs
with Informational Severity level and above to this server (Kiwi IP: 192.168.0.4).
 IOS-XR should be configured in a way to use RFC 5424

Solution:
SysLog is used for System Logging, You have seen a lot of Syslog messages generated by the
Cisco devices from the first day of logging in to the device console!

NETWORK ASSURANCE AND MANAGEMENT 31


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

The SysLog is a most important part of Network Assurance and Monitoring. We can exactly see
what has happened to everything on the devices, such as OSPF neighborship went down, Some
Interface is flapping, A user logged in to the device and etc… .
If you take a look at the above output, It is generated by System (SYS), sometimes most
engineers call this part the Facility, but it is not the facility! All Cisco Routers and Switches
generate the SysLog message with the Facility of Local7. Instead the logs include the Process
that actually created the log (in this case SYS). There is a number (5) next to the SYS, it implies
the Severity level of the message, or in other words, how much important this log is!
5 is for Notifications:

Normal Level, but significant condition! A user just logged in to the console.
By default all logs with all Severity levels will be send to the Console and also the Buffer. So the
device will keep track of them untill you reboot the device:

We can find some these information as well as buffered logs using the show logging command:

NETWORK ASSURANCE AND MANAGEMENT 32


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Edge-Router-254(config)#do sh logging

Syslog logging: enabled (0 messages dropped, 14 messages rate-limited, 0 flushes, 0 overruns, xml disabled,
filtering disabled)

No Active Message Discriminator.

No Inactive Message Discriminator.

Console logging: level debugging, 55 messages logged, xml disabled,

filtering disabled

Monitor logging: level debugging, 0 messages logged, xml disabled,

filtering disabled

Buffer logging: level debugging, 67 messages logged, xml disabled,

filtering disabled

Exception Logging: size (4096 bytes)

Count and timestamp logging messages: disabled

Persistent logging: disabled

No active filter modules.

Trap logging: level informational, 71 message lines logged

Logging to 192.168.0.4 (udp port 514, audit disabled,

link up),

3 message lines logged,

0 message lines rate-limited,

0 message lines dropped-by-MD,

xml disabled, sequence number disabled

filtering disabled

Logging Source-Interface: VRF Name:

Log Buffer (4096 bytes):

on Interface Ethernet2/2, changed state to down

*Aug 4 20:26:00.164: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet2/3, changed state to down

*Aug 4 20:26:00.164: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet3/0, changed state to down

*Aug 4 20:26:00.164: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet3/1, changed state to down

*Aug 4 20:26:00.164: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet3/2, changed state to down

*Aug 4 20:26:00.164: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet3/3, changed state to down

*Aug 4 20:26:00.247: %LINEPROTO-5-UPDOWN: Line protocol on Interface NVI0, changed state to up

--More--

NETWORK ASSURANCE AND MANAGEMENT 33


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

By default the buffer size is set to 4096, we can adjust it, Or we can specify which Severity level
being logged:
CSR1000v-1:

logging buffered 8192

logging monitor informational

logging host 192.168.0.4

We changed the Buffer size of Logging to 8192 bytes.


We have Configured Informational and above (Lower in numbering) for sending logs to the
Virtual Terminals when we do Telnet, SSH to the device.
By using the logging host command we specified the Kiwi Syslog Server IP address, the System
will send the Syslog messages towards this host (192.168.0.4).
First of all let’s Telnet to the CSR1000v:

Edge-Router-254#telnet 192.168.1.1

Trying 192.168.1.1 ... Open

User Access Verification

Username: orhan

Password:

CSR1000v-1#terminal monitor

CSR1000v-1#conf t

Enter configuration commands, one per line. End with CNTL/Z.

CSR1000v-1(config)#int lo 10

*Aug 5 01:03:03.064: %LINEPROTO-5-UPDOWN: Line protocol on Interface Loopback10, changed state to up

CSR1000v-1(config-if)#

By default when you do a Virtual Terminal to the Cisco Router, it will not show you any Syslog
messages by dedfault, in order to see the log messages we need to enter terminal monitor
command in the Priv Exec mode (Each time you do SSH, Telnet to the device).
And we can ge Syslog configuration information as well as see the Syslog messages buffered in
the device memory usinf show logging command:

NETWORK ASSURANCE AND MANAGEMENT 34


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

CSR1000v-1#show logging

Syslog logging: enabled (0 messages dropped, 3 messages rate-limited, 0 flushes, 0 overruns, xml disabled,
filtering disabled)

No Active Message Discriminator.

No Inactive Message Discriminator.

Console logging: level debugging, 91 messages logged, xml disabled,

filtering disabled

Monitor logging: level informational, 2 messages logged, xml disabled,

filtering disabled

Buffer logging: level debugging, 18 messages logged, xml disabled,

filtering disabled

Exception Logging: size (4096 bytes)

Count and timestamp logging messages: disabled

Persistent logging: disabled

No active filter modules.

Trap logging: level informational, 92 message lines logged

Logging to 192.168.0.4 (udp port 514, audit disabled,

link up),

15 message lines logged,

0 message lines rate-limited,

0 message lines dropped-by-MD,

xml disabled, sequence number disabled

filtering disabled

Logging Source-Interface: VRF Name:

Log Buffer (8192 bytes):

*Aug 5 00:29:50.185: %SYS-5-LOG_CONFIG_CHANGE: Buffer logging: level debugging, xml disabled, filtering

NETWORK ASSURANCE AND MANAGEMENT 35


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

disabled, size (8192)

*Aug 5 00:30:02.486: %SYS-5-LOG_CONFIG_CHANGE: Console logging: level informational, xml disabled, filtering
disabled

*Aug 5 00:30:56.488: %SYS-6-LOGGINGHOST_STARTSTOP: Logging to host 192.168.0.4 port 0 CLI Request Triggered

*Aug 5 00:30:57.488: %SYS-6-LOGGINGHOST_STARTSTOP: Logging to host 192.168.0.4 port 514 started - CLI
initiated

*Aug 5 00:31:25.654: %VRRP-6-STATECHANGE: Gi2 Grp 2 state Master -> Init

*Aug 5 00:31:27.658: %LINK-5-CHANGED: Interface GigabitEthernet2, changed state to administratively down

*Aug 5 00:31:28.659: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2, changed state to down

*Aug 5 00:31:36.784: %LINK-3-UPDOWN: Interface GigabitEthernet2, changed state to up

*Aug 5 00:31:36.797: %VRRP-6-STATECHANGE: Gi2 Grp 2 state Init -> Backup

*Aug 5 00:31:37.785: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2, changed state to up

*Aug 5 00:31:40.406: %VRRP-6-STATECHANGE: Gi2 Grp 2 state Backup -> Master

*Aug 5 00:52:45.568: %SYS-5-LOG_CONFIG_CHANGE: Monitor logging: level informational, xml disabled, filtering
disabled

*Aug 5 00:57:43.431: %SYS-5-LOG_CONFIG_CHANGE: Console logging: level debugging, xml disabled, filtering
disabled

*Aug 5 01:02:35.407: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: orhan] [Source: 192.168.0.254]


[localport: 23] at 01:02:35 UTC Thu Aug 5 2021

*Aug 5 01:03:03.064: %LINEPROTO-5-UPDOWN: Line protocol on Interface Loopback10, changed state to up

*Aug 5 01:05:54.461: %SYS-5-CONFIG_I: Configured from console by orhan on vty0 (192.168.0.254)

*Aug 5 01:06:03.976: %SYS-6-LOGOUT: User orhan has exited tty session 1(192.168.0.254)

We also see the timestamps in every Syslog message, you can change the options using service
timestamps log command in the global configuration level.

NOTE: Make sure to set the NTP server in order to have Synchronized Clock time.

NETWORK ASSURANCE AND MANAGEMENT 36


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Let’s configure the IOS-XR box as well:


XRv9k-2-1:

logging format rfc5424

logging console debugging

logging 192.168.0.4 vrf default severity info

commit

On IOS-XR devices, the Syslog messages will not be sent to the Console by default, we can
enable it manually.
IOS-XR devices support the RFC 5424, which has a structured message format, by default IOS-
XR is using RFC 3164 which is the older and simpler version.
NOTE: IOS and IOS-XE are using RFC 3164.
NOTE: VRF default is refering to the Global Routing Table, If you are configuring Syslog for
MGMT VRF you can specify it here.
On IOS and IOS-XE we can specify the VRF after the logging host x.x.x.x command as an
argument.
Not let’s download the Kiwi Syslog Server installer from their website (Free Trial version
works for 14 days) and install it on the Windows Server 2016:
https://www.solarwinds.com/kiwi-syslog-server/registration
NOTE: Make sure you enble the .Net Framework 2.0 and 3.5 on Windows Server 2016.

Click on the Installer file and do simple Next Nexts! Until finishing the installation process.

NETWORK ASSURANCE AND MANAGEMENT 37


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

NOTE: After installation We can get to the Kiwi Syslog Server Console. In order to get access to
the Web GUI you need to enable HTTP Secure server also (Refer to below link):
https://support.solarwinds.com/SuccessCenter/s/article/Enable-SSL-support-for-Kiwi-Web-
Access?language=en_US
For Testing and Labbing purpose the Windows Based Console works fine, For production use
cases make sure to enable SSL to have a Web GUI Access.
Web based GUI:

I just want to shutdown CSR1000v-1’s GigabitEthernet2 and see the logs on Kiwi Syslog server:

Let’s check the Kiwi Syslog Server:

NETWORK ASSURANCE AND MANAGEMENT 38


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

The same messages are being sent to the server, Even if we reload the Router, Kiwi Syslog
server keeps them in it’s database.

NETWORK ASSURANCE AND MANAGEMENT 39


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

NetFlow and IPFIX

Topology:

 NetFlow v9 vs IPFIX
 Configuration on Cisco and Non-Cisco devices
 Implementing ManageEngine NetFlow Analyzer

Task 01:
 Install the ManageEngine Netflow Analyzer Demo version (IP address: 192.168.0.1)
 Configure Netflow version 9 on Edge-Router-254 and Internal-Router
 The flows should be exported and sent to the NetFlow Analyzer with port number 9996
 Netflow should be enable in both direction on Internal-Router Ethernet0/2
 Netflow should be enabled in ingress direction on Edge-Router-254 e0/1
 Enable IPFIX on CSR1000v’s GigabitEthernet1 in both directions (Netflow Analyzer IP
address 192.168.0.1 and port number 9996).

NETWORK ASSURANCE AND MANAGEMENT 40


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Solution:
NetFlow as it’s name implies is a protocol to collect the flow information that is coming in or
going out of the router interface. We can Collect the detailed information related to the Flows
(Such as Protocol, TCP, UDP Port, DSCP value etc…) and send them to the Netflow Analyzer to
do analysis on them.
Let’s download the software from their website (They have Windows and Linux editions), in
this example we will install it on the Windows Server 2016 (30 Days Trial version):
https://www.manageengine.com/products/netflow/download-free.html
The installation process is straight forward (Next Next!):
Specify the Web Server port and NetFlow listen port during installation (After Installation
make sure you open these ports on the windows Firewall):

Done! You can access the Web GUI using port 8060:

NETWORK ASSURANCE AND MANAGEMENT 41


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Default username and password: admin/admin


IPFIX is an enhanced and standard version of Cisco’s Netflow 9, and sometimes it is called
Netflow version 10.
I want to refer to ManageEngine website to briefly explain their differences:
Source: https://www.manageengine.com/products/netflow/ipfix-monitoring.html
What is IPFIX?

IP Flow Information Export or IPFIX is an extended version of NetFlow v9, standardized by the Internet
Engineering Task Force (IETF). It supports variable length fields like HTTP hostname or HTTP URL as well
as enterprise-defined fields. IPFIX allows you to collect and analyze flow data from layer 3 devices and
firewalls with an IPFIX collector and IPFIX analyzer.

With NetFlow Analyzer's IPFIX monitoring and reporting features, you can diagnose and troubleshoot network
issues and generate customized reports. You can plan your future bandwidth needs to optimize usage with these
one-minute granularity reports. NetFlow Analyzer helps you generate and schedule custom bill plans,
and sends email and SMS-based alerts in case of threshold violations.

NetFlow vs IPFIX.

IPFIX is an industry standardized version of NetFlow. IPFIX, often referred to as NetFlow v10, is a more
relevant option when it comes to working with data or devices that are not built by Cisco itself. While
NetFlow also provides multiple options for this, they're simply more time-consuming and complicated to use.

One of the major differences between NetFlow and IPFIX is that IPFIX allows a vendor ID to be specified. This
allows the vendor to add proprietary information to the flow and export any data they want. IPFIX also allows
variable length fields, making HTTP host and URL export easier.

Let’s configure Netflow On the IOS devices:


Edge-Router-254:

ip flow-export version 9

ip flow-export destination 192.168.0.1 9996

interface Ethernet0/1

ip flow ingress

Internal-Router:

ip flow-export version 9

ip flow-export destination 192.168.0.1 9996

interface Ethernet0/2

ip flow ingress

ip flow egress

This is the old form of configuration, I will configure the IPFIX on the IOS-XE device so you will
realyze the difference:

NETWORK ASSURANCE AND MANAGEMENT 42


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

CSR1000v-1:

flow exporter TEST

destination 192.168.0.1

transport udp 9996

export-protocol ipfix

flow monitor TEST

exporter TEST

record netflow ipv4 original-input

interface GigabitEthernet1

ip flow monitor TEST unicast input

ip flow monitor TEST unicast output

And this is the End Result:

NETWORK ASSURANCE AND MANAGEMENT 43


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

CSR1000v-1#show flow monitor TEST

Flow Monitor TEST:

Description: User defined

Flow Record: netflow ipv4 original-input

Flow Exporter: TEST

Cache:

Type: normal (Platform cache)

Status: allocated

Size: 200000 entries

Inactive Timeout: 15 secs

Active Timeout: 1800 secs

NETWORK ASSURANCE AND MANAGEMENT 44


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Trans end aging: off

Stats:

protocol distribution

CSR1000v-1#show flow monitor TEST cache

Cache type: Normal (Platform cache)

Cache size: 200000

Current entries: 11

High Watermark: 375

Flows added: 830

Flows aged: 819

- Inactive timeout ( 15 secs) 819

IPV4 SOURCE ADDRESS: 192.168.1.1

IPV4 DESTINATION ADDRESS: 192.168.0.1

TRNS SOURCE PORT: 53988

TRNS DESTINATION PORT: 9996

INTERFACE INPUT: Null

FLOW SAMPLER ID: 0

IP TOS: 0x00

IP PROTOCOL: 17

ip source as: 0

ip destination as: 0

ipv4 next hop address: 192.168.1.253

ipv4 source mask: /32

ipv4 destination mask: /24

tcp flags: 0x00

interface output: Gi1

counter bytes: 40776

counter packets: 98

timestamp first: 04:31:24.139

timestamp last: 04:34:26.321

IPV4 SOURCE ADDRESS: 104.16.123.96

NETWORK ASSURANCE AND MANAGEMENT 45


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

IPV4 DESTINATION ADDRESS: 192.168.2.11

TRNS SOURCE PORT: 443

TRNS DESTINATION PORT: 50630

INTERFACE INPUT: Gi1

FLOW SAMPLER ID: 0

IP TOS: 0x00

IP PROTOCOL: 6

ip source as: 0

ip destination as: 0

ipv4 next hop address: 192.168.2.11

ipv4 source mask: /0

ipv4 destination mask: /24

tcp flags: 0x11

interface output: Gi2

counter bytes: 184

counter packets: 4

timestamp first: 04:34:22.019

timestamp last: 04:34:22.021

--More--

As you can see from the above output, the device Monitors the Flow information and send
them to the Netflow Analyzer.
Edge-Router-254#show ip cache flow | begin Pro

Protocol Total Flows Packets Bytes Packets Active(Sec) Idle(Sec)

-------- Flows /Sec /Flow /Pkt /Sec /Flow /Flow

TCP-Telnet 7 0.0 14 45 0.0 11.4 13.0

TCP-WWW 1442 0.0 1777 783 87.0 16.0 9.1

TCP-other 3970 0.1 112 697 15.1 7.3 10.8

UDP-DNS 2562 0.0 1 69 0.1 0.2 15.4

UDP-NTP 1208 0.0 1 76 0.0 0.7 15.4

UDP-other 5226 0.1 1 126 0.2 0.2 15.4

ICMP 1561 0.0 8 75 0.4 35.7 15.1

IGMP 18 0.0 12 41 0.0 2.7 15.5

IP-other 18 0.0 176 80 0.1 1596.8 6.5

Total: 16012 0.5 189 764 103.1 8.6 13.7

NETWORK ASSURANCE AND MANAGEMENT 46


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

SrcIf SrcIPaddress DstIf DstIPaddress Pr SrcP DstP Pkts

Et0/1 192.168.0.1 Et0/0 40.81.120.44 11 F891 0DD8 1

Et0/1 192.168.0.1 Null 224.0.0.252 11 F30F 14EB 2

Et0/1 192.168.0.4 Et0/0 40.81.120.44 11 CF7D 0DD8 1

Et0/1 192.168.2.11 Et0/0 13.32.69.98 06 C624 01BB 13

Et0/1 192.168.0.1 Et0/0 51.103.5.186 06 C2AF 01BB 1

Et0/1 192.168.0.1 Et0/0 51.103.5.186 06 C2AF 01BB 1

Et0/1 192.168.1.1 Et0/0 10.20.20.254 01 0000 0800 91

Et0/1 192.168.0.1 Et0/0 185.51.200.2 11 FF16 0035 1

Et0/1 192.168.0.253 Null 224.0.0.5 59 0000 0000 67

SrcIf SrcIPaddress DstIf DstIPaddress Pr SrcP DstP Pkts

Et0/1 192.168.2.11 Et0/0 23.58.222.72 06 C621 01BB 3

Et0/1 192.168.0.1 Et0/0 169.254.1.100 11 0089 0089 12

Other Vendors also support IPFIX because of open standard.


For example I have configured IPFIX on my Home Router (MikroTik RouterOS device):

NETWORK ASSURANCE AND MANAGEMENT 47


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Device Administration with ISE 3.0


AAA and TACACS+
Topology:

 What is AAA and TACACS+ and why we need it


 Implementing Cisco ISE 3.0 and do the device Admin configuration (Step by Step)
 AAA Configuration (For device admin) on IOS-XE and IOS-XR

Task 01:
 Enable the ISE 3.0 Device Admin feature
 There should be two accounts: orhan-admin and navid-operator
 The Admin account should be able to execute any command on the devices
 The Operator account should only execute show commands on the devices
 Configure the AAA for Device Admin on IOS-XE (CSR1000v-1)
 Configure the AAA for Device Admin on IOS-XR (XRv9k-2)

NETWORK ASSURANCE AND MANAGEMENT 48


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Solution:
AAA stands for Authentication, Authorization and Accounting.
 Authentication: We want to know who is trying to do something
 Authorization: We want to know what someone can do
 Accounting: We want to know what has someone done
The idea is very simple, we can implement this idea in 2 major ways:
 Device Administartion: Someone trying to login to a Router or a Switch, Is he/she the
person that we think of? do they have the permission to even login to our devices?
(Authentication), Is He/She going to execute some specific config commands? Or just
works as an Operator and can run only some low level show commands?
(Authorization), What has He/She done in the past? Did she execute an specific
command that caused a network outage? Did He shutdown an important interface?
(Accounting)
 Network Administartion: Let’s imagin you work in an ISP, A customer gets an internet
service from your company, Is this device allowed to enter our network? For example
their perimeter router trys to do a PPPoE connection with the BRAS, You have provided
a Username and Password for them (They try to do Authentication), After login you
want to keep track of their Traffic usage, the connect and disconnect time etc… (You do
Accounting).
In this Lab, we are going to use AAA for the first one (Device Administartion).
TACACS+ is going to be used for this purpose, in TACACS+ all AAA works seperately! Which
means we can individualy have Authentication, Authorization and Accounting. This protocol
is specially designed for this purpose.
What about RADIUS? In RADIUS the first A (Authentication) combined with Authorization!
When you get authenticated normaly you can do anything (The access can be limited using
other protocols or technologies, We imagin RADIUS is working alone!), in other words: we
have Authentication and Accounting in RADIUS.
Other differences between these two protocols are: RADIUS is using UDP (Port 1645,1646
or 1812, 1813) and message content is not being encrypted (Only Password field is
encrypted)but TACACS+ using TCP and packet contents are being encrypted.
As a first step let’s get the ISE node ready, Just follow the setup instructions (during the
installation it will ask you to enter “setup” to start doing initial setup process.
Then we login to the Web GUI:

NETWORK ASSURANCE AND MANAGEMENT 49


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Enable Device Admin feature (By default it is not enabled):

Click on the node name:

NETWORK ASSURANCE AND MANAGEMENT 50


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Enable Device Admin Service and Save:

Let’s add the Network devices:

NETWORK ASSURANCE AND MANAGEMENT 51


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Add the devices that you want, with their IP Addresses and TACACS Authentication:

Shared secret is very important, it will be used when the device wants to contact ISE. This
Shared Secret will be also defined on our devices.

NETWORK ASSURANCE AND MANAGEMENT 52


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

We have defined our two dedvices.


Let’s define User Identity Groups:

I have created two groups: Admins and Operators. Wi will put the users in this group.
Create Users:

NETWORK ASSURANCE AND MANAGEMENT 53


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

We need to create a Shell Profile:

It is okay to provide Privilege level 15 for everyone, because we will do Authorization for the
users, Even if someone has Priv 15 but limited to execute only show commands, He/She cannot
do any Config commands with that Privilege 15.

NETWORK ASSURANCE AND MANAGEMENT 54


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Now time to create command sets:

First one is Permit All Commands for Admins:

The Second command set is for Operators:

NETWORK ASSURANCE AND MANAGEMENT 55


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

The operators can only execute “exit” and any “show” commands. (“exit” is neccesary for
allowing user to close the session and exit the Virtual Terminal).
You can be more specific with commans sets and only allow some arguments for show
command, this is how to do that:

With this command set, the User can only run “exit”, “show ip cef” and “show ip route”
commands.
The last step is to create the Device Admin Policies:

Here we will bind Groups, Command Sets and Shell profile together in the Authorization
section:

NETWORK ASSURANCE AND MANAGEMENT 56


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

You can do this by adding a new Rule.


And Save it.
We are done with the ISE node!
Let’s configure the devices:
First of all make sure you set a Local Username and Password with Priv 15 as well as an Enable
password (For the case that AAA server is not available, the device will use Locally defined
values):
CSR1000v-1:

username orhan privilege 15 secret 9 $9$l8Rh70lks/UNwE$IxsxSoBTutY5NfERzym4qaL7uiYratD/lI9MXnWMJFk

enable secret 9 $9$.04osPdoeeDVyk$4yfENqyaYTeRjUnPuLarwX12VIc/1oP5Hlbf3wUSc3.

XRv9k-2:

username orhan

group root-lr

secret 10
$6$KYGuRdYS5dr/R...$6BkIqnNHVdpRv5fNuiohSGymGjlT2coJ81xbu7ibN8h8QUnw6xSdrCFkWeaCLjksrU/ac9J/Kq/iGd8IZ9ME5.

Define the TACACS+ server:

NETWORK ASSURANCE AND MANAGEMENT 57


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

CSR1000v-1:

aaa new-model

ip tacacs source-interface GigabitEthernet1

tacacs server ISE

address ipv4 192.168.0.70

key Orhan123

aaa group server tacacs+ LAB-TACACS-SERVERS

server name ISE

XRv9k-2:

tacacs source-interface GigabitEthernet0/0/0/1 vrf default

tacacs-server host 192.168.0.70 port 49

key 7 013C140C5A05575D72

aaa group server tacacs+ LAB-TACACS-SERVERS

server 192.168.0.70

We just defined the TACACS+ server and specified the shared secret as well as source interface
(On the ISE 3.0 we have defined the devices with these interface IP addresses).
Let’s do the AAA configuration on the IOS-XE:

CSR1000v-1:
aaa authentication login AAA group LAB-TACACS-SERVERS local
aaa authentication enable default group LAB-TACACS-SERVERS enable
aaa authorization config-commands
aaa authorization exec AAA group LAB-TACACS-SERVERS local
aaa authorization commands 0 AAA group LAB-TACACS-SERVERS local
aaa authorization commands 1 AAA group LAB-TACACS-SERVERS local
aaa authorization commands 15 AAA group LAB-TACACS-SERVERS local
aaa accounting commands 15 AAA start-stop group LAB-TACACS-SERVERS
aaa accounting exec AAA start-stop group LAB-TACACS-SERVERS
line vty 0 530
authorization commands 0 AAA
authorization commands 1 AAA
authorization commands 15 AAA
authorization exec AAA
accounting commands 15 AAA
login authentication AAA
accounting exec AAA
transport input telnet ssh
!

NETWORK ASSURANCE AND MANAGEMENT 58


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

We have defined an Authentication Login list name AAA (It will try to do Authentication when
someone logs in using Available TACACS+ servers, if no servers available, the router trys the
Localy defined Usernames).

For the enable we do the same.


We want to do Authorization for all of the commands that user executes in exec and config
commands whenever the privilege level is 0, 1 or 15 (15 was enough in this case).
We also want to do Accounting for any commands that user executes as well as when user
tryes to enter the exec.
As the final step we force the Virtual Terminal Lines to use these defined AAA configs.
Let’s test:
Edge-Router-254#ssh -l orhan-admin 192.168.1.1
Password:
CSR1000v-1#show aaa session
Total sessions since last reload: 9
Session Id: 4001
Unique Id: 11
User Name: *not available*
IP Address: 0.0.0.0
Idle Time: 0
CT Call Handle: 0
Session Id: 4003
Unique Id: 13
User Name: *not available*
IP Address: 0.0.0.0
Idle Time: 0
CT Call Handle: 0
Session Id: 4017
Unique Id: 19
User Name: orhan-admin
IP Address: 192.168.0.254
Idle Time: 0
CT Call Handle: 0
CSR1000v-1#conf t
Enter configuration commands, one per line. End with CNTL/Z.
CSR1000v-1(config)#int lo 12
CSR1000v-1(config-if)#ip add 192.168.12.12 255.255.255.0
CSR1000v-1(config-if)#shu
CSR1000v-1(config-if)#shutdown
CSR1000v-1(config-if)#do sh user
Line User Host(s) Idle Location
0 con 0 idle 00:06:10
* 1 vty 0 orhan-admi idle 00:00:00 192.168.0.254

Interface User Mode Idle Peer Address

Orhan-admin user can execute any command on the device.

NETWORK ASSURANCE AND MANAGEMENT 59


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Let’s test the Navid-Operator:


Edge-Router-254#ssh -l navid-operator 192.168.1.1
Password:
CSR1000v-1#show user
Line User Host(s) Idle Location
0 con 0 idle 00:08:07
* 1 vty 0 navid-oper idle 00:00:00 192.168.0.254

Interface User Mode Idle Peer Address

CSR1000v-1#show ip int br | ex unas


Interface IP-Address OK? Method Status Protocol
GigabitEthernet1 192.168.1.1 YES NVRAM up up
GigabitEthernet2 192.168.2.1 YES NVRAM up up
Loopback11 192.168.100.11 YES manual administratively down down
Loopback12 192.168.12.12 YES manual administratively down down

CSR1000v-1#conf t
Command authorization failed.

CSR1000v-1#show protocols
Global values:
Internet Protocol routing is enabled
GigabitEthernet1 is up, line protocol is up
Internet address is 192.168.1.1/24
GigabitEthernet2 is up, line protocol is up
Internet address is 192.168.2.1/24
GigabitEthernet3 is down, line protocol is down
GigabitEthernet4 is down, line protocol is down
Loopback10 is up, line protocol is up
Loopback11 is administratively down, line protocol is down
Internet address is 192.168.100.11/24
Loopback12 is administratively down, line protocol is down
Internet address is 192.168.12.12/24
CSR1000v-1#

Everything works fine.


Let’s configure the IOS-XR box:
XRv9k-2:
aaa accounting commands AAA start-stop group LAB-TACACS-SERVERS
aaa authorization exec AAA group LAB-TACACS-SERVERS none
aaa authorization commands AAA group LAB-TACACS-SERVERS
aaa authentication login AAA group LAB-TACACS-SERVERS local
aaa accounting update periodic 30
vty-pool default 0 4 line-template vty
line template vty
accounting commands AAA
authorization commands AAA
login authentication AAA
!

In IOS-XR by default we don’t have any VTY lines, you need to create them manually.
The command syntax is almost the same with IOS-XE with a few differences that you can
realize from the above config comman sets.
NETWORK ASSURANCE AND MANAGEMENT 60
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Let’s test it:


Edge-Router-254#ssh -l orhan-admin 192.168.1.2
Password:

RP/0/RP0/CPU0:XRv9k-2#conf t
Thu Aug 5 23:14:55.427 UTC
Current Configuration Session Line User Date Lock
00001000-00003c5a-00000000 con0_RP0_C orhan Thu Aug 5 23:11:01 2021
RP/0/RP0/CPU0:XRv9k-2(config)#int lo 12
RP/0/RP0/CPU0:XRv9k-2(config-if)#ipv4 address 10.12.12.12/24
RP/0/RP0/CPU0:XRv9k-2(config-if)#commit
Thu Aug 5 23:15:26.429 UTC

Edge-Router-254#ssh -l navid-operator 192.168.1.2


Password:

RP/0/RP0/CPU0:XRv9k-2#show ip int br
Thu Aug 5 23:16:39.433 UTC

Interface IP-Address Status Protocol Vrf-Name


Loopback12 10.12.12.12 Up Up default
MgmtEth0/RP0/CPU0/0 unassigned Shutdown Down default
GigabitEthernet0/0/0/0 192.168.2.2 Shutdown Down default
GigabitEthernet0/0/0/1 192.168.1.2 Up Up default
GigabitEthernet0/0/0/2 unassigned Shutdown Down default
GigabitEthernet0/0/0/3 unassigned Shutdown Down default
RP/0/RP0/CPU0:XRv9k-2#conf t
Command authorization failed
% Incomplete command.
RP/0/RP0/CPU0:XRv9k-2#conf
Command authorization failed
% Incomplete command.
RP/0/RP0/CPU0:XRv9k-2#

Let’s check the TACACS Logs on ISE:

NETWORK ASSURANCE AND MANAGEMENT 61


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

NETWORK ASSURANCE AND MANAGEMENT 62


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

MPLS and Segment-Routing OAM


MPLS/SR Operations and Management
Topology:

 MPLS OAM
 Segment-Routing OAM

NOTE: This Lab does not have any tasks. It is just a Step by Step case study and walk through
the technology.
All devices are fully configured (both MPLS LDP and Segment-Routing is configured but SR is
preffered for Forwarding the traffic).
NOTE: This lab is a resource consuming lab because of CSR1000v and XRv9k nodes in the SP
network (you need at least 24 CPU cores and 64 GBs of Memory to run all of the nodes, If you
don’t have enough resources you can power off some devices).

NETWORK ASSURANCE AND MANAGEMENT 63


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

With MPLS encapsulation, we are dealing with LSPs (Label Switched Paths), For example if you
want to to do a ping from XRv9k-1 to XRv9k-2’s loopback 0 IP address, the packet will be Label
switched inside the service Provider network, it is different than the normal IP Forwarding,
We are somehow Tunneling the traffic towards the destination, that is the power of MPLS, the
reason we can provide multiple services to the customers. The devices in the middle (The P
devices) do not care about the original content of the encapsulated packet, they are just label
switching the packets with the information inside the Label Stack.
There is a rule when we talk about the Customer Services and it’s quality:
The Customers are the first ones they notice something bad happened in the SP network! For
example if LSPs are broken, if MPLS IP is not enabled on some links or even something is
wrong with the Control Plane and Data Plane. Even before the SP engineers notice these errors,
the Customers notice it! Because they are the actual end user getting served by the Service
Providers.
This is the main reason of having some kind of tools letting the SP engineers to make sure their
network works fine.
We have the MPLS and SR OAM as a great tools to let the SP staff Manage and Assure their
MPLS networks. They can find any errors related to the Data Path and even SR policies,
Inconsistency between the Control and Data Plane and etc… .
So far, for many years we have been using ping and traceroute commands to troubleshoot our
IP Network, but aiming with MPLS, needs to go beyond and deal with a better tools specifically
designed for that purpose.
Let’s imagin we are running MPLS IP with LDP in the SP network, LDP is distributing the labels
and the devices are pushing some labels, swaping and poping them to provide the LSPs. For
some reason MPLS IP is disabled on one of the interfaces in the LSP, do you think normal Trace
can find the problem? The answer is NO!
The LDP is creating and advertising and learning the labels for the Prefixes that are learned
from IGPs. There must be a route in the routing table to allow the LDP to do it’s job. In the
above case when MPLS IP is disabled on a link or for some reason LDP neighborship has not
formed, the packet will be IP Forwarded instead of Label Switching. So the Trace command
cannot find the broken LSP. In The trace we are sending some packets towards the IP
destinations, we must find a way to not use any routable/forwardable destination IP address
in the trace packets that are going to find the problems in MPLS LSPs. Thanks to the MPLS
Trace, we can solve this problem, simply by using 127.0.0.0/8 as the destination address of the
MPLS Ping and Trace! Whenever a router wants to do the IP Forwarding due to lack of labels, it
cannot forward it! The 127.0.0.0/8 is a localhost range and is not a valid IP destination. The

NETWORK ASSURANCE AND MANAGEMENT 64


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

router will drop the packet and reply the Source that something went wrong with the Label
Switched Path (There is no Label switched outgoing interface).
By using the MPLS OAM tools we can find the inconsistency in the Route-Processor
information and the Line-Card information. For example we can force the Remote Router to
process MPLS Trace packet with it’s Route-Processor instead of Line Card Processor.
Let’s jump in to the lab and see these things in action:
First of all we need to enable MPLS OAM on all of the devices, The command is simple! Just
enter mpls oam on the global configuration! That’s all:
On all SP Core and Aggregation Routers (IOS-XE and IOS-XR boxes):

mpls oam

As an example (IOS-XR):

All IOS-XR Login Credentials:


Username: orhan Password: orhan123
Example (IOS-XE):

Let’s use normal Ping and MPLS Ping command and capture the packets:
RP/0/RP0/CPU0:XRv9k-1#ping 10.255.255.2

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 10.255.255.2, timeout is 2 seconds:

!!!!!

NETWORK ASSURANCE AND MANAGEMENT 65


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Take a look at to the destination: 10.255.255.2, as well as the ICMP being encapsulated inside
IP.
Let’s use MPLS Ping instead and set the target FEC address as the 10.255.255.2/32:
RP/0/RP0/CPU0:XRv9k-1#ping mpls ipv4 10.255.255.2/32

Fri Aug 6 00:42:07.799 UTC

Sending 5, 100-byte MPLS Echos to 10.255.255.2/32,

timeout is 2 seconds, send interval is 0 msec:

Codes: '!' - success, 'Q' - request not sent, '.' - timeout,

'L' - labeled output interface, 'B' - unlabeled output interface,

'D' - DS Map mismatch, 'F' - no FEC mapping, 'f' - FEC mismatch,

'M' - malformed request, 'm' - unsupported tlvs, 'N' - no rx label,

'P' - no rx intf label prot, 'p' - premature termination of LSP,

'R' - transit router, 'I' - unknown upstream index,

'X' - unknown return code, 'x' - return code 0

Type escape sequence to abort.

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 10/28/48 ms

The destination IP is: 127.0.0.1! this makes sure that if some router has a broken labeled
output interface we can detect it, the router is not going to IP Forward 127.0.0.1 destined
packet anywhere.
MPLS Ping and Trace are using UDP datagrams with the source and destination port of: 3503
And the message is MPLS Echo not the normal ICMP Echo request.
NETWORK ASSURANCE AND MANAGEMENT 66
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

It contains important information to troubleshoot MPLS related issues.


NOTE: The reply messages will be normal IP forwarded, so this MPLS Ping and Trace are
testing the LSPs in one direction, the reply packet can come back in any way and do not need to
be Label switched.
At the moment we are using Segment-Routing SIDs (Enable in IS-IS) to do the Label Switching.
We can force the router to test the LDP generated labels instead:
RP/0/RP0/CPU0:XRv9k-1#ping mpls ipv4 10.255.255.2/32 fec-type ldp

Fri Aug 6 00:49:21.934 UTC

Sending 5, 100-byte MPLS Echos to 10.255.255.2/32,

timeout is 2 seconds, send interval is 0 msec:

Codes: '!' - success, 'Q' - request not sent, '.' - timeout,

'L' - labeled output interface, 'B' - unlabeled output interface,

'D' - DS Map mismatch, 'F' - no FEC mapping, 'f' - FEC mismatch,

'M' - malformed request, 'm' - unsupported tlvs, 'N' - no rx label,

'P' - no rx intf label prot, 'p' - premature termination of LSP,

'R' - transit router, 'I' - unknown upstream index,

'X' - unknown return code, 'x' - return code 0

Type escape sequence to abort.

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 7/11/25 ms

NETWORK ASSURANCE AND MANAGEMENT 67


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

SR is preffered to forward the packets in Data Plane, but we can also test LDP generated LSP as
well.
In order to test the Load Sharing we can put different destination addresses as well (by default
it is using 127.0.0.1:
RP/0/RP0/CPU0:XRv9k-1#ping mpls ipv4 10.255.255.2/32 fec-type ldp destination 127.0.0.11

Fri Aug 6 00:52:47.592 UTC

Sending 5, 100-byte MPLS Echos to 10.255.255.2/32,

timeout is 2 seconds, send interval is 0 msec:

Codes: '!' - success, 'Q' - request not sent, '.' - timeout,

'L' - labeled output interface, 'B' - unlabeled output interface,

'D' - DS Map mismatch, 'F' - no FEC mapping, 'f' - FEC mismatch,

'M' - malformed request, 'm' - unsupported tlvs, 'N' - no rx label,

'P' - no rx intf label prot, 'p' - premature termination of LSP,

'R' - transit router, 'I' - unknown upstream index,

'X' - unknown return code, 'x' - return code 0

Type escape sequence to abort.

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 15/20/30 ms

NETWORK ASSURANCE AND MANAGEMENT 68


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Let’s capture Trace and MPLS Trace:


RP/0/RP0/CPU0:XRv9k-1#traceroute 10.255.255.2

Fri Aug 6 00:59:52.779 UTC

Type escape sequence to abort.

Tracing the route to 10.255.255.2

1 10.1.5.5 [MPLS: Label 16002 Exp 0] 19 msec

10.1.3.3 15 msec

10.1.5.5 28 msec

2 10.3.4.4 [MPLS: Label 16002 Exp 0] 25 msec 22 msec

10.5.6.6 8 msec

3 10.2.4.2 14 msec *

10.2.6.2 14 msec

NETWORK ASSURANCE AND MANAGEMENT 69


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

It is a UDP Datagram with destination 10.255.255.2 and Dst Port starting from 33434, and
does not give us much information about the LSP
RP/0/RP0/CPU0:XRv9k-1#traceroute mpls ipv4 10.255.255.2/32 verbose

Fri Aug 6 01:02:50.624 UTC

Tracing MPLS Label Switched Path to 10.255.255.2/32, timeout is 2 seconds

Codes: '!' - success, 'Q' - request not sent, '.' - timeout,

'L' - labeled output interface, 'B' - unlabeled output interface,

'D' - DS Map mismatch, 'F' - no FEC mapping, 'f' - FEC mismatch,

'M' - malformed request, 'm' - unsupported tlvs, 'N' - no rx label,

'P' - no rx intf label prot, 'p' - premature termination of LSP,

'R' - transit router, 'I' - unknown upstream index,

'X' - unknown return code, 'x' - return code 0

Type escape sequence to abort.

0 10.1.5.1 10.1.5.5 MRU 1500 [Labels: 16002 Exp: 0]

L 1 10.1.5.5 10.5.6.6 MRU 1500 [Labels: 16002 Exp: 0] 44 ms, ret code 8

L 2 10.5.6.6 10.2.6.2 MRU 1500 [Labels: implicit-null Exp: 0] 15 ms, ret code 8

! 3 10.2.6.2 23 ms, ret code 3

We can realize that Downstream Labeled Output interface is working fine as well as the
implicit-null being used (reason is PHP) and also detecting the exact router IP addresses on
that interface. Another thing which is very important in MPLS is MTU, as you can see the
routers are reporting the actual link MRUs (Maximum Receive Unit).

NETWORK ASSURANCE AND MANAGEMENT 70


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

This is the packet content, and there are detailed information about everything!
Another useful command is testing the ECMP Multipath:
RP/0/RP0/CPU0:XRv9k-1#traceroute mpls multipath ipv4 10.255.255.2/32
Fri Aug 6 01:07:41.413 UTC
Starting LSP Path Discovery for 10.255.255.2/32
Codes: '!' - success, 'Q' - request not sent, '.' - timeout,
'L' - labeled output interface, 'B' - unlabeled output interface,
'D' - DS Map mismatch, 'F' - no FEC mapping, 'f' - FEC mismatch,
'M' - malformed request, 'm' - unsupported tlvs, 'N' - no rx label,
'P' - no rx intf label prot, 'p' - premature termination of LSP,
'R' - transit router, 'I' - unknown upstream index,
'X' - unknown return code, 'x' - return code 0
Type escape sequence to abort.
LL!
Path 0 found,
output interface GigabitEthernet0/0/0/0 nexthop 10.1.3.3
source 10.1.3.1 destination 127.0.0.0
LL!
Path 1 found,
output interface GigabitEthernet0/0/0/1 nexthop 10.1.5.5
source 10.1.5.1 destination 127.0.0.0

Paths (found/broken/unexplored) (2/0/0)


Echo Request (sent/fail) (6/0)
Echo Reply (received/timeout) (6/0)
Total Time Elapsed 381 ms

NETWORK ASSURANCE AND MANAGEMENT 71


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

EMCP also works fine, by changing destination IP sequentionaly for each path (127.0.0.0,
127.0.0.1 and so on…).
Let’s shutdown the g0/0/0/0 and g0/0/0/1 interfaces of XRv9k-4 as well as disable MPLS IP
on CSR1k-6 Gig1 and Gig2:

RP/0/RP0/CPU0:XRv9k-4#show ip int br | ex unassi


Fri Aug 6 01:13:29.915 UTC

Interface IP-Address Status Protocol Vrf-Name


Loopback0 10.255.255.4 Up Up default
GigabitEthernet0/0/0/0 10.2.4.4 Down Down default
GigabitEthernet0/0/0/1 10.4.7.4 Down Down default
GigabitEthernet0/0/0/2 10.3.4.4 Up Up default
GigabitEthernet0/0/0/4 10.4.5.4 Up Up default
GigabitEthernet0/0/0/5 10.4.6.4 Up Up default

CSR1k-6(config-if)#int g1
CSR1k-6(config-if)#no mpls ldp igp autoconfig
CSR1k-6(config-if)#
*Aug 6 01:14:34.607: %LDP-5-SP: 10.255.255.2:0: session hold up initiated
CSR1k-6(config-if)#int g2
CSR1k-6(config-if)#no mpls ldp igp autoconfig
CSR1k-6(config-if)#
*Aug 6 01:14:53.405: %LDP-5-SP: 10.255.255.7:0: session hold up initiated

Now our LSP is broken towards 10.255.255.2/32, I wanna test it with Normal Ping and
Traceroute:
RP/0/RP0/CPU0:XRv9k-1#ping 10.255.255.2
Fri Aug 6 01:16:32.352 UTC
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.255.255.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 5/11/20 ms
RP/0/RP0/CPU0:XRv9k-1#trace 10.255.255.2
Fri Aug 6 01:16:37.975 UTC
Tracing the route to 10.255.255.2
1 10.1.5.5 [MPLS: Label 16002 Exp 0] 11 msec 4 msec 3 msec
2 10.5.6.6 [MPLS: Label 16002 Exp 0] 7 msec 4 msec 3 msec
3 10.2.6.2 12 msec * 13 msec

NETWORK ASSURANCE AND MANAGEMENT 72


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Still SR labels are being used, completely disable SR on Router 6:


CSR1k-6:
router isis
no segment-routing mpls
!

RP/0/RP0/CPU0:XRv9k-1#ping 10.255.255.2
Fri Aug 6 01:21:43.592 UTC
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.255.255.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 4/8/22 ms

RP/0/RP0/CPU0:XRv9k-1#trace 10.255.255.2
Fri Aug 6 01:21:46.001 UTC

Type escape sequence to abort.


Tracing the route to 10.255.255.2

1 10.1.5.5 [MPLS: Label 16002 Exp 0] 9 msec 3 msec 21 msec


2 10.5.6.6 [MPLS: Label 600013 Exp 0] 7 msec 5 msec 4 msec
3 10.2.6.2 8 msec * 13 msec

Interesting we have the reachability but in fact between CSR1k-6 and XRv9k-2 there is no
Label switching.
As you realized we cannot detect LSP problems with normal Ping and Trace commands.
Let’s check it with MPLS Ping and Trace:
RP/0/RP0/CPU0:XRv9k-1#ping mpls ipv4 10.255.255.2/32 fec-type ldp
Fri Aug 6 01:26:55.245 UTC

Sending 5, 100-byte MPLS Echos to 10.255.255.2/32,


timeout is 2 seconds, send interval is 0 msec:

Codes: '!' - success, 'Q' - request not sent, '.' - timeout,


'L' - labeled output interface, 'B' - unlabeled output interface,
'D' - DS Map mismatch, 'F' - no FEC mapping, 'f' - FEC mismatch,
'M' - malformed request, 'm' - unsupported tlvs, 'N' - no rx label,
'P' - no rx intf label prot, 'p' - premature termination of LSP,
'R' - transit router, 'I' - unknown upstream index,
'X' - unknown return code, 'x' - return code 0

Type escape sequence to abort.

BBBBB
Success rate is 0 percent (0/5)

Label is missing on some device for that prefix! Let’s find the exact hop:

NETWORK ASSURANCE AND MANAGEMENT 73


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

RP/0/RP0/CPU0:XRv9k-1#trace mpls ipv4 10.255.255.2/32 fec-type ldp verbose


Fri Aug 6 01:26:07.365 UTC

Tracing MPLS Label Switched Path to 10.255.255.2/32, timeout is 2 seconds

Codes: '!' - success, 'Q' - request not sent, '.' - timeout,


'L' - labeled output interface, 'B' - unlabeled output interface,
'D' - DS Map mismatch, 'F' - no FEC mapping, 'f' - FEC mismatch,
'M' - malformed request, 'm' - unsupported tlvs, 'N' - no rx label,
'P' - no rx intf label prot, 'p' - premature termination of LSP,
'R' - transit router, 'I' - unknown upstream index,
'X' - unknown return code, 'x' - return code 0

Type escape sequence to abort.

0 10.1.5.1 10.1.5.5 MRU 1500 [Labels: 500013 Exp: 0]


L 1 10.1.5.5 10.5.6.6 MRU 1500 [Labels: 600013 Exp: 0] 15 ms, ret code 8
B 2 10.5.6.6 10.2.6.2 MRU 1500 [No Label] 15 ms, ret code 9

Interesting right?  We exactly found the problem in our MPLS data plane.
Let’s try Segment Routing as the FEC-Type instead of LDP:
RP/0/RP0/CPU0:XRv9k-1#trace mpls ipv4 10.255.255.2/32 verbose
Fri Aug 6 01:29:18.581 UTC

Tracing MPLS Label Switched Path to 10.255.255.2/32, timeout is 2 seconds

Codes: '!' - success, 'Q' - request not sent, '.' - timeout,


'L' - labeled output interface, 'B' - unlabeled output interface,
'D' - DS Map mismatch, 'F' - no FEC mapping, 'f' - FEC mismatch,
'M' - malformed request, 'm' - unsupported tlvs, 'N' - no rx label,
'P' - no rx intf label prot, 'p' - premature termination of LSP,
'R' - transit router, 'I' - unknown upstream index,
'X' - unknown return code, 'x' - return code 0

Type escape sequence to abort.

0 10.1.5.1 10.1.5.5 MRU 1500 [Labels: 16002 Exp: 0]


F 1 10.1.5.5 10.5.6.6 MRU 1500 [Labels: 600013 Exp: 0] 7 ms, ret code 4

We have completely disable Segment-Routing on CSR1k-6, and there is no mapping for SID
16002, so it replies with no FEC mapping for that prefix.
Another greate feature of MPLS OAM is to force the Route-Processor to reply to these MPLS
Ping and Trace messages, in that way we can find the Inconsistency between Control Plane and
Data Plane.
NOTE: It is not possible to test this feature in Virtual Environmen, these devices do not have
separate Supervisor and Line Cards, But the command is:
RP/0/RP0/CPU0:XRv9k-1#trace mpls ipv4 10.255.255.2/32 reply mode router-alert

NETWORK ASSURANCE AND MANAGEMENT 74


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Using Segment-Routing OAM, we can test our policies. For example a device is pushing some
Segment ID’s into the Label Stack in order to Steer the packet towards some Node or Link.
Let’s test it also, For example I want to Steer the packet this way (For some reason, Just
testing):

How can we test that LSP works fine?


RP/0/RP0/CPU0:XRv9k-1#traceroute mpls nil-fec labels 16003,16004,16007,16006 output interface g0/0/0/1 nexthop 10.1.5.5
Fri Aug 6 01:45:34.079 UTC
Tracing MPLS Label Switched Path with Nil FEC with labels [16003,16004,16007,16006], timeout is 2 seconds

Codes: '!' - success, 'Q' - request not sent, '.' - timeout,


'L' - labeled output interface, 'B' - unlabeled output interface,
'D' - DS Map mismatch, 'F' - no FEC mapping, 'f' - FEC mismatch,
'M' - malformed request, 'm' - unsupported tlvs, 'N' - no rx label,
'P' - no rx intf label prot, 'p' - premature termination of LSP,
'R' - transit router, 'I' - unknown upstream index,
'X' - unknown return code, 'x' - return code 0

Type escape sequence to abort.

0 10.1.5.1 MRU 1500 [Labels: 16003/16004/16007/16006/explicit-null Exp: 0/0/0/0/0]


L 1 10.1.5.5 MRU 1500 [Labels: 16004/16007/16006/explicit-null Exp: 0/0/0/0] 16 ms
L 2 10.3.5.3 MRU 1500 [Labels: implicit-null/16007/16006/explicit-null Exp: 0/0/0/0] 9 ms
L 3 10.3.4.4 MRU 1500 [Labels: implicit-null/16006/explicit-null Exp: 0/0/0] 12 ms
L 4 10.4.7.7 MRU 1500 [Labels: 16006/explicit-null Exp: 0/0] 8 ms
L 5 10.4.7.4 MRU 1500 [Labels: implicit-null/explicit-null Exp: 0/0] 8 ms
! 6 10.4.6.6 8 ms

The packet is traversing our forced path steered by a series of Segments.


You can also use different Segment IDs such as Anycast SID and Adjacency SIDs.
NOTE: The First label (16003) is the outer label in the stack.
NETWORK ASSURANCE AND MANAGEMENT 75
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Cisco SD-WAN Monitoring


Walkthrough the Monitoring features of vManage
Topology:

NOTE: You can use any lab which have Cisco SD-WAN Controllers and Edge device, you don’t
have to use this one. Controllers redundancy is optional in this case.
NOTE: This section is just a case study and walkthrough the vManage Monitoring features, we
don’t have any tasks.
NOTE: Having a basic knowledge of how Cisco SD-WAN solution works as well as how to on-
board the controllers and edge devices is requierd. To get more information please refer to out
SD-WAN course in the website:
https://orhanergun.net/courses/self-paced-sdwan-training/

NETWORK ASSURANCE AND MANAGEMENT 76


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Cisco’s SD-WAN solution (formerly Viptela SD-WAN) is a great SD-WAN solution, every SD-
WAN technology to achieve it’s goal is building a Fabric (Overlay) on top of the Underlay
network. This overlay makes it possible to have Application awareness.
One Single Pane of Glass (vManage) is being used to Monitor all the devices that are
participating in this solution (Controllers as well as Edge devices). The vManage has greate
Monitoring tools to help the administrators monitor and troubleshoot their SD-WAN network.
The vManage is making a Control connection with each of the devices and continously getting
information almost about anything from the devices!
We will do this part on the vManage device and you will learn what features it has in terms of
Monitoring or better say Network Assurance.
There is a CA node in this lab which we will use a Browser in it to login to one of our vManage’s
Web Based GUI (In this lab vManages are in a Cluster, their configurations are in sync, and all
of them have identical view about the SD-WAN Network).
Let’s Go to the vManage-1 dashboard:

In the main dashboard we can see an overview of the SD-WAN Network, 4 vSmarts are online,
3 WAN Edge devices and 2 vBonds as well as 3 vManages in a cluster.
7 Control connections are up, If you click on it it gives you about the details of the devices:

NETWORK ASSURANCE AND MANAGEMENT 77


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

It is also showing us that we have 3 sites which they have Full WAN Connectivity (11,12 and
13).

We can also see the SLA prob results (from BFD) for Application Aware Routing purposes.

Let’s go to the Monitor -> Network Section:

You can find a list of all devices ready to be monitored:

NETWORK ASSURANCE AND MANAGEMENT 78


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

By clicking on each of the devices we will be redirected to the detailed monitoring menu of that
specific device, Fore example let’s see the cEdge-1:

This page is the System Status dashboard. We can find information about device hardware
(These devices are Virtual devices so we don’t expect any Module or Temperature sensors
etc… )
NETWORK ASSURANCE AND MANAGEMENT 79
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

You can See the Real Time data as well as Historical data (1h, 3h, 7days etc…).
If you enable DPI (Deep Packet Inspection) You can get the info related to the applications such
as The Percentage of traffic for that Application and etc… .
There is no data in this part because we have not configured it’s related policies.
The interface section gives you information of device interfaces (Physical as well as Logical):

You can see which VPN an specific interface is part of, what is the Status of that interface, What
is MTU and TX and RX rates.
In the QoS secrion, If you configure any Queues and apply them as the Localized policies to the
WAN Edge devices you can see The TX rate as well as amount of Dropped traffic .

NETWORK ASSURANCE AND MANAGEMENT 80


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

In the WAN Section we can see the status of TLOCs of the WAN Edge device as well as current
Tunnels (GRE and IPSec) created with Remote devices

These are the list of IPSec Tunnels this device created with Remote WAN Edge devices.

NETWORK ASSURANCE AND MANAGEMENT 81


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

If you configure any security features such as Firewall, IPS, URL Filtering, you can find the
monitoring for each of these feature in the Security Monitoring Part:

Control Connections can be monitored as well:

It draws a logical topology for that connections over different transports, for example using
public-internet transport, this device has formed 3 different control connections to two
vSmarts and the first vManage.
NOTE: We have also used vSmart High Availability in this lab and forced the WAN Edge devices
to connect to a specific vSmarts that are in a Controller Group (14 in this example).

NETWORK ASSURANCE AND MANAGEMENT 82


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

These are the Events with their importance happened on the device:

NOTE: You noticed with Cisco SD-WAN you don’t need to configure SNMP, Syslog servers,
everything is placed in a single box (vManage). The same thing is true with the SD-Access, The
DNA-C is the SNMP NMS, Syslog Server and Netflow Analyzer as well.
In the AC Logs section you can see the Logs generated whenever a hit happened for an Entry in
the Access Control List.
The most Interesting part of Cisco SD-WAN Monitoring is the Troubleshooting feature.
As far as we are dealing with a Fabric, it is not possible to have exact Flow examination without
using some built in tools.
There are two types of tools to do the troubleshooting:

NETWORK ASSURANCE AND MANAGEMENT 83


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Connectivity and Traffic:


For example let’s jump in to the Device Bringup which is a Connectivity troubleshooting tool:

It shows us what exactly happened during the device Bringup process:


Device got Authorized bu vBond, There was no pending Software updates and not applied,
Router configuration was synchronized and Control Plane Connectivity (OMP Peering)
established to the vSmarts and Finally Data Plane Connectivity Established.
The other tool (Control Connections) is showing us a Live View of the Current control
connections that this device is holding:

NETWORK ASSURANCE AND MANAGEMENT 84


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Using the Ping tool we can send ICMP, TCP and UDP echo requests to the remote devices (For
all VPNs: Service side VPNs such as 100, Transport side VPN0 and 512 for MGMT).
Let’s test Service side for example:

We have pinged Host-2 on vEdge-1, and seems we have 5 percent packet loss (Maybe because
of device ICMP rate limiting or some packet losses in EVE-NG interface manager.
There is also A traceroute command:

In shows us a Graphical based output about the Path until reaching Host-2
NETWORK ASSURANCE AND MANAGEMENT 85
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Another great troubleshooting tool is Simulate Flows:

It shows us what exactly happens when we send mumltiple specific application packets to the
destination. In this case the router is doing ECMP and using all TLOCs to connect to the Remote
TLOCs.
If you set any Application Aware Routing policy to the Data Plane, and you want to visualize
what is exactly happening in the data plane you can use this tool.

Tunnel Health Troubleshooting tool is drawing a chart related to the SLA of each tunnel:

NETWORK ASSURANCE AND MANAGEMENT 86


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

At the moment there is about 2% packet loss for a tunnel.


You can also change the Chart options to draw the Chart for Latency and Jitter:

And the last tool gives you the feature of having Application Aware Routing Visualization, you
can find how a specific tunnel behaves when it comes to SLA of each APP, it will simulate it for
you with current BFD probs in terms of Packet loss or delay or jitter, for example for Apple
Updates:

NETWORK ASSURANCE AND MANAGEMENT 87


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

The last option is Real Time, we can execute show commands on all devices without going to
their CLI one by one:

NETWORK ASSURANCE AND MANAGEMENT 88


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

For example the above output is the same as show omp peers command.
Let’s check routing table:

So far we have discovered individual Monitoring section for each of the devices, what about the
vManage Audit Logs?

NETWORK ASSURANCE AND MANAGEMENT 89


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Cisco SD-Access Assurance


DNA-C Assurance and Network Telemetry
Topology:

 DNA-C NDP and Assurance


 Network Telemetry

NOTE: At the moment, Cisco SD-Access technology cannot be virtualized on any Network
Emulation platforms (EVE-NG, GNS3 etc…) the only way to have working SD-Access is having
main Physical components such as DNAC Appliance, Border and Edge nodes.
NOTE: In this lab we are using our own Physical Rack, if you want to have access to this lab
please refer to below link:
https://sdarack.orhanergun.net/
NOTE: This part is only a case study and does not have any Tasks, We will only go through the
DNAC Assurance feature and Network Telemetry.

NETWORK ASSURANCE AND MANAGEMENT 90


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Two major components or let’s say Engines are running behind the scenes in DNA-Center, The
first one is called APIC-EM which in general terms is responsible of Provisioning the devices,
doing the automation to bring up the SDA fabric. The second one is: NDP (Network Data
Platform), this module is gathering the information from all devices that are part of SDA
Network (using SNMP, Syslog, IPFIX etc… ) and providing the Assurance part of SD-Access
solution.
In this lab the SDA Fabric is fully configured:

As you can see our nodes are participated in the SDA Fabric (Control Plane, Border Node and
Edge node).

NETWORK ASSURANCE AND MANAGEMENT 91


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Two hosts are succesfully onboarded to the SDA Fabric:

This was just a configration of current network to see if it is working or not.


The Maximal Network Telemetry is not enabled by default (For any Edge node), this will help
the DNAC to collect the detailed NetFlow (IPFIX) information from the end nodes, so the DNAC
will do analysis on those flow information and show you Network Telemetry information in
the Assurance part as well.

There are Three profiles by default: Disable Telemetry, Optimal Visibility and Maximal
Visibility.

NETWORK ASSURANCE AND MANAGEMENT 92


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Maximal Telemetry is: Gathering and analyzing the Syslog and IPFIX (Application Visibility)
informations:

In order to enable the Maximal Visibility for our Edge device, Go to the Site View and select the
device and from Action menu select Maximal Visibility:

NETWORK ASSURANCE AND MANAGEMENT 93


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

NOTE: On 9300 Edge interfaces (Interfaces that are connected to the End Hosts) there should
be a lan tag in the description, otherwise IPFIX configuration will not be applied to that
interfacea by DNAC:
9300#show run int g1/0/10
Building configuration...

Current configuration : 482 bytes


!
interface GigabitEthernet1/0/10
description lan To Server1
switchport access vlan 1023
switchport mode access
device-tracking attach-policy IPDT_MAX_10
ip flow monitor dnacmonitor input
ip flow monitor dnacmonitor output
load-interval 30
no macro auto processing
dot1x timeout tx-period 7
dot1x max-reauth-req 3
source template DefaultWiredDot1xClosedAuth
spanning-tree portfast
service-policy input DNA-MARKING_IN
service-policy output DNA-dscp#APIC_QOS_Q_OUT
end

NOTE: All these configuration has been configured by DNAC, We have not done any of them
manually.
NOTE: You can enter the description for interface either from CLI (manually) or configure it on
the Host on-boarding section in DNAC Fabric, Just make sure there is a lan tag in the
description.
Let’s check Netflow configuration on the 9300 (pushed by DNAC as well):
flow record dnacrecord
match ipv4 version
match ipv4 protocol
match application name
match connection client ipv4 address
match connection server ipv4 address
match connection server transport port
match flow observation point
collect timestamp absolute first
collect timestamp absolute last
collect flow direction
collect connection initiator
collect connection client counter packets long

NETWORK ASSURANCE AND MANAGEMENT 94


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

collect connection client counter bytes network long


collect connection server counter packets long
collect connection server counter bytes network long
collect connection new-connections
flow exporter dnacexporter
destination 192.168.222.111
source Tunnel0
transport udp 6007
export-protocol ipfix
option interface-table timeout 10
option vrf-table timeout 10
option sampler-table
option application-table timeout 10
option application-attributes timeout 10
flow monitor dnacmonitor
exporter dnacexporter
cache timeout inactive 10
cache timeout active 60
record dnacrecord
ip flow monitor dnacmonitor input
ip flow monitor dnacmonitor output
ip flow monitor dnacmonitor input
ip flow monitor dnacmonitor output
snmp-server enable traps flowmon

Let’s jump in to the DNA-C’s Assurance part:

In the main dashboard you will get Overal Health view of the Network. It’s showing us the
Devices with their roles and their health level which is green (Healthy) as well as Wired and
Wireless Clients on-boarded to the Fabric.
As well as issues listed at the buttom of the page:

NETWORK ASSURANCE AND MANAGEMENT 95


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

Currently there is no issue.


NOTE: DNA-C by using AI in NDP can suggest you how to fix these issues step by step.

Network Health is providing detailed information of the Overal Network.

NETWORK ASSURANCE AND MANAGEMENT 96


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

In the Client Health section we can find more Detailed information about the Clients:

NETWORK ASSURANCE AND MANAGEMENT 97


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

As you can see the Client Domain Username (Active Directory Username) is shown here as
well (We are using dot1x for client Authentication and Authorization, ISE is integrated with
DNAC and ISE is integrated with Active Directory Domain Controller, and Micro Segmentation
is being done).
NETWORK ASSURANCE AND MANAGEMENT 98
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

If you click on one of the Usernames we get detailed information (Client 360 view):

NETWORK ASSURANCE AND MANAGEMENT 99


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

NETWORK ASSURANCE AND MANAGEMENT 100


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

So far we have enabled Application Visibility on 9300:

We can find Application Health information as well:

NETWORK ASSURANCE AND MANAGEMENT 101


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

If something goes wrong in the network, you can find them in the Issues section, For example if
some node is not reachable, or some Client cannot be on-boarded, it will give you step by step
guide how to solve this issue.

NETWORK ASSURANCE AND MANAGEMENT 102


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

If DNAC finds any Threats they will be shown in the Rogue Management section:

Using the search feature you can almost search everything in the network, and it will redirecty
you to related pages for that entity:

NETWORK ASSURANCE AND MANAGEMENT 103


ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT

NETWORK ASSURANCE AND MANAGEMENT 104

You might also like