Professional Documents
Culture Documents
Network Assurance and Management Course
Network Assurance and Management Course
Network Assurance and Management Course
SNMP
Simple Network Management Protocol
Topology:
Task 01:
Get the Zabbix node ready for being an NMS for the network
Configure SNMP v2c with community: public on Edge-Router-254, CSR1000v-1 and
XRv9k-2
All Traps should be sent to the NMS
Solution:
SNMP stands for Simple Network Management Protocol. As the name implies, it is simple to
Undedrstand, Configure and Work with!
SNMP has two major components:
NMS: The Network Management Station which a server and is going to collect the
necessary information from the Infrastructure devices such as Routers, Switches,
Servers etc… or make some changes on these devices.
SNMP Agent: This component is running on the Infrastructure devices (Routers,
Switches etc…) and is going to be at service of the NMS, for example: reading the NMS
requested information from a database and send it to the NMS, when some event
happens, it will inform the NMS and etc… .
If you take a look at the topology, there is a Zabbix node, this node is going to be our NMS, and
all the Routers and Switches are running SNMP Agent.
SNMP has 3 versions:
Version 1: which is the original one and it is considered obsolete, it supports 32-bit
counters which is limited for nowadays networks with devices that have links above
Gigabits per second bandwidth. The security considerations is another major problem
with these initial versions (1 and 2c), simply they put a community string (something
like a password but not so good as a password! ) into the messages trying to make a
little bit security with the solution, maybe this idea was good in those days (30 years
ago) but nowadays?! Not at all! On most devices there is a “public” community by default
and if you don’t change it or not put some ACLs to limit the NMS IPs, everyone can GET
the information from your entire MIB (we will talk about it soon, it is something like a
database).
Maybe you think of changing that default community (“public”) to something complex,
but trust me anyone can do a simple packet capture and find that complex community!
We will not try to configure any Version 1 at all, no one willing to use it anymore.
Version 2: This version introcuced a new security system but for some reasons which is
beyond the scope of this article I will not go through it, so they simply ignored v2 and
introduces v2c (when we talk about SNMP v2 we are actually talking about v2c).
In version 2c, the came up with a standard approach that could be used with any vendor,
improvements like: 64-bit counters support, community string from old fashion version
1 added back to this version as well, improvements to the MIB structure and some new
messages (GETBULK and INFORMs).
This version even nowadays is the widely used version! Even without good security!
The reason is: Simplicity in any aspect, in terms of dealing with MIB, dealing with
configuration.
Version 3: this is a totally different SNMP approach, there are many security
improvements like: providing true Authentication (MD5 and SHA) and Privacy (AES,
DES etc..), in this version we can authenticate the user, put them in a group, provide
them only portional access of the MIB (by using Views) and finally Encrypt the message
contents with some Encryption algorithms.
This was only an introduction to different versions, we will talk about them in detail
throught the tasks.
The SNMP v2c configuration on the Cisco routers and switches is very easy and straight
forward:
Edge-Router-254:
permit 192.168.0.3
As you can see, we have specified a community with “public” as the string, and RO stands for
(Read Only which allows the NMS to only read the information on the device), there is another
option which is RW (Read Write, which allows NMS to read and also make some changes on
the device, and RW is not recommended because of security problems in 2c). most
implementations using the RO to just collect the information from the devices.
The Named Access Control List is being used at the end of the first command to just allow
specific devices to poll the information from the device (in this case out NMS IP address is
192.168.0.3).
We have used trap-source interface as being used to send the TRAPs (for example if you
shutdown an interface, immediately that info will be sent to the NMS to let the NMS know
something happened in the device otherwise the NMS will poll the information in some specific
intervals and will not be notified until next poll).
Then we have enabled sending traps for all possible entries, it can be enabled for individual
entries as well:
<cr>
This time we have not used any ACL to limit the NMS IP addresses (which is not recommended,
always try to use ACLs).
And this Agent on IOS-XE is going to send Informs instead of traps to the NMS (Informs are just
like Traps but they expect an acknowledgement from the NMS).
You can also specify the more information about the device such as where it is placed, the
Admin contact info and etc… .
Let’s take a look at to the IOS-XR configs:
XRv9k-2:
snmp-server traps
As you can see the command syntax is almost the same as IOS.
We have also put ifindex persist command, this feature provides an interface index (ifIndex)
value that is retained and used when the router reboots.
That was all about SNMP v2c configuration on thed devices, as I mentioned before, It is very
simple configure and work with!
Let’s get the NMS ready:
There are many Network Management Servers out there from different vendors that are doing
the same job, but with different GUI and features, We will configure 3 of them in this lab, the
first one is Zabbix which is a free Linux based solution.
In EVE-NG there is no template for this node by default, in the SNMP video training We have
explained how to add this node to the EVE-NG (please refer to those videos).
There is a DHCP server running on the Edge-Router-254, which will provide IP address, DNS
and Default Router information to this node. For simplicity We are going to use the Dynamic IP
address on the NMS (in a real invironment make sure to configure static values).
Just power on the Zabbix node and click on it, in the VNC Console use these default Username
and Password to login to the device:
Username: root
Password: zabbix
If you enter the ip address command, you can find about the eth0 interface IP address.
Let’s login to the Web GUI (you can use any windows node in this lab to go to the web browser
and enter the zabbix web GUI (You can use Test-PC, Syslog-Server and etc…) these are the
Windows 10 and Server 2016 nodes.
For example I logged in to the Syslog-Server node (Win Server 2016) and entered the Zabbix
IP address.
Windows 10 default credentials:
Username: User
Password: Test123
Windows Server 2016 credentials:
Username: Administrator
Password: Test123
You need to add Cisco IOS device template to the zabbix, by default it does not have any
template for Cisco dedvices.
The reason we add them is: NMS will GET the values of Object IDs inside the MIB (Management
Information Base).
MIB is a hierarchical structured dabased which we can go through it and search for an Object
ID (For example Interfaces have their own Object ID in the MIB and we can find detailed
information related to counters, Link Status etc… inside it).
You can get the Cisco IOS Official template from their website:
Just Click on the link and download and import them before adding Cisco IOS template.
After adding templates, You can add the devices to the inventory:
You can find graphs related to the CPU, Interfaces and Memory:
More graphs can be seen as well, for example the temprature of the Chassis if you use the
physical dedvices and also dedpends on the Template that you are using (Which Object ID
values it can get from the MIB of the devices).
If you click on the Latest Data tab in the Monitoring section, it is going to show you the Latest
data collected from the device:
These are the information related to the Ethernet0/0 of the Edge-Router-254. Such as counters
and state and name of it.
Let’s shutdown eth0/0, and do a packet capture on the eth0 interface of Zabbix node:
Before shutdown:
The NMS is getting the SNMP information from the Agents, Take a look at Object IDs: Long
numbers: 1.3.6.1.2.1.31.1.1.1….. !
A human cannot remember them, that is why they have created templates and added to the
NMS, to request for these specific information in the MIB.
Let’s analyze one of these GET messages:
The messages are UDP using the port number 161. It includes the community (clear text) and
also refering to the Object IDs.
You can find more information about the SNMP MIB OIDs on cisco webside (just google it),
there is an online tool out their by cisco to find about the MIB and OIDs.
Let’s shutdown the interface:
Edge-Router-254:
interface e0/0
shutdown
This Trap makes sure that NMS gets the latest information about an Object immediately.
Task 02:
Configure SNMPv3 on the Switch 2 (SW for subnet 192.168.1.0/24).
Configure the NMS to do SNMPv3 for this device
Use AuthPriv Security Level
Solution:
SNMPv3 provides major security improvements to the SNMP.
There are different Security Levels:
NoAuthNoPriv
AuthNoPriv
AuthPriv
NoAuthNoPriv: In this model there is no Authentication and Privacy at all (Never use it unless
you have a good reason to do so!.
AuthNoPriv: This provides Authentication by Username and an MD5 or SHA password but no
Privacy, which the packets will not be encrypted at all.
AuthPriv: The strongest level of the Security for SNMP. In this model the SNMPv3 will provide
Authentication as well as Encryption.
In this example we will configure the strongest one which is Security Level 3 (AuthPriv):
First of all, we need to define a view .By using the Views we can specify which section of the
MIB will be accessable for a specific Group.
dot1xPaeSystem.1
dot1xPaePortEntry.2
dot1xPaePortEntry.3
dot1xPaePortEntry.4
dot1xPaePortEntry.5
dot1xAuthConfigEntry.1
dot1xAuthConfigEntry.2
dot1xAuthConfigEntry.3
dot1xAuthConfigEntry.4
dot1xAuthConfigEntry.5
dot1xAuthConfigEntry.6
dot1xAuthConfigEntry.7
dot1xAuthConfigEntry.8
dot1xAuthConfigEntry.9
dot1xAuthConfigEntry.10
dot1xAuthConfigEntry.11
dot1xAuthConfigEntry.12
dot1xAuthConfigEntry.13
dot1xAuthConfigEntry.14
dot1xAuthStatsEntry.1
dot1xAuthStatsEntry.2
dot1xAuthStatsEntry.3
--More--
--More--
This is the most difficult and confusing part of the SNMPv3 configuration, and I think that is
one of the reasons people still prefer using SNMPv2c! Maybe it is not so Simple as the name
implies!
SW2:
interface Vlan1
no shutdown
snmp-server user Navid Admins v3 auth md5 PASSWORD123 priv aes 128 ABC123ABC123ABC123ABC123 access NMS
permit 192.168.0.3
Just like the previous example we need to download Cisco IOS Switches SNMPv3 template
from their website, and upload it to the zabbix,
Then we add a device with SNMPv3.
{$SNMP_SECNAME}: Is the Username field refering to the dedfault values of the template (you
can use a specific username in this box as well).
{$SNMP_AUTH}: Password of the specific User.
Then we have linked the CiscoSwitchInterfaceSNMPv3 template (downloaded from the zabbix
website) to this device.
NOTE: This SW node is a virtual device, we don’t expect the template to work with it, you need
a Physical box to test this SNMPv3 template.
Task 03:
Configure Cacti as the NMS for the same devices in the lab.
Solution:
There are many Network Management Stations out there by many companies, Cacti is one of
them which is also free and widely used nowadays.
You can install in on the Windows as well as Linux Operating Systems.
In this example we will install it on the Linux Debian 10:
Just Power on the Node and set the networking parameters.
In this example we will set IP address of 192.168.0.71/24 to the Debian box.
The default credentials of this node we are using in the training is:
Username: root
Password: Test123
Open the Terminal and enter apt update command to get the recent packages.
Make sure you have the internet access for this node
Enter apt install cacti
Follow the instructions to set the password (refer to the Cacti part video in the training
for more info)
Open the Firefox Browser on Debian 10 box and enter http://127.0.0.1/cacti
Use the Username Admin and the Password specified during the installation
permit 192.168.0.71
permit 192.168.0.3
CSR1000v-1:
XRv9k-2:
commit
Now Click on one of the devices then click Create Graphs for this Device:
Task 04:
Remove the Cacti node from the lab
Add a Windows Server 2016 node instead
Install and Configure the PRTG NETWORK MONITOR as an NMS
Solution:
Let’s install a Paid solution as well, PRTG is a paid NMS, but we can test the Free Demo version
for 30 days. It will work full functional for 30 days.
Download the free version on Windows server 2016 node (included the free trial key):
https://www.paessler.com/prtg?gclid=EAIaIQobChMIhuLrzMmY8gIV2EaRBR2PzgevEAAYAS
AAEgISkvD_BwE
Click on the installer, follow the steps (make sure you have the internet access for license
activation phase).
Enter http://127.0.0.1 on the Firefox of Win Server 2016 node and login to the PRTG Web GUI:
Username: prtgadmin
Password: prtgadmin
Just like other solutions the default community value is set to “public”, in this lab we are okay
with that no need to change.
From the devices section we did a right click on the Network Infrastructure and addedd a new
Group name Routers:
NOTE: By default the SNMPv2 community is public, you don’t need to change it for this lab.
Go to the devices tab, Right Click on Routers (new group we just created) and Add a new
Device:
Click on OK.
After Device Creation Click on the Run Auto Discovery button next to the device name:
NETWORK ASSURANCE AND MANAGEMENT 28
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT
The PRTG will automatically scan the Sensors for this device.
Just wait a couple minutes and you can see the sensor information.
Add the other dedvices using the same steps.
And We are done!
The PRTG will draw beautiful Graphs and Diagrams for each the Objects:
SysLog
Topology:
Logging Targets
Syslog Messages, Severities and Facilities
Configuration on IOS-XE and IOS-XR
Implementing Solarwinds Kiwi Syslog Server
Task 01:
Do the Syslog configuration only on IOS-XE box (CSR1000v)
SysLog messages should have the buffer size of 8192 Bytes
Logs with Severity levels of 0 to 6 should be send to Virtual Terminals (Test it)
Setup the Kiwi Syslog Server, and Configure CSR1000v-1 and XRv9k-2 to send the Logs
with Informational Severity level and above to this server (Kiwi IP: 192.168.0.4).
IOS-XR should be configured in a way to use RFC 5424
Solution:
SysLog is used for System Logging, You have seen a lot of Syslog messages generated by the
Cisco devices from the first day of logging in to the device console!
The SysLog is a most important part of Network Assurance and Monitoring. We can exactly see
what has happened to everything on the devices, such as OSPF neighborship went down, Some
Interface is flapping, A user logged in to the device and etc… .
If you take a look at the above output, It is generated by System (SYS), sometimes most
engineers call this part the Facility, but it is not the facility! All Cisco Routers and Switches
generate the SysLog message with the Facility of Local7. Instead the logs include the Process
that actually created the log (in this case SYS). There is a number (5) next to the SYS, it implies
the Severity level of the message, or in other words, how much important this log is!
5 is for Notifications:
Normal Level, but significant condition! A user just logged in to the console.
By default all logs with all Severity levels will be send to the Console and also the Buffer. So the
device will keep track of them untill you reboot the device:
We can find some these information as well as buffered logs using the show logging command:
Edge-Router-254(config)#do sh logging
Syslog logging: enabled (0 messages dropped, 14 messages rate-limited, 0 flushes, 0 overruns, xml disabled,
filtering disabled)
filtering disabled
filtering disabled
filtering disabled
link up),
filtering disabled
*Aug 4 20:26:00.164: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet2/3, changed state to down
*Aug 4 20:26:00.164: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet3/0, changed state to down
*Aug 4 20:26:00.164: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet3/1, changed state to down
*Aug 4 20:26:00.164: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet3/2, changed state to down
*Aug 4 20:26:00.164: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet3/3, changed state to down
--More--
By default the buffer size is set to 4096, we can adjust it, Or we can specify which Severity level
being logged:
CSR1000v-1:
Edge-Router-254#telnet 192.168.1.1
Username: orhan
Password:
CSR1000v-1#terminal monitor
CSR1000v-1#conf t
CSR1000v-1(config)#int lo 10
CSR1000v-1(config-if)#
By default when you do a Virtual Terminal to the Cisco Router, it will not show you any Syslog
messages by dedfault, in order to see the log messages we need to enter terminal monitor
command in the Priv Exec mode (Each time you do SSH, Telnet to the device).
And we can ge Syslog configuration information as well as see the Syslog messages buffered in
the device memory usinf show logging command:
CSR1000v-1#show logging
Syslog logging: enabled (0 messages dropped, 3 messages rate-limited, 0 flushes, 0 overruns, xml disabled,
filtering disabled)
filtering disabled
filtering disabled
filtering disabled
link up),
filtering disabled
*Aug 5 00:29:50.185: %SYS-5-LOG_CONFIG_CHANGE: Buffer logging: level debugging, xml disabled, filtering
*Aug 5 00:30:02.486: %SYS-5-LOG_CONFIG_CHANGE: Console logging: level informational, xml disabled, filtering
disabled
*Aug 5 00:30:56.488: %SYS-6-LOGGINGHOST_STARTSTOP: Logging to host 192.168.0.4 port 0 CLI Request Triggered
*Aug 5 00:30:57.488: %SYS-6-LOGGINGHOST_STARTSTOP: Logging to host 192.168.0.4 port 514 started - CLI
initiated
*Aug 5 00:31:28.659: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2, changed state to down
*Aug 5 00:52:45.568: %SYS-5-LOG_CONFIG_CHANGE: Monitor logging: level informational, xml disabled, filtering
disabled
*Aug 5 00:57:43.431: %SYS-5-LOG_CONFIG_CHANGE: Console logging: level debugging, xml disabled, filtering
disabled
*Aug 5 01:06:03.976: %SYS-6-LOGOUT: User orhan has exited tty session 1(192.168.0.254)
We also see the timestamps in every Syslog message, you can change the options using service
timestamps log command in the global configuration level.
NOTE: Make sure to set the NTP server in order to have Synchronized Clock time.
commit
On IOS-XR devices, the Syslog messages will not be sent to the Console by default, we can
enable it manually.
IOS-XR devices support the RFC 5424, which has a structured message format, by default IOS-
XR is using RFC 3164 which is the older and simpler version.
NOTE: IOS and IOS-XE are using RFC 3164.
NOTE: VRF default is refering to the Global Routing Table, If you are configuring Syslog for
MGMT VRF you can specify it here.
On IOS and IOS-XE we can specify the VRF after the logging host x.x.x.x command as an
argument.
Not let’s download the Kiwi Syslog Server installer from their website (Free Trial version
works for 14 days) and install it on the Windows Server 2016:
https://www.solarwinds.com/kiwi-syslog-server/registration
NOTE: Make sure you enble the .Net Framework 2.0 and 3.5 on Windows Server 2016.
Click on the Installer file and do simple Next Nexts! Until finishing the installation process.
NOTE: After installation We can get to the Kiwi Syslog Server Console. In order to get access to
the Web GUI you need to enable HTTP Secure server also (Refer to below link):
https://support.solarwinds.com/SuccessCenter/s/article/Enable-SSL-support-for-Kiwi-Web-
Access?language=en_US
For Testing and Labbing purpose the Windows Based Console works fine, For production use
cases make sure to enable SSL to have a Web GUI Access.
Web based GUI:
I just want to shutdown CSR1000v-1’s GigabitEthernet2 and see the logs on Kiwi Syslog server:
The same messages are being sent to the server, Even if we reload the Router, Kiwi Syslog
server keeps them in it’s database.
Topology:
NetFlow v9 vs IPFIX
Configuration on Cisco and Non-Cisco devices
Implementing ManageEngine NetFlow Analyzer
Task 01:
Install the ManageEngine Netflow Analyzer Demo version (IP address: 192.168.0.1)
Configure Netflow version 9 on Edge-Router-254 and Internal-Router
The flows should be exported and sent to the NetFlow Analyzer with port number 9996
Netflow should be enable in both direction on Internal-Router Ethernet0/2
Netflow should be enabled in ingress direction on Edge-Router-254 e0/1
Enable IPFIX on CSR1000v’s GigabitEthernet1 in both directions (Netflow Analyzer IP
address 192.168.0.1 and port number 9996).
Solution:
NetFlow as it’s name implies is a protocol to collect the flow information that is coming in or
going out of the router interface. We can Collect the detailed information related to the Flows
(Such as Protocol, TCP, UDP Port, DSCP value etc…) and send them to the Netflow Analyzer to
do analysis on them.
Let’s download the software from their website (They have Windows and Linux editions), in
this example we will install it on the Windows Server 2016 (30 Days Trial version):
https://www.manageengine.com/products/netflow/download-free.html
The installation process is straight forward (Next Next!):
Specify the Web Server port and NetFlow listen port during installation (After Installation
make sure you open these ports on the windows Firewall):
Done! You can access the Web GUI using port 8060:
IP Flow Information Export or IPFIX is an extended version of NetFlow v9, standardized by the Internet
Engineering Task Force (IETF). It supports variable length fields like HTTP hostname or HTTP URL as well
as enterprise-defined fields. IPFIX allows you to collect and analyze flow data from layer 3 devices and
firewalls with an IPFIX collector and IPFIX analyzer.
With NetFlow Analyzer's IPFIX monitoring and reporting features, you can diagnose and troubleshoot network
issues and generate customized reports. You can plan your future bandwidth needs to optimize usage with these
one-minute granularity reports. NetFlow Analyzer helps you generate and schedule custom bill plans,
and sends email and SMS-based alerts in case of threshold violations.
NetFlow vs IPFIX.
IPFIX is an industry standardized version of NetFlow. IPFIX, often referred to as NetFlow v10, is a more
relevant option when it comes to working with data or devices that are not built by Cisco itself. While
NetFlow also provides multiple options for this, they're simply more time-consuming and complicated to use.
One of the major differences between NetFlow and IPFIX is that IPFIX allows a vendor ID to be specified. This
allows the vendor to add proprietary information to the flow and export any data they want. IPFIX also allows
variable length fields, making HTTP host and URL export easier.
ip flow-export version 9
interface Ethernet0/1
ip flow ingress
Internal-Router:
ip flow-export version 9
interface Ethernet0/2
ip flow ingress
ip flow egress
This is the old form of configuration, I will configure the IPFIX on the IOS-XE device so you will
realyze the difference:
CSR1000v-1:
destination 192.168.0.1
export-protocol ipfix
exporter TEST
interface GigabitEthernet1
Cache:
Status: allocated
Stats:
protocol distribution
Current entries: 11
IP TOS: 0x00
IP PROTOCOL: 17
ip source as: 0
ip destination as: 0
counter packets: 98
IP TOS: 0x00
IP PROTOCOL: 6
ip source as: 0
ip destination as: 0
counter packets: 4
--More--
As you can see from the above output, the device Monitors the Flow information and send
them to the Netflow Analyzer.
Edge-Router-254#show ip cache flow | begin Pro
Task 01:
Enable the ISE 3.0 Device Admin feature
There should be two accounts: orhan-admin and navid-operator
The Admin account should be able to execute any command on the devices
The Operator account should only execute show commands on the devices
Configure the AAA for Device Admin on IOS-XE (CSR1000v-1)
Configure the AAA for Device Admin on IOS-XR (XRv9k-2)
Solution:
AAA stands for Authentication, Authorization and Accounting.
Authentication: We want to know who is trying to do something
Authorization: We want to know what someone can do
Accounting: We want to know what has someone done
The idea is very simple, we can implement this idea in 2 major ways:
Device Administartion: Someone trying to login to a Router or a Switch, Is he/she the
person that we think of? do they have the permission to even login to our devices?
(Authentication), Is He/She going to execute some specific config commands? Or just
works as an Operator and can run only some low level show commands?
(Authorization), What has He/She done in the past? Did she execute an specific
command that caused a network outage? Did He shutdown an important interface?
(Accounting)
Network Administartion: Let’s imagin you work in an ISP, A customer gets an internet
service from your company, Is this device allowed to enter our network? For example
their perimeter router trys to do a PPPoE connection with the BRAS, You have provided
a Username and Password for them (They try to do Authentication), After login you
want to keep track of their Traffic usage, the connect and disconnect time etc… (You do
Accounting).
In this Lab, we are going to use AAA for the first one (Device Administartion).
TACACS+ is going to be used for this purpose, in TACACS+ all AAA works seperately! Which
means we can individualy have Authentication, Authorization and Accounting. This protocol
is specially designed for this purpose.
What about RADIUS? In RADIUS the first A (Authentication) combined with Authorization!
When you get authenticated normaly you can do anything (The access can be limited using
other protocols or technologies, We imagin RADIUS is working alone!), in other words: we
have Authentication and Accounting in RADIUS.
Other differences between these two protocols are: RADIUS is using UDP (Port 1645,1646
or 1812, 1813) and message content is not being encrypted (Only Password field is
encrypted)but TACACS+ using TCP and packet contents are being encrypted.
As a first step let’s get the ISE node ready, Just follow the setup instructions (during the
installation it will ask you to enter “setup” to start doing initial setup process.
Then we login to the Web GUI:
Add the devices that you want, with their IP Addresses and TACACS Authentication:
Shared secret is very important, it will be used when the device wants to contact ISE. This
Shared Secret will be also defined on our devices.
I have created two groups: Admins and Operators. Wi will put the users in this group.
Create Users:
It is okay to provide Privilege level 15 for everyone, because we will do Authorization for the
users, Even if someone has Priv 15 but limited to execute only show commands, He/She cannot
do any Config commands with that Privilege 15.
The operators can only execute “exit” and any “show” commands. (“exit” is neccesary for
allowing user to close the session and exit the Virtual Terminal).
You can be more specific with commans sets and only allow some arguments for show
command, this is how to do that:
With this command set, the User can only run “exit”, “show ip cef” and “show ip route”
commands.
The last step is to create the Device Admin Policies:
Here we will bind Groups, Command Sets and Shell profile together in the Authorization
section:
XRv9k-2:
username orhan
group root-lr
secret 10
$6$KYGuRdYS5dr/R...$6BkIqnNHVdpRv5fNuiohSGymGjlT2coJ81xbu7ibN8h8QUnw6xSdrCFkWeaCLjksrU/ac9J/Kq/iGd8IZ9ME5.
CSR1000v-1:
aaa new-model
key Orhan123
XRv9k-2:
key 7 013C140C5A05575D72
server 192.168.0.70
We just defined the TACACS+ server and specified the shared secret as well as source interface
(On the ISE 3.0 we have defined the devices with these interface IP addresses).
Let’s do the AAA configuration on the IOS-XE:
CSR1000v-1:
aaa authentication login AAA group LAB-TACACS-SERVERS local
aaa authentication enable default group LAB-TACACS-SERVERS enable
aaa authorization config-commands
aaa authorization exec AAA group LAB-TACACS-SERVERS local
aaa authorization commands 0 AAA group LAB-TACACS-SERVERS local
aaa authorization commands 1 AAA group LAB-TACACS-SERVERS local
aaa authorization commands 15 AAA group LAB-TACACS-SERVERS local
aaa accounting commands 15 AAA start-stop group LAB-TACACS-SERVERS
aaa accounting exec AAA start-stop group LAB-TACACS-SERVERS
line vty 0 530
authorization commands 0 AAA
authorization commands 1 AAA
authorization commands 15 AAA
authorization exec AAA
accounting commands 15 AAA
login authentication AAA
accounting exec AAA
transport input telnet ssh
!
We have defined an Authentication Login list name AAA (It will try to do Authentication when
someone logs in using Available TACACS+ servers, if no servers available, the router trys the
Localy defined Usernames).
CSR1000v-1#conf t
Command authorization failed.
CSR1000v-1#show protocols
Global values:
Internet Protocol routing is enabled
GigabitEthernet1 is up, line protocol is up
Internet address is 192.168.1.1/24
GigabitEthernet2 is up, line protocol is up
Internet address is 192.168.2.1/24
GigabitEthernet3 is down, line protocol is down
GigabitEthernet4 is down, line protocol is down
Loopback10 is up, line protocol is up
Loopback11 is administratively down, line protocol is down
Internet address is 192.168.100.11/24
Loopback12 is administratively down, line protocol is down
Internet address is 192.168.12.12/24
CSR1000v-1#
In IOS-XR by default we don’t have any VTY lines, you need to create them manually.
The command syntax is almost the same with IOS-XE with a few differences that you can
realize from the above config comman sets.
NETWORK ASSURANCE AND MANAGEMENT 60
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT
RP/0/RP0/CPU0:XRv9k-2#conf t
Thu Aug 5 23:14:55.427 UTC
Current Configuration Session Line User Date Lock
00001000-00003c5a-00000000 con0_RP0_C orhan Thu Aug 5 23:11:01 2021
RP/0/RP0/CPU0:XRv9k-2(config)#int lo 12
RP/0/RP0/CPU0:XRv9k-2(config-if)#ipv4 address 10.12.12.12/24
RP/0/RP0/CPU0:XRv9k-2(config-if)#commit
Thu Aug 5 23:15:26.429 UTC
RP/0/RP0/CPU0:XRv9k-2#show ip int br
Thu Aug 5 23:16:39.433 UTC
MPLS OAM
Segment-Routing OAM
NOTE: This Lab does not have any tasks. It is just a Step by Step case study and walk through
the technology.
All devices are fully configured (both MPLS LDP and Segment-Routing is configured but SR is
preffered for Forwarding the traffic).
NOTE: This lab is a resource consuming lab because of CSR1000v and XRv9k nodes in the SP
network (you need at least 24 CPU cores and 64 GBs of Memory to run all of the nodes, If you
don’t have enough resources you can power off some devices).
With MPLS encapsulation, we are dealing with LSPs (Label Switched Paths), For example if you
want to to do a ping from XRv9k-1 to XRv9k-2’s loopback 0 IP address, the packet will be Label
switched inside the service Provider network, it is different than the normal IP Forwarding,
We are somehow Tunneling the traffic towards the destination, that is the power of MPLS, the
reason we can provide multiple services to the customers. The devices in the middle (The P
devices) do not care about the original content of the encapsulated packet, they are just label
switching the packets with the information inside the Label Stack.
There is a rule when we talk about the Customer Services and it’s quality:
The Customers are the first ones they notice something bad happened in the SP network! For
example if LSPs are broken, if MPLS IP is not enabled on some links or even something is
wrong with the Control Plane and Data Plane. Even before the SP engineers notice these errors,
the Customers notice it! Because they are the actual end user getting served by the Service
Providers.
This is the main reason of having some kind of tools letting the SP engineers to make sure their
network works fine.
We have the MPLS and SR OAM as a great tools to let the SP staff Manage and Assure their
MPLS networks. They can find any errors related to the Data Path and even SR policies,
Inconsistency between the Control and Data Plane and etc… .
So far, for many years we have been using ping and traceroute commands to troubleshoot our
IP Network, but aiming with MPLS, needs to go beyond and deal with a better tools specifically
designed for that purpose.
Let’s imagin we are running MPLS IP with LDP in the SP network, LDP is distributing the labels
and the devices are pushing some labels, swaping and poping them to provide the LSPs. For
some reason MPLS IP is disabled on one of the interfaces in the LSP, do you think normal Trace
can find the problem? The answer is NO!
The LDP is creating and advertising and learning the labels for the Prefixes that are learned
from IGPs. There must be a route in the routing table to allow the LDP to do it’s job. In the
above case when MPLS IP is disabled on a link or for some reason LDP neighborship has not
formed, the packet will be IP Forwarded instead of Label Switching. So the Trace command
cannot find the broken LSP. In The trace we are sending some packets towards the IP
destinations, we must find a way to not use any routable/forwardable destination IP address
in the trace packets that are going to find the problems in MPLS LSPs. Thanks to the MPLS
Trace, we can solve this problem, simply by using 127.0.0.0/8 as the destination address of the
MPLS Ping and Trace! Whenever a router wants to do the IP Forwarding due to lack of labels, it
cannot forward it! The 127.0.0.0/8 is a localhost range and is not a valid IP destination. The
router will drop the packet and reply the Source that something went wrong with the Label
Switched Path (There is no Label switched outgoing interface).
By using the MPLS OAM tools we can find the inconsistency in the Route-Processor
information and the Line-Card information. For example we can force the Remote Router to
process MPLS Trace packet with it’s Route-Processor instead of Line Card Processor.
Let’s jump in to the lab and see these things in action:
First of all we need to enable MPLS OAM on all of the devices, The command is simple! Just
enter mpls oam on the global configuration! That’s all:
On all SP Core and Aggregation Routers (IOS-XE and IOS-XR boxes):
mpls oam
As an example (IOS-XR):
Let’s use normal Ping and MPLS Ping command and capture the packets:
RP/0/RP0/CPU0:XRv9k-1#ping 10.255.255.2
!!!!!
Take a look at to the destination: 10.255.255.2, as well as the ICMP being encapsulated inside
IP.
Let’s use MPLS Ping instead and set the target FEC address as the 10.255.255.2/32:
RP/0/RP0/CPU0:XRv9k-1#ping mpls ipv4 10.255.255.2/32
!!!!!
The destination IP is: 127.0.0.1! this makes sure that if some router has a broken labeled
output interface we can detect it, the router is not going to IP Forward 127.0.0.1 destined
packet anywhere.
MPLS Ping and Trace are using UDP datagrams with the source and destination port of: 3503
And the message is MPLS Echo not the normal ICMP Echo request.
NETWORK ASSURANCE AND MANAGEMENT 66
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT
!!!!!
SR is preffered to forward the packets in Data Plane, but we can also test LDP generated LSP as
well.
In order to test the Load Sharing we can put different destination addresses as well (by default
it is using 127.0.0.1:
RP/0/RP0/CPU0:XRv9k-1#ping mpls ipv4 10.255.255.2/32 fec-type ldp destination 127.0.0.11
!!!!!
10.1.3.3 15 msec
10.1.5.5 28 msec
10.5.6.6 8 msec
3 10.2.4.2 14 msec *
10.2.6.2 14 msec
It is a UDP Datagram with destination 10.255.255.2 and Dst Port starting from 33434, and
does not give us much information about the LSP
RP/0/RP0/CPU0:XRv9k-1#traceroute mpls ipv4 10.255.255.2/32 verbose
L 1 10.1.5.5 10.5.6.6 MRU 1500 [Labels: 16002 Exp: 0] 44 ms, ret code 8
L 2 10.5.6.6 10.2.6.2 MRU 1500 [Labels: implicit-null Exp: 0] 15 ms, ret code 8
We can realize that Downstream Labeled Output interface is working fine as well as the
implicit-null being used (reason is PHP) and also detecting the exact router IP addresses on
that interface. Another thing which is very important in MPLS is MTU, as you can see the
routers are reporting the actual link MRUs (Maximum Receive Unit).
This is the packet content, and there are detailed information about everything!
Another useful command is testing the ECMP Multipath:
RP/0/RP0/CPU0:XRv9k-1#traceroute mpls multipath ipv4 10.255.255.2/32
Fri Aug 6 01:07:41.413 UTC
Starting LSP Path Discovery for 10.255.255.2/32
Codes: '!' - success, 'Q' - request not sent, '.' - timeout,
'L' - labeled output interface, 'B' - unlabeled output interface,
'D' - DS Map mismatch, 'F' - no FEC mapping, 'f' - FEC mismatch,
'M' - malformed request, 'm' - unsupported tlvs, 'N' - no rx label,
'P' - no rx intf label prot, 'p' - premature termination of LSP,
'R' - transit router, 'I' - unknown upstream index,
'X' - unknown return code, 'x' - return code 0
Type escape sequence to abort.
LL!
Path 0 found,
output interface GigabitEthernet0/0/0/0 nexthop 10.1.3.3
source 10.1.3.1 destination 127.0.0.0
LL!
Path 1 found,
output interface GigabitEthernet0/0/0/1 nexthop 10.1.5.5
source 10.1.5.1 destination 127.0.0.0
EMCP also works fine, by changing destination IP sequentionaly for each path (127.0.0.0,
127.0.0.1 and so on…).
Let’s shutdown the g0/0/0/0 and g0/0/0/1 interfaces of XRv9k-4 as well as disable MPLS IP
on CSR1k-6 Gig1 and Gig2:
CSR1k-6(config-if)#int g1
CSR1k-6(config-if)#no mpls ldp igp autoconfig
CSR1k-6(config-if)#
*Aug 6 01:14:34.607: %LDP-5-SP: 10.255.255.2:0: session hold up initiated
CSR1k-6(config-if)#int g2
CSR1k-6(config-if)#no mpls ldp igp autoconfig
CSR1k-6(config-if)#
*Aug 6 01:14:53.405: %LDP-5-SP: 10.255.255.7:0: session hold up initiated
Now our LSP is broken towards 10.255.255.2/32, I wanna test it with Normal Ping and
Traceroute:
RP/0/RP0/CPU0:XRv9k-1#ping 10.255.255.2
Fri Aug 6 01:16:32.352 UTC
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.255.255.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 5/11/20 ms
RP/0/RP0/CPU0:XRv9k-1#trace 10.255.255.2
Fri Aug 6 01:16:37.975 UTC
Tracing the route to 10.255.255.2
1 10.1.5.5 [MPLS: Label 16002 Exp 0] 11 msec 4 msec 3 msec
2 10.5.6.6 [MPLS: Label 16002 Exp 0] 7 msec 4 msec 3 msec
3 10.2.6.2 12 msec * 13 msec
RP/0/RP0/CPU0:XRv9k-1#ping 10.255.255.2
Fri Aug 6 01:21:43.592 UTC
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.255.255.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 4/8/22 ms
RP/0/RP0/CPU0:XRv9k-1#trace 10.255.255.2
Fri Aug 6 01:21:46.001 UTC
Interesting we have the reachability but in fact between CSR1k-6 and XRv9k-2 there is no
Label switching.
As you realized we cannot detect LSP problems with normal Ping and Trace commands.
Let’s check it with MPLS Ping and Trace:
RP/0/RP0/CPU0:XRv9k-1#ping mpls ipv4 10.255.255.2/32 fec-type ldp
Fri Aug 6 01:26:55.245 UTC
BBBBB
Success rate is 0 percent (0/5)
Label is missing on some device for that prefix! Let’s find the exact hop:
Interesting right? We exactly found the problem in our MPLS data plane.
Let’s try Segment Routing as the FEC-Type instead of LDP:
RP/0/RP0/CPU0:XRv9k-1#trace mpls ipv4 10.255.255.2/32 verbose
Fri Aug 6 01:29:18.581 UTC
We have completely disable Segment-Routing on CSR1k-6, and there is no mapping for SID
16002, so it replies with no FEC mapping for that prefix.
Another greate feature of MPLS OAM is to force the Route-Processor to reply to these MPLS
Ping and Trace messages, in that way we can find the Inconsistency between Control Plane and
Data Plane.
NOTE: It is not possible to test this feature in Virtual Environmen, these devices do not have
separate Supervisor and Line Cards, But the command is:
RP/0/RP0/CPU0:XRv9k-1#trace mpls ipv4 10.255.255.2/32 reply mode router-alert
Using Segment-Routing OAM, we can test our policies. For example a device is pushing some
Segment ID’s into the Label Stack in order to Steer the packet towards some Node or Link.
Let’s test it also, For example I want to Steer the packet this way (For some reason, Just
testing):
NOTE: You can use any lab which have Cisco SD-WAN Controllers and Edge device, you don’t
have to use this one. Controllers redundancy is optional in this case.
NOTE: This section is just a case study and walkthrough the vManage Monitoring features, we
don’t have any tasks.
NOTE: Having a basic knowledge of how Cisco SD-WAN solution works as well as how to on-
board the controllers and edge devices is requierd. To get more information please refer to out
SD-WAN course in the website:
https://orhanergun.net/courses/self-paced-sdwan-training/
Cisco’s SD-WAN solution (formerly Viptela SD-WAN) is a great SD-WAN solution, every SD-
WAN technology to achieve it’s goal is building a Fabric (Overlay) on top of the Underlay
network. This overlay makes it possible to have Application awareness.
One Single Pane of Glass (vManage) is being used to Monitor all the devices that are
participating in this solution (Controllers as well as Edge devices). The vManage has greate
Monitoring tools to help the administrators monitor and troubleshoot their SD-WAN network.
The vManage is making a Control connection with each of the devices and continously getting
information almost about anything from the devices!
We will do this part on the vManage device and you will learn what features it has in terms of
Monitoring or better say Network Assurance.
There is a CA node in this lab which we will use a Browser in it to login to one of our vManage’s
Web Based GUI (In this lab vManages are in a Cluster, their configurations are in sync, and all
of them have identical view about the SD-WAN Network).
Let’s Go to the vManage-1 dashboard:
In the main dashboard we can see an overview of the SD-WAN Network, 4 vSmarts are online,
3 WAN Edge devices and 2 vBonds as well as 3 vManages in a cluster.
7 Control connections are up, If you click on it it gives you about the details of the devices:
It is also showing us that we have 3 sites which they have Full WAN Connectivity (11,12 and
13).
We can also see the SLA prob results (from BFD) for Application Aware Routing purposes.
By clicking on each of the devices we will be redirected to the detailed monitoring menu of that
specific device, Fore example let’s see the cEdge-1:
This page is the System Status dashboard. We can find information about device hardware
(These devices are Virtual devices so we don’t expect any Module or Temperature sensors
etc… )
NETWORK ASSURANCE AND MANAGEMENT 79
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT
You can See the Real Time data as well as Historical data (1h, 3h, 7days etc…).
If you enable DPI (Deep Packet Inspection) You can get the info related to the applications such
as The Percentage of traffic for that Application and etc… .
There is no data in this part because we have not configured it’s related policies.
The interface section gives you information of device interfaces (Physical as well as Logical):
You can see which VPN an specific interface is part of, what is the Status of that interface, What
is MTU and TX and RX rates.
In the QoS secrion, If you configure any Queues and apply them as the Localized policies to the
WAN Edge devices you can see The TX rate as well as amount of Dropped traffic .
In the WAN Section we can see the status of TLOCs of the WAN Edge device as well as current
Tunnels (GRE and IPSec) created with Remote devices
These are the list of IPSec Tunnels this device created with Remote WAN Edge devices.
If you configure any security features such as Firewall, IPS, URL Filtering, you can find the
monitoring for each of these feature in the Security Monitoring Part:
It draws a logical topology for that connections over different transports, for example using
public-internet transport, this device has formed 3 different control connections to two
vSmarts and the first vManage.
NOTE: We have also used vSmart High Availability in this lab and forced the WAN Edge devices
to connect to a specific vSmarts that are in a Controller Group (14 in this example).
These are the Events with their importance happened on the device:
NOTE: You noticed with Cisco SD-WAN you don’t need to configure SNMP, Syslog servers,
everything is placed in a single box (vManage). The same thing is true with the SD-Access, The
DNA-C is the SNMP NMS, Syslog Server and Netflow Analyzer as well.
In the AC Logs section you can see the Logs generated whenever a hit happened for an Entry in
the Access Control List.
The most Interesting part of Cisco SD-WAN Monitoring is the Troubleshooting feature.
As far as we are dealing with a Fabric, it is not possible to have exact Flow examination without
using some built in tools.
There are two types of tools to do the troubleshooting:
Using the Ping tool we can send ICMP, TCP and UDP echo requests to the remote devices (For
all VPNs: Service side VPNs such as 100, Transport side VPN0 and 512 for MGMT).
Let’s test Service side for example:
We have pinged Host-2 on vEdge-1, and seems we have 5 percent packet loss (Maybe because
of device ICMP rate limiting or some packet losses in EVE-NG interface manager.
There is also A traceroute command:
In shows us a Graphical based output about the Path until reaching Host-2
NETWORK ASSURANCE AND MANAGEMENT 85
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT
It shows us what exactly happens when we send mumltiple specific application packets to the
destination. In this case the router is doing ECMP and using all TLOCs to connect to the Remote
TLOCs.
If you set any Application Aware Routing policy to the Data Plane, and you want to visualize
what is exactly happening in the data plane you can use this tool.
Tunnel Health Troubleshooting tool is drawing a chart related to the SLA of each tunnel:
And the last tool gives you the feature of having Application Aware Routing Visualization, you
can find how a specific tunnel behaves when it comes to SLA of each APP, it will simulate it for
you with current BFD probs in terms of Packet loss or delay or jitter, for example for Apple
Updates:
The last option is Real Time, we can execute show commands on all devices without going to
their CLI one by one:
For example the above output is the same as show omp peers command.
Let’s check routing table:
So far we have discovered individual Monitoring section for each of the devices, what about the
vManage Audit Logs?
NOTE: At the moment, Cisco SD-Access technology cannot be virtualized on any Network
Emulation platforms (EVE-NG, GNS3 etc…) the only way to have working SD-Access is having
main Physical components such as DNAC Appliance, Border and Edge nodes.
NOTE: In this lab we are using our own Physical Rack, if you want to have access to this lab
please refer to below link:
https://sdarack.orhanergun.net/
NOTE: This part is only a case study and does not have any Tasks, We will only go through the
DNAC Assurance feature and Network Telemetry.
Two major components or let’s say Engines are running behind the scenes in DNA-Center, The
first one is called APIC-EM which in general terms is responsible of Provisioning the devices,
doing the automation to bring up the SDA fabric. The second one is: NDP (Network Data
Platform), this module is gathering the information from all devices that are part of SDA
Network (using SNMP, Syslog, IPFIX etc… ) and providing the Assurance part of SD-Access
solution.
In this lab the SDA Fabric is fully configured:
As you can see our nodes are participated in the SDA Fabric (Control Plane, Border Node and
Edge node).
There are Three profiles by default: Disable Telemetry, Optimal Visibility and Maximal
Visibility.
Maximal Telemetry is: Gathering and analyzing the Syslog and IPFIX (Application Visibility)
informations:
In order to enable the Maximal Visibility for our Edge device, Go to the Site View and select the
device and from Action menu select Maximal Visibility:
NOTE: On 9300 Edge interfaces (Interfaces that are connected to the End Hosts) there should
be a lan tag in the description, otherwise IPFIX configuration will not be applied to that
interfacea by DNAC:
9300#show run int g1/0/10
Building configuration...
NOTE: All these configuration has been configured by DNAC, We have not done any of them
manually.
NOTE: You can enter the description for interface either from CLI (manually) or configure it on
the Host on-boarding section in DNAC Fabric, Just make sure there is a lan tag in the
description.
Let’s check Netflow configuration on the 9300 (pushed by DNAC as well):
flow record dnacrecord
match ipv4 version
match ipv4 protocol
match application name
match connection client ipv4 address
match connection server ipv4 address
match connection server transport port
match flow observation point
collect timestamp absolute first
collect timestamp absolute last
collect flow direction
collect connection initiator
collect connection client counter packets long
In the main dashboard you will get Overal Health view of the Network. It’s showing us the
Devices with their roles and their health level which is green (Healthy) as well as Wired and
Wireless Clients on-boarded to the Fabric.
As well as issues listed at the buttom of the page:
In the Client Health section we can find more Detailed information about the Clients:
As you can see the Client Domain Username (Active Directory Username) is shown here as
well (We are using dot1x for client Authentication and Authorization, ISE is integrated with
DNAC and ISE is integrated with Active Directory Domain Controller, and Micro Segmentation
is being done).
NETWORK ASSURANCE AND MANAGEMENT 98
ORHAN ERGUN LLC NETWORK ASSURANCE AND MANAGEMENT
If you click on one of the Usernames we get detailed information (Client 360 view):
If something goes wrong in the network, you can find them in the Issues section, For example if
some node is not reachable, or some Client cannot be on-boarded, it will give you step by step
guide how to solve this issue.
If DNAC finds any Threats they will be shown in the Rogue Management section:
Using the search feature you can almost search everything in the network, and it will redirecty
you to related pages for that entity: