Professional Documents
Culture Documents
All - in - OneENCOR350 - 401v1 - 0ExamCertGuide1stEdition - None
All - in - OneENCOR350 - 401v1 - 0ExamCertGuide1stEdition - None
All - in - OneENCOR350 - 401v1 - 0ExamCertGuide1stEdition - None
1st Edition
Contents at a Glance
Chapter 1 Architecture
Chapter 2 Virtualization
Chapter 3 Infrastructure
Chapter 4 Network Assurance
Chapter 5 Security
Chapter 6 Automation
2020 © CCIEin8Weeks.com 5
Table of Contents
Virtualization ...................................................................................................................................................................................... 49
Automation .......................................................................................................................................................................................... 49
Performance, Scalability, and High Availability .................................................................................................................. 49
Performance ........................................................................................................................................................................................ 49
Scalability and High Availability ............................................................................................................................................... 51
Security Implications, Compliance, and Policy .................................................................................................................... 52
Workload Migration ........................................................................................................................................................................ 54
Explain the Working Principles of the Cisco SD-WAN Solution ..................................................................................... 55
SD-WAN Control and Data Planes Elements ........................................................................................................................ 56
vBond Orchestrator .......................................................................................................................................................................... 56
vManage ............................................................................................................................................................................................... 56
vSmart Controller ............................................................................................................................................................................. 57
vEdge Devices ................................................................................................................................................................................... 57
Further Reading ................................................................................................................................................................................. 60
Explain the Working Principles of the Cisco SD-Access Solution .................................................................................. 61
SD-Access Control and Data Planes Elements ..................................................................................................................... 61
SD-Access Control Plane .............................................................................................................................................................. 61
SD-Access Data Plane .................................................................................................................................................................... 62
Software-Defined Access (or SD-Access) .............................................................................................................................. 63
Traditional Campus Interoperating with SD-Access........................................................................................................... 65
Further Reading ................................................................................................................................................................................. 65
Describe Concepts of Wired and Wireless QoS ...................................................................................................................... 65
QoS Components .............................................................................................................................................................................. 66
QoS Policy ........................................................................................................................................................................................... 66
Differentiate Hardware and Software Switching Mechanisms ....................................................................................... 67
Process and CEF Switching .......................................................................................................................................................... 68
Software-based CEF ........................................................................................................................................................................ 68
Hardware-based CEF ...................................................................................................................................................................... 69
MAC Address Table and TCAM................................................................................................................................................ 69
FIB vs. RIB ......................................................................................................................................................................................... 70
Chapter Summary ................................................................................................................................................................................. 72
Muhammad Afaq Khan started his professional career at Cisco TAC San Jose and passed his
first CCIE in 2002 (#9070). He held multiple technical and management positions at Cisco San
Jose HQ over his 11 years of tenure at the company before moving into cloud software and data
center infrastructure IT industries.
He has worked at startups as well as Fortune 100 companies in senior leadership positions over
his career. He is also a published author (Cisco Press, 2009) and holds multiple patents in the
areas of networking, security, and virtualization. Currently, he is a founder at Full Stack
Networker and a vocal advocate for network automation technologies and NetDevOps.
2020 © CCIEin8Weeks.com 12
2020 © CCIEin8Weeks.com 13
Preface
Congratulations! You have taken your first step towards preparing and passing the Enterprise
Infrastructure Core (ENCOR) 350-401 V1.0 Exam.
Did you just purchase a copy? Interested in getting access to a complimentary ENCOR Exam
Quiz? Register here1 and send us an email at support@cciein8weeks.com to get started.
This study guide is dedicated to all those souls who will never settle for less than they can be, do,
share, and give!
1 https://bit.ly/2UhExXh
2020 © CCIEin8Weeks.com 14
• ENCOR exam consists of topics from six domains of knowledge, i.e., Architecture,
Virtualization, Automation, Infrastructure, Network Assurance, and Security. It went live
on February 24, 2020.
• ENCOR serves a triple purpose as CCNP Core, and CCIE Infrastructure Lab and CCIE
Wireless Lab qualification exam. CCIE recertification requirements are now different
from the initial qualification.
• It is the mandatory Core exam for the CCNP Enterprise track. You become CCNP
Enterprise certified when you pass one of the professional Concentration exams in
addition to ENCOR.
• It obsoletes both old CCNP R&S exams (300-101 and 300-115) and CCIE written exams
for both R&S (400-101) and Wireless (400-351) tracks.
• It is a 120-minute exam that costs $400 (USD) per attempt, it is significantly cheaper than
$450 per attempt for older 400 series qualification or “written exams”
• Each successful attempt at ENCOR recertifies your CCNP for three years, which is the
same as today. However, the CCNP Recertification exam cost change from $400 (passing
one core exam) to $900 (passing three concentration exams), i.e., you pay more than
200% in the new format. There are other possible exam combinations for recertification,
including Continuing Education (CE) credits.
• Each successful attempt at ENCOR plus any one of the Professional track concentration
exams recertifies your CCIE Enterprise for three years. However, the recertification exam
cost to change from $450 (one exam) to $700 (2 exams). There are other possible exam
combinations for recertification, including Continuing Education (CE) credits.
2020 © CCIEin8Weeks.com 15
Let’s now double click into each of those areas and the actual underlying topics that are either
removed or added into the new exam.
2020 © CCIEin8Weeks.com 16
Key Differences Between CCIE Routing and Switching Written and ENCOR
Exams
• Beyond the addition of wireless topics (makes sense now that ENCOR doubles up as
Enterprise as well as Wireless qualification exam), network fundamental topics are pretty
much gone. Thumbs up!
• Layer 2, Layer 3, and VPN technologies have only seen removal and no additions. If you
compare ENCOR with 400-101 V5.1 blueprint, you will be shocked to see that protocols
or technologies such as VLANs, most multicast, RIP, IS-IS, iBGP, MPLS/MPLS VPNs,
DMVPN and even most topics related to OSPF and eBGP have been eliminated.
• Security topics are a net gainer by a significant margin (+15%, as we noted above).
However, most security topics are Cisco proprietary and lack some crucial security
technologies and solutions such as Cloud Access Security Broker (or CASB).
• IP or Infra services topics have mostly shrunk, but Cisco still managed to paddle along
Flexible NetFlow and DNA Center, so a thumbs down!
• Finally, I liked how Cisco chucked away IOT topics (good one!) but super surprised to
see the removal of SDN, Kubernetes, and containers topics. Cisco also added a lot of
proprietary SD-WAN (aka Viptela solution).
The new CCIE Enterprise Lab exam blueprint includes five sections.
2020 © CCIEin8Weeks.com 17
Looking at the actual exam topics line items, I can’t help but notice that about 90% of the exam
is Cisco proprietary. In contrast, 10-15% within Infra automation and programmability consist of
open standard and evolving topics. Now, if you recall, in the older format, Evolving
Technologies were only part of the CCIE written exam but no-show in the CCIE R&S lab. So
given the context, the inclusion of automation in the Lab exam is a huge step forward, and I
applaud this change.
The new CCNP Enterprise certification 2 track obsoletes the current CCNP R&S, Wireless, and
CCDP certifications. Unlike the old CCNP, in the newer format, CCNA is no longer required as
a pre-qualification.
The CCNP Enterprise certification requires you to pass one Core and one Concentration exam
before you can become certified. The ENCOR 350-401 is the mandatory Core exam, besides that
there are six Concentration or Elective exams available that you can choose from. Unlike the
older, ROUTE/SWITCH/TSHOOT exams, you’ve plenty of choices in the newer CCNP
Enterprise.
This study guide includes all of the topics from Cisco's official exam blueprint for Implementing
Cisco Enterprise Network Core Technologies, i.e., ENCOR 350-401. As you may already have
noticed on the "Contents at a Glance" page that this guide has been formatted around the Cisco's
official ENCOR 350-401 exam topics or curriculum. The benefit? Well, as you read through the
various topics, you will know exactly where you're within your learning journey.
2 https://bit.ly/2uTqg8n
2020 © CCIEin8Weeks.com 18
All contents are carefully covered with core concepts, code snippets (where applicable), and
topic summaries to help you master the skills so you can confidently face the pressures of the
Cisco exam as well as its real-world application.
The ENCOR exam contains subject matter from six domains, which also happen to be the six
chapters in this study guide.
1. Architecture
2. Virtualization
3. Infrastructure
4. Network Assurance
5. Security
6. Automation
This guide is for anyone who's studying for Cisco ENCOR 350-401 V1.0 exam. I strongly
suggest taking a methodical approach for exam preparation, i.e., start with a target date or when
you would like to sit for the actual exam and then work backward to see what kind of study plan
would work for you. To help further, I have put together an 80 hours learning plan3 consisting
entirely of public resources, something that you can download and follow.
3 https://bit.ly/3130UB1
2020 © CCIEin8Weeks.com 19
CCIEin8Weeks.com carries the supplemental resources (sold separately) that go hand in hand
with this study guide to further ensure your exam success.
• Exam Prep Bundle4 that covers all bodies of knowledge tested on the ENCOR Exam
• 6x Practice Quizzes (one for each section as per the official curriculum)
• 1x Practice Exam (to help you prepare to face the pressure of a real Cisco exam)
• Hands-on Labs with cloud-hosted IDE for immediate Python code execution
• Code snippets hosted as GitHub Gists that you can clone/fork for modification
4 https://bit.ly/36RWiim
2020 © CCIEin8Weeks.com 20
2020 © CCIEin8Weeks.com 21
CHAPTER 1 ARCHITECTURE
This chapter covers the following exam topics from the Cisco’s official 350-401 V1.0 Enterprise
Network Core Technologies (ENCOR)5 exam blueprint.
5 https://bit.ly/3b7cn7o
2020 © CCIEin8Weeks.com 22
2020 © CCIEin8Weeks.com 23
Before we get into the specifics of network design discussion for an enterprise network, it
behooves us to look at the big picture and the very fundamentals of network design.
The network is simply a resource, and a means to an end. Every enterprise network is laid out to
facilitate the applications running on top of it. The network will meet its goals if enterprise
applications can run in a reliable and performant manner. With the increasing adoption of cloud
applications (or SaaS apps such as CRM or HRM), i.e., applications that are hosted by the
providers (such as Salesforce) in their own data centers as opposed to being on-premise, the role
of the network changes again. In the new world of cloud apps, the network still has to provide
reliable and performant access to those off-premise apps, but even more so maintain the
necessary user experience, security, compliance and visibility and control with the help of
solutions such as Cloud Access Security Broker (or CASB6).
While you’ve to build your network for current requirements, it must be able to evolve, for
example where your core design choices stay the same (for example, 2-tier versus 3-tier
architecture), so think in a modular fashion. Still, at the same time, other parts of the network can
evolve, much like building blocks of a Lego. Whether you are designing for only on-premise or
6 https://en.wikipedia.org/wiki/Cloud_access_security_broker
2020 © CCIEin8Weeks.com 24
everything off-prem (SaaS/PaaS), you are design will still need to be performant, resilient, and
scalable.
Enterprise network design such as 2-Tier, 3-Tier, and Fabric Capacity planning
Enterprise campus network can span over a single building or a group of buildings spread out
over a large geographic area much like a college campus but still in closer proximity. The
primary goal of the campus design is to deliver the fastest speed (say 1 or 10 Gbps) and variety
of access (LAN, WLAN) options to the endpoints.
Campus network design can be organized around four core principles, i.e.
In 1999, Cisco pioneered the campus network design with hierarchical design model which used
a layered approach. The hierarchical network design can help break down otherwise complex
and flat networks into multiple smaller and manageable network tiers or layers. Each layer is
focused on a specific set of requirements and roles. With this design, network designers can pick
the most suitable platform and software features for each layer. As we discussed earlier,
regardless of how a network was designed, the ability to modify an existing design, i.e. without
rip and replace, is of utmost importance. There can be many underlying reasons for such
modifications, i.e. addition of newer services, more bandwidth, and so on.
When you think of network design, you’re likely thinking about the most discussed and much
talked about three tier or three-layer design. The three-layer design is most suited to large
enterprise campus networks. Those three layers are:
1. Core
2. Distribution
3. Access
• Distribution layer provides policy driven connectivity and boundary control between the
access and core layers. It is the boundary between the layer 2 and layer 3 domains.
• Access layer provides users access to the network
Each layer in the 3-tier architecture provides a distinct function and thus relies on a unique set of
features. The access layer is not just about connectivity but also feature richness up and down the
OSI stack.
The distribution layer is about the connectivity as well as policy, convergence, QoS and HA
features.
2020 © CCIEin8Weeks.com 26
The core layer is about high speed and high bandwidth connectivity and less about the features.
It acts as the backbone for the network and glues all the network building blocks. It also acts as
an aggregation point for the distribution layer.
Two-layer design is a modified three-layer design where the core has been collapsed into the
distribution layer. The main motivation for the collapsed core has to do with cost and the
operational simplicity that it brings. It is best suited for small to medium-sized networks.
It is worth noting that the above discussion is about enterprise campus design and not enterprise
data center. The campus is where end-users connect to the network whereas the data center
provides connectivity to the servers and devices such as load balancers and storage arrays. Let
me summarize the key differences between the two network designs before we move on.
2020 © CCIEin8Weeks.com 27
When designing an enterprise network, network engineers should try to include redundancy at
each layer. Let’s first discuss the broad HA and redundancy considerations.
• You should try to use Host Standby Routing Protocol (HSRP) or Gateway Load
Balancing Protocol (GLBP) with sub-second timers for redundancy at the default
gateway
• Avoid daisy-chaining switches and use StackWise and chassis-based solutions instead
• Avoid protecting against double failures and over-engineering with three or more
redundant links
• L2/L3 distribution should be implemented with HSRP or GLBP with the distribution
layer at the boundary. It is known to provide up to sub-second convergence
• L3-based access, i.e. using a routing protocol in the access layer and an L3 p2p routed
link between the access and distribution switches, is recommended for sub-200
milliseconds failover
Let’s now discuss some specific redundancy considerations by each campus network layer.
The purpose of the default gateway redundancy or first-hop redundancy is to help protect against
a single node failure so that traffic from end hosts can continue flowing through active default
gateway device after a small sub-second convergence.
In the hierarchical design that we have discussed so far, distribution switches define the L2/L3
network boundary and act as the default gateway to the entire L2 domain facing the access layer.
Without some form of redundancy in place, default gateway failure could result in a massive
outage.
HSRP, VRRP, and GLBP are three popular first-hop redundancy protocols for implementing
default gateway redundancy. HSRP and GLBP are Cisco proprietary, whereas VRRP is an IETF
standard based protocol defined in RFC 3768 and RFC 5798.
HSRP and VRRP are the recommended protocols and can provide sub-second failover with
some tuning for redundant distribution switches. If you are using Cisco switches, best practices
indicate that you would be better off using feature rich HSRP however VRRP is a must when
your design requires vendor inter-op.
The configuration snippet below shows how you can use HSRP in an enterprise campus
deployment and achieve sub-second failover times.
interface Vlan100
description Data VLAN for Access-Switch
ip address 10.1.1.1 255.255.255.0
ip helper-address 10.1.2.1
standby 1 ip 10.1.1.2
standby 1 timers msec 200 msec 750
standby 1 priority 150
standby 1 preempt
standby 1 preempt delay minimum 180
STP/RSTP root should be the same device as the HSRP primary device for a given subnet or
VLAN. Without consolidating HSRP primary and STP root in a single device, the transit link
between the distribution switches can act as a transit link where traffic to/from default gateway
takes multiple L2 hops. It is also recommended that preemption delay is set to 150% of the time
that it takes for the switch to boot up from scratch.
HSRP preemption needs to be configured with switch boot time and overall connectivity to the
rest of the network. If preemption and neighbor adjacency occur before switch has L3
connectivity to the core, no traffic will actually and remain blackholed until complete L3
connectivity is restored.
GLBP protects traffic against device or circuit failure much like HSRP or VRRP, but in addition
to that, it also allows packet load sharing between a group of redundant routers. Before GLBP,
you could only implement HSRP or VRRP hacks to get load balancing to work. For example,
you could configure distributes devices as alternate root switches and divide and direct traffic
from VLANs into both. Yet another hack would have been to use multiple HSRP groups on a
single interface and use DHCP to alternate between the default gateways. As you can see, none
of these hacks are clean and could very easily become an administrative nightmare.
HSRP uses a virtual IP and MAC pair which is always assumed by the active router whereas
GLBP uses one virtual IP address for multiple virtual MAC addresses.
interface Vlan100
description Data VLAN for Access-Switch
ip address 10.1.1.1 255.255.255.0
ip helper-address 10.1.2.1
glbp 1 ip 10.1.1.2
glbp 1 timers msec 250 msec 750
glbp 1 priority 150
glbp 1 preempt delay minimum 180
Let’s now wrap up the FHRP discussion with a side by side comparison table.
Today, most network devices can provide a level of high availability intra-box, i.e. in the form of
redundant supervisors such as Cisco Catalyst 6500, 4500, and Nexus 7K. When you have
redundant supervisors, the box can also support Stateful Switchover or SSO which ensures that
the standby supervisor blade contains state information from the active blade and can thus
switchover and become primary to assume the L2 forwarding function.
The Cisco Catalyst 6500 and N7K switches support L3 Non-Stop Forwarding or NSF which
allows redundant supervisors to assume L3 forwarding functions without tearing down and
rebuilding L3 neighbor adjacencies in the event of primary supervisor failure.
2020 © CCIEin8Weeks.com 31
Now, in a hierarchical network design, the core and distribution nodes are connected via L3 p2p
links which means distribution or core related failures are about loss of link, i.e. if a supervisor
fails on a non-redundant device, the links fail and the network simply re-converges through the
second core or distribution device within sub-200 milliseconds when using EIGRP or OSPF.
With redundant supervisors, links are not dropped due to SSO/NSF convergence event if a
supervisor were to fail. There will be a momentary interruption to traffic during SSO, once SSO
is completed NSF follows suit. This will obviously result in some downtime during re-
convergence too.
If the L2/L3 boundary is in the access layer, i.e. a routed access design, then SSO/NSF can
provide an increased level of HA. If your access layer design is switched (or L2), then you
should consider using dual redundant supervisors with SSO.
Today, the campus WLAN is used to provide voice, video and data connectivity for employees,
internet access for guests, and connectivity for various sensors, cameras and Internet of Things
(or IoT) devices in general.
There are plenty of design principles and gotchas when it comes to WLAN deployment, so let’s
go over some of the high order bits before we dive into the actual WLAN deployment models.
• Coverage versus Capacity: You’ve to provide adequate coverage because its wireless, but
when coming up with your design be sure to focus on capacity. You can’t simply add
capacity by adding more APs later.
• Application throughput: Voice, video, and data all have different design requirements, so
be sure to understand the latency, jitter, and packet loss requirements for each interactive
application that will be used on your network
• Disable lower rate 802.11 standards, if you can’t, try to isolate them as much as possible
• Network security is a big deal when it comes to wireless, as RF is not something that you
can control, unlike wired connectivity. It is recommended to use WPA2-Enterprise with
policy server, AES encryption, and stringent rules for BYOD and guest networks.
• High availability: There are three main Cisco recommended methods of WLAN HA, i.e.
SSO, N+1, and WLAN controller link aggregation.
• Multicast support: Voice and video solutions frequently use multicast for efficient
delivery of traffic. Multicast in remote sites uses the underlying WAN and LAN support
for multicast traffic.
2020 © CCIEin8Weeks.com 32
• Guest Authentication: There are many ways to authenticate guest access, however the
common approaches include local and central WebAuth and CMX-based (aka Mobility
Services Engine) guest onboarding.
This model is primarily recommended for large site deployments. There are multiple benefits to
centralized deployments.
You can connect the WLAN controller in a variety of ways, i.e. directly to a DC services module
or block, a separate services block within the campus core or distribution layer in your 3-tier
design. Wireless traffic is usually tunneled via the Control and Provisioning of Wireless Access
Points (CAPWAP) protocol which operates between the WLAN controller and the AP.
Thanks to its centralized nature, policy application is straightforward in this model because the
controller is your single point for managing L2 security and network policies thus providing a
unifying policy application across wireless and wired.
2020 © CCIEin8Weeks.com 33
In terms of direct customer benefits, local-mode provides the delivery on the following
requirements.
• It enables fast roaming so wireless users can roam all they want between floors and
buildings all over the campus.
• It can support rich media services with Call Admission Control (CAC) and multicast with
Cisco VideoStream.
• It allows for centralized policy enforcement where you can subject your traffic to the
firewall (L4-L7 inspection), network access control and classification.
For larger centralized or local-mode deployments, it is recommended that you use the Cisco
8540 or 5520 WLAN controllers. For smaller sites, Cisco recommends that you use a 3504
WLAN controller as a local on-site controller.
Distributed Model
The distributed WLAN architecture or design is where wireless traffic load is distributed across
various access points, so it is the opposite of centralized or local-mode design as far as traffic
handling is concerned. In this mode, you must reconfigure your access layer every time you add
an AP. This model uses distributed forwarding as the forwarding of data between the client and
destination happens directly through the AP without the intermediary step of being tunneled
through a controller as is the case with a centralized approach. In this model, APs are referred to
as fat APs.
2020 © CCIEin8Weeks.com 34
Distributed WLCs are commonly connected to the distribution layer within the campus network,
and in that case, Cisco doesn’t recommend using an L2 to hook up a WLC as that would require
adding access layer features such as HSRP to the distribution layer. Cisco strongly recommends
using WLC via L3 which allows for WLAN configuration to be isolated on a single device much
like other access layer routing devices.
Controller-less Model
The controller-less deployment still uses controller functionality but in a virtual or hosted
fashion. With the rise of controller-less APs, you don’t need a physical controller to drive a
bunch of APs, so you simply move that controller function i.e. control and management planes to
another entity on or off-site.
Controller-based Model
It is worth noting a little bit of WLAN history here. Remember, WLAN started with standalone
APs with no centralized coordination of any sort whatsoever, so those APs were independent in
every way you can imagine. You would need to log into each of them and hand configure pretty
much.
Second generation APs were controller-based, or you can also them dependent or tethered i.e.
they couldn’t operate on their own or without the centralized controller. WLAN controller used
to be expensive so unless you can throw more controllers to achieve your desired HA, you pretty
much had no choice but to deal with the controller being the single point of failure.
Third generation APs were also controller-based but not tethered but tunneled which solved the
single point of failure issue since you could now build a tunnel all the way back to controller
regardless of its physical location. This also created a drawback where the controller still needed
to be in the forwarding path. You still needed expensive redundant controllers and it didn’t scale
well in larger deployments.
Cloud-based Model
configuration, optimization, and mobility control are centralized and delivered to you as a cloud
service from the providers hosted data center.
This model provides significant benefits over legacy approaches primarily due to less hardware
to buy, install and maintain.
Cisco FlexConnect is the go-to model for multiple small remote sites or branches that connect
into a central site. FlexConnect provides a cost-effective solution where network engineers can
control remote APs from headquarters through the WAN.
Cisco AP operating in FlexConnect can switch client data traffic with dot1q VLAN tags in order
to segment traffic. This mode of operation is also known as FlexConnect Local Switching.
Optionally, in FlexConnect mode, you can also tunnel traffic back to the centralized controller
for example for guest access. FlexConnect can be deployed in either a shared or dedicated
2020 © CCIEin8Weeks.com 36
controller model. Cisco WLAN 8500, 550 and 3500 series controllers support both shared and
dedicated modes of operation.
Before you use shared mode, make sure that your deployments meets the following
requirements.
For HA purposes, you can deploy a pair of controllers in SSO configuration or N+1 arrangement
if you desire cross-site resiliency. Alternatively, you can also deploy dual resilient controllers
configured in an N+1 manner using the Cisco vWLC.
SD-Access Wireless is the fabric-enabled wireless solution that also fully integrates with a wired
SD-Access design. The primary benefit of SD-Access Wireless is that customers can have a
unified policy and experience across both wired and wireless mediums. In this design, the fabric
WLCs communicate wireless client information to the fabric control plane, and the fabric APs
encapsulate traffic into the VXLAN data path.
2020 © CCIEin8Weeks.com 37
In the early days of WLAN deployments, focus used to be mostly around providing maximum
Wi-Fi coverage with the minimum AP count possible, fast forward to today, coverage uniformity
and cell to cell overlap are now the major areas of design consideration. This change has been
mostly driven by applications such as voice or video which have a lower tolerance to jitter and
roaming delays.
Cisco’s best practices in designing and deploying location-aware WLANs include the following
components.
Further Reading
Cisco Campus LAN and WLAN Design Guide 7
First, we need to establish a definition of what’s on-premise and what’s in the cloud. On-premise
or “on-prem” refers to private data centers that companies either house in their facilities or a
colocation shop such as Equinix. Generally, when people refer to “cloud”, they pretty much
mean public cloud. It is a model where a cloud service provides e.g. Amazon AWS or Microsoft
Azure make computing resources available to you.
As per the RightScale Cloud Survey8, the percentage of enterprises that have a multi-cloud
strategy grew to 84 percent vs. 81 percent in 2018, which is no surprise given the cloud rise of
cloud adoption over the last 8 years or so. The worldwide public cloud services market is
projected to grow 17.5 percent in 2019 to an impressive total of $214.3 billion, up from $182.4
billion in 2018, according to the Gartner cloud forecast9.
As cloud adoption picked up, purchasing and maintaining on-premise infrastructure went from
an investment to a liability. While cloud and on-premise are two different deployment models,
7 https://bit.ly/31lkI2O
8 https://bit.ly/37OWnF1
9 https://gtnr.it/36QlxSi
2020 © CCIEin8Weeks.com 39
but behind the scene, enterprises still have a singular goal of implementing a lean and agile IT
infrastructure that meets or exceeds a company’s needs while optimizing the cost. There are
three major types of infrastructure, or raw material if you like, that can be deployed on-premise
or used in the cloud, i.e.
• Networking infrastructure (routers, switches, firewalls, load balancers what have you)
• Computing (x86 or ARM)
• Data Storage (traditional arrays or HCI)
Likewise, there are four popular cloud deployment and service models respectively.
There is no one size fits all model to help you figure out whether on-premise or cloud is better
for your organization, so you’ll need to perform the due diligence to determine what would work
best. There are many ways we can slice and dice the two paradigms, so let’s start with some of
the key design components that are relevant to all enterprise deployments.
• Cost
• Security
• Agility and Scalability
• HA and Fault-tolerance
• Customization
• Compliance
On-Premise Cloud
Cost CAPEX, lots of upfront costs OPEX, Pay-as-you-go
2020 © CCIEin8Weeks.com 40
In a nutshell, the cloud is here to stay (and grow), however, on-premise is not going away
anytime soon either. Let’s go over the core cloud concepts in a little more detail.
Cloud computing is the result of a well thought out infrastructure by the providers, in the same
way, that electricity, water, and gas are the result of decades of infrastructural development by
the utility providers. Cloud computing is made available through network connections in the
same way that public utilities have been made available through networks of pipes and wires. All
clouds are scalable (resources are added as demand rises) and elastic (resources grow or shrink
as demand rises or falls).
As per Gartner10, AWS, Azure, and GCE have about 47%, 22% and 8% public cloud market
share respectively.
10 https://bit.ly/37SKIVE
2020 © CCIEin8Weeks.com 41
Cisco’s definition of cloud outlines the following four aspects as must-have for a cloud service,
i.e.
• On-demand means resources follow demand pattern, they are provisioned and de-
provisioned with increasing and decreasing demands respectively
• At-scale, means cloud provider has enough supply of resources to meet demands from all
its customers, i.e. providing cloud services at-scale.
• Multitenant, means cloud services are inherently multi-tenant out of the box
• Elastic means that corresponding cloud services will grow or shrink based on customer’s
demand patterns
Before we dive into “design considerations” for each cloud deployment model, let’s first
understand each type of cloud that exists out there.
Public Cloud
The public cloud is defined as computing services offered by third-party providers, such as
AWS, over the public Internet, making them available to anyone who wants to use or purchase
them. Cloud services may be free or sold on-demand i.e. pay-as-you-go, allowing customers to
pay only per usage for the CPU cycles, storage, or bandwidth they use. Public cloud users simply
sign up for a service, use the resources made available to them, and pay for what they used
within a given amount of time.
As per the CLOUD VISION 2020 survey, digital transformation, IT agility and DevOps are the
top drivers for public cloud adoption.
2020 © CCIEin8Weeks.com 42
Technically speaking, a public cloud is a pool of virtual resources that include computing,
storage, and networking, all developed from commodity hardware owned and managed by a
third-party provider such as AWS or Azure, that is automatically provisioned and allocated
among multiple customers in a multi-tenant fashion through a self-service interface. It’s an
economically compelling way to scale out workloads that experience unexpected demand
fluctuations.
A public cloud is the simplest form of all cloud deployments: A customer that needs more
resources and platforms such as servers or storage, or services simply pays a public cloud vendor
by the hour or the minute, to get access to what’s needed when it’s needed. Infrastructure,
computing power, storage, or cloud applications are decoupled from underlying hardware with
the help of virtualization by the vendor, orchestrated mostly by open source management and
automation software. Connectivity to a public cloud generally happens via the internet
(obviously, encrypted) but also through dedicated low latency network connections available at
large colocation data centers, much like the AWS Direct Connect11.
Private Cloud
The private cloud is about offering computing services either over the Internet or a private
internal network, such as WAN. In terms of service offering, private cloud is no different than
public cloud but dedicated to the needs and goals of a single enterprise as opposed to being
shared and multi-tenant. It is worth mentioning that the private cloud is not to be confused with
its location, because it can be located on-premise (internal) or off-premise (hosted).
11 https://aws.amazon.com/partners/directconnect/
2020 © CCIEin8Weeks.com 43
Due to single-tenant provisioning and dedicated use of resources, private clouds deliver a higher
degree of control and customization, as well as a higher level of security and privacy. Private
cloud also provides better service SLAs and data security when hosted on-premise or what is
known as an internal cloud.
One drawback of an internal cloud is that the company’s central IT department is held
responsible for the cost and accountability of managing the cloud leading to similar staffing,
management, and maintenance expenses as traditional data center ownership. Private clouds can
be either self or provider-managed such as Rackspace.
Virtual private cloud (or VPC) is a private cloud carved out inside a public cloud for the sole
purpose of being used by a single tenant. It provides isolation of data both in transit and at-rest
resulting in enhanced security and data control.
Cloud provider will let you provision a cloud router and a firewall, so you can connect remote or
on-premise resources to a VPC. AWS provides features such as security groups, ACLs and flow
logs that capture information about the IP traffic going to and from network interfaces within
your VPC.
Hybrid Cloud
A hybrid cloud combines the benefits of public and private clouds by allowing data and
applications to be shared between them. When workload demand changes, hybrid cloud
computing allows businesses the ability to seamlessly scale using public cloud and thus handle
overflow or demand bursts without giving third-party service providers access to the totality of
their data.
2020 © CCIEin8Weeks.com 44
Hybrid cloud architecture is the best of both worlds approach and that is what allows enterprises
to run critical workloads in the private cloud and lower risk workloads in the public cloud and
allocate resources from either environment as desired in an automated fashion via APIs. It’s a
setup that minimizes data exposure and allows medium to large enterprises to maintain a
scalable, elastic, and secure portfolio of IT resources and services.
Using a hybrid cloud helps companies eliminate the need to make CAPEX investment to handle
short-term or seasonal spikes in demand as well as when the business needs to free up on-
premise resources for more sensitive data or applications. In summary, hybrid cloud computing
delivers flexibility, scalability, elasticity and cost efficiencies with the lowest possible risk of
data exposure.
As per Cisco, there are five major challenges involved with deploying and managing a hybrid
cloud, i.e.
• Cloud management
• OPEX
• Security
• No common ground
• Lack of expertise within IT
Multi-cloud
Multi-cloud is not yet another cloud model per se, but a cloud deployment approach made up of
multiple cloud services, from multiple cloud service providers, public or private.
By definition, a “multi” cloud refers to the presence of more than one cloud deployment of the
same type. Unlike the hybrid cloud, multi-cloud refers to the presence of multiple clouds of the
same type, e.g. two private clouds or two public clouds. The drivers behind the trends are
avoiding vendor lock-in, cost savings, performance, better defenses against Distributed Denial of
Service (DDoS) attacks, improved reliability and the existence of shadow IT.
2020 © CCIEin8Weeks.com 45
The IDC predicted that more than 85% of Enterprise IT organizations will chalk up a plan to use
multi-cloud architectures by 2018. Cisco also said that a small number of gigantic hyper-scale
data centers would hold just over half of all data center servers, and account for 69% of all data
center processing power, and 65% of all data stored in data centers. Primarily based on of multi-
cloud, Gartner12 also expects 80% of enterprises will have shut down their traditional data
centers by 2025, up from just 10 percent today.
Let us compare the two major cloud models’ side by side in the critical areas of capacity, control,
cost, service SLAs, security and customization.
Public Private
Everything is OPEX
CAPEX + OPEX
12 https://gtnr.it/31ehcHm
2020 © CCIEin8Weeks.com 46
Infrastructure as a service (IaaS) is a cloud service model where compute, storage and
networking resources and capabilities are owned and hosted by a service provider and offered to
customers on-demand. Using a self-service interface, customers can self-provision and self-
manage this infrastructure, using a web-based interface that serves as an IT single-pane of
management for the overall environment.
Public cloud providers also facilitate REST API access to the infrastructure so IT departments
can manage those off-premise resources using their existing tools. IaaS solutions also form the
basis for PaaS and SaaS service models. Compared to SaaS and PaaS, IaaS users are responsible
for managing applications, data, runtime, middleware, and OSes. Providers still manage the
lower layers of virtualization (i.e. the hypervisor), physical servers, disk drives, storage volumes,
and network connectivity.
Platform as a Service (PaaS) allows developers to build, run, and manage applications in the
cloud. PaaS provides a framework that developers can use to build customized applications.
PaaS makes the dev/test, and deployment of applications quick and cost-effective. With PaaS,
2020 © CCIEin8Weeks.com 47
Popular examples of PaaS include Google App Engine (GAE), Microsoft Azure .NET and
Heroku. GAE allows up to 25 free and unlimited paid applications, you can code your apps in
Java, Python, Go, PHP and tons of other languages.
The cloud offers several unique benefits to developers and service users as opposed to using on-
premise, namely scalability, availability, collaboration, and portability.
Software as a service (or SaaS) is a way of delivering centrally hosted applications over the
Internet, as a service as opposed to locally installed. SaaS applications run on a SaaS provider’s
servers.
2020 © CCIEin8Weeks.com 48
With SaaS, you don’t have to install or maintain software, you simply access it via the Internet.
This allows enterprise IT teams to reclaim both CAPEX and OPEX that would have otherwise be
used to manage complex software and hardware to support applications if they were hosted on-
premise. The provider manages not only the access to the application, but also security,
availability, and performance. SaaS business applications are accessed by end-users using a web
browser over secure SSL transport. Popular examples of SaaS include Web email platforms such
as Gmail, CRM, HRM, and Unified Communication software such as Cisco WebEx.
Let us compare the three major cloud service models, side by side, across the critical areas of
cost, control, application software management, upgrades, security and data controls.
Software Stack Security Mostly customer A mix of provider and Mostly provider-
and Data managed customer-managed managed
Consolidation
It is about breaking down of resources silos and reduce the number of physical data centers as a
result. Consolidation of resources include servers, storage, and networking.
Virtualization
Virtualization is the foundation technology behind a cloud where compute, storage and
networking resources can be dynamically partitioned, provisioned and assigned to applications.
Virtualized resources pools of compute and storage can be far more easily allocated on-demand
and elastically as opposed to their physical variant, for example, a cloud server versus a bare-
metal.
Automation
Service automation enters once resources are consolidated and virtualized. It allows an
intelligent network fabric to rapidly and automatically find and respond to application resource
requirements, i.e. provision processing, storage, and security resources on-demand.
Performance
To benchmark cloud performance, let’s first break it down to its most fundamental components,
i.e. compute, storage, and networking and the overall SLA depending on the service model, that
is guaranteed to you by your service provider. The performance of your instances in a cloud,
public or private, is bounded by the actual underlying hardware. In the case of public cloud, you
are sharing resources in a multi-tenant fashion, however, that doesn’t mean that you will not
receive your allocated resources. Now, if your application is consistently trying to request more
than what your chosen instance allows for, then your application performance may degrade, and
your only recourse could be to upgrade to a larger compute instance. On AWS, you can use
CloudWatch metrics to monitor your CPU load in case if your workload is running continuously
“hot”.
2020 © CCIEin8Weeks.com 50
There is another lesser-known term known as CPU steal time, which is nothing but the
percentage of time a vCPU must wait for the real CPU. You can see your VM’s steal time using
the Linux “top” command. If your VM consistently displays a high %st (steal time), this means
CPU cycles are being taken away from your VM to serve others. You may be using more than
your share of CPU resources, or the physical server where VM is located may be
overprovisioned. If steal time remains high, try giving your VM more CPU resources or move to
a different physical server. Applications that need short bursts of CPU resources, e.g. Web apps,
will see serious performance degradation during periods of high steal time.
You can monitor and benchmark cloud performance by either using the cloud provider’s built-in
tools or using a third-party tool such as the CloudHarmony.
Service SLA, monthly Must be equal to or greater than four 9s (99.99%), 4 minutes of
billing cycle downtime per month
When you move your on-premise application into a PaaS cloud, you need to ensure that they are
performant and secure. There are several considerations that you would need to factor into your
decision that may be incrementally or entirely different from how you manage your application
performance on-premise or inside your data center.
For PaaS and SaaS, overall service SLA uptime is still useful but not enough on its own. With
PaaS and SaaS, you must consider the application performance because service SLA (or uptime)
is insufficient to guarantee an excellent end-user experience.
2020 © CCIEin8Weeks.com 51
Service SLA, monthly billing Must be equal to or greater than four 9s (99.99%), 4
cycle minutes of downtime per month
SaaS is one of the most mature segments of the cloud computing market where likes of
Salesforce, Workday, and WebEx dominate CRM, HRM/HCM and Unified Communication
verticals respectively. SaaS provides faster time to market, reduces TCO and comes with
seamless upgrades. However, those benefits don’t hold much value if your SaaS applications are
unable to provide a quality user experience. In the case of SaaS, your cloud provider can’t dictate
to you what constitutes acceptable performance for your business.
Service SLA, monthly billing Must be equal to or greater than three 9s (99.9%), 10
cycle minutes of downtime per month
Cloud scalability refers to the phenomenon where workload increase leads to the addition of
more incremental resources whereas elasticity refers to how workload increase or decrease
dictates the provisioning as well as de-provisioning (or shrinking) of resources. You can achieve
elasticity in AWS by configuring auto-scaling for your EC2 instance.
With most IaaS cloud providers, you should be able to configure auto-scaling in a variety of
different ways, i.e.
• You can use the auto-scaling group to maintain a minimum or a specified number of
running instances at all times
• You can use manual scaling to specify the maximum, minimum, or desired capacity of
your auto-scaling group
2020 © CCIEin8Weeks.com 52
• You can also perform scaling by calendar schedule, i.e. scaling actions are performed
automatically as a function of time and date
• You can also do scaling by policy, where you define the parameters that in turn control
the scaling
High availability, in the cloud, is achieved by creating multiple instances of your application and
directing traffic (or load) to them on an elastic load sharing basis. For fault-tolerance reasons,
you can also place those instances in different Availability Zones (or AZs) within or across
regions.
IaaS cloud deployment model provides you with the most control over your cloud application
scalability and high availability as opposed to PaaS and SaaS where you naturally depend on
your provider. In the case of PaaS and SaaS, infrastructure such as compute, storage and
networking are pre-provisioned for you. In the case of PaaS, you can create multiple instances in
separate geographies with shared storage. However, in case of SaaS, your provider configures
and maintains scalability and HA for you.
Security has been the #1 issue cited hindering cloud adoption for a long time. IT teams must
consider the needs for security and compliance based on the business and government policies
for the given vertical (e.g. financial vs technology) and geography. Cloud Security Alliance (or
CSA) publishes best practices for providing security assurance and compliance for cloud
computing.
Cloud security primarily refers to either security to the cloud or security for the cloud. Keep in
mind that any security solution technically can be delivered from the cloud. Cloud security
threats fall into four main categories.
• Malware and ransomware
• Gaps in visibility and coverage
• Compromised accounts and malicious insiders
• Data breaches and compliance
Cloud security requires a coordinated approach across the three main pillars of networks,
endpoints, and the cloud. Each of these pillars has a specific set of deployment form-factors that
can be utilized based on an organization’s needs.
Depending on the cloud deployment model, responsibility for security in the cloud is shared
between your central IT department as well as your cloud provider.
2020 © CCIEin8Weeks.com 53
Many security threats are associated with the data itself, i.e.
• Confidentiality
• Access controls
• Integrity
Likewise, there are numerous laws and regulations out there relating to the storage and use of
data. In the US, these privacy protection laws include the following.
• PCI-DSS
• HIPAA
• SOX
• FISMA
To meet those compliance requirements, cloud providers need to maintain the following.
• Business continuity and data recovery
• Log and audit trail
• Geo-fencing and data placement (including GDPR in Europe)
2020 © CCIEin8Weeks.com 54
Workload Migration
Cloud workload migration is entirely dependent on your cloud deployment model, i.e. IaaS, PaaS
or SaaS. As per Cloud Vision 2020 survey, it is estimated13 that over 80% of enterprise
workloads will be in the cloud by 2020.
In the case of IaaS, your workload consumes resources in the form of compute and storage units
where the PaaS/SaaS workload is about the software stack. The workload here refers to one of
the following entities.
• Physical machine
• Virtual machine (VM)
• Container
• Application + Data
The following are the common drivers for on-premise to cloud workload migration.
• Decreasing OPEX (ability to match supply and demand in an automated fashion)
• Improving workforce productivity (no wait for infrastructure and out-of-the-box services)
• Eliminating CAPEX (no hardware refresh)
• Improved scalability and HA (multiple zones, regions across the world)
• Faster Time to Market (TTM)
Workload migration to the cloud needs to be driven by the business value that would have been
gained by the cloud user (e.g. better user experience) as well as the central IT (e.g. reduced cost,
avoiding vendor lock-in, improved HA, etc.). You need to gain a thorough understanding of
which migration strategy will be best suited for your existing infra and apps.
13 https://bit.ly/38RtdVM
2020 © CCIEin8Weeks.com 55
The following are the common migration strategies for on-premise to cloud workload migration
broken down by the cloud deployment models.
The global SD-WAN market size is expected to grow from $1B in 2018 to $4.1B by 2023. The
number of cloud-based applications increases traffic in the network and SD-WAN provides
better cloud connectivity than MPLS does. As a result, more and more enterprises are expected
to opt for SD-WAN. On the other hand, the deployment of SD-WAN also enables network
operators such as AT&T and Verizon to save capital and reduce OPEX. Today, Cisco (formerly
Viptela) and VMware (formerly VeloCloud) are two major players in the SD-WAN market.
2020 © CCIEin8Weeks.com 56
Cisco SD-WAN is a cloud-delivered WAN overlay architecture that builds on the SDN (in the
DC) principles and extends them to meet the unique WAN design requirements. The overall
Cisco solution is broken into orchestration, management, control, and data planes.
vBond Orchestrator
vBond is a software-based component that is responsible for the initial authentication of vEdge
devices and orchestrates vSmart and vEdge connectivity. It also plays an important role to ensure
that vEdge devices that sit behind a NAT device can function.
It is what makes up the orchestration plane and delivers the zero-touch provisioning. When a
vEdge device boots up the very first time, vBond is responsible for onboarding the device into
the SD-WAN fabric.
vManage
vManage is what makes up the management plane within Cisco SD-WAN architecture. Cisco
vManage is how you configure the SD-WAN solution as far as the GUI is concerned. Network
engineers can perform configuration, provisioning, monitoring and even troubleshooting via
vManage. vManage supports both single and multi-tenant dashboards for an enterprise customer
and service provider respectively.
2020 © CCIEin8Weeks.com 57
vManage is what you use to configure and maintain Cisco SD-WAN devices and required
connectivity among all solution components in the underlay as well as the overlay networks.
vSmart Controller
vSmart controller is a software-based component and is responsible for the centralized SD-WAN
control plane much like what a traditional SDN controller would do in a data center or campus
SDN LAN fabric.
vSmart controller maintains a secure connection to each vEdge device and distributes routes and
policy information via the Overlay Management Protocol (or OMP), sort of working as a route
reflector server. It also helps orchestrates the secure data plane connectivity among the vEdge
devices (or remote branches) by distributing the crypto keying information without the use of
IKE.
Cisco vSmart controller is sort of the brain of the solution. You will still create your policies
within vManage but vSmart controller is what helps enforce them. You also can think of vSmart
controller as the sole controller of traffic traversal within the SD-WAN fabric.
vEdge Devices
vEdge devices are mostly Cisco routers that make up the SD-WAN fabric by forwarding traffic.
Cisco WAN edge routers come in both physical or virtual form factors and are chosen by the
network engineers based on the actual branch requirements, i.e. connectivity type, throughput or
bandwidth, and other functional needs.
vEdge routers are responsible for performing encryption, QoS, including running routing
protocols such as BGP and OSPF. As we speak, for the access layer, we have Cisco ISR1K,
ISR4K, and vEdge 100/1000 as possible choices for vEdge devices. For the distribution layer,
we can choose between vEdge 2000/5000 or ASR1K platforms. In terms of virtual vEdge
platforms, you can choose among ISRv, CSR1Kv, and vEdge Cloud platforms.
2020 © CCIEin8Weeks.com 58
Now, that we have covered Cisco SD-WAN components and their actual functions within the
SD-WAN fabric, let us now compare the two solutions, i.e. age-old traditional WAN versus SD-
WAN.
Cisco SD-WAN virtual IP fabric transforms a complex legacy network into an easy-to-manage,
scalable network in five steps:
1. Separate transport from the service side of the network.
2. Centralize routing intelligence and enable segmentation.
2020 © CCIEin8Weeks.com 59
vEdge devices and vSmart controller connect via a TLS based control plane. You can configure
certificates within vManage for encrypted and authenticated communication. Each vEdge device
can send traffic directly to another without exchanging any reachability information. To provide
scale, the SD-WAN solution uses IEEE’s Overlay Management Protocol (or OMP) that carries
QoS, routing policy, multicast and IPSec keys.
Cisco SD-WAN solution supports IPSec ESP and GRE encapsulations for its overlay network.
There is no IKE needed since the key exchange is handled by OMP. Absence of IKE helps speed
up vEdge to vEdge tunnel setup time. The solution can classify traffic based on ports, protocols,
2020 © CCIEin8Weeks.com 60
IP addresses, and IP DSCP values. It can also classify traffic based on applications. Policies are
configured via vManage dashboard, and once done, are communicated to vSmart controller,
which in turn communicates them to vEdge devices. If a vSmart controller goes down, affected
vEdge devices can continue with the last known good configuration.
Zero-touch provisioning relies on vBond which allows vEdge devices to connect into vSmart
controller without any prior configuration. Each vEdge device contains an SSL certificate. All
vEdge devices must be trusted by the vManage before they can be managed by the vSmart
controller.
For brownfield deployment, the solution allows integration with VRF/VLAN by using a 4-byte
shim header known as the label, which is part of each packet in the overlay and functions as sort
of a membership ID.
Here are the resources requirements for vBond, vEdge, vManage, and vSmart Controller
components. Please note that vBond, vManage, and vSmart Controller requirements depend on
the vEdge devices that are needed to be provisioned.
Cisco provides complete hardware and software installation guidelines and requirements14.
SSD (GB) 10 10 20 16
Bandwidth 1 Up to 2 25 2
(Mbps)
Hypervisor ESXi / KVM
Further Reading
SD-WAN Cloud Scale Architecture15
14 https://bit.ly/31eQSwP
15 https://bit.ly/2vAp4XM
2020 © CCIEin8Weeks.com 61
Cisco SD-Access defines a single network fabric that extends from the access layer to the cloud.
Thanks to policy-based automation that can be configured for users, devices and things, you
don’t have to compromise on data visibility or security within the network.
The integration between SD-Access and SD-WAN requires carrying the access layer metadata in
the form of SGT tags carried inside the IPsec packets sent over the SD-WAN overlay. Cisco
DNA Center does for SD-Access what vManage does for SD-WAN. SD-Access border routers
perform both WAN edge and SD-Access terminations. Having said that, WAN edge and SD-
Access functions are governed by vManage and Cisco DNA Center respectively. vManage and
DNA Center are two different products however they are integrated via their APIs to achieve a
consistent and comprehensive experience.
SD-Access fabric uses VXLAN in addition to Locator/ID Separation Protocols (or LISP), as a
control plane protocol, to perform endpoint to location mappings. LISP allow routing based on
Endpoint Identifier (or EID) as well as Routing Locator (or RLOC) which represent IP or MAC
and IP address respectively. LISP architecture requires mapping servers that store and resolve
2020 © CCIEin8Weeks.com 62
EIDs to RLOCs. The ROLCs are part of the network underlay routing domain whereas EIDs
operate independently of the location. Much like how DNS name resolution works, source
RLOC queries the mapping servers to get the destination RLOC IP for traffic encapsulation.
LISP instance IDs are used to maintain independent topologies (or shall we say VRFs) and they
are mapped to VXLAN VNIs from the data plane perspective.
Decoupling of endpoint identity from its location makes possible to have IP from a given subnet
behind multiple L3 gateways, the phenomenon is known as subnet stretching, versus 1:1 nature
of traditional IP routing.
SD-Access uses the overlay network with a fabric data plane by way of Virtual Extensible LAN
(or VXLAN) encapsulation. VXLAN encapsulates and transports L2 frames across the underlay
where the overlay is defined by the VXLAN Network Identifier or VNI. VXLAN also carries
SGTs required for segmentation.
Using VXLAN, you tag L2 frames using IP/UDP over L3 network where each overlay is termed
as VXLAN segment. As defined in RFC 7348, VXLAN VNI is a 24-bit field which means you
can have up to 16M VXLAN segments. SD-Access fabric uses modified VXLAN header where
16 reserved bits are carved out for carrying SGT tags (2^16 = 64K) using something that’s
known as VXLAN-GPO format.
Before we move on, it is worth noting that Cisco SD-Access 1.0 solution breaks down the overall
solution into five basic layers, i.e.
The “software-defined” connotation means that decisions about how traffic is steered among all
the sites in the WAN is determined by the policy which helps it adapt to the real-time condition
of the WAN as opposed to being dictated by a pre-configured fixed or static configuration. The
“wide area” part refers to the fact that the sites/networks you are looking to connect aren’t local
to one another. There is another related term known as SD-Access which refers to the edge or
access part of technologies bringing in the users in a software-defined fashion using wired,
wireless or VPN transport.
Cisco’s SD-Access is an intent-based network solution for the enterprise. The intent-based
network is envisioned as a single system that provides the translation and validation of business
goals (referred to as “intent”) into actual network policies and insights. SD-Access provides
automated services such as network segmentation, QoS, analytics for everything that connects
into the network, for example, a user, device, or application traffic.
SDA is needed because while network requirements have changed, but underlying technologies
and operations have not led to slower service rollouts. As per Cisco, the following are the direct
benefits of the SD-Access solution.
• Automated policy and provisioning for the management of wired/wireless network
• Automated group-based policies and network segmentation
• Faster issue resolution and capacity planning
• Open programmable interfaces for integration with third-party solutions
SDA is implemented with Cisco DNA Center which brings together all the elements of design,
policy definition for a wired and wireless network. We can divide the overall solution into two
layers, i.e.
• SDA Fabric
2020 © CCIEin8Weeks.com 64
• DNA Center
SDA network fabric is made up of an overlay and an underlay. This provides a clear separation
of responsibilities and simplifies deployment and operations by policy changes only affecting the
overlay. Underlay consists of all the physical devices such as switches, routers, wireless LAN
controllers and glued together with a layer 3 routing protocol.
Cisco DNA center LAN automation uses IS-IS routed design. IS-IS is protocol agnostic, can
work with loopback interfaces (without requiring an address on each L3 link), and supports an
extensible TLV format for future use cases.
SDA fabric overlay is the logical or virtual topology and consists of three main components, i.e.
• Fabric data plane, VXLAN and Group Policy Option (GPO)
• Fabric control plane, LISP performs the mapping and resolution of users and devices
associated with VXLAN VTEPs
• Fabric policy plane, with Scalable Group Tags (SGT), business goals or intent are
translated into a network policy
SDA can instantiate logical network policy depending on and what’s supported by the network
fabric. Those services include security, QoS, and application visibility among others. Cisco DNA
Center acts as the central management plane for building and maintaining SDA fabric. DNA
Center provides two key functions, i.e. automation and assurance.
2020 © CCIEin8Weeks.com 65
You can define and manage SDA group-based policies using DNA Center automation. It also
integrates with Cisco ISE to provide host onboarding and policy enforcement. Network
assurance helps quantify network availability along with a comprehensive set of network
analytics. DNA Center assurance collects a variety of telemetry, in the form of SNMP, NetFlow
and Syslog as well as using NETCONF and streaming telemetry.
Cisco SDA solution is supported on most of the Cisco switching product line including Nexus
7000, and Catalyst 3650 series switches.
SD-Access fabric edge nodes are much like the access layer in a traditional campus LAN
topology. The edge nodes use an L3 access design however with the addition of few additional
functions such as endpoint registration, mapping of a user to the virtual network, anycast L3
gateway, LISP forwarding, and VXLAN encapsulation/decapsulation.
Further Reading
SD-Access Design Guide16
The concept of converged access includes both wired and wireless QoS components. The
primary role of QoS in mixed voice, video and data networks is to help minimize packet loss and
buffer overruns during instantaneous sub-second congestions. QoS, when applied at the campus
edge, also help maintain a consistent user experience across the entire network.
There are a few key design principles related to QoS configuration within a campus network.
16 https://bit.ly/37Q7y03
2020 © CCIEin8Weeks.com 66
QoS Components
There are seven main components of any QoS configuration, regardless of whether that’s for the
campus or WAN edge.
1. Traffic classification. It classifies each incoming IP packet, into a specific class, based on
either on the packet header or payload. This classification can be done by an ACL.
2. Assignment to traffic Queues (Queuing). It assigns incoming packets to a queue for
handling as a function of the traffic class.
3. Policing. Policing is a way of ensuring that no traffic exceeds the maximum rate (in
bits/second) that you’ve configured.
4. Priority Queuing. It uses an LLQ priority queue on an interface. It can either be based on
standard priority queuing (also known as Low Latency Queuing or LLQ) or hierarchical
priority queuing (also known as H-QoS).
5. Shaping. To manage two networks with differing speeds, you can configure shaping. It
uses buffering and token bucket algorithm to regulate data flows. When bursty traffic
exceeds the configured shaped rate, packets are queued and transmitted after a small
delay.
6. Marking and mutation. Marking is used on traffic to convey information to the
downstream devices in the network.
7. Trust. It enables traffic to pass through a networking device with DSCP or CoS or IPP
encoding done by the endpoints.
QoS Policy
Cisco routers and switches use two primary concepts to achieve QoS configuration as defined by
the network engineers within the Cisco MQC framework.
• Class-maps
• Policy-maps
A class map is a way to name or label a specific traffic flow and separate it from the rest. Policy
map, on the other hand, specifies which traffic class to act where actions can be any of the
following.
Policy maps are not effective until they are attached to a port or an interface. They can be applied
to a physical port or an interface and on VLAN or tunnel interfaces as well.
IP packet switching is the process as to how two end hosts communicate with each other. Much
like network layer and IP addresses, the data link layer also has its link-layer addresses which are
known as MAC addresses for ethernet. If you two end hosts are on the same IP subnet, they do
not need a default gateway or a router to communicate with each other. One end-host can use
ARP to find out the other’s MAC address on an ethernet segment, and then simply transmit the
packet to the destination host. However, when the destination host is on a different subnet than
the source host, the source host would simply send that packet off to the default gateway or the
router. If the router knows how to reach that destination IP subnet, via a static or dynamic
routing protocol, the router will simply rewrite the L2 information and send packet off to its
destination host using the appropriate outgoing interface.
2020 © CCIEin8Weeks.com 68
In the earliest days of networking, Cisco routers switched packets from incoming to outgoing
interfaces using process switching which was slow due to the CPU overhead involved.
Eventually, Cisco streamlined the process with fast switching and then finally CEF switching.
Process switching is the switching mechanism in which a general-purpose CPU, e.g. PowerPC or
x86 processor, on a router is used to switch packets. In classic IOS, there is an input_process that
runs on the CPU for processing all incoming packets. Today, process switching is only limited to
a handful of specific scenarios, while everything else gets CEF switched whether in software (i.e.
CPU) or hardware (i.e. network processor) depending on the platform.
• Packets sourced or destined to the router i.e. traffic destined to the control plane such as
routing protocol packets
• Packets that are too complex for the hardware to handle, for example, packets with IP
options set
• Packets that require extra information such as ARP resolution etc.
The routing table, also known as Routing Information Base (or RIB) is built from information
gained from either directly connected interfaces and/or static or dynamic routing protocols.
Cisco Express Forwarding (or CEF) is a Cisco proprietary switching method developed back in
the 1990s to keep pace with the modern high capacity and low latency networks. Today, it is the
default switching method across all Cisco routers, switches, and even appliances. CEF can be
done in software or hardware. CEF can be implemented on both centralized (e.g. ISR 4K or
ASR1K) as well as the distributed (e.g. Cisco ASR9K or Nexus 7K) forwarding platforms.
Please note that concepts of centralized and distributed forwarding are orthogonal to whether a
platform is software-based or hardware.
Software-based CEF
Software-based CEF implies CEF processing done using a general-purpose processor as opposed
to using an ASIC or a network processor. CEF consists of two major components, i.e.
Forwarding Information Base (or FIB) and Adjacency table.
The FIB is built directly from the routing table and contains next-hop IP addresses for each
destination IP prefix. It is updated when a routing or topology change occurs.
2020 © CCIEin8Weeks.com 69
Adjacency table contains MAC addresses and egress interfaces of all directly connected next
hops and is populated using the ARP table (for ethernet medium).
Hardware-based CEF
Hardware-based CEF is where forwarding is done with the help of ASIC(s) or network
processor(s). It can be either centralized (e.g. ASR1K) or distributed (e.g. ASR9K or CRS-1).
The primary advantage of distributed forwarding is that the packet throughput is improved even
more so by offloading forwarding tasks to the egress line card(s).
Ternary Content Addressable Memory (or TCAM) is a type of CAM that can operate with 0, 1
and X where X refers to either a 0 or 1 hence giving it more flexibility in terms of searching
through memory locations within CAM. When a frame is received on a switch port with TCAM,
a copy of the first 200 bytes of the packet is copied to the forwarding controller which helps
perform the actual lookups within the TCAM. Those 200 bytes are enough to perform all
necessary forwarding decisions based on VLANs, egress ports, etc.
Each TCAM entry comprises three components, i.e. Value, Mask and Result. The X value that I
referred to earlier is organized by the mask where each unique mask can represent up to eight
values. The mask/value pairs are evaluated simultaneously in order to find the best or the longest
2020 © CCIEin8Weeks.com 70
match within a single TCAM lookup operation. In the case of an ACL, once a source/destination
mask pair reaches eight values, a new mask pair is created so another eight values can be stored.
• Layer 2
• Layer 3
• QoS Access Control Elements (or individual permit/deny statements)
• Security Access Control Elements (or individual permit/deny statements)
• IPv6
There are some notable differences between high-end and low-end switching platforms when it
comes to their use and size of TCAMs. For example, higher-end switches come with larger
TCAM sizes and do not make use of Switching Database Manager (or SDM) templates.
MAC address table is what’s also known as the CAM table and is used on switches to find the
egress port for frame forwarding. MAC address table timeout is five minutes by default on Cisco
switches, so entry is held in the table only for that amount of time before the timeout expires and
the entry is removed from the table.
FIB is used for forwarding but is derived from the combination of RIB and adjacency table so
that L2 information in each outgoing frame can be rewritten.
2020 © CCIEin8Weeks.com 71
RIB FIB
Architecture IP routing table (best AD CEF table
only)
Data Structure Repository of routes Repository of interface IDs
and next-hop information for
each destination prefix
Plane of Operation Routing Forwarding
The RIB can be local to a routing protocol such as the case with the OSPFv217. The OSPFv2
local RIB acts as the primary state management data structure for SPF computation which
minimizes the churn within the global RIB and leads to lesser packet drops. The global RIB is
updated only when routes are added, deleted or changed. By default, the global RIB is used to
compute inter-area, NSSA and forwarding addresses for type 5 and 7 LSAs.
17 https://bit.ly/2ScHxkS
2020 © CCIEin8Weeks.com 72
Chapter Summary
• The access layer is not just about connectivity but also feature richness up and down the
OSI stack.
• vSmart controller is a software-based component and is responsible for the centralized
SD-WAN control plane much like what a traditional SDN controller would do in a data
center or campus SDN LAN fabric.
• vEdge devices and vSmart controller connect via a TLS based control plane
• SDA network fabric is made up of an overlay and an underlay.
• Ternary Content Addressable Memory (or TCAM) is a type of CAM that can operate
with 0, 1 and X hence the ternary, where X refers to either a 0 or 1
• MAC address table is what’s also known as the CAM table and is used on switches to
find the egress port for frame forwarding.
• The global RIB is updated only when routes are added, deleted or changed.
2020 © CCIEin8Weeks.com 73
CHAPTER 2 VIRTUALIZATION
This chapter covers the following exam topics from the Cisco’s official 350-401 V1.0 Enterprise
Network Core Technologies (ENCOR) exam blueprint.
Without virtualization, IT organizations will be forced to deploy a lot more servers to keep pace
with today’s high storage and processor demands. Each of these servers may be operating only at
a fraction of their capacity, thus wasting precious resources.
This enables IT organizations to run multiple applications across different operating systems, on
a single server. This results in economies of scale and better efficiency.
The Type-1 hypervisor runs on the physical hardware of the host machine. It doesn't have to load
an underlying operating system first. With direct access to the underlying hardware and no other
software, these hypervisors are the most efficient hypervisors by running directly atop bare
metal.
A Type-2 hypervisor is typically installed on top of an existing operating system and known as a
hosted hypervisor. Type-2 hypervisors are generally not used for data center computing and are
reserved for client or end-user, where performance and security are of lesser concern.
Before we move on, it is worth noting that terms hypervisor and Virtual Machine Monitor (or
VMM) are often used interchangeably, but, they do not exactly refer to the same piece of
software. Hypervisor (or more precisely Type-1 hypervisor) includes VMM as well as the device
model. You can think of VMM as software responsible for setting up VMs and handling I/O
access for guest OS. VMM ensures that guest OS execution has pretty much identical behavior
while running on top of VMM versus bare metal. It is also responsible for efficient program
execution as well as managing all hardware resources.
The device model is the other part of the hypervisor which provides I/O interfaces for VMs by
way of I/O virtualization. VMM delegates I/O requests to the correct device model. You can
think of vNICs and vHBAs as examples of various device models. Device model can be either
software-based (e.g. virtIO drivers) or hardware-based or hardware-assisted (e.g. SR-IOV which
allows a physical PCIe function to be partitioned into multiple virtual PCIe functions). With
software-based solutions, I/O virtualization techniques simply use virtualized CPUs (or vCPUs)
alongside the VMs.
Virtual Machine
The virtual machine (or VM) is comprised of a set of configuration and specification files. It
comes to life using the physical resources of the underlying host. Each VM is self-contained and
is completely independent of other VMs. Multiple VMs can be installed on the same physical
server, which enables several operating systems and applications to run on one physical server or
host. A thin layer of hypervisor software decouples the virtual machines from the host. It also
takes care of dynamically allocating the required resources to each virtual machine.
A virtual machine consists of several types of files that are stored on the supported storage
device. The key files that are part of any virtual machine are the virtual disk file, NVRAM
setting file, configuration file, and the log file. You can configure a virtual machine using the
given virtualization software. You do not need to edit any of the key files, manually.
A virtual machine may contain extra files, if you add Raw Device Mappings (RDMs) or if one or
more snapshots exist.
2020 © CCIEin8Weeks.com 78
Every virtual machine has virtual devices that provide the same functionality as physical devices.
It has additional benefits in terms of portability, manageability, and security.
• Encapsulation
• Hardware Independence
• Isolation
• Partitioning
There are numerous benefits of compute virtualization including but not limited to the following.
• Improved security
• Easier administration
• Cost savings
• Consolidation and centralization of physical servers
• Faster TTM
A container image is a lightweight, portable and executable package of software that consists of
code, runtime libraries, system tools and libraries etc. Containers are available for both Linux
and Windows based apps. There are a variety of container technologies that exist today.
2020 © CCIEin8Weeks.com 79
• Docker containers
• Java containers
• Unikernels
• LXD (LXD is based on liblxc, its purpose is to control some lxc with added capabilities,
like snapshots or live migration)
• OpenVZ
• Rkt
• Windows Server containers
• Hyper-V containers
It is worth noting that LXCs (Linux Containers) is an OS-level virtualization mechanism for
running multiple isolated Linux systems (or containers) on a control host using a single Linux
kernel. LXD isn't a repackaging of LXC, it was built on top of LXC to provide a new, better user
experience. Technically speaking, LXD uses LXC through liblxc and its Go binding to create
and manage the containers.
Docker is an open platform for developers and sysadmins to build, ship, and run distributed
applications, whether on laptops, data center VMs, or the cloud. Docker can build images
automatically by reading the instructions from a Dockerfile. A Dockerfile is a text file that
contains all the commands a user could call on the command line to assemble an image. In
Docker, everything is based on Images. An image is a combination of a file system and
parameters. A container is a runtime instance of an image. Dockerfile is used to build the image
when you run “docker build”.
Container image contains executable package of a piece of software that includes everything
needed to run it: code, runtime, system tools, system libraries, settings.
2020 © CCIEin8Weeks.com 80
Kubernetes
• a container platform
• a microservices platform
• a portable cloud platform
The legacy way to deploy applications is to install the applications on a host using the operating-
system package manager. This had the disadvantage of mixing the applications’ executables,
configuration, libraries, and lifecycles with each other and the underlying host OS.
Let’s now go over some of the key Kubernetes concepts. A container is the smallest unit in the
Kubernetes terminology. The main purpose of Kubernetes is to manage and deploy containers. It
is also worth noting that Kubernetes management is not just limited to Docker containers.
Node is the host where containers run, much like a physical server that’s also known as a host
where VMs reside. A pod is a management unit in the Kubernetes world. It is comprised of one
or more containers and has its own IP address and storage namespaces. All containers running
2020 © CCIEin8Weeks.com 82
inside a pod share those networking and storage resources. When a pod is deleted, it will go
away forever. A pod is defined using a YAML file.
Deployment is how you handle HA in Kubernetes. While a pod by itself is mortal, but with
“deployment”, Kubernetes can ensure that the desired number of pods are always up and
running. Again, “deployment” is defined using a YAML file.
Kubernetes service, a micro-service, is an abstraction that defines a logical set of pods and policy
which dictates how to access them. Kubernetes architecture also includes something known as a
label. This isn’t your MPLS label. Here, the label refers to a semantic tag that can be attached to
Kubernetes objects to mark them as part of a group. Labels are assigned as key/value pairs.
Kubernetes annotations are like labels, but they allow you to attach arbitrary key-value
information to an object. Unlike labels, annotations are free form and can contain less structured
data, you can think of them as a way of attaching rich meta-data to an object.
2020 © CCIEin8Weeks.com 83
Virtual Switching
There may be one or more virtual switches within one host system. The virtual network adaptor
(such as a vNIC) which is assigned to a Virtual Machine can be assigned to this virtual switch.
2020 © CCIEin8Weeks.com 84
There are three well known virtual switches: VMware virtual switch (available in standard &
distributed packaging), Cisco Nexus 1000V, and Open vSwitch (or OVS). Standard or
standalone VMware vSwitch is meant to run on a single ESX host, whereas a distributed vSwitch
can manage up to 500 hosts.
Open vSwitch was created by Nicira which was subsequently acquired by VMware in 2012.
OVS has its root within the open-source community and with Linux-based hypervisors, such as
KVM and XEN. OVS is now also included within the OpenStack distribution. OVS doesn’t
come with a controller unlike Cisco’s VSM (for N1Kv) or vCenter (for VMware distributed
virtual switch). Open vSwitch is managed by third party controllers and managers (e.g.
OpenStack Neutron plug-in or OpenDaylight SDN controller).
2020 © CCIEin8Weeks.com 85
Intel DPDK implements a run to completion model for packet processing, where all resources
must be allocated before calling Data Plane applications, running as execution units on logical
processing cores. With LXCs and KVM, the TUN/TAP subsystem creates a virtual ethernet
interface attached to a process.
With the rise of 10Gbps and beyond, packet rates tend to exceed what a host is capable of when
I/O virtualization is software-based and the hypervisor (or VMM) permanently resides in the
data path between the physical NIC and the vNIC. To keep up with increasing speeds, a
combination of software and hardware features are used. It is important to understand while
server virtualization helped de-couple application software stack from the underlying hardware,
it led to increased CPU utilization and reduction in I/O throughput, thus putting a limit on VM
concentration per server.
Like with bare-metal hosts and NICs, in a virtualized environment, getting packets off the wire
for processing is also interrupt-driven. When traffic is received by the NIC, an interrupt request
is sent to host CPU, which then must stop doing what it’s doing to help retrieve the data. “CPU”
here refers to CPU core running the hypervisor as well as the CPU cores allocated to the VM that
data is destined to. Now, that’s a double hit. Historically speaking, two major techniques were
developed to address this bottleneck, i.e.
VMDq allows the hypervisor to assign a queue in the physical NIC for each VM, thus
eliminating the need for the first interrupt destined to the hypervisor. With this approach, only
the CPU that’s running the VM is interrupted and packets are then copied directly into VM user
space memory. This means packets are only touched once, resulting in higher throughput and
lower CPU utilization. In case if you were wondering how the frames are classified to correct
VMs, there is no magic and it happens via MAC or VLAN tag found inside the incoming frame.
However, due to VMM still managing the I/O queues in the NIC for each VM, the handoff still
requires extra CPU utilization. This is where we enter SR-IOV.
2020 © CCIEin8Weeks.com 86
SR-IOV is an extension of PCI Express (or PCIe) specification, unlike VMDq queue per VM
approach, SR-IO goes one step further and creates a virtual function (or VF) that behaves as if
each VM has its own physical NIC. Each VF is allocated a descriptor that knows where the
corresponding VM user space memory is located. SR-IOV data transfer happens via Direct
Memory Access (or DMA). This approach results in several benefits, i.e. it bypasses the virtual
switch and the VMM, thus providing an interrupt-free operation for extremely fast data I/O
speeds.
Network virtualization is about combining hardware and software network resources and
functions into a software-based entity that can be managed by virtualization management tools. It
can be divided into two main categories, i.e.
External virtualization is about combining many networks into a virtual unit spanning multiple
servers or switches, whereas internal refers to providing network connectivity to VMs or
containers located on a single physical server. External network virtualization units are carved
2020 © CCIEin8Weeks.com 87
out as VLANs, while internal virtualization uses pseudo-interfaces such as Virtualized Network
Interface Card (or vNIC) to emulate a physical network.
Now, in the world of external virtualization, several network-based techniques are used to
provide virtualization or separation of traffic carried over a single physical network. Let’s review
some of those before we move to virtual switches.
802.1Q or dot1q is the standard that brought us VLANs based on the IEEE 802.3 standard. This
is what defined the VLAN tagging for Ethernet frames.
Virtual Routing and Forwarding (or VRF) is a way to segment IP traffic with or without L3
VPNs (such as MPLS VPNs). It is also known as VRF-Lite when used without L3 VPNs.
Virtual Extensible VLAN (or VXLAN) is used to create virtual overlays on top of a physical
underlay. VXLAN uses MAC in IP/UDP tunneling to extend L2 segments over IP networks.
VXLAN uses flood and learn mechanism much like ethernet itself. Ethernet VPNs (or EVPNs) is
a variation where VXLANs are used along with BGP to accomplish routing between endpoints.
Network Virtualization using GRE (or NVGRE) uses MAC in GRE encapsulation to extend L2
segments over routed network. It uses a subnet identified within the GRE header to accomplish
L2 extensions. NVGRE and VXLANs target the same use case.
Overlay Transport Virtualization (or OTV) uses MAC in IP encapsulation to create L2 overlays
and provides support for multicast traffic.
VRF
Virtual Routing and Forwarding (or VRF) is feature universally available on all modern routers
and L3 switches that allows multiple instances of a routing table. The VRF achieves the
virtualization of the networking devices at network layer or L3. However, in order to create a
VPN, you need to interconnect individual devices that are using VRFs.
2020 © CCIEin8Weeks.com 88
The type of data path virtualization technology to use varies depending on how far VRFs and
corresponding devices are from each other. If virtualized devices are directly connected or L2
adjacent or single hop away, you have no choice but to use link or interface level virtualization
e.g. using 802.1Q tag or L2 based labeling in the form of VRF to VLAN mapping.
If the virtualized devices are multiple hops away from each other, you need to use some form of
tunneling to realize an end to end data path virtualization. MPLS VPNs and GRE tunnels are two
options that can be used to realize an end to end data path isolation. When VRFs are configured
without MPLS, they are known as VRF-lite. This form of VRF configuration requires that VRF-
lite interfaces are L3 or routed.
With or without MPLS, you need to understand three device roles that the entire VRF
configuration revolves around, i.e. Customer Edge (or CE), Provider Edge (or PE) and Provider
(or P) devices. CE routers advertise and learn routes via the CE-PE link. PE routers exchange
routing information with CE routers via static routes or using a dynamic routing protocol such as
BGP or RIP. PE routers only maintain VPN routes to which they are directly connected to and
not all the provider’s VPNs. Each VPN on PE is mapped a corresponding VRF. Finally, P
routers which are also known as core routers, only reside within the provider’s cloud and thus
never directly attach to any of the CE routers.
With VRF-lite, multiple customers can share one CE router and one CE-PE physical link. This
shared CE router keeps separate VRF tables for each customer. This is where a Multi-VRF CE
device takes on some PE-like functions i.e. ability to maintain separate VRF tables.
2020 © CCIEin8Weeks.com 89
• CE router receives a packet, depending on the input interface the packet was received on,
the router will use the corresponding routing table to look up the destination and forward
it onto the PE.
• Ingress PE receives the packet from CE, it performs a VRF lookup and if the route is
found, the PE router will add an MPLS label and send it into the MPLS cloud.
• Now, when egress PE receives a packet from the network, it would strip off the MPLS
label but use it to identify the correct VPN/VRF routing table.
• Finally, when the destination CE router receives a packet from the PE, it uses the input
interface to look up the correct VRF routing table. The packet is forwarded into the VPN
if a matching route is found.
Configuring Multi-VRF CE
PE Configuration
There are plenty of Cisco IOS CLIs that you can use to verify a VRF configuration.
2020 © CCIEin8Weeks.com 93
Show ip protocols vrf <vrf-name> displays routing protocol information related to a VRF.
Show ip route vrf <vrf-name> displays IP routing table information related to a VRF.
Show ip vrf displays information about the VRF instances configured by the admin.
Further Reading
Network Virtualization and Path Isolation 18
Generic routing encapsulation (GRE) is a tunneling protocol that allows you to transport packets
from one protocol over or within another protocol by way of encapsulation. GRE tunnel
endpoints can encapsulate a payload within an IP packet, the two endpoints are identified by a
tunnel source and tunnel destination. All intermediary L3 devices simply forward these
encapsulated packets based on the outer IP header. Once these packets arrive at the other tunnel
endpoint, the GRE headers are removed, and the packets are forwarded based on the inner
protocol headers. GRE tunnels can be either point to point (or p2p) or point to multipoint (or
p2mp). GRE tunnels can carry both multicast and broadcast traffic such as routing protocol
packets, for example, OSPF or EIGRP. GRE tunnels are not secure i.e. the encapsulated data
remains in cleartext however they can be secured using what is known as GRE over IPsec.
Unlike GRE, IPsec can provide secure tunneling between two endpoints. IPsec can be configured
in a tunnel as well as transport mode i.e. without encapsulation. In its most basic form, there are
three major components when it comes to configuring an IPSec tunnel, i.e.
18 https://bit.ly/2S5Imfm
2020 © CCIEin8Weeks.com 94
In Cisco IOS jargon, identification of interesting traffic takes place using an ACL. You can use
an extended access-list to identify the source and destination networks for encryption. Once
that’s done, you can pick your IPSec protocol suite and corresponding encryption and
authentication algorithms in the form of a transformation set. There are two possible IPsec
protocols, Encapsulating Security Payload (or ESP) and Authentication Header (or AH). The
former allows you to use both encryption (or packet confidentiality) and authentication (packet
integrity). However, when using AH, you are options are limited to authentication or packet
integrity only. The most commonly used encryption and authentication algorithms are AES (e.g.
AES-128 or AES-256) and SHA-1 or SHA-2 respectively. Finally, you tie everything together
using a Cisco IOS data structure known as the crypto map. Your cisco device will start
encrypting traffic that matches the crypto ACL as soon as the crypto map is applied to an
interface.
Today, Cisco IOS supports a plethora of ways to configure IPSec depending on the use case. Let
me summarize.
GRE can be configured in both point to point (or p2p) or point to multipoint (or p2mp) modes
with IPsec. Here is the breakdown of a preferred GRE/IPsec configuration for various
topologies.
On R1:
On R3:
To verify and troubleshoot IPSec or GRE/IPSec configuration, you can use any of the following
Cisco IOS CLIs.
LISP
Unlike IP, Locator ID Separation Protocol (or LISP) is both a network architecture and a
protocol that utilizes two namespaces, i.e.
Decoupling EID and RLOC functions result in several advantages such as smaller routing states,
better multihoming, minimizing end customer changes and ingress traffic engineering over
traditional IP routing. LISP architecture is made possible with the help of many roles, i.e.
The following are some of the key use cases for LISP.
LISP is supported on Cisco devices running IOS XE, IOS XR and Nexus OS.
Further Reading
LISP FAQ19
VXLAN
Network segmentation has been around forever with 802.1Q VLANs and L2 extensions.
However, VLANs are limited to 4K in numbers. Virtual eXtensible LAN (or VXLAN), on the
other hand, uses MAC-in-IP encapsulation to extend L2 networks across L3 clouds in the form
of overlays. It uses 24-bit VIN address space which allows for 16M VINs or VXLAN segments.
VXLAN tunnels are set up on top of the underlying network, which by the way must be able to
accommodate the extra VXLAN overhead by supporting larger MTUs.
VXLAN offers several benefits beyond 16M VIN address space, i.e.
19 https://bit.ly/38RW506
2020 © CCIEin8Weeks.com 99
VXLAN fabric consists of an underlay and an overlay network. Underlay network is an IP routed
network that uses well-known routing protocols such as OSPF, BGP, or EIGRP, and rest of the
time-tested best practices for IP routing network. VXLAN overlay can handle broadcast,
unknown unicast and multicast (or BUM) traffic. However, handling of BUM traffic is where the
role of VXLAN control plane comes into play. The most scalable control option is to use
standard-based MP-BGP EVPN with VXLAN data plane. This enables the use of MPLS for E-
LAN services, Provider Backbone Bridges (PBB) or PBB-VPLS, and Network Virtualization
Overlays (NVO3) for L2/L3 DCI with VXLAN, NVGRE, and MPLSoGRE. The use of MP-
BGP minimizes flooding and supports Integrated Routing and Bridging (IRB) in the overlay.
VXLAN uses VTEP devices to map tenant VMs to VXLAN segments and perform VXLAN
encapsulation and de-encapsulation. Each VTEP has two interfaces, one is an L2 interface which
is for the local LAN segment and functions as a bridge, and an L3 or IP interface facing the
transport network or the overlay. The overlay control plane is formed across the VTEPs.
Each IP interface must be unique to identify the VTEP device on the transport network, all
encapsulated packets (MAC in UDP) are sourced from and destined to this IP address. VTEP
device also helps discover remote VTEPs for its segments and learns remote MAC to VTEP
mappings.
Further Reading
VXLAN and Cisco Nexus 9K switches20
20 https://bit.ly/37RLpOR
2020 © CCIEin8Weeks.com 101
Chapter Summary
CHAPTER 3 INFRASTRUCTURE
This chapter covers the following exam topics from the Cisco’s official 350-401 V1.0 Enterprise
Network Core Technologies (ENCOR) exam blueprint.
Layer 2
• Troubleshoot static and dynamic 802.1q trunking protocols
• Troubleshoot static and dynamic EtherChannels
• Configure and verify common Spanning Tree Protocols (RSTP and MST)
Layer 3
• Compare routing concepts of EIGRP and OSPF (advanced distance vector vs. linked state,
load balancing, path selection, path operations, metrics)
• Configure and verify simple OSPF environments, including multiple normal areas,
summarization, and filtering (neighbor adjacency, point-to-point and broadcast network
types, and passive interface)
• Configure and verify eBGP between directly connected neighbors (best path selection
algorithm and neighbor relationships)
Wireless
• Describe Layer 1 concepts, such as RF power, RSSI, SNR, interference noise, band and
channels, and wireless client devices capabilities
• Describe AP modes and antenna types
• Describe access point discovery and join process (discovery algorithms, WLC selection
process)
• Describe the main principles and use cases for Layer 2 and Layer 3 roaming
• Troubleshoot WLAN configuration and wireless client connectivity issues
IP Services
• Describe Network Time Protocol (NTP)
• Configure and verify NAT/PAT
• Configure first-hop redundancy protocols, such as HSRP and VRRP
• Describe multicast protocols, such as PIM and IGMP v2/v3
2020 © CCIEin8Weeks.com 103
2020 © CCIEin8Weeks.com 104
Layer 2
We can divide the overall troubleshooting of trunks into two most common areas, i.e.
The trunk is an L2 feature that uses 802.1Q VLAN tags to multiplex (or trunk) traffic from
multiple VLANs on a single physical link. The Trunk can be configured either manually by the
network admin, or a protocol such as DTP or VTP can be used to facilitate the propagation of
VLAN configuration within a switching domain.
Symptoms
A switch access port is behaving like a trunk port by accepting VLAN traffic from a VLAN
different from the VLAN to which it is configured for.
Solution
To troubleshoot trunk link not forming or VLAN leaking, you can proceed with the following
two approaches.
• You can use show interface trunk command on both local and remote peer links to ensure
that there is no native VLANs mismatch. If native VLAN does not match on both sides,
VLAN leaking can take place.
• You can use the show interface trunk command to check trunk status and figure out if it
is established or not. Cisco switches would try to Dynamic Trunking Protocol (or DTP)
by default which you can avoid by statically configuring the trunk.
You can check out trunk status by using show interfaces trunk command on peer switches. In the
output below, you can notice that native VLANs are misconfigured and not matching on both
sides. When this occurs, you will also notice %CDP-4-NATIVE_VLAN_MISMATCH log
messages.
Please note that in case of native VLAN mismatch, traffic for all VLANs other than native ones
will successfully propagate over the trunk. Traffic associated with native VLANs (20 or 30) will
not propagate over the trunk link.
Symptom
One side of the trunk is configured with trunk mode off and other with trunk mode on. With this
configuration, the trunk link will not come up.
Solution
Trunk links are usually configured with switchport mode trunk command. Cisco switch trunk
ports use DTP to negotiate the mode of the trunk. The two sides must have compatible trunk
modes before a trunk link will come up.
To investigate trunk mode incompatibility, you can start with show interfaces trunk to find out
the link that is not part of an active trunk. If a link doesn’t show up in the output, you can further
prod into it using show interfaces <interface> switchport command. You want to make sure that
the trunk administrative mode is compatible with the other side. To resolve, you will need to go
to each switch and add switchport mode trunk to both interfaces.
Name: Fa0/3
Switchport: Enabled
Administrative Mode: trunk
Symptom
Traffic from a VLAN will not go over a trunk unless it can traverse. When that happens, you will
notice expected traffic or perhaps no traffic at all.
Solution
To allow a VLAN on a trunk, you need to issue switchport trunk allowed vlan <vlan #>
command on corresponding switches. You can reveal the list of allowed VLANs using show
interfaces trunk command.
VTP is a Cisco proprietary protocol that helps reduce trunking administration overhead. It
achieves this goal by using various protocol modes and versions. You can configure a switch to
operate in one of four modes, i.e.
2020 © CCIEin8Weeks.com 107
• Server: In this mode, you can create, modify and delete VLANs for the entire VTP
domain. VTP servers advertise their VLAN configuration to other switches in the same
VTP domain. It is the default VTP mode.
• Client: VTP clients are pretty much like VTP servers, but you can’t create, modify or
delete VLANs on a VTP client.
• Transparent: In this mode, VTP switches do not advertise or accept its own VLAN
configuration however transparent switches pass through or forward advertisements they
receive.
• Off: It is similar to VTP transparent mode but with one more restriction, i.e. switches
configured in off mode do not forward VTP advertisements.
VTP can be implemented in three versions, i.e. V1, V2 or V3. For all practical purposes, you will
only be using V3. V3 provides supports for extended VLANs and the ability to be configured on
a per-port basis.
In this case, you need to first make sure that all relevant links are indeed configured as trunks
and their trunking types match. Make sure that their VTP domains match and verify that the VTP
server switch is configured as a VTP server.
To resolve this issue, you need to ensure that the newly added switch has a configuration
revision # that is less the current revision # found in the VTP domain. Sometimes, the
propagation may not start until you make some modifications to the VLAN database such as the
creation or deletion of VLANs.
Switch in VTP client mode will lose its VLAN database upon recycling, as a result, switch trunk
ports will move to inactive trunk state. To resolve this issue, you need to repopulate the VLAN
database from the VTP server which will move all ports that were previously members of
VLANs that VTP server advertised back into the active state. You can trigger this by either
switch to VTP transparent mode temporarily and adding VLAN to which the uplink port is
assigned in the VLAN database.
2020 © CCIEin8Weeks.com 108
EthernetChannel allows bonding or bundling of up to 8 ethernet links into one logical link. If one
link within the bundle were to fail, the switch will automatically move traffic to other functional
links. As far as STP is concerned, it sees EtherChannel as one logical link thus avoiding any
potential conflict with respect to bridging loops. EtherChannels are also known as Port Channels.
EtherChannels are not be confused with trunks since trunks are used to allow traffic from more
than one VLAN on a given port (trunk), whereas EtherChannels are simply a bundle of two or
more ports that provide higher bandwidth, load balancing, and link-level redundancy. Unlike
trunks, EtherChannels can be configured as L2 as well as L3 links.
EtherChannel can be configured manually (aka statically) by the network admin, or a dynamic
protocol can be used to perform link negotiation. There are two common types of protocols, i.e.
Port Aggregation Protocol (or Cisco proprietary PAgP) and Link Aggregation Control Protocol
(or IEEE 802.3ad LACP). For all practical purposes, you will always be configuring LACP.
For EthernetChannel troubleshooting, you can always use show etherchannel summary
command as a starting point to prod EtherChannel status. You can verify the channel negotiation
mode by using show etherchannel port command.
For etherchannel to work, each side must have the same switch mode, i.e. access or trunk, native
and trunked VLAN, port speed and duplex, etc.
Configure and Verify Common Spanning Tree Protocols (RSTP and MST)
IEEE 802.1w or RSTP provides rapid convergence of the IEEE 802.1D spanning tree protocol.
Let us outline the main differences between the two protocol variants.
STP RSTP
Fundamentals Only root sends out BPDUs All switches or bridges send
in a stable switched topology out BPDUs (every 2s)
Port States Disabled Discarding
Blocking Learning
Listening Forwarding
Learning
Forwarding
Port Roles Root (Forwarding) Root (Forwarding)
Designated (Forwarding) Designated (Forwarding)
Non-Designated (Blocking) Alternate (Discarding)
2020 © CCIEin8Weeks.com 109
Backup (Discarding)
Timers Hello (2s) Proposal and agreement
Max Age (20s = 10 x 2s) process (< 1s)
Forward Delay (15s)
FSM Transition Duration 50s Uses Request Link Query
mechanism to seek failure
Only root can inform others Every bridge can generate TC
of topology change and inform neighbors
RSTP Configuration
Configuring RSTP is like STP except for the features that are already incorporated into RSTP,
for example, uplinkfast or backbonefast, etc.
To configure RSTP, all your switches need to be configured with the following command in the
global configuration mode.
We can divide the overall STP configuration into the following steps.
RSTP Verification
In order to display or verify STP or RSTP configuration, you can use one or more of the
following commands.
2020 © CCIEin8Weeks.com 110
Show spanning-tree
MST Configuration
Cisco MST or MSTP is an IEEE 802.1s standard-based protocol. It offers faster convergence
over 802.1D STP and runs within the concept of an instance as opposed to all or per VLANs
scopes. MSTP uses RSTP for rapid convergence but allows VLANs to be grouped into an
instance where each instance contains an independent STP topology.
Switches with the same MST configuration are known to be in an MST region. You can
configure a Cisco switch for a region using spanning-tree mst configuration global mode
command.
MST configuration can be broken down into a few steps as shown below.
MST Verification
To display and verify MST operations, you can use one more of the following Cisco CLIs.
Layer 3
Compare routing concepts of EIGRP and OSPF (advanced distance vector vs. linked state, load
balancing, path selection, path operations, metrics)
EIGRP
Typical distance vector protocols use two pieces of information to make the best path selection
to a destination network, i.e. the distance in L3 hop count and the vector which is the next-hop.
Once EIGRP builds a neighbor relationship, it builds a topology table i.e. unlike RIP or IGRP, it
doesn’t rely on routing table to hold all of the information it needs to operate and then installs
route from this to the routing table (much like a distance-vector protocol). You can display the
content of the topology table using show ip eigrp topology command. Topology contains both
distance and vector information for each destination prefix that EIGRP knows about.
EIGRP uses minimum bandwidth and total delay to compute the routing metric. These
bandwidth and delay values are what network admins have configured on the interfaces. EIGRP
calculates the final metric using bandwidth, delay and a set of 5 coefficients known as K values.
EIGRP speaking router wouldn’t form a neighbor relationship with another L3 device if K values
didn’t match.
EIGRP metric = ([K1 * bandwidth + (K2 * bandwidth) / (256 - load) + K3 * delay] * [K5 /
(reliability + K4)]) * 256
Where the default K values are K1=1, K2=0, K3=1, K4=0 and K5 = 0 which simplifies the
metric formula to bandwidth plus delay.
EIGRP also establishes a specific jargon when it comes to path selection, and those terms include
feasible distance, reported distance and feasible successor.
Feasible distance (or the best path or FD) is the best metric along a path to destination network.
This includes the metric to the neighbor advertising the given path. Reported distance (RD) is the
total metric along the path to a destination network as advertised by an upstream neighbor.
Finally, a feasible successor (FC) is a path whose reported distance is less than the feasible
distance.
EIGRP supports multiple equal cost paths out of the box and you can configure it using
maximum-path command. Additionally, EIGRP also supports unequal cost paths load balancing
which can be configured using variance command. Unequal cost paths follow the logic that any
2020 © CCIEin8Weeks.com 113
routes with a feasible distance less than ‘n’ times of the successor routes feasible distance can be
included in the multiple paths.
Further Reading
EIGRP Configuration Guide21
OSPF
Open Shortest Path First (OSPF) protocol is defined in RFC 2328 and based on link-state
technology where a link is an interface on an L3 device such as a router. The state of the link is
an attribute of that interface and its relationship to its neighbors. The interface attributes relevant
to OSPF include the IP address, subnet mask, the type of the network, the routers that are also
connected to that network and what have you.
Unlike distance vector protocols that use Bellman-Ford, OSPF uses the shortest path first (or
SPF or Dijkstra SPF) algorithm to determine the shortest to all known destination networks with
the help of a graph.
When an OSPF router boots up, it generates a link-state advertisement (or LSA) for that router
which represents the state of links. All routers exchange their LSAs via flooding mechanism.
Once an exchange is completed, every router ends up with a database that is used to calculate the
shortest path to each destination. Each OSPF router uses the Dijkstra SPF algorithm to derive the
shortest path tree and the result of this calculation is stored in the routing table (or RIB). The
algorithm places each router at the root of a tree and then calculates the shortest path to each
destination network based on the cumulate cost needed to reach that destination.
EIGRP OSPF
Best Path Selection DUAL FSM Dijkstra SPF
Algorithm
Administrative 90 110
Distance
Metric Bandwidth, Load, Delay and Cost
Reliability
21 https://bit.ly/2OhmqfZ
2020 © CCIEin8Weeks.com 114
OSPF uses interface cost as its metric, which is inversely proportional to the bandwidth of that
interface, i.e. higher bandwidth means lower cost by default unless you modify it using ip ospf
cost <value> command.
OSPF uses flooding to exchange link-state advertisements between routers and all routers within
an area have an exact link-state database. Routers that have interfaces in multiple areas including
backbone are known as Area Border Routers (ABRs). Routers that act as a gateway between
OSPF and other routing protocols or other instances of the OSPF process are known as
Autonomous System Border Routers (or ASBRs). OSPF finite state machine (or FSM) includes
eight different states including down, attempt, init, two-way, exstart, exchange, loading and full.
OSPF addresses three classes of network types, point to point (p2p), point to multipoint (p2mp)
and broadcast.
Further Reading
OSPF Design Guide22
Configure and verify simple OSPF environments, including multiple normal areas,
summarization, and filtering (neighbor adjacency, point-to-point and broadcast network types,
and passive interface)
Multiple Areas
22 https://bit.ly/31jSOUx
2020 © CCIEin8Weeks.com 115
The OSPF process ID is a numeric value only locally significant to the router, i.e. it is never sent
to the other routers and thus doesn’t have to match with process IDs on other routers either.
Technically, you can also run multiple OSPF processes on the same router too.
The network command is a way of assigning an interface to an area whereas a mask is used for
ease of configuration so that you can put a bunch of interfaces into an area with one line. Area-id
is the area number you want the interface to be in. It can be configured in a simple number
format such as 0 or 1 or 2, and in the form of an IP address say 0.0.0.0 or 1.1.1.1.
RTA#
interface fa0/0
ip address 182.21.11.1 255.255.255.0
interface fa0/1
ip address 182.21.12.2 255.255.255.0
interface fa0/2
ip address 108.21.1.1 255.255.255.0
Route Summarization
Summarization is about consolidating multiple routes into one single advertisement. In OSPF,
this is normally done at the ABRs. You can configure summarization between any two areas,
however, it is recommended to summarize towards the backbone area so it can inject those
summaries into other areas. Summarization is highly effective if the network addresses assigned
are contiguous.
• Inter-area routes
• External routes
Inter-area route summarization is done on ABRs and applies to routes from within the AS, i.e. it
doesn’t apply to routes coming into the OSPF domain from external sources. You can use area
<area-id> range <address> <mask> command to configure inter-area summary. Here, area-id
refers to the area containing networks that are to be summarized.
In this topology, RTB is summarizing the range of subnets from 108.21.64.0 to 18.21.95.0 into
one range, i.e. 108.21.64.0 255.255.224.0 (/27) into the backbone area. Likewise, RTC is
summarizing 108.21.96.0/27 into the backbone.
RTB#
router ospf 101
area 1 range 108.21.64.0 255.255.224.0
2020 © CCIEin8Weeks.com 117
RTC#
router ospf 101
area 1 range 108.21.96.0 255.255.224.0
External routes summarization is relevant to external routes only, ones that are injected into
OSPF via redistribution. Much like inter-area summarization, the address range being contiguous
would make it straightforward.
You will need to use summary-address <ip-address> <mask> command on ASBR(s) doing the
redistribution into OSPF. This command will not affect if configured on a router with no
connection to another router outside the OSPF domain.
In the above topology, RTA and RTD are injecting external routes into OSPF. RTA is injecting
subnets within the range of 108.21.64-95 and RTD is injecting subnets 108.21.96-127.
RTA#
router ospf 101
summary-address 108.21.64.0 255.255.224.0
2020 © CCIEin8Weeks.com 118
RTD#
router ospf 101
summary-address 108.21.96.0 255.255.224.0
redistribute bgp 20 metric 1000 subnets
Route Filtering
Unlike RIP or distance vector protocols, OSPF has built-in controls over route propagation.
OSPF routes are allowed or denied into different OSPF areas based on the area type, such as
backbone or stub areas. OSPF ABRs limit the advertisement of different types of routes into
different OSPF areas depending on the type of associated LSA. For example, an OSPF ABR
bordering an OSPF stub area would prevent the advertisement of external routes into the stub
area. The ABR is a stub or totally-stub area that would advertise a default route as an inter-area
route. However, an ABR to a totally-stub area prevents advertisements of any inter-area
including any external routes into that area. One common use of route filtering is when
performing mutual redistribution.
• With passive interface command configured, OSPF doesn’t send hellos on an interface.
This means that the device wouldn’t discover any neighbor on that interface.
• Most filtering tools available do not filter out or remove routes from within the LS
database.
• The route filters have no impact on the presence of routes in the routing table beyond the
local router, i.e. they are only locally significant.
Distribute-list in works on any OSPF router and would prevent routes from being added to the
routing table but routes still get added to the LS database i.e. the downstream neighbors will still
have those routes. However, distribute-list out works on an ASBR to filter redistributed routes
into other protocols.
2020 © CCIEin8Weeks.com 119
RTE#
interface fa0/1
ip address 213.25.15.130 255.255.255.192
interface fa0/0
ip address 213.25.15.2 255.255.255.192
router rip
network 213.25.15.0
RTC#
interface fa0/0
ip address 213.25.15.67 255.255.255.192
interface fa0/1
ip address 213.25.15.1 255.255.255.192
router rip
redistribute ospf 101 metric 2
passive-interface Ethernet0
network 213.25.15.0
RTA#
interface fa0/0
ip address 213.25.15.68 255.255.255.192
2020 © CCIEin8Weeks.com 120
router rip
redistribute ospf 101 metric 1
network 213.25.15.0
If you were to do a show ip route on RTC, you would have found two paths to the 213.25.15.128
destination network. This occurred because RTC advertised the route to RTA via OSPF and
RTA advertised it back via RIP. Now, in order to fix this issue, the most effective way would be
to use a distribute-list on RTA to deny the 213.25.15.0 network from being put back into RIP.
RTA#
interface fa0/0
ip address 213.25.15.68 255.255.255.192
router rip
redistribute ospf 101 metric 1
network 213.25.15.0
distribute-list 1 out ospf 101
Further Reading
OSPF Configuration Guide23
Configure and verify eBGP between directly connected neighbors (best path selection algorithm
and neighbor relationships)
BGP is an exterior gateway protocol (or EGP) so it was created from ground up to perform
interdomain routing. BGP router establishes a connection using TCP to each of its neighbors.
BGP router can establish two types of sessions, external or internal. If the two BGP peers reside
in two different domains or autonomous systems (or ASs), the session is known as an external
23 https://bit.ly/2GMcKGh
2020 © CCIEin8Weeks.com 121
BGP or an eBGP session. If the two BGP peers are in the same AS, then it said to be an internal
BGP or iBGP session.
By default, BGP establishes a peer relationship using the IP address of the interface closest to the
peering router. However, BGP peers can use neighbor update-source command to source BGP
packets from another interface if so desired.
BGP routers can receive multiple paths to the same destination but BGP router uses the best path
selection algorithm and would only install the best route in the routing table. BGP assigns the
first valid path as the current best path, but then BGP compares the best path with the next path
in the list until BGP reaches the end of the list.
Here is the sequence of steps that BGP uses as the best path selection algorithm. However, keep
in mind that a lot of these tests can be overridden or even bypassed altogether using the various
BGP knobs and options available on a Cisco router.
Neighbor Relationships
iBGP Configuration
2020 © CCIEin8Weeks.com 122
iBGP peers are supposed to be in the same AS hence there is no requirement for them to be
directly connected. BGP assumes that Intra AS routing is already taken care of by an IGP. iBGP
speakers must either be fully meshed, or a route reflector must be used.
iBGP Configuration
#R1
interface fa0/1
ip address 10.20.10.1 255.255.255.0
!
router bgp 400
eBGP Configuration
neighbor 10.20.10.2 remote-as 400
#R2
interface fa0/0
ip address 10.20.10.2 255.255.255.0
!
router bgp 400
neighbor 10.20.10.1 remote-as 400
eBGP Configuration
Unlike iBGP, and in case of eBGP, the two peers must be directly connected.
#R1
interface fa0/1
ip address 10.20.10.1 255.255.255.0
2020 © CCIEin8Weeks.com 123
!
router bgp 300
neighbor 10.20.10.2 remote-as 400
#R2
interface fa0/0
ip address 10.20.10.2 255.255.255.0
!
router bgp 400
neighbor 10.20.10.1 remote-as 300
Further Reading
BGP Configuration Guide24
24 https://bit.ly/38Z9xiF
2020 © CCIEin8Weeks.com 124
Wireless
Describe Layer 1 concepts, such as RF power, RSSI, SNR, interference noise, band and
channels, and wireless client devices capabilities
RF Power
The power of a radio signal as a function of its ratio to another standard or reference value is
measured in decibel (or dB). The two common examples are dBm (dB value is compared to
1mW) and dBw (dB value is compared to 1W).
You can calculate the power in dBs by using the following formula.
Where log10 is logarithm base 10, signal is the power of the signal (in watts), and reference is
the reference or standard power (in mW). Below is a list of commonly used dB values for power
estimates.
The radiated or transmitted power is rated in either dBm or W. Power that comes off an antenna
is measured as effective isotropic radiated power or EIRP. EIRP is used by regulatory
organizations such as the FCC or Europe’s ETSI to determine and measure power limits in
WLAN equipment.
RSSI
Received Signal Strength Indicator or RSSI is a measurement of received signal i.e. how well (or
not) your device can receive a signal from an access point or a wireless router. The higher the
RSSI, the stronger the signal.
2020 © CCIEin8Weeks.com 125
It is usually written in negative form hence the closer the value is to zero, the stronger the
received signal is.
Contrary to the popular understanding, SNR is not a ratio but a difference in dBs between the
received signal and the ambient noise level or noise floor. For example, if the client-side receives
a signal of -50 dBm and the noise floor is at -100 dBm, then the SNR is going to be 50 dB.
In 802.11 WLANs, the lower SNR leads to data corruption and re-transmissions which in turn
impact both data throughput and latency.
Interference Noise
There are two types of interference when it comes to Wi-Fi, one that’s caused by non-Wi-Fi
sources and ones caused by Wi-Fi sources (other wireless routers or access points).
If you’ve so much interference or noise caused by Wi-Fi devices that no one can receive a signal,
that phenomenon is known as co-channel interface or CCI. CCI is about everyone trying to use
same frequency in the given space.
Adjacent channel interference (or ACI) is when transmissions are sent on an adjacent or partially
overlapping channel. ACI occurs as a result of bleed over to an overlapping channel which
causes noise and interference. In practice, ACI is much worse than CCI.
Several frequency bands within the radio spectrum are used by the Wi-Fi. It uses unlicensed ISM
bands. These bands have been internationally agreed upon to be used without acquiring a license.
2.4 GHz band provides longer range but lower bandwidth (max of 3 non-overlapping channels)
whereas 5 GHz band provides shorter range but higher bandwidth (max of 23 non-overlapping
channels).
Wireless designs need to keep the capabilities of both access points as well as wireless client
devices. As per Cisco Meraki, a location is classified as high density if you are expecting to have
more than 30 clients connect to an AP.
Various aspects of a client must be understood and used for capacity planning and configuration
of features on the APs.
• Per-connection bandwidth
• Aggregate throughput used by all clients in a coverage area
• Maximum bandwidth needed per cell
• CCI esp. in a high-density environment
• Channel reuse and directional antennas
In Cisco Unified Wireless Network architecture, access points are known as Lightweight APs
(LAPs). Cisco LAP can support up to eight different modes of operations. You can view a list of
the supported modes by issuing config ap mode ? command on the WLC.
• Local
• Bridge
• FlexConnect
• Monitor
• Reap
• Rogue
• SE-Connect
• Sniffer
The most common and well-known AP mode is known as Local. It is also the default mode of
operation. In local mode, LAP maintains a CAPWAP tunnel to its associated controller. All
client traffic is centrally forwarded by the controller, the reason why LAPs are known as dumb
APs. Without connection to a controller, LAP will not forward any traffic.
Monitor mode is a feature designed to allow LWAPP-enabled APs to exclude themselves from
dealing with data traffic between clients and the infrastructure. In this mode, LAPs act as
dedicated sensors for location-based services or LBS. In this mode, APs cannot serve clients.
Bridge mode is used for bridging the wireless and wired infrastructure together. It is one of the
oldest modes around.
SE-Connect mode allows you to connect to LAP using the Cisco Spectrum Expert to gather
statistics. It is used for troubleshooting purposes only.
Sniffer is also like SE-Connect mode and is used for troubleshooting purposes only.
Rouge Detector is yet again like SE-Connect and Sniffer modes, it doesn’t serve any clients and
is used for security purposes against rogue APs.
Describe access point discovery and join process (discovery algorithms, WLC selection process)
2020 © CCIEin8Weeks.com 128
The LAPs must first discover the WLCs and register them before LAPs can service wireless
clients. WLCs manage LAPs configurations and firmware, turning LAPs deployment zero-touch.
For WLC to be able to manage the LAP, the LAP needs to discover the controller and register
with the WLC. After the LAP has registered to the WLC, LWAPP messages are exchanged and
the newly registered AP initiates a firmware download from the WLC.
During the LAP provisioning, following WLAN specific configurations take place.
• SSID
• Security
• Data rate, radio channels, and power levels
During LAP registration with the WLC, the following sequence of events take place.
LAPs can perform discovery that can be in either native or L2 mode or L3 mode. LAP uses the
L3 discovery method if the L2 method fails or is not available. You can use debug lwapp events
enable command to display the sequence of events as outlined above.
Thu Aug 15 00:24:40 2019: 00:0b:85:51:5a:e0 Successfully added NPU Entry for
AP 00:0b:85:51:5a:e0 (index 48)Switch IP: 0.0.0.0, Switch Port: 0, intIfNum 2,
vlanId 0AP IP: 0.0.0.0, AP Port: 0, next hop MAC: 00:0b:85:51:5a:e0
Thu Aug 15 00:24:40 2019: 00:0b:85:51:5a:e0 Successfully transmission of
LWAPP Join-Reply to AP 00:0b:85:51:5a:e0
Thu Aug 15 00:24:40 2019: 00:0b:85:51:5a:e0 Register LWAPP event for
AP 00:0b:85:51:5a:e0 slot 0
Thu Aug 15 00:24:40 2019: 00:0b:85:51:5a:e0 Register LWAPP event for
AP 00:0b:85:51:5a:e0 slot 1
LAPs can also use the L3 LWAPP discovery, where the L3 algorithm uses different options in
order to discover WLCs. The L3 LWAPP WLC algorithm is used to build a list of controllers.
Once LAP has a list, it can attempt to join a WLC.
Further Reading
LAP Registration to WLC25
Describe the main principles and use cases for Layer 2 and Layer 3 roaming
Mobility or roaming is a wireless client’s ability to maintain its wireless association with one AP
to another. Once a client is successfully associated with an AP, the WLC controlling the AP adds
an entry into its database that contains the client’s IP and MAC addresses. When a client moves
to a new AP, WLC updates its entry with the new AP information behind the scene. Obviously,
the process is a little more complicated when a client moves from an AP joined to one WLC to
an AP joined to another controller (i.e. the case of an inter controller roaming).
Layer 2 roaming occurs when a client leaves an AP for another where both APs are configured in
the same subnet and most likely attached to the same WLC.
25 https://bit.ly/2uSYAk0
2020 © CCIEin8Weeks.com 130
Layer 3 roaming is about a case where a client roams from one AP to another that is connected to
two different APs and the newly joined AP is on a different subnet/VLAN. Cisco WLC can
either transparently change the client’s subnet or build what is known as a mobility tunnel to
facilitate layer 3 roaming.
2020 © CCIEin8Weeks.com 131
The first step is to accurately describe the problem, which requires asking critical questions and
documenting the answers to them. Let me share a few example questions that are worth asking.
Does the issue manifest itself only on certain WLC firmware version(s)?
Is the issue faced with only specific types of wireless client or software versions?
Are there any clients that are not running into this issue?
Does the issue only happen with a specific wireless mode, such as 802.11n mode or 802.11ac
mode?
Is the issue related to a specific SSID, configured with or without certain WLAN security?
Did it ever work? If yes, what configuration or combination are known to work?
You can collect WLC’s Running configuration using “show run-config” CLI via SSH (or even
Telnet), however, you may want to disable paging by using the “config paging disable”
command. If your WLC has a very large number of APs joined to it, you can exclude AP
configurations by using the “show run-config no-ap” command. You can also collect AP Group
VLAN configuration using “show wlan apgroups” command. Please note that if your WLC has a
large number of APs joined, it may take tens of minutes to collect the full run-config.
If you need to collect current logs from the WLC, you can do so by executing the following
commands.
Show msglog
2020 © CCIEin8Weeks.com 132
Show traplog
Next step is to collect some debugs from WLC, you can use the following commands to
accomplish the task.
Show Commands
show time
Debug Commands
Depending on the issue at hand, you may also want to collect the following debug commands.
Before you start the show and or debug collection, it is highly recommended that you first
execute the following set of commands.
Show Commands
For show commands, be sure to collect them at least twice i.e. before and after the test is
completed.
term len 0
show clock
show tech
show logging
more event.log
Debug Commands
After you’ve collected the show commands, you can go ahead and enable the following debug
commands and rerun your test.
debug dot11 {d0|d1} trace print clients mgmt keys rxev txev rcv xmt txfail ba
Now, if you’re running wave 2 APs that run Click OS (or AP-COS) such as 1800, 2800, 3800 or
4800 model access points, you need to use the following show and debug commands.
term len 0
show clock
show tech
show log
Please note that debug CLIs are not consistent across the board for wave 2 APs, i.e. 1800 and
2800/3800 models use two different set of debug commands. You can disable all debugs using
“config ap client-trace stop” on all wave 2 APs.
2020 © CCIEin8Weeks.com 135
term mon
Last but not least, you may also want to collect network topology and client-side information to
complete your root cause analysis.
Gathering client-side information is crucial to complete and round out your data collection
process. You may want to consider collecting the following client-related information.
• OS version
• WLAN adapter model and driver version
• Supplicant, if any used
• Security configuration
• Any configuration or setting deviation from client adapter or OS defaults
Broadly speaking, here are a few common threads when troubleshooting client-side connectivity
issues.
Now, here is a list of client-side issues that you may run into during your day job as a network
engineer.
Idle timeout
Session timeout
WLAN changes
Manual deletion from WLC
Authentication timeout
AP Radio Reset
IP Services
NTP is designed to synchronize the time on a network. It uses UDP to transport packets. An NTP
network receives its time from an authoritative time source such as an atomic clock attached to a
time server. NTP distributes this time across the network.
NTP client makes a transaction with its server each polling interval. It uses the concept of a
stratum to describe the distance in hops between a machine and an authoritative time source.
Devices running NTP prefer another device that has the lowest stratum number. Generally, it is
possible to achieve 10ms drift over long distances (WAN) and 1ms for LAN.
The Client/server is the most common internet use case. In this setup, a client or dependent
server can be synchronized to a group member, but no group member can synchronize to the
client or dependent server.
Symmetric Active/passive is useful for configurations where a group of low stratum peers
operates as backups for each other.
Broadcast or multicast mode is where clients can be configured to use broadcast or multicast
modes. It allows clients to use a single configuration to associate with multiple servers.
Network Address Translation (or NAT) comes in many different forms, but in all variations, it is
still about translating IP addresses with or without the help of TCP/UDP ports. First, there is
NAT and Port Address Translation (PAT).
NAT can be configured in two ways, static and dynamic. Static NAT is the simplest form of
NAT where only one to one translation of IP addresses is involved. With static NAT, translations
forever stay in the translation table and never time out once they are configured by the network
2020 © CCIEin8Weeks.com 138
admin. In order to remove entries from the translation table, you’ve to remove the static NAT
statements from the configuration.
Dynamic NAT is like static NAT in the sense that it is still one to one NAT i.e. between an
inside local and inside global address. However, the mapping of an inside local to an inside
global address happens dynamically. For dynamic NAT to work, you must set up a pool of inside
global IP addresses. The dynamic entry only stays in the translation table so long as there is some
traffic, in the absence of traffic the entries time out.
What if you had more local addresses and less global addresses? Enter PAT. PAT allows a
specific UDP or TCP port on a global address to be translated to a specific port on a local
address. Static PAT, much like static NAT, is where you specify the translation rules within the
configuration. PAT (or NAT overload) is a way to hide an entire RFC 1918 IP address space
behind a single public globally routable IP address (it could also be a few global IP addresses as
opposed to one!).
Static NAT
Router(config)#interface fa0/0
Router(config-if)#ip nat inside
2020 © CCIEin8Weeks.com 139
Router(config)#interface fa0/1
Router(config-if)#ip nat outside
Dynamic NAT
Router(config)#interface fa0/0
Router(config-if)#ip nat inside
Router(config)#interface fa0/1
Router(config-if)#ip nat outside
After some traffic matches those NAT rules, you can notice the following.
Static PAT
Router(config)#interface fa0/0
Router(config-if)#ip nat inside
Router(config)#interface fa0/1
2020 © CCIEin8Weeks.com 140
Router(config)#interface fa0/0
Router(config-if)#ip nat inside
Router(config)#interface fa0/1
Router(config-if)#ip nat outside
After some traffic matches the PAT rules, you can witness the following.
HSRP Configuration
HSRP Verification
R1#sh standby
FastEthernet0/0 - Group 1
State is Active
Preemption enabled
R2#sh standby
FastEthernet0/0 - Group 1
State is Standby
Preemption enabled
VRRP Configuration
Configuration on R1 (master)
R1(config)#interface f0/0
R1(config-if)#ip address 10.1.1.1 255.255.255.0
R1(config-if)#no shutdown
R1(config-if)#vrrp 123 ip 10.1.1.100
R1(config-if)#vrrp 123 preempt
Configuration on R2 (backup)
R2(config)#interface f0/0
R2(config-if)#ip address 10.1.1.2 255.255.255.0
R2(config-if)#no shutdown
R2(config-if)#vrrp 123 ip 10.1.1.100
R2(config-if)#vrpp 123 priority 90
VRRP Verification
VRRP status on R1
VRRP status on R2
Interface Grp Pri Time Own Pre State Master addr Group addr
Fa0/0 123 90 3648 Y Backup 10.1.1.1 10.1.1.100
Multicast is about bandwidth conservation based on efficient one-to-many traffic routing and
forwarding. Unicast IP communication takes place between two hosts, whereas multicast
transmission takes place between one sender (aka source) and multiple receivers (aka group).
There are two protocols involved in making the source to group multicast transmission happen.
PIM is used between routers so that they can track the source to group multicast transmissions.
IGMP, on the other hand, is used between end hosts on a LAN segment and the routers on that
LAN so hosts to group memberships can be tracked (avoid sending traffic down to interfaces
where no end host or receivers are located).
IP multicast addresses are carved out from Class D address space, which means host addresses
can be in the range of 224.0.0.0 to 239.255.255.255. A multicast address is chosen at the source.
There are two main multicast delivery models which differ only in the way receivers receive
multicast traffic, ASM and SSM.
For ASM delivery mode, a multicast receiver host can use any version of IGMP to join a
multicast group. This group is denoted as G in the routing table state notation. When a receiver
joins this group, it means that it wants to receive multicast traffic sent by any source to group G.
SSM is a multicast delivery model that is best suited for one to many applications such as voice
and video broadcasts. In this mode, an IP multicast receiver host must use IGMPv3 to subscribe
to a stream denoted as (S,G). (S,G) means that receiving hosts is only interested in receiving
traffic sent by source S to group G.
2020 © CCIEin8Weeks.com 145
PIM can operate in two modes, dense (PIM-DM) or sparse (PIM-SM) mode. PIM dense mode
uses a push model to flood multicast traffic to every part of the network. In this mode, the router
assumes that all other routers want to forward multicast packets for a group.
PIM sparse mode uses a pull model to deliver multicast traffic. Only network segments with
active receivers that have explicitly shown interest or requested the data will receive the traffic.
IGMPv1 allows an end host to explicitly announce its willingness to receive a multicast traffic
stream. This version only has two messages, Membership Query and a Membership Reply. The
membership query is always sent to 224.0.0.1.
IGMPv2 adds multiple enhancements over IGMPv1. It allows Membership Query to be both
general (i.e. to 224.0.0.1) and group-specific (sent to a multicast group). The general
Membership Query is used to find out all multicast groups that the end hosts are subscribed to.
Both IGMPv1 and IGMPv2 are not adequate for scenarios where an end host is interested in
receiving all multicast traffic for a group regardless of who the source or sender is. IGMPv3
caters to that use case with source-specific multicast feature.
2020 © CCIEin8Weeks.com 146
Chapter Summary
• The trunk is an L2 feature that uses 802.1Q VLAN tags to multiplex (or trunk) traffic
from multiple VLANs on a single physical link
• IEEE 802.1w or RSTP provides rapid convergence of the IEEE 802.1D spanning tree
protocol.
• OSPF uses flooding to exchange link-state advertisements between routers and all routers
within an area have an exact link-state database
• The OSPF process ID is a numeric value only locally significant to the router, i.e. it is
never sent to the other routers and thus doesn’t have to match with process IDs on other
routers either.
• By default, BGP establishes a peer relationship using the IP address of the interface
closest to the peering router
• iBGP speakers must either be fully meshed, or a route reflector must be used.
• Received Signal Strength Indicator or RSSI is a measurement of received signal i.e. how
well (or not) your device can receive a signal from an access point or a wireless router
• SNR is not a ratio but a difference in dBs between the received signal and the ambient
noise level or noise floor
• The most common and well-known AP mode is known as Local.
• Layer 3 roaming is about a case where a client roams from one AP to another that is
connected to two different APs and the newly joined AP is on a different subnet/VLAN.
2020 © CCIEin8Weeks.com 147
This chapter covers the following exam topics from the Cisco’s official 350-401 V1.0 Enterprise
Network Core Technologies (ENCOR) exam blueprint.
• Diagnose network problems using tools such as debugs, conditional debugs, trace route,
ping, SNMP, and syslog
• Configure and verify device monitoring using syslog for remote logging
• Configure and verify NetFlow and Flexible NetFlow
• Configure and verify SPAN/RSPAN/ERSPAN
• Configure and verify IPSLA
• Describe Cisco DNA Center workflows to apply network configuration, monitoring, and
management
• Configure and verify NETCONF and RESTCONF
2020 © CCIEin8Weeks.com 148
2020 © CCIEin8Weeks.com 149
Cisco IOS CLI is packed with tools (much like Linux) that allow you to view and troubleshoot
the current state of the overall system. This is true regardless of the actual IOS variant, i.e.
whether that’s now-defunct classic IOS, or IOS XE or IOS XR. Primarily, there are two tools
within Cisco IOS that stand out more than all others and they are show and debug commands.
While show commands provide you with a one-time snapshot of the current system state for a
specific area such “show ip route” for routing, debug commands, on the other hand, provide you
with real-time details of the system inner workings (e.g. debug ip packet) until they are explicitly
turned off by the network admin. For this reason alone, you need to run debugs with caution.
Debug commands that operate on every packet, as opposed to specific events, can be intense for
the router’s CPU.
You can minimize the debug CPU overhead by narrowing down the scope of the debug
command, a feature that’s known as conditional debugging. Conditional debugging allows you to
filter out the debug information shown by the interface and/or the conditions that you specify.
You can start a debug command simply by just entering and executing the command for your
specific area of interest such as IP routing, OSPF or BGP or NetFlow, etc. Now, regardless of the
number of debug commands you’re running at any moment, Cisco has made it easier to shut
them all down by using one single command, i.e. “no debug all” or “undebug all”. You can also
display a set of all debugs that are configured on a given system using the “show debug”
command. Under the default configuration, debug output is sent only to the console port. Cisco
recommends that you disable this default by using the “no logging console” command. You can
turn on debug output on an SSH or Telnet session by using the “terminal monitor” command. If
you don’t witness any command output despite entering the “terminal monitor” command, be
sure to verify that “no logging on” command has not been used. You can also log debug
messages to a memory buffer using the “logging buffered” command.
Before you use debug command(s), Cisco recommends that you configure millisecond-level
timestamping on the router, for debug and logs, by using the following commands.
When using debug on a platform that runs Cisco IOS XE, keep in mind the following additional
guidelines.
• Show debug condition (shows all conditional debugs)
2020 © CCIEin8Weeks.com 150
The following example shows how to enable debugging for packets matching an ACL 101 and
destination IPv4 address 100.1.1.1 on interface Gi0/1/2, and to enable conditional debug for the
CEF-MPLS feature:
The following example shows how to enable debug for packets matching an ACL 200, matching
mpls packets with a label of 20, on interface Gi0/1/2, and to enable conditional debug for the
CEF-MPLS feature:
Configure and verify device monitoring using syslog for remote logging
Remote logging via Syslog is a crucial tool for root cause analysis that necessitates debug
captures from a router over long periods. There is virtually no limit to how many debug log
entries you can generate and send to remote Syslog servers.
Configuring Syslog
It is Cisco’s recommended best practice to configure timestamping on all the routers involved in
debugging since it makes it easier to corroborate log entries across routers.
Logging trap <level #> command specifies the type of messages by severity level that you want
to be sent to the Syslog server. The default severity is informational (#6) and lowers which
includes Debug output (#7). However, you can configure it to be all the way down to 0 which
essentially means sent everything to Syslog so obviously proceed with caution.
Logging facility <facility-type> lets you specify the facility level used by the Syslog. The default
is local7 whereas possible values go from local0 to local7.
Router#config terminal
Enter configuration commands, one per line. End with CNTL/Z.
Router(config)#logging 192.168.1.10
Router(config)#service timestamps debug datetime localtime show-timezone msec
Router(config)#service timestamps log datetime localtime show-timezone msec
Router(config)#logging facility local4
Router(config)#logging trap warning
Router(config)#end
Verifying Syslog
Router#show logging
Syslog logging: enabled (0 messages dropped, 0 flushes, 0 overruns)
Console logging: level debugging, 79 messages logged
Monitor logging: level debugging, 0 messages logged
Buffer logging: disabled
Trap logging: level warnings, 80 message lines logged
Logging to 192.168.1.10, 57 message lines logged
NetFlow is a Cisco IOS feature that provides detailed statistics for packets flowing through a
router. NetFlow data capture helps capture data in the form of incoming or outgoing packets. A
flow is identified as a directional stream of packets between a source and a destination based on
a combination of network (such as source/destination IP addresses), transport layer information
(such as source/destination TCP or UDP ports), L3 protocol type, Type of Service (ToS) and
input interface.
2020 © CCIEin8Weeks.com 152
Typical use cases for NetFlow include network or application monitoring, application profiling,
network planning, aid in DDoS mitigation, and data warehousing.
Ingress IP packets include IP to IP and IP to MPLS packets, whereas Egress IP packets likewise
include IP to IP and MPLS to IP packets.
configure terminal
!
interface ethernet 0/0
ip flow ingress
!
configure terminal
!
interface ethernet 1/0
ip flow egress
NetFlow data export destination is a host that’s running an application that can collect NetFlow
data. NetFlow data export version 9 is a flexible format and supports new fields and record types
needed to accommodate BGP next-hop and MPLS protocol support.
To verify, if NetFlow is working properly, you can use any of the following show commands.
To verify, if the NetFlow data export is operational or not, you can use “show ip flow export”
command.
The newer form of NetFlow is known as Flexible NetFlow (or FNF), it allows you to understand
network behavior for more specific applications by allowing you to define new flow keys for
packet length or MAC address.
Flexible NetFlow consists of several components including flow records, flow monitors, flow
exporters, and flow samplers. Flow records are a combination of key and non-key fields. Flow
records are assigned to flow monitors to define the flow data cache. You can either use pre-
defined flow records or create your own. Flow monitors are applied to interfaces and carry data
collected within the cache. Flow exporters export the flow monitor cache to external NetFlow
collection hosts. Flow samplers are used to reduce the load on the device by sampling packets (1
out of n). Samplers are combined with flow monitors as they are applied to an interface.
2020 © CCIEin8Weeks.com 155
Current entries: 4
High Watermark: 4
Flows added: 101
Flows aged: 97
- Active timeout ( 1800 secs) 3
- Inactive timeout ( 15 secs) 94
- Event aged 0
- Watermark aged 0
- Emergency aged 0
IPV4 DESTINATION ADDRESS: 192.168.1.2
ipv4 source address: 100.10.11.1
trns source port: 25
trns destination port: 25
counter bytes: 72840
counter packets: 1821
All SPAN features allow traffic passing through ports or VLANs to be copied to another port on
the same switch (local SPAN), on a different switch within the same bridging domain (remote
SPAN) and across an L3 cloud (using ERSPAN). You will typically connect a packet sniffer or
an IPS to SPAN port for analysis.
2020 © CCIEin8Weeks.com 157
SPAN or local SPAN supports a session within one switch. It copies traffic from one or more
source port(s) or VLANs to a destination port. With Remote SPAN or RSPAN, source ports and
source VLANs can be located anywhere in the switching domain. ERSPAN supports source
ports or VLANs and even destinations on different switches even across L3 networks by way of
a GRE tunnel. Technically, ERSPAN is RSPAN over GRE.
This example shows how to set up SPAN session 10 for monitoring source port traffic to a
destination port.
This example shows how to configure RSPAN session 10 to monitor multiple source interfaces
and configure the destination as RSPAN VLAN 101.
Switch> enable
Switch# configure terminal
Switch(config)# monitor session 100 type erspan-source
Switch(config-mon-erspan-src)# source interface GigabitEthernet1/0/2 rx
Switch(config-mon-erspan-src)# source interface GigabitEthernet1/0/3 - 8 tx
Switch(config-mon-erspan-src)# source interface GigabitEthernet1/0/4
Switch(config-mon-erspan-src)# destination
Switch(config-mon-erspan-src-dst)# erspan-id 101
Switch(config-mon-erspan-src-dst)# origin ip address 172.16.0.1
Switch(config-mon-erspan-src-dst)# ip prec 5
Switch(config-mon-erspan-src-dst)# ip ttl 32
Switch(config-mon-erspan-src-dst)# mtu 1700
Switch(config-mon-erspan-src-dst)# origin ip address 172.16.0.1
Switch(config-mon-erspan-src-dst)# vrf 100
Switch(config-mon-erspan-src-dst)# no shutdown
Switch(config-mon-erspan-src-dst)# end
Cisco IP SLA sends data across the network to simulate actual network data and collects network
performance information in real-time. IP SLA measurements can be used for troubleshooting and
for the network planning and design purposes.
IP SLA is a layer 3 feature and can be configured end to end to best reflect the metrics close to
what an end-user might experience.
IP SLA Configuration
The following ICMP echo operation will start immediately and run indefinitely.
ip sla 6
icmp-echo 172.29.139.134 source-ip 172.29.139.132
frequency 300
request-data-size 28
tos 160
timeout 2000
tag SFO-RO
ip sla schedule 6 life forever start-time now
The UDP echo operation measures end to end response time between a Cisco device and any
other device configured with an IP address. The following UDP echo operation starts
immediately and runs indefinitely.
ip sla 5
2020 © CCIEin8Weeks.com 160
IP SLA Verification
You can use the Cisco Digital Network Architecture (or DNA) tool to manage both traditional
and SD-Access networks. It allows network admins to deploy quickly, reduce risks, and lower
costs. It stores all the software images and software maintenance updates (or SMUs) for all
devices in your network.
• Policy
• Automation
• Analytics
The Cisco DNA network plug and play (PnP) application allows automation for device
configuration and software deployments. The device could be hosted on-premise (running PnP
agent) or hosted in the cloud (running PnP connect). PnP Protocol uses HTTPs/XML. It allows
you to perform the following tasks.
The Network Plug and Play > Devices window shows all network devices that the Network Plug
and Play tool is onboarding and provisioning.
A workflow defines a network device provisioning process that includes a sequence of actions
concerning applying a device configuration or installing a software image. Workflow is applied
to a device when it is in claimed status and has finished booting up.
Creating a Workflow
1. Start with DNA Center home page and click on Network Plug and Play
2. Click on Workflows, and this is where you can create a new workflow or edit an existing
one
3. By default, image and configuration tasks are included in a new workflow. You can
delete or add new task(s). You can also change order of task execution.
4. Once done with tasks, click Add to create the new workflow.
Cisco DNA Center Assurance provides a comprehensive solution to ascertain better and
consistent service levels. It provides reactive as well as proactive network monitoring and
troubleshooting.
2020 © CCIEin8Weeks.com 162
Data collection is about streaming in a variety of network telemetry and contextual data in real-
time. Once data is ingested, data correlation and analytics can be performed. Data is stored
within the DNA center database and can be utilized by Assurance as well as other DNA Center
applications such as Capacity Planning. Assurance provides time series analysis, graph-based
data modeling, and system management portal.
It is no secret that traditional methods of managing networks today are either using SNMP or the
device’s CLI. While those methods are useful, they have their inherent limitations too. To boot,
CLIs are proprietary, and there is no structured way to ingest and extract data to or from them.
Likewise, SNMP is also useful but doesn’t make a distinction between configurational and
operational data.
What’s common between the two methods is that they both don’t offer a programmatic or
standard-based interface into the device where you can write, replace, or edit configurations.
Cisco IOS XE supports Yet Another Next Generation (YANG) data modeling language, which
can be used with the Network Configuration Protocol (NETCONF) to provide both
programmable and automated network operations. NETCONF is an XML-based protocol that
client applications can use to request information from and make configuration changes to a
device.
2020 © CCIEin8Weeks.com 163
NETCONF makes a distinction between configuration and state data. It uses three different data
stores, i.e., candidate, running and startup configuration, and uses RPC for messages and SSH
(TCP 830) for secure transport.
NETCONF is session-oriented and stateful, unlike REST and RESTCONF, which are stateless.
NETCONF protocol contains four layers.
• Protocol (SSH as transport)
• Messages (XML encoding)
• Operations (each device and platforms support a given number of operations)
NETCONF Operations layer defines a set of base protocol operations invoked as RPC methods
with XML-encoded parameters. As per the RFC, the Message layer provides a simple, transport-
independent framing mechanism for encoding RPCs and notifications.
The Secure Transport layer provides a communication path between the client and the server.
NETCONF can be layered over any transport protocol that provides a set of basic requirements.
The Content layer is outside the scope of RFC.
• <get>, like “show” CLI, it retrieves running configuration and device state information
• <get-config>, like “show run” CLI, it retrieves all or part of a specified configuration
• <edit-config>, like “config terminal” commands, it loads all or part of a configuration to
the specified configuration datastore
• <delete-config>, like “no” or delete config, it simply deletes a configuration datastore
2020 © CCIEin8Weeks.com 164
NETCONF Example
<copy-config>
<source>
<url>file://my-candidate-config.cfg</url>
</source>
<target>
<candidate/>
</target>
</copy-config>
NETCONF Configuration
NETCONF defines three datastores, however, users are welcome to define their datastore if they
so prefer.
NETCONF Verification
You can utilize any of the following show commands to verify your NETCONF configuration.
RESTCONF
RESTCONF is an IETF standard and describes how to map a YANG spec to a RESTful
interface. Keep in mind that RESTful APIs are not meant to replace NETCONF, but rather add a
simplified and standard interface that follows REST’s well-known principles. RESTCONF is
nothing but an addition of REST API, which is very popular, to the NETCONF. YANG models
are used as when you use RESTCONF, and thus the URLs, HTTP verbs, and Request bodies are
automatically derived from the associated YANG model.
RESTCONF is a stateless protocol and uses structured data with XML or JSON encodings and
YANG to provide REST-like APIs so you can programmatically configure your network device.
RESTCONF uses HTTPs methods to provide CREATE, READ, UPDATE and DELETE (also
famously known as CRUD) operations on YANG-based datastore.
Unlike NETCONF, RESTCONF supports both XML and JSON encoding formats. It is worth
noting that RESTCONF is an IETF standard that not only adapts YANG specification to a
RESTful API interface. It can also function as a southbound API when network devices support
RESTCONF interfaces directly and can expose them to an Orchestrator upstream.
RESTCONF protocol, much like REST API, operations include GET, POST, PUT, PATCH, and
DELETE.
RESTCONF NETCONF
DELETE <delete-config>
GET http://csr1kv/restconf/api/config/native
RESTCONF Configuration
You can configure the RESTCONF interface to access NETCONF datastores in three simple
steps.
1. Configure AAA (remote or local) for NETCONF connections (username admin privilege
15 secret <pwd>)
2. Enable RESTCONF interface (restconf-yang command)
3. Enable HTTPS server (ip http secure-server command)
RESTCONF Verification
Number of sessions : 1
Number of sessions :1
session-id : 19
transport : netconf-ssh
username : admin
source-host : 2001:db8::1
login-time : 2018-10-26T12:37:22+00:00
in-rpcs :0
in-bad-rpcs :0
out-rpc-errors :0
out-notifications :0
global-lock : None
2020 © CCIEin8Weeks.com 168
Chapter Summary
• NetFlow data export destination is a host that’s running an application that can collect
NetFlow data.
• Flexible NetFlow consists of several components including flow records, flow monitors,
flow exporters, and flow samplers.
• Technically, ERSPAN is RSPAN over GRE.
• IP SLA is a layer 3 feature and can be configured end to end to best reflect the metrics
close to what an end-user might experience.
• NETCONF is a network management protocol that is designed specifically for
transactional-based network management and can be considered as a successor to SNMP.
• RESTCONF is an IETF standard and describes how to map a YANG spec to a RESTful
interface.
• Unlike NETCONF, RESTCONF supports both XML and JSON encoding formats
2020 © CCIEin8Weeks.com 169
CHAPTER 5 SECURITY
This chapter covers the following exam topics from the Cisco’s official 350-401 V1.0 Enterprise
Network Core Technologies (ENCOR) exam blueprint.
There are four types of TTY lines, and you can display what’s supported on your device by using
show line command.
1. CTY
2. TTY
3. AUX
4. VTY
CTY is your console port. On any router, it appears in the router configuration as line con <#>.
TTY lines are asynchronous lines used for inbound and outbound remote access either via a
modem or terminal connection. It is referred to as line <#> on a router. AUX is an auxiliary port,
and you can spot within the configuration as line aux <#>. VTY lines are virtual terminal lines of
a router and are used primarily to control inbound telnet or SSH connections. You can configure
them by using line vty 0 <#> command.
Configuring a password on a line is straightforward. You must go into enable mode, followed by
line mode, and then configure the password.
router#configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
router(config)#line con 0
router(config-line)#password cciein8weeks
router(config-line)#login
If you want to configure a local user-specific password, all you need to do is to add username
and password pairs and turn on local authentication using local login command.
router#configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
router(config)#username user1 password password1
router(config)#username user2 password password2
2020 © CCIEin8Weeks.com 172
To enable authentication, authorization, and accounting (AAA) authentication for line logins,
you can use the login authentication command in line configuration mode. You also need to
configure AAA services.
router#configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
router(config)#aaa new-model
router(config)#aaa authentication login my-auth-list tacacs+
router(config)#tacacs-server host 192.168.1.101
router(config)#tacacs-server key cciein8weeks
router(config)#line 1 8
router(config-line)#login authentication my-auth-list
ACLs
Access lists (or ACLs) are stateless L2-L4 packet filters to specify what should be permitted and
denied inbound or outbound to and from a network. As a reference, firewalls are stateful packet
filters and can operate from L2 to L7 packet headers or data.
ACL Configuration
router#configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
2020 © CCIEin8Weeks.com 173
router#configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
router(config)#no access-list 101 deny icmp any any
router(config)#^Z
router#show access-list
router#
*Mar 9 00:43:29.832: %SYS-5-CONFIG_I: Configured from console by console
ACL Verification
router#show access-list
Extended IP access list 101
deny icmp any any
permit ip any any
router#
Further Reading
ACL Configuration Guide26
CoPP
You can divide a networking device into four distinct logical groups as far as traffic to/from or
through a device is concerned.
1. Data plane (traffic that is not sourced or destined from/to the device, i.e., transit traffic)
2. The control plane (traffic sourced or destined from/to the device, traffic type used for the
creation and operation of the network such as BGP, OSPF, and ARP)
3. Management plane (technically same as control plane traffic but for network
management such as TFTP, SSH, SNMP, FTP, NTP, etc.)
4. Services plane (a case of data plane traffic but in this case, router is involved in
modifying the packet header or payload, such as GRE, QoS, NAT, etc.)
26 https://bit.ly/2S8cM0B
2020 © CCIEin8Weeks.com 174
CoPP is the router processor protection mechanism, and thus applies to all packets that get
punted to the router processor. CoPP is about protecting the punt path, not just the control plane.
Router#show running-config
Building configuration...
.
. ---<skip>---
!
class-map match-all Catch-All-IP
match access-group 124
class-map match-all Management
match access-group 121
class-map match-all Normal
match access-group 122
class-map match-all Undesirable
match access-group 123
class-map match-all Routing
match access-group 120
!
policy-map CCIEin8Weeks_CoPP
class Undesirable
police 8000 1500 1500 conform-action drop exceed-action drop
class Routing
police 1000000 50000 50000 conform-action transmit exceed-action transmit
class Management
2020 © CCIEin8Weeks.com 175
You can verify CoPP configuration by using the show policy-map control-plane command.
transmit
exceeded 0 packets, 0 bytes; actions:
transmit
conformed 0 bps, exceed 0 bps
Class-map: Management (match-all)
0 packets, 0 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: access-group 121
police:
cir 100000 bps, bc 20000 bytes
conformed 0 packets, 0 bytes; actions:
transmit
exceeded 0 packets, 0 bytes; actions:
drop
conformed 0 bps, exceed 0 bps
Class-map: Normal (match-all)
0 packets, 0 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: access-group 122
police:
cir 50000 bps, bc 5000 bytes
conformed 0 packets, 0 bytes; actions:
transmit
exceeded 0 packets, 0 bytes; actions:
drop
conformed 0 bps, exceed 0 bps
Class-map: Catch-All-IP (match-all)
50461 packets, 24038351 bytes
5 minute offered rate 4000 bps, drop rate 0 bps
Match: access-group 124
police:
cir 50000 bps, bc 5000 bytes
conformed 50444 packets, 24031001 bytes; actions:
transmit
exceeded 17 packets, 7350 bytes; actions:
drop
conformed 4000 bps, exceed 0 bps
Class-map: class-default (match-any)
16785 packets, 1183331 bytes
5 minute offered rate 0 bps, drop rate 0 bps
2020 © CCIEin8Weeks.com 178
Match: any
police:
cir 8000 bps, bc 1500 bytes
conformed 16658 packets, 1175711 bytes; actions:
transmit
exceeded 127 packets, 7620 bytes; actions:
transmit
conformed 0 bps, exceed 0 bps
Router#
There are few security principles to keep in mind when using RESTful APIs.
• Least privilege: Add only the necessary permissions and revoke when no longer in use
• Separation of privileges: Grant permissions to an entity based on a combination of
conditions based on the type of resource
• Least common mechanism: Avoid state (or fate) sharing among different components
• Fail-safe defaults: Unless permission is explicitly granted, it should be denied implicitly
• Always use HTTPS
• Use password hash
• Never expose information on URLs
• Use oAuth where possible
2020 © CCIEin8Weeks.com 179
EAP
Extensible Authentication Protocol (or EAP) is a wireless authentication protocol that supports a
plethora of authentication methods. By using EAP along with a RADIUS server, an AP
facilitates a wireless client device and RADIUS server to perform mutual authentication. Each
protocol that uses EAP encapsulates EAP messages within its protocol messages.
There are over three dozen authentication methods that are supported by EAP. They include
EAP-MD5, EAP-TLS, EAP-TTLS, and EAP-IKEv2, among others. The encapsulation of EAP
over 802.1X is known as EAP over LAN or EAPOL.
There are three main steps to configuring EAP with Cisco APs.
1. Pick either Network EAP (for Cisco clients only) or Open Authentication with EAP
(Non-Cisco wireless clients or combination of Cisco and non-Cisco)
2. Define authentication server
3. Define client authentication methods
You can define the auth server by going to the AP server manager tab and do the following
actions.
• On AP’s Encryption Manager, specify your WEP encryption and make it mandatory
• Choose your encryption key size (e.g. 128-bit)
• Apply your configuration
• Select your desired SSID
• Under “Authentication methods accepted”, check the box that says Open and choose
“with EAP” from the drop-down menu.
• Apply your configuration. If you’ve Cisco wireless clients, you will need to also check
the box that says Network-EAP.
You can also configure your authentication method using the CLI.
AP(config)#interface dot11radio 0
AP(config-if)#encryption mode wep mandatory
AP(config-if)#end
AP#write memory
EAP Verification
To verify, you can use show radius server-group all command. It shows all configured RADIUS
groups on the AP.
WebAuth
Web authentication is an L3 security feature that directs the controller not to allow IP traffic
from a wireless client until that client has provided a valid username and password. It is a simple
form of authentication that doesn’t need a supplicant or client software as such. Web auth is
typically used to provide guest access mostly on open Wi-Fi hotspots.
The web authentication process involves a set of actions that must take place for web auth to be
successful.
• The use opens a browser and points to a URL, which causes the client to send a DNS
request out. WLC bypasses the DNS request to the DNS server, and as a result, name
resolution is completed.
2020 © CCIEin8Weeks.com 181
• Once a client has the IP address, it tries to open a TCP connection by sending a TCP
SYN.
• WLC has rules for the client and acts like a proxy server for the web URL. It sends back
SYN-ACK, followed by the client, which sends back the final ACK and completes the
TCP handshake.
• The client sends out HTTP GET for the web URL, but WLC intercepts and redirects the
client to the default page for the WLC.
• Once the client is served the login page, the client can provide username and password
and log in.
On the network side, you need to configure the VLAN interface, WLC for web auth, add a
WLAN instance, and finally configure an authentication type.
From WLC GUI, go to Controller tab and then choose an interface name and VLAN ID. Click
Apply to get the rest of the form where you need to put an IP address, mask, gateway IP and a
primary DHCP server. Click Apply when you’re done.
This step is about configuring WLC for internal web authentication, which is the default auth
type on Cisco WLC.
2020 © CCIEin8Weeks.com 182
From WLC GUI, go to the Controller tab and then choose an interface name and VLAN ID.
Click Apply to get the rest of the form where you need to put an IP address, mask, gateway IP,
and a primary DHCP server — Click Apply when you’re done.
After you have enabled internal web auth, click on the WLAN tab, and choose NEW. Configure
this page with type: WLAN, Profile Name: <name>, SSID: <ssid> and ID: 1. Click Apply and
hop over to Edit <your-profile-name> option. On this page, you need to enable your WLAN
interface and select interface. You can leave everything to default values.
Click on the Security tab, select Layer 2, and set Security to none. Move to the Layer 3 tab,
select Web Policy box, and choose the “Authentication” option and “Web Policy.” Click apply
when you’re done.
Local authentication is the most straightforward way to configure user authentication. You have
to add Local Net users within the Security tab on WLC GUI. When you’re doing so, make sure
that select your WLAN Profile.
RADIUS can also be configured using the WLC GUI. You can use an external Windows
RADIUS server via Cisco ACS.
Traditionally, you configure a WLAN with PSK security, where all devices share the same PSK.
Identity PSK is a feature that allows multiple PSKs, one for each client to be configured on the
same SSID. When a client authenticates to a wireless network, WLC checks with a RADIUS
server to if the MAC address exists in the auth policy. If it does, the RADIUS server will reply
with an ACCESS-ACCEPT message, including PSK, as a Cisco-AVPair.
IPSK requires that you have either a Microsoft NPS (AD is a must) or use Cisco ISE, which
makes it easier since client MACs can be assigned to Endpoint Identity Groups or EIGs and not
have to be created as users.
For PSK configuration, you need to log into WLC GUI, go to corresponding WLAN > Security
> Layer 2 tab, and enable MAC Filtering. Now hop over to WLANs > Advanced Tabs and
2020 © CCIEin8Weeks.com 183
enable the “Allow AAA Override” option. You’ll also need to select NAC State as “ISE NAC” if
you are using the Cisco ISE server.
Threat defense
The Cisco Cyber Threat Defense (or CTD) solution combines several elements to provide
visibility into various types of cyber threats, i.e.
• Generating network-wide security telemetry (from NetFlow export from Cisco switches,
routers, and appliances)
• Aggregating, normalizing and analyzing NetFlow telemetry data (provided by Lancope
StealthWatch)
• Providing contextual (from Cisco ISE)
Endpoint Security
Cisco endpoint solution is about protecting your PCs, Macs servers and even mobile devices.
There are a few key traits of any modern endpoint solution, i.e.
1. Cloud or on-premise deployment options
2. Prevention capabilities
3. Integrated sandboxing capabilities
4. Continuous monitoring and recording
5. Agentless Rapid time to detection
6. Automated Response
2020 © CCIEin8Weeks.com 184
Cisco offers AMP for Endpoints solution meets and exceeds all the above traits across just about
any OS that is out there today including Microsoft Windows (7, 8, 10, Server), macOS, Linux,
Android and iOS.
Next-generation firewall
Cisco offers AMP for Endpoints solution meets and exceeds all the above traits across just about
any OS that is out there today, including Microsoft Windows (7, 8, 10, Server), macOS, Linux,
Android, and iOS.
TrustSec, MACsec
Cisco TrustSec security architecture builds secure networks by way of clouds of trusted network
devices. Each device in a cloud is authenticated by its neighbors. Cisco TrustSec uses EAP-
FAST for authentication.
MACSec defined in the IEEE 802.11AE standard is an L2 hop by hop encryption framework that
provides data confidentiality and integrity. MACSec Key Agreement or MKA protocol provides
sessions and encryption keys. Cisco devices support SAP GCM cipher for data encryption.
WebAuth enables network admins to control network access and enforce policy based on the
identity of a user. WebAuth helps prevent unauthorized access for clients that do not support
IEEE 802.1X authentication (i.e., lack 802.1X supplicant). Cisco switches can fall back to web
authentication when the end-host do not have 802.1X supplicant. 802.1X is always preferred for
managed assets where you can install a supplicant. It works on Layer 2, whereas web auth
functions at Layer 3.
Both web auth and MAB can be configured as fallback mechanisms for 802.1X. If 802.1X
authentication times out, the switch attempts MAB. If MAB fails, the switch attempts to
authenticate with Web auth.
2020 © CCIEin8Weeks.com 185
Chapter Summary
• Unlike ACLs, firewalls are stateful packet filters and can operate from L2 to L7 packet
headers or data.
• Extensible Authentication Protocol (or EAP) is a wireless authentication protocol that
supports a plethora of authentication methods.
• Web authentication is an L3 security feature that directs the controller not to allow IP
traffic from a wireless client until that client has provided a valid username and password
• Cisco TrustSec uses EAP-FAST for authentication.
• Both web auth and MAB can be configured as fallback mechanisms for 802.1X
2020 © CCIEin8Weeks.com 186
CHAPTER 6 AUTOMATION
This chapter covers the following exam topics from the Cisco’s official 350-401 V1.0 Enterprise
Network Core Technologies (ENCOR) exam blueprint.
Syntax
Python programs can be written in plain text using any text editor such as TextEdit or Notepad in
macOS and Windows respectively. It is recommended to use the .py extension for Python
programs. You can run a python program using the command line. Unlike C/C++, there are no
curly braces or semicolons in Python, it uses indentation instead.
Python variables can be local or global in scope. They follow standard nomenclature of an
alphanumeric name starting with a letter. Python variables are case sensitive.
• Boolean
• Integer, long, float
• String
• List
• Object
2020 © CCIEin8Weeks.com 189
Python statements include print, raw_input, import, and many others. Below are some examples
of the python statements and expressions.
Operators
Strings
Or
Arrays
Conditionals
Functions
Function is a block of code that only executes when it is called. You can pass data or execute
functions without any parameters obviously depending on the function itself. A python function,
much like C/C++, can also return data as a result.
In order to call above function, all you have to do as is to add the name of the function followed
by the parenthesis i.e.
myfunction()
Now, let’s put our Python knowledge to work with some REST API calls using Python requests
library.
The HTTP POST method sends data to a web server. The type of the body of the POST request is
indicated by the Content-type header. Here we’re creating a new user (em@example.com) and the web
server will create the user including assigning it an ID which you can see in the response header. Please
note the status code 200 (OK) in the response header.
Python Code
2020 © CCIEin8Weeks.com 191
Code Output
The GET method requests a representation of the given resource thus it is only used to retrieve
data. In the example below, we retrieve metadata for user #1571 (Elon Musk).
Python Code
2020 © CCIEin8Weeks.com 192
Code Output
The HTTP PATCH method is for applying partial information to a web resource. The type of the
body of the PATCH request is indicated by the Accept-Patch header. In this example, we patch
and add a phone number for our user #1571 (Elon Musk).
Python Code
Code Output
2020 © CCIEin8Weeks.com 193
The HTTP PUT method is for creating a new resource (if one doesn’t exist) or replacing
information for a given web resource that already exists. In this example, we replace our older
user (#1571) with a new one (Jessica Alba). You can notice that the fields, such as phone #, that
were not explicitly provided by our code, didn’t change because of the PUT request.
Python Code
Code Output
2020 © CCIEin8Weeks.com 194
The HTTP DELETE method is for deleting an existing resource. In this example, we simply delete our
user Jessica Alba (or #1571).
Python Code
Code Output
We can also verify that the resource (user) was indeed deleted by issuing an HTTP GET which
results in status code 404 (Not Found).
Python Code
2020 © CCIEin8Weeks.com 195
Code Output
Further Reading
Python and REST APIs27
Cisco DevNet has divided the overall Cisco portfolio into eight Dev Centers, one representing
each technology group.
• Cloud
• Collaboration
• Data Center (UCS, NX-OS, IOS XE, etc.)
• IoT and edge computing
• Networking (IOS XR, IOS XE, NX-OS)
• Security
• Wireless and mobile
27 https://bit.ly/2W7MejI
2020 © CCIEin8Weeks.com 196
• Application developers
SDK stands for Software Development Kit, it is a package integrated with libraries, documents
and code examples. It is different from APIs, which are a documented set of URIs where
developers only need a reference guide and a resource address to get started.
SDKs APIs
Set of tools that can be used to develop or It is an interface that allows software
create applications applications to interact with each other
Contains all APIs Purpose built for specific use or a feature
More robust, with tons of utilities Lightweight and fast
Primarily used to create new applications Primarily used for adding specific functions
to an existing application
Cisco provides a wide range of SDKs on different Cisco platforms, here is a non-exhaustive list.
Currently, Cisco has provided us with three programmability solutions that we can use to interact
with Cisco devices and platforms.
However, technically speaking and in terms of SDKs and toolkits, we have access to the
following frameworks.
• Cobra SDK
• NX Toolkit
• ACI Toolkit
2020 © CCIEin8Weeks.com 197
The key component within Cisco Application Centric Infrastructure (or ACI) is the Application
Policy Infrastructure Controller (or APIC). APIC supports the deployment, management, and
monitoring of applications with a unified operations model for physical and virtual infrastructure
components. The ACI policy model is based on promise theory which is about scalable control
of objects rather than using the top-down paradigm.
APIC is aware of both the configuration commands and the state changes for the underlying
objects. With promise theory, underlying objects handle configuration state changes, as initiated
by the APIC in the form of desired state changes. With this approach, objects are also
responsible for sending in exception and faults back to the control system (call it bottoms-up
approach if you will) in addition to enabling greater scale by enabling the methods of objects to
request state changes from one another.
Cobra SDK comes in two installable .egg files with the following installable packages.
Going by the GitHub repo (datacenter/cobra), it seems a bit stagnant where last commit took
place over two years ago.
NX Toolkit
NX Toolkit is a bunch of python libraries that allow configuration of Cisco Nexus 9K and 3K
series switches. It is based on NX-API REST which is used in ACI and the Cisco NX-OS and
provides a framework for network programmability and automation.
You can clone the nxtoolkit.git repo from GitHub or just download it as a zip file. Alternatively,
you can also do a docker pull on docketcisco/nxtoolkit. Going by the GitHub repo, it seems
nothing has been added to it by Cisco over the past four years.
ACI Toolkit
2020 © CCIEin8Weeks.com 198
ACI toolkit is a collection of python libraries that allow the configuration of the APIC controller.
It is based on the REST API. The overall toolkit model is divided into three components.
Looking at the diagram on the left, the tenant is the root class within the toolkit, this is where all
the configurations takes place as far as application topology is concerned. AppProfile is what
contains the actual configuration. Endpoint Group (EPG) is the object where you define
configuration for what happens when an endpoint is connected to the ACI fabric. Context is
pretty much like VRFs in Cisco jargon. EPGs provide or consume the network services as
defined in the Contracts (Taboos, as the name implies, are network services that can never be
provided or consumed by the EPGs). FilterEntry defines the traffic profile that either a contract
or taboo applies. Much like the application topology object model, there is also the top-down
APIC policy taxonomy on the right side to lay out the scope of various ACI objects.
There is also an ACI toolkit interface model (physical and virtual) and a physical topology model
(pod, node, link) as part of the ACI toolkit object model.
YANG Development Kit (or YDK) is an SDK that provides APIs that are modeled in YANG to
reduce the complexity with the help of APIs, by abstracting out the protocol or encoding details.
2020 © CCIEin8Weeks.com 199
SDK consists of a core package that defines services and providers. In addition to the core
package, there are a bunch of other modules that provide the YANG modeling. YDK is currently
supported for IOS XE, IOS XR and NX OS.
Looking at the GitHub repo, it seems that the project is alive and kicking and there were several
commits that were checked in during 2019.
2020 © CCIEin8Weeks.com 200
For SDK-based programmability, we will simply use the easiest method available to us by Cisco
to do so and that is to use on-box Guest Shell on Cisco IOS XE 16.7.x (or otherwise known as
Fuji). Please note that you can use Python library on-box or off-box which happens via
SSH/NETCONF interface.
Guest Shell is a virtualized Linux-based (LXC) container environment that runs side by side the
IOS XE, that separation allows secure execution of various scripts (such as Python) and other
software packages. CSR1000v running IOS XE 16.7.x uses a CentOS 7.x minimal rootfs.
Enabling the Guest Shell depends on what platform and IOS XE version you’re using. If you’re
using a version before Fuji, you’d need to use first use “iox”, followed by “guestshell enable”
along with virtualportgroup IP addresses. Once that’s set up, you can pre-installed Python 2.7.x
or install Python 3.x package using “guestshell run python3” command.
Now, for CSR1000v and IOS XE Fuji and later, you can enable and configure Guest Shell using
the following steps.
• Run “iox” and enable guest shell using “guestshell enable” command along with the app-
hosting appid guestshell”
• Configure Virtual Port Group by assigning it an IP address
• Optionally, you can configure NAT so that your Guest Shell container stack can access
the internet and download other installable packages
CSR1000v Configuration
CSR1Kv(config)# iox
interface VirtualPortGroup0
ip address 172.16.30.1 255.255.255.0
ip nat inside
ip nat inside source list GS_NAT interface GigabitEthernet1 overload
ip access-list standard GS_NAT permit 172.16.0.0 0.0.255.255
guestshell enable
Python 2.7 (pre-installed version) can be launched using the “guestshell run python <script.py>”
command.
Now, let’s put together a few lines and execute Python script to retrieve some IOS XE specific
information. In the example below, we’re using the CLI module to issue configure a new
loopback interface (lo100) and then retrieving system clock information.
Instead of using cli.cli function, you can also use cli.clip function to directly display information
on the Guest Shell console.
Two main data encoding formats are used in APIs. They are XML and JSON (pronounced as
Jay-sun). JavaScript Object Notation (or JSON) is both a human-friendly and machine-readable
format and sends data in name-value pairs. JSON is best known for the curly brace syntax. JSON
is popular because it is easier to read and natively maps to Python dictionary data structure.
JSON Example
{
"persons": [
{
"name": "Jeff Bezos",
"gender": "male"
},
{
2020 © CCIEin8Weeks.com 202
JavaScript Object Notation (or JSON) is language-agnostic is documented as its data encoding
standard. It supports primitive types such as strings and numbers along with nested lists and
objects.
Python includes a native JSON package that you can use to both encode and decode data. You
can use “import json” to import the entire package and parse JSON data into a python dictionary
or list. You can parse the JSON file using the json.load() into python dictionary data structure
which is organized in key-value pairs. You can also read and write JSON strings using
json.loads() and json.dumps methods respectively.
JSON Document
2020 © CCIEin8Weeks.com 203
[
{
"gender": "male",
"name": "Jeff Bezos"
},
{
"gender": "male",
"name": "Elon Musk"
},
{
"gender": "female",
"name": "Jessica Alba"
}
]
2020 © CCIEin8Weeks.com 204
Python Code
Code Output
Simple Network Management Protocol (or SNMP) is an IP protocol responsible for collecting
and organizing, in a tree-like fashion, information about the devices being managed. It can also
be used to modify those objects to change a device’s behavior. It uses UDP ports 161 and 162.
SNMP has been around for over 30 years. Over this time, it has been the de-facto way to monitor
networks. It worked great when networks were small and polling a device every 15-30 minutes
met operational requirements. SNMP MIBs are a type of data model defining a collection of
information that is organized in a hierarchical format that is used along with SNMP. Anyhow,
SNMP did work great for monitoring devices every few minutes, but it never caught on for
configuration management purposes due to custom or proprietary MIBs.
In addition to SNMP, there has always been the network command-line interface or CLI. Access
to the CLI happens via console, Telnet, or SSH, and it has been the de-facto way of managing
the configuration of networking devices for the past 20+ years.
2020 © CCIEin8Weeks.com 205
Expect and TCL scripting and custom parsers and screen scrapers were the best the industry had
to offer over that time, you can call it poor man’s way to automation when there was no better
way. It is now unacceptable as the rate of change continues to increase, there are more devices,
and higher demands being placed on the network.
Now, if you put it all together along with where we are concerning business requirement today,
you can summarize following to be the desired requirements for today’s configuration
management.
• Easier to use API-based management interfaces that can be consumed with open source
tools (REST, RESTCONF, NETCONF, gRPC)
• Decouple configuration from operational data
• Both human and machine friendly for consumption
• Support variety of transport protocols (HTTP/S, SSL, SSH)
• Support variety of data encoding formats (JSON and XML)
Model driven programmability harnesses the power of models, i.e. matching the devices’
capabilities to standardized models making it easier to configure networking devices. A data
model is a structured method to describe any object, e.g. our driving licenses describe us in an
individual way.
2020 © CCIEin8Weeks.com 206
The most common data models today, is Yet Another Next Generation (YANG). YANG is also
the modeling language which is defined in RFC 6020 and RFC7950. It uses protocols such as
NETCONF and RESTCONF for data model programmable interfaces.
NETCONF uses remote procedure calls and notifications. YANG module defines the hierarchies
of data that can be used for NETCONF operations. There are two main YANG data models.
• Open YANG models (developed by the vendors and standards bodies such as IETF or
OpenConfig. They are created to be independent of the underlying platform)
• Native models (developed by the vendors such as Cisco or Juniper. They can only be
used on the devices they are created for)
The terms used in YANG model are defined in the RFC. Here are some commonly used terms.
• Anyxml (a data node that can contain an unknown chunk of XML data)
• Augment (adds new scheme nodes to an existing schema node)
• Container (an interior data node that only contains child nodes)
• Data model, node, and tree (model defines how data such as a node in the schema tree is
instantiated. The data tree contains the instantiated tree of configuration and state data)
• Leaf (node with no child nodes)
• Leaf-list (like leaf node but defines a set of nodes rather than one)
• Module (defines the hierarchy of nodes that can be used for NETCONF operations)
• State data (non-configuration data, such as status or statistics)
• Leaf nodes
• Leaf-list nodes
• Container nodes
• List nodes
2020 © CCIEin8Weeks.com 207
YANG module contains a sequence of statements and each statement starts with a keyword,
followed by an optional argument, and then a semicolon or a block of sub-statements enclosed
within braces.
There are four main statements in a YANG module, i.e. container, list, leaf and leaf-list.
NETCONF
The NETCONF protocol uses XML-based data encoding for both the configuration and the
protocol messages. It provides a small set of operations to both manage device configurations
and get device state information. The basic protocol provides the following configuration related
operations.
• Retrieve
• Configure
• Copy
• Delete
Following list includes operations that you can accomplish using NETCONF protocol.
• Get
• Get-config
• Edit-config
• Copy-config (create/replace an entire configuration datastore)
• Delete-config (delete a configuration datastore)
• Lock (lock the configuration datastore)
• Unlock (release a configuration lock)
• Close-session (graceful session closure)
• Kill-session (force termination of a session)
There are some similarities and some big differences between NETCONF and SNMP protocols.
SNMP
• No concept of a transaction
• Lacks backup and restore of device configuration
• Limited industry support for configuration MIBs (but plenty of monitoring)
NETCONF
NETCONF service uses SSH for transport and the default SSH port is TCP 830. You can use
terminal to log into a NETCONF server or device and issue NETCONF commands directly (or
use a Python script, something we’ll discuss shortly).
When the login is completed, the NETCONF server will respond with a <hello> message listing
capabilities for the device. The NETCONF session remains open after the initial response.
<capability>urn:ietf:params:netconf:capability:validate:1.1</capability>
......truncated for brevity.......
<capability>
urn:ietf:params:netconf:capability:notification:1.1
</capability>
</capabilities>
<session-id>1315</session-id></hello>]]>]]>
<?xml version="1.0"?>
<rpc xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"
message-id="1331">
<get-config>
<source>
<running/>
</source>
</get-config>
</rpc>]]>]]>
RESTCONF
RESTCONF can use XML or JSON encoding format for data structured data. HTTP verbs such
as GET, POST, PUT, PATCH and DELETE are directed to a RESTCONF API to access data
resources within the YANG data models. A network device can serve both NETCONF and
RESTCONF at the same time.
RESTCONF NETCONF
supports running datastore (edits of candidate Supports running and candidate data stores
data store are immediately committed)
2020 © CCIEin8Weeks.com 210
Doesn’t support obtaining and releasing data Supports obtaining and releasing data store
store lock lock
Doesn’t support transactions across multiple Doesn’t support transactions across multiple
devices devices
Implicit validation for edit operations Explicit validation for edit operations
https://<host>/restconf/<resource-type>/<yang-module:resource>
Request
Response
HTTP/1.1 200 OK
Date: Thu, 26 Mar 2020 20:56:30 GMT
Server: device.fullstacknetworker.com
Content-Type: application/yang-data+json
{
"ietf-restconf:restconf" : {
"data" : {},
"operations" : {},
"yang-library-version" : "2020-01-10"
}
}
As noted earlier, the RESTCONF protocol supports two media types, i.e. application/yang-
data+xml and application/yang-data+json.
If you’re using a Python script, you can construct your headers as follows.
headers = {
'Content-Type': 'application/yang-data+json',
'Accept': 'application/yang-data+json'
2020 © CCIEin8Weeks.com 211
Besides RESTCONF/NETCONF APIs, IOS XE also offers Guest Shell access on or off box. For
Guest Shell, you need to use a 16.5 or later version, and you can configure it using three CLIs.
• iox
• If you’re running IOS XE 16.8 or later, you’d need to configure Guest shell virtual
interface.
However, Guest Shell is something that we’ve discussed earlier, and it is outside the scope of
this section.
Enabling YANG and API access on Cisco IOS XE devices boils down to two CLIs, so it
couldn’t be simpler.
• Privilege level 15 for NETCONF communication (we’re using a local user, but you can
also use AAA RADIUS):
o username admin privilege 15 secret fsn123
• Enabling YANG with NETCONF
o netconf-yang
Now, let’s configure our priv 15 user and YANG/NETCONF. SSH is enabled automatically
when you enable NETCONF. NETCONF uses SSH port 830 for transport.
2020 © CCIEin8Weeks.com 213
In order to connect and execute our NETCONF API calls, we can use ncclient and minidom
python libraries.
Let’s now write a Python script that uses NETCONF to fetch the router’s hostname.
Python Code
2020 © CCIEin8Weeks.com 214
Code Output
Python Code
ow, we try to download the running configuration using NETCONF API call. While the output is natively
in XML form, but we convert it to JSON and display it too.
Before we move on from NETCONF examples, let us go over the device handlers that are
supported by the ncclient. You can notice, in all our code, we used the ‘default’ device handler
which is the same as “csr” for the Cisco CSR1000v device.
Juniper: device_params={‘name’:’junos’}
Cisco CSR: device_params={‘name’:’csr’}
Cisco Nexus: device_params={‘name’:’nexus’}
Huawei: device_params={‘name’:’huawei’}
Alcatel Lucent: device_params={‘name’:’alu’}
H3C: device_params={‘name’:’h3c’}
HP Comware: device_params={‘name’:’hpcomware’}
Let me note down some code choices that we have made when using ncclient and why.
• Mostly, we’re using the following template to connect into the device via NETCONF
• xmltodict comes handy to parse XML to dictionary and thus making it compatible with
JSON encoding for further processing. Remember, JSON is a serialization construct thus
making it easier to sort and transmit between systems, whereas dictionary is a data
structure to use inside your code. However, they both represent objects as name/value
pairs.
• We’ve already discussed xml.dom.minidom library, which is one of the two popular
methods to process XML format.
• XML to dictionary conversion can be done with a single line of code. “xml_data” below
represents raw XML data.
• You can also add or delete IOS XE configuration using Manager object, however in both
cases, you need to first construct an XML configuration template.
RESTCONF uses REST APIs and HTTPS port 443 for transport. To configure RESTCONF, you
also need to enable HTTPS server and restconf and add a priv 15 user.
To connect and execute our RESTCONF API calls, we can use various python libraries.
• netmiko
• requests (REST get/post call etc.)
• json (encoding support)
2020 © CCIEin8Weeks.com 220
2020 © CCIEin8Weeks.com 221
2020 © CCIEin8Weeks.com 222
2020 © CCIEin8Weeks.com 223
Code Output
2020 © CCIEin8Weeks.com 224
Here, we modify the IP address on the GigabitEthernet3 interface using a HTTP PUT method.
First, we import necessary libraries, optionally disable and then set our variables for
authentication into our target CSR1000v instance. We set our HTTP request header to be
application/yang-data+json and define our functions. Finally, we call our functions or methods,
take user input and change IP address on the interface as requested. Please note that port
9443/HTTPS is being used for RESTCONF.
Cisco DNA Center is at the center of Cisco’s intent-based networking initiative. Cisco customers
and partners can use APIs, integration flows, events and notification services and Cisco DNA
Center SDK to create applications above and beyond the features it natively provides.
• Design
• Policy
• Provision
• Assurance
• Platform
DNA Design component allows you to design your network using workflows while allowing for
importing existing network designs and device images from APIC-EM (Enterprise Module) and
Cisco Prime Infrastructure into DNA Center.
DNA Policy is about user and device profiles that help deliver on secure access as well as
segmentation. Application policies ensure consistent network performance based on business
requirements.
DNA Provision allows you to use policy-based automation to deliver services to network-based
on business priority and simplifies device deployment. It is the module that is responsible for
delivering zero-touch deployment.
DNA Assurance enables networking elements to stream telemetry for ensuring application
performance and user connectivity in real-time.
DNA Platform allows developers to directly access the DNA through the developer toolkit or
SDK. To review APIs, you can click on Platform > Developer Toolkit.
DNA center appliance hosts SDN controller, analytics engine and telemetry storage. At the time
of writing, a 44-core DNA appliance (DN2-HW-APL) is listed for USD 88.6K in Cisco’s GPL.
It must be installed and run on the bundled bare metal server, as we speak, there is no virtual
appliance package available.
Let’s now look at the various aspects of DNA dashboard and types of data it provides.
2020 © CCIEin8Weeks.com 226
2020 © CCIEin8Weeks.com 227
2020 © CCIEin8Weeks.com 228
Let’s now get our hands dirty with DNA center SDK by using the python client module.
2020 © CCIEin8Weeks.com 229
2020 © CCIEin8Weeks.com 230
2020 © CCIEin8Weeks.com 231
Creating a site.
2020 © CCIEin8Weeks.com 232
Toolkit provides information about all the API calls that are part of the SDK. However, before
we proceed further, let me summarize the types of APIs that are available with the DNA Center.
• Northbound APIs
• Southbound APIs
• Eastbound APIs
• Westbound APIs
Northbound APIs, also known as, Intent APIs provide support for policy-based abstraction of
business intent, allowing focus on business outcomes as opposed to mechanics of how. It is a
RESTful APIs that allows the use of GET, POST, PUT and DELETE HTTP methods with JSON
encoding to discover and control the network. All API calls require the use of a security token
that identifies the privileges of an authenticated user making the RESTful API calls. You must
obtain a security token before you can make use of any of the DNA center APIs.
Southbound APIs allow managing non-Cisco infrastructure by way of SDK that allow the
creation of packages for 3rd party devices. Package contains mapping of Cisco DNA center
features to other vendors’ southbound protocols. In summary, southbound APIs are a gateway to
multivendor support within the Cisco DNA center.
Events and Notifications provide the ability to establish a notification handler, so when specific
DNA center events are triggered, such as SoftWare Image Management (SWIM) events, 3rd
party systems can take actions in response to those events. Notifications can also be generated
for internal events such as Assurance event causing an external ITSM system to initiate a ticket.
Westbound APIs, or what’s also known as Integration APIs, are provided so that other 3rd party
systems such as ITSM can be integrated with Cisco DNA center. Using integration APIs, you
can implement change management, and approval and pre-approval chains. They also allow
integration for reporting and analytics capabilities such as capacity planning, asset management,
compliance control, and auditing. Finally, integration APIs allow support for IT4IT reference
architecture so if you are using an external system that supports that, you can be sure to optimize
your end to end IT value chain.
Further Reading
Cisco DNA Center User Guides 28
Cisco DNA Center Maintain and Operate Guides 29
28 https://bit.ly/2w2rWNH
29 https://bit.ly/2TLz7mB
2020 © CCIEin8Weeks.com 233
vManage is the centralized management platform for configuring, monitoring and maintaining
Cisco SD-WAN (formerly Viptela) devices. Once installed, you can access vManage using a
web browser. When you log into vManage, the dashboard provides a bird’s eye view of the
network.
30 https://bit.ly/3cOxpbA
2020 © CCIEin8Weeks.com 234
vManage provides REST API access for retrieving real-time and configuration state information
about the SD-WAN overlay.
2020 © CCIEin8Weeks.com 235
• Certificate Management
• Configuration
• Device and Inventory
• Monitoring including Real-Time
• Troubleshooting
You can access the vManage RESTful API documentation by going to the server itself.
https://vManage-ip-address:8443/apidocs
You can construct your API call URLs using the following format.
https://vmanage-ip-address:8443/dataservice/blah
For example, you could list all network devices attached to your SD-WAN overlay.
https://vmanage-ip-address:8443/dataservice/device
Likewise, you can get health status of hardware device components such as CPU or memory by
sending in the following URL.
https://vmanage-ip-address:8443/dataservice/device/hardware/environment?deviceId=system-ip-
address
2020 © CCIEin8Weeks.com 236
Much like APIC, you can send in your REST API calls either via Postman, Python or another
OOP language of your choice. Again, if you use Postman, and your vManage server is using a
self-signed certificate, you need to disable “SSL certificate verification”. For Python, if you use
requests library for sending in your HTTP methods, you can add on “verify=False” in your
POST or GET requests. Even if you provide verify=False, then you will notice a warning
message like below that you can ignore.
Another thing to keep in mind is that every time you successfully authenticate using the
vManage admin username and password, the server issues a session token or cookie that you can
use to send in more requests.
Let me share examples from Postman and Python connecting to vManage servers via REST API
calls.
Postman
2020 © CCIEin8Weeks.com 237
Python Code
Code Output
Interpret REST API Response Codes and Results in Payload Using Cisco
DNA Center and RESTCONF
A REST request asks a RESTful service to act, the request will need to have all the necessary
information that the service needs to fulfill the request.
2020 © CCIEin8Weeks.com 238
On DNA Center platform, you can go to Operational Tools > Network Discovery. On DevNet,
click on POST /discovery and your DNA Center platform, clicking POST Starts a discovery
process.
Request
{
"snmpMode": "string",
"netconfPort": "string",
"preferredMgmtIPMethod": "string",
"name": "string",
"globalCredentialIdList": [
"string"
],
"httpReadCredential": {
"port": "integer",
"secure": "boolean",
"username": "string",
"password": "string",
"comments": "string",
"credentialType": "string",
"description": "string",
"id": "string",
"instanceUuid": "string"
},
"httpWriteCredential": {
"port": "integer",
"secure": "boolean",
"username": "string",
2020 © CCIEin8Weeks.com 239
"password": "string",
"comments": "string",
"credentialType": "string",
"description": "string",
"id": "string",
"instanceUuid": "string"
},
"parentDiscoveryId": "string",
"snmpROCommunityDesc": "string",
"snmpRWCommunityDesc": "string",
"snmpUserName": "string",
"timeout": "integer",
"snmpVersion": "string",
"ipAddressList": "string",
"cdpLevel": "integer",
"enablePasswordList": [
"string"
],
"ipFilterList": [
"string"
],
"passwordList": [
"string"
],
"protocolOrder": "string",
"reDiscovery": "boolean",
"retry": "integer",
"snmpAuthPassphrase": "string",
"snmpAuthProtocol": "string",
"snmpPrivPassphrase": "string",
"snmpPrivProtocol": "string",
"snmpROCommunity": "string",
"snmpRWCommunity": "string",
"userNameList": [
2020 © CCIEin8Weeks.com 240
"string"
],
"discoveryType": "string"
}
Response
{
"response": {
"taskId": "any",
"url": "string"
},
"version": "string"
}
2020 © CCIEin8Weeks.com 241
Embedded Event Manager (EEM) is a software component within Cisco IOS XE, XR and NX-
OS to make network admins life easier by tracking and taking actions on various events.
EEM can use various event detectors and actions to provide notifications, some of the examples
are
• SNMP
• Syslog
• Counter
• CLI events
• Timers
• IP SLA
Examples of the EEM actions that you can take based on those events.
• Sending an email
• Executing an OS command
• Generating SNMP traps
• Reload the router
Verification
You can verify either using a debug or show command. For debug and show CLIs, the most
common ones are debug event manager action cli and show event manager policy registered.
• Version control
• Design patterns
• Testing
Configuration management tools include Puppet, Chef and Ansible and they are well known in
the DevOps circles. These tools enable you to automate applications, infrastructure, and
networks to a high degree without the need to do any manual programming.
Puppet is written in Ruby and refers to its automation instruction set as Puppet manifests. The
major point to note here is that Puppet is agent-based. Agent- based means a software agent
needs to be installed on all devices you want to manage with Puppet. “Devices” here refers to
servers, routers, switches, firewalls, and the like. It is often not possible to load an agent on many
networking devices. Hence, this requirement limits the number of devices that can be used with
Puppet out of the box. Agent requirement raises the barriers to deployment for Puppet as far as
networking is concerned. Furthermore, with some investment and cultural change, DevOps
virtuous cycle brings with it the benefits of improved scalability, reliability, maintainability, and
faster release rollouts with higher quality.
Chef, another popular configuration management tool, follows much of the same model as
Puppet. Chef is written in Ruby and uses a declarative model, is also agent-based. Chef refers to
its automation instruction as recipes and when they are grouped, are known as cookbooks.
2020 © CCIEin8Weeks.com 243
The two notable differences between Puppet/Chef, and Ansible are that Ansible is written in
Python and that it is natively agentless. Being agentless significantly lowers the barrier to
deployment from an automation perspective.
Ansible can integrate and automate any device using any API. For example, integrations can use
REST APIs, NETCONF, SSH, or even SNMP, if desired. Ansible sets of tasks (instructions) are
known as playbooks. Each playbook is made up of one or more plays, where each play consists
of individual tasks.
SaltStack is yet another configuration management tool. It can be configured with or without an
agent. However, in most deployments it is configured with agents where masters to which
minions connect. The connection between master and minion uses ZeroMQ for data transfer.
Once the master and minions are connected, you can run commands on the master that can be
trickled down to any or all minions. You can also define states using which you can apply and
validate configuration to minions.
Ease of Use Steep learning Steep learning Easier Steep learning curve
curve curve with
DSL/Ruby
Pricing Free (Chef Free (Open Free (CLI), Free (open source)
Basics), Source) up to 10 unlimited nodes
$72/node nodes, Enterprise Edition
afterwards $120/node Free (Ansible (paid)
afterwards Tower) up to 10
nodes, paid
afterwards with or
without support
When using the configuration management tools, from a RESTful service standpoint, for an
operation (or service call) to be idempotent, clients can make that same call repeatedly while
producing the same result.
Version control systems enable efficient collaboration for developer contributing to a software
project. Imaging yourself working as a developer in a team setting, without version control
systems, you’re working with shared folders containing the whole project. In a situation like this,
multiple people on your team may end up working on the same file at the same time, potentially
causing many unpredictable issues.
The first Version Control system was developed around the 1970s. Since then, many VCSs have
been developed, and today there are many available possibilities for those organizations who
want to use Version Control. Git was originally built to support the Linux Kernel development
by Linus Torvalds. Apache SubVersioN (SVN) was created as an alternative to CVS. SVN uses
atomic operations, meaning that either all changes that are made to the source code are applied or
2020 © CCIEin8Weeks.com 245
none are applied. No partial changes are allowed, avoiding many potential issues. A drawback of
SVN is its slower speed of due to its centralized nature.
Git is a distributed version control software that keeps track of every modification to the code. If
a mistake is made, developers can look back and compare earlier versions of code to help fix the
mistake minimizing disruption to all team members.
Version control protects source code from any kind of errors that may have serious consequences
if not handled properly. Git may be a little daunting for new users to learn with all the jargon and
learning curve around commits but is powerful once you understand it.
Remote Repository: A remote repository is where the files of the project reside, and it is also
where all other local copies are pulled from. It can be stored on an internal private server or
hosted on a public repository such as GitHub or BitBucket.
Local Repository: A local repository is where snapshots, or commits, are stored on each
person's local machine.
Staging Area: The Staging Area is where all the changes you want to perform are placed. You,
as a project member, decide which files Git should track. For example, you can decide to add and
commit files to fully become part of your local repository by moving them to the staging area
first, without including all files in your project.
2020 © CCIEin8Weeks.com 246
Working Directory: A Working Directory is a directory that is controlled by git. Git will track
differences between your working directory and local repository, and between your local
repository and the remote repository.
Last but not least, let’s compare Git and SVN side by side.
You can put Git to paces in two different ways, i.e. self-host a Git server or let a provider host it
for you i.e. Git-as-a-service. Self-hosted servers require you to use your server (or a VM) and
require a little more knowledge than provider-hosted counterparts but they are mostly light-
weight software and free. There are also some hybrid Git servers like GitLab, which can be self-
hosted (for free) or web-hosted (for a cost, generally billed as $ per user).
For self-hosted variants, I’d strongly suggest you consider using GitLab, Gitea (one that I am
using, and you will see it throughout this guide) or Gogs. In case you’re curious, they are all
freeware. For web-hosted, you’ve GitHub, and then plenty of GitHub-like alternatives such as
GitLab, BitBucket, SourceForge, Launchpad, etc.
• Local
• Centralized
• Distributed
A local version control system tracks files within a local file system, it is like a making copy of a
file before you modify. This approach allows you to revert back to the previous version. It
obviously doesn’t provide most of the benefits that we have discussed above.
Centralized version control system uses a client/server model. The repo is stored in a centralized
system. Every time an individual wants to make a change to a file, they need to get a working
copy of the files from the central repo to their local machine. In a centralized system, only one
person at a time is allowed to modify a particular file. When you need to make changes, you
checkout the file which results in a lock on the file and then you check it in when you are done.
The distributed version control, unlike centralized variant, is based on a peer-to-peer model. In
this model, when an individual wants to make a change to a file, they must first make a copy or
clone the full repo to their local machine. Multiple people can clone remote repos to their local
machines. Each individual can work on any file without needing any locking, when they are
done making changes, they can just push the repo to the remote or main or hosted service.
Before we proceed, it is important to understand what a Git repository or repo is. If you recall,
the purpose of Git is to help you manage a software project with the help of version controls by
way of manipulating the graph of commits. With Git, a project is nothing but a set of files that
change over the lifetime of the project and Git stores this information in a data structure called
repository or repo.
A Git client must be installed on the client machine before someone can make a clone a repo. Git
clients are widely available for windows, macOS and Linux/Unix. It is worth noting that some
clients offer a GUI, but main focus remains on command line interface (or CLI).
2020 © CCIEin8Weeks.com 248
Git repo is stored in the same root directory as the project itself, but inside a folder or sub-
directory if you will, known as .git. Anyhow, a Git repo contains, among other things, the
following components.
You can use git status, git ls-files and git worktree list –porcelain commands to display Git
worktree details.
Now, let’s review some common source code operations that you can perform with Git. You can
consider all of the Git operations as options that you will add to your “git” commands.
2020 © CCIEin8Weeks.com 249
The git repo resides in a hidden .git directory, it holds metadata such as the files, commits
and logs that contains commit history. The working directory is the folder that contains the
working copy of the files on the local machine. The staging area stores the information that
an individual wants to either add, update, or delete in the repo. It is not a folder or a
directory, staging area is just an index file located in the .git directory.
Each of the three stages mentioned are associated with three states within Git.
• Committed (the version of the file that has been saved in the repo)
• Modified (the file that has changed before it is added to the staging area)
• Staged (the modified file that is ready to be committed to the remote repo)
Clone Operation
As the name implies, the “clone” option allows you to clone or duplicate a repo to your local
machine. Please note that repo in question could be located on your local file system or a remote
server. You can access the remote repo via HTTP/HTTPS or SSH protocols.
You can find the complete link of your GitHub repo as shown below.
2020 © CCIEin8Weeks.com 250
Add/remove Operations
Git add allows you to add a file to the staging area, and Git remove works on the working
directory as well as the staging index.
In the example below, “git add -u” command only adds currently tracked files (the modified
ones) to the staging area and checks if any of the files have been deleted. It doesn’t add any of
the new files to the staging area. Once “git add” is completed, you can view results using the “git
show” command as shown below.
2020 © CCIEin8Weeks.com 251
Git remove is the exact opposite of Git add, with add you track a file for changes whereas with
remove to untrack it.
Commit Operation
Git commit command records all of the file changes in the local repo along with a hash that
serves as the tag for identifying the commit. These commits can later be pushed to a repo or
merged from a repo etc.
Please note the difference between “git add” and “git commit”, former simply add your modified
files to the queue that can be committed later whereas the later commits the files into the index to
the local repo.
Git commit command can be used with a flag, such as -a or -m. “-a” is used to add all the files to
the staging area whereas “-m” allows you to add (or commit) all the changed files to your
version history along with a commit message. Last but not least, if you can combine staging and
commit by using both flags together i.e. “git commit -am”.
In the following example, a git commit is taking place for the changes (two lines containing
Read and Write SNMP strings) that were previously only added (“git add -u”).
2020 © CCIEin8Weeks.com 252
You can view commit hashtag using the “git log” command as shown below. Note the line that
starts with “commit” and includes “06f9d8ebefd1e3eca754a9c43826a00387e4e160”.
Git push command executes the already committed changes to your branch, in this case, we’re
pushing them to “dev_branch”. While it is outside the scope of git push or pull, committing
changes essentially kicks off our Continuous Integration and Continuous Delivery (CICD)
pipeline as shown in the Gitea and Drone (self-service CD platform) GUI screenshots.
2020 © CCIEin8Weeks.com 253
2020 © CCIEin8Weeks.com 254
Git pull command fetches remote changes to your local repo. Note the new file persons.xml was
pulled from the GitHub repo branch master.
2020 © CCIEin8Weeks.com 255
Like we did before, you can verify the changes by peeking inside the git logs.
Branch
Git stores its data as a series of snapshots, so when you make a commit, it stores the commit
object which contains within it a pointer to the snapshot of the content (or changes) you staged.
A branch in Git jargon is nothing but a movable pointer to one of the commits, the default branch
name is in Gitea is “master” (same on GitHub by the way). The name “master” traces its roots
into “git init”. Branches enable git users to work on code independently without affecting the
main code in the repo.
Let’s visualize the relationship among commit objects, commits, snapshots, and master branch.
Git knows your current or active branch by way of HEAD which is pointing to the master
branch.
2020 © CCIEin8Weeks.com 256
In this example shown below, our changes were committed to dev_branch and were merged into
master after they were successfully tested using our Drone-based CICD pipeline.
Let’s now add a new branch name “test_branch” by using “git branch test_branch” command.
Now, let’s say if we want to make “test_branch” or active branch, which can be done by using
‘git checkout test_branch”.
2020 © CCIEin8Weeks.com 257
Now, if you make a change and commit to our test_branch, then our test_branch will point to a
new commit 4 (added “Richard Branson” to persons.xml file), while the master will continue to
point to commit 3.
Git uses merge as a way to put forked history back together again, fork simply refers to cloning
or copying a branch. However subtle, there is a difference between fork and branching, former is
an independent copy of the repo where later is simply a way to adding some code, say a feature
that will eventually be merged back into the main branch.
In the real world, there are two primary reasons why you’d fork a repo, i.e.
Now, in order to demonstrate a merge, we’re going to run through the following set of actions.
Diff Operation
Git diff command shows the file differences or changes that are not yet staged (or tracked or
added), changes between files in the staging area and the current version, and the differences
even between branches.
In the example below, we made changes to parse-json.py script as highlighted with red (-) and
green (+) and corresponding line items or contents.
2020 © CCIEin8Weeks.com 260
In the example below, we requested the diff between two of our branches, test_branch, and
anothertest_branch.
Merge conflict results from competing changes, say developer A deletes a file in branch A and
developer B edits the same file in branch B where both branches A and B are part of one repo.
This scenario forces a decision of whether to delete or keep the removed file in a new commit.
You can display merge conflict using the “git status” command.
2020 © CCIEin8Weeks.com 261
You’d need to resolve the conflict before a commit will be allowed. You have two choices, either you
keep (git add <file>) the file from branch B or remove it (git rm <file>). Now, before we move
on, it is important to summarize the various Git operations we have covered in this section.
GitOps
Now, before we move on, there is one more topic that’s worthy of mention in this section, and
that is GitOps. It is a way of implementing continuous deployment for cloud native applications.
The central idea of GitOps is having a Git repository that acts as single source of truth with
declarative descriptions of the infrastructure currently desired in the production environment and
an automated process to make the production environment match the state stored inside the repo.
There are two ways to implement GitOps, i.e. push and pull-based deployments. Both methods
are similar except the how the deployment pipeline works. If you use a traditional build tool such
as Jenkins or TravisCI are triggered by an external event such as when new code is pushed to the
software repo. With pull-based deployment, there is a new role of an operator which takes over
the role of the pipeline by continuously comparing the desired state in the environment repo with
the actual state in the deployed infrastructure.
Further Reading
Introduction to Git31
Git Reference32
Now, let’s say we want to perform a small configuration change (say you want to change SNMP
RW strings or add a static route) using the DevOps pipeline using the Git, Ansible, and a build
server known as Drone.
There are four major steps that you’d need to perform to accomplish this.
• Creating Configuration Change
• Building New Configuration
31 https://bit.ly/3aR1Agi
32 https://git-scm.com/doc
2020 © CCIEin8Weeks.com 263
During this stage, you are creating a configuration change. Now, to accomplish this task, you
will utilize a Software Control Management (or SCM) tool such as GitHub or Gogs or Gitea or
another similar tool of your choice.
For this discussion, let’s say you’re using Gitea which is a self-hosted version of Git service
available free of cost. Gogs and Gitea are open sources, single-binary Go implementations of a
GiHub-like git repository hosting system that you can use privately or in a team setting. Both are
light on system requirements.
There are two sets of tools that are utilized during this stage. You have a build server that runs
through your Continuous Integration and Continuous Delivery (CICD) pipeline as defined in the
pipeline file i.e. .drone.yml.
There are many open source options for build and integration tooling, you can use Drone or
Jenkins or even Travis CI which is now free for open source and private projects. However,
Drone is a popular continuous integration and delivery platform built in Go. It integrates with
many version controls services such as GitHub, GitLab, and Gitea and Gogs.
Drone agent watches for code changes and will automatically build and test changes as soon as
they are committed. The drone is primarily distributed as a Docker image, so you will need to
use Docker Compose to manage the CI server containers by creating the “docker-compose.yml”
file. To monitor code changes to trigger build and test stages, Drone will need access to your
source code repository inside Gitea or Gogs. The Drone Docker image is a unified container that
can be run in a few different ways.
Now, it is recommended to run one container that operates as the Drone server, which
coordinates repository access, hosts the web UI, and serves the API. In addition to that, you can
run another container as a Drone agent with the same settings, which is responsible for building
and testing software from the configured repositories. The “drone-server” service starts the main
2020 © CCIEin8Weeks.com 264
Drone server container listening on default port 8000 and likewise, the “drone-agent” service
uses the same image but started with the “agent” command.
You will also need to set up server and agent environments using the respective .env files as well
as Systemd unit using a service file before you can fire up Drone. Both Gitea and Gogs don’t
support OAUTH2 so you will be prompted for user id and password each time you kick off the
CICD pipeline, a little bit annoying but not too bad.
You can also additionally configure nginx or apache as reverse proxy server to have Drone send
request through the proxy server and use SSL to secure communication between Drone and your
version control system.
Broadly speaking, there are three types of tests involved during the CICD pipeline execution, i.e.
Unit, Integration, and Production testing. Unit testing is simply done using a single node which
can be a router, switch or a firewall, whereas integration and production testing include a more
realistic simulation network topology with multiple nodes that closely mimic the real network.
To create test networks, you can either use Cisco VIRL or GNS3. Cisco VIRL integration with
Drone is simple as Cisco provides a plugin that you can tap into.
There are multiple ways to deploy your configuration, however, the most used and preferred tool
is Ansible. Ansible consists of YAML data files as well as the Jinja template which contains
variables that can be replaced with your YAML data file.
Now, to use GNS3 with Drone, you can simply integrate GNS3 using the GNS3 server RESTful
APIs directly into Drone CICD .drone.yml pipeline file along with the project information to
start and stop your simulated network topologies. You can use Ansible playbooks to deploy your
configurations into simulated networks.
Now, let’s put our Ansible, YAML and CICD knowledge to use and commit a real-life IOS
configuration change. In this example, I will actually modify SNMP RW strings on my router,
check-in my changes and kick off my CICD pipeline. CICD will run unit and integration testing
2020 © CCIEin8Weeks.com 265
before my changes can be committed into the master branch and new IOS configuration is
allowed into production.
I am using Gitea within a Docker container that’s running inside my VirtualBox VM which is
running on my Ubuntu Linux distro. You can run Git on about any platform where Go would
provide you with an independent binary which includes Linux, Windows and macOS/OSX.
Installing Gitea inside a docker container is straightforward. After starting the Docker setup via
docker-compose Gitea should be available using your browser to finalize the installation. You
can connect to it using http://localhost:3000 and follow the installation wizard. If you are coming
from Gogs, you can refer to details here. You can do a “docker container ls” to get the details of
your installation.
When you log in, you will come to the following Gitea dashboard.
As you can notice, Gitea has a dashboard and separate links for issues, pull requests and explore.
Dashboard features a summary of your config/code integration activity. In my case, I have a
dev_branch where I make all my changes, then my Drone server facilitates the testing using
2020 © CCIEin8Weeks.com 266
Let’s now go through an example of the integration part of that NetDevOps pipeline. The
handoff between Gitea (CI) and Drone (CI/CD) happens when I commit my config change into
dev-branch. As we previously discussed, since Gitea doesn’t support OAUTH2 for a seamless
handover between CI and CD stages per se, I have to manually enter my user password each time
a commit is permitted to proceed from Gitea to Drone.
As a member addresses these errors, automated testing tools such as Drone (along with GNS3 or
Cisco VIRL) in this stage will report if the errors were successfully fixed and when the config
change is accepted. This continuous feedback loop dramatically increases a NetDevOps
member’s productivity for making frequent network changes while decreasing the likelihood of a
network downtime.
Unless a network change is all about housekeeping, Continuous Delivery benefits the business
itself who are supposed to benefit from the resulting network changes because as soon as config
is successfully accepted in the CI stage and a can be tested, it is deployed to given networking
devices. Optionally, business groups can verify that changes meet their expectations and provide
feedback to NetDevOps team who then address the feedback in this stage.
Anyhow, now let’s go back to making our first network change. In my example, I will be
modifying read and write SNMP community strings to showcase my NetDevOps Gitea/Drone
CICD pipeline.
2020 © CCIEin8Weeks.com 267
I manually edited my all.yml file that contains the global variables used in my Ansible
playbooks. I edited the file and modified the SNMP community string values to
NewSecureReadJuly4th and 4NewSecureWriteJuly4th. This was done before issuing git add and
commit. git add adds your modified files to the queue to be committed later whereas git commit
commits the files that have been added and creates a new revision with a log. If you do not add
any files, git will not commit anything. You can combine both actions with “git commit -a”.
Now, the final step before Gitea CI will trigger the Drone CI, involves pushing the config change
using git push. git push pushes your changes to the remote repository. You can see what happens
once I issue git push in the screenshot below.
Using above, I pushed my changes into dev_branch and it triggered the CI stage in both Gitea as
well as Drone which is my build server. You can see 4 objects since my change calls for the
removal of two SNMP CLIs which sets the READ and WRITE community strings and then
replace them with newer values.
Let’s now look at what happened on Gitea server after my git push.
You can see “-” and “+” denoting what was removed versus added from the configuration, it also
shows +2/-2 to describe the same information. 10.0.10.1 is my switch IP address where these
changes are to be tested and then committed.
Let’s now look at what happened on the Drone server after my git push.
If you follow the commit message, “FullStackNetworker #1”, you can see the there was an initial
build->test cycle for dev_branch and after successful completion, it was repeated for the master
branch before finally deploying it to my switches.
Let’s zoom into each dev_branch first on the Drone server. On the right side, it shows the steps
taken around the Gitea/Drone handoff as part of the “Clone” stage within my CICD pipeline
configuration. My pipeline consists of the build start stage, followed by integration testing and
merge the change into master and then notify the NetDevOps team via Slack.
2020 © CCIEin8Weeks.com 269
You can see that for 4 lines of change, the entire pipeline for dev_branch and master branch take
only about 3 minutes or so. If your config changes are larger, or your pipeline is longer or more
complex, it will take longer. The Drone server simply works as an orchestrator and follows the
flow of control within your pipeline file.
Below is a screenshot from my Slack channel which shows that the NetDevOps team was
notified for dev_branch and master build completion and pipeline execution success.
2020 © CCIEin8Weeks.com 270
Chapter Summary