NX-OS Port Profiles

As I become more familiar with NX-OS, I frequently find features that are meant to make life easier for us network Admins and Engineers. I’ve been informed that the days of CLI jockey’s are rapidly coming to an end, and rightly so, but even with my best DevOps attempts I still find myself having to manually edit configs frequently. One of my least favorite tasks is adding a new Vlan to our ESX cluster — there are just so many interfaces to touch. There must be a better way! Turns out, there is at least one better way (of many, I’m sure) — port profiles.

Port Profile Overview

Port profiles are interface configuration templates that can be assigned to ports that have the same configuration requirements. If you’ve ever found yourself copying and pasting interface configurations on a box, then port-profiles can help you.

The limit to the number of ports that can inherit a profile is platform dependent — my Nexus 7700’s show a limit of 16384, while my Nexus 9300’s show 512.

Creating a port profile

Let’s walk through a really simple example of a port-profile

  1. First, we create the profile, and in so doing, define the type of interface to which the profile will be applied

    NX9K(config)# port-profile type ?
      ethernet          Ethernet type
      interface-vlan    Interface-Vlan type
      port-channel      Port-channel type
    

    For this example we’ll create a ethernet type. Please also note that on the Nexus 7K’s we can also use types of loopback and tunnel.

  2. Next we define the commands that will be applied to every interface

    NX9K(config)# port-profile type ethernet MY-TEST-PROFILE
    NX9K(config-port-prof)# switchport
    NX9K(config-port-prof)# switchport mode trunk
    NX9K(config-port-prof)# switchport trunk allowed vlan 10,20,30,40,50,100
    NX9K(config-port-prof)# spanning-tree port type edge trunk
    NX9K(config-port-prof)# no shutdown
    
  3. Lastly we change the state of the profile to enabled.

    NX9K(config-port-prof)# state enabled
    

That’s all you need to do to create a profile. We can review the configuration on our profile by using the show port-profile command:

NX9K# show port-profile

SHOW PORT_PROFILE

port-profile MY-TEST-PROFILE
 type: Ethernet
 description:
 status: enabled
 max-ports: 512
 inherit:
 config attributes:
  switchport
  switchport mode trunk
  switchport trunk allowed vlan 10,20,30,40,50,100
  spanning-tree port type edge trunk
  no shutdown
evaluated config attributes:
 switchport
 switchport mode trunk
 switchport trunk allowed vlan 10,20,30,40,50,100
 spanning-tree port type edge trunk
 no shutdown
assigned interfaces:

This output gives us nearly all the info we need — the type of profile we created, the commands that it contains, commands that are actually being applied (evaluated), and any interfaces that are assigned to use this profile. At this point we haven’t assigned an interface so let’s do that now.

Assigning profiles to interfaces

To assign our newly created profile, we use the inherit port-profile interface sub-command

interface eth101/1/1
  inherit port-profile MY-TEST-PROFILE

And that’s it! Very easy stuff here.

Now the best part comes days or months later when you need to modify the ports. You simply add the new command(s) to the profile, and all assigned interfaces automatically get the updated config.

Viewing interface and port profile configurations

The only thing to remember down the road is that now your interfaces won’t show the actual configuration. So your standard show interface only shows the inherit command:

interface Ethernet101/1/16
    inherit port-profile MY-TEST-PROFILE

There are two ways you can see the commands as applied to each interface. First, you can display the full interface config using the command show port-profile expand-interface name PROFILE_NAME

NX9K# sh port-profile expand-interface name MY-TEST-PROFILE

port-profile MY-TEST-PROFILE
 Ethernet101/1/16
  switchport mode trunk
  switchport trunk allowed vlan 10,20,30,40,50,100
  spanning-tree port type edge
  no shutdown

Or, you can use the command show run interface INTERFACE expand-port-profile

NX9K# sh run int eth101/1/16 expand-port-profile

interface Ethernet101/1/16
 switchport mode trunk
 switchport trunk allowed vlan 10,20,30,40,50,100
 spanning-tree port type edge
 no shutdown

The difference here is that the show port-profile expand-interface command will show you all interfaces with that profile assigned, where the show run interface is only displaying the single interface.

Inheritance

Another great feature of port-profiles is that they are inheritable. This allows you to modularize your configurations and reference them by profile name within other profiles. I came across a good example of this in a presentation from Cisco about using profile inheritance on the Nexus 1000V. In their example, they were applying the same switchport mode and vlan access settings but wanted to apply varying QoS policies. So in their example they had the following profiles:

port-profile WEB
 switchport mode access
 switchport access vlan 100
 no shut

port-profile WEB-GOLD
 inherit port-profile WEB
 service-policy output GOLD

port-profile WEB-SILVER
 inherit port-profile WEB
 service-policy ouput SILVER

interface Eth1/1
 inherit port-profile WEB-GOLD

interface Eth1/2
 inherit port-profile WEB-SILVER

The end result is that all assigned interfaces are configured as access ports in vlan 100, but the QoS policy differed. Only 4 levels of inheritance are supported, so don’t go too crazy here.

Things to remember

As you begin to work with profiles, there are some important things to remember as it relates to order of precedence in the commands that will take effect on the interface. Taken straight from the documentation:

The system applies the commands inherited by the interface or range of interfaces according to the following guidelines:

  • Commands that you enter under the interface mode take precedence over the port profile’s commands if there is a conflict. However, the port profile retains that command in the port profile.

  • The port profile’s commands take precedence over the default commands on the interface, unless the port-profile command is explicitly overridden by the default command.

  • When a range of interfaces inherits a second port profile, the commands of the initial port profile override the commands of the second port profile if there is a conflict.

  • After you inherit a port profile onto an interface or range of interfaces, you can override individual configuration values by entering the new value at the interface configuration level. If you remove the individual configuration values at the interface configuration level, the interface uses the values in the port profile again.

  • There are no default configurations associated with a port profile.

One other important detail from the documentation states that checkpoints are created anytime you enable, modify, or inherit a profile, this way the system can roll back to a good configuration in case of any errors. A profile will never be partially applied — if there are errors, the config is backed out.

So go out and make your life easier — try out port profiles today!

SNMP polling interval granularity

I recently had a need to increase the granularity of SNMP monitoring on some critical network interfaces, and I thought I’d share what I’ve learned so far.

The reason behind the change was due to an issue where we briefly maxed out the bandwidth on of our WAN circuits. For most companies this would be no big deal, especially since the burst in traffic was less than a minute in duration. However, because our customers are very sensitive to latency, this caused some noticeable delays.

The problem with interval monitoring is that you just don’t know what happens between the polls. The whole idea of interval monitoring is to get an approximation of what’s happening, so by design you’re going to miss some of the details. But how do you know what you miss?

Take a look at this picture to get an idea of what I’m talking about:

SNMP Granularity Example

As you can see, the longer the time between polls, the more detail is missed. This might not be a problem for you — I would say that the goal of most monitoring solutions is to look for traffic *trends*, and not necessarily sub minute bursts. But if you need the extra detail, then it becomes quite a challenge, and there are limited solutions depending on the size of your budget.

If money is no object, then you can buy one of the fancy network traffic monitors by Corvil or Netscout. These are awesome boxes that record every drop of data you feed through them, so that you can replay the exact issue over and over. You pay handsomely for that functionality.

cPacket networks and Riverbed also have some cool products, but price is definitely a factor as well.

The only way to get this kind of granularity on the cheap is to leverage SNMP and pump up the granularity.

IOS SNMP Configuration

Disclaimer: Do not apply these commands to your production environment until you have tested and fully understand the potential impact

By my rough estimation and testing, most Cisco devices (IOS, IOS XE, and NX-OS), by default, will update their internal SNMP statistics every 10 seconds +/-. But there’s hidden command in IOS that will let you adjust the interval at which IOS updates the internal statistics database.

snmp-server hc poll <interval>

The interval for this command is in hundredths of seconds. So to update at 1 second intervals, you need:

snmp-server hc poll 100

I have used this command successfully on Cisco ISR G2 routers, 7200 routers, and 4900M switches.

This does not work on IOS XE or NX-OS devices, including ASR1000’s and Nexus 7K or 3K’s.

Network Management Systems

You might recall that I’m a big fan of Solarwinds for network monitoring. Unfortunately, the lowest interval you can configure for statistics collection in Solarwinds NPM is 1 minute. So I had to look elsewhere for a product that could poll more frequently.

Enter SevOne. Their platform allows you to configure ‘high-speed pollers’ which can be tuned down to 1 second intervals. The folks at SevOne are great to work with, and I received a lot of good guidance as I was configuring my instance.

This isn’t meant to be a review or comparison of the two products — SevOne solved a very specific problem for me and at a price point that was 1/10th the cost of the next closest solution, so props to them.

The other possible solutions are Cacti or other RRDtool based statistics/graphing systems. Since these are so flexible, I would expect that you could grab stats down to one second with these as well, although I haven’t verified this.

Request for Comments

There’s some discussion out there about the problems you can run into if you poll too frequently, one of which is polling more frequently than the agent updates. I definitely observed this problem as I was working on this, and saw the wild numbers you can get. If you have any thoughts on why polling more frequently would be bad, or might lead you to the wrong conclusions, please feel free to comment.

Cisco EZVPN with IOS Router and ASA

I had an interesting request come across my desk, where I needed to configure a site-to-site VPN for some internet connected devices, but the devices were not allowed to connect internally to our network. So basically, I needed to tunnel the internet traffic back to our headend without allowing access to the internal network. The remote location also wouldn’t have a static IP. Having used EZVPN in the past, I figured this would be another great use case. Unfortunately I spent way too many hours trying to find a good example of how to get this setup working, so I figured I’d share my config for anyone else who may be struggling with a similar setup.

Diagram

EZVPN with IOS and ASA

IOS Router Config (EZVPN Client)

crypto ipsec client ezvpn ez
 connect auto
 group MyTunnelGroup key MySecretKey
 mode client
 peer 10.10.10.1
 username MyVPNUser password MyPassword
 xauth userid mode local
!
interface Fa0/0
 description WAN Interface
 ip address dhcp
 crypto ipsec client ezvpn ez
!
interface Fa0/1
 description LAN Interface
 ip address 192.168.0.1 255.255.255.0
 crypto ipsec client ezvpn ez inside
!

The first section defines the properties for the EZVPN connection, and there are 3 items that need special attention:

  1. The group and key you configure here will match the TunnelGroup name and IKEv1 key you configure on the ASA
  2. The username and password are also defined on the ASA. This is the actual user that is being authenticated.
  3. The xauth mode needs to be configured as local so the router doesn’t have to prompt for credentials.

Other items to note:

  1. There are three modes for EZVPN, Client, Network Extension, and Network Plus. If this were a true L2L VPN, I’d use Network Extension or Network Extension Plus so that there was direct IP-IP connectivity between hosts on either side of the VPN. Since I don’t need that, I’m configuring Client mode which is similar to a PAT for all client traffic.
  2. The peer IP will be the outside address of your EZVPN server.

ASA Configuration (EZVPN Server)

access-list EZVPN-ACL standard deny 10.0.0.0 255.0.0.0
access-list EZVPN-ACL standard permit any4
!
group-policy MyGroupPolicy internal
group-policy MyGroupPolicy attributes
 dns-server value 8.8.8.8
 vpn-access-hours none
 vpn-simultaneous-logins 3
 vpn-idle-timeout 30
 vpn-session-timeout none
 vpn-filter value EZVPN-ACL
 vpn-tunnel-protocol ikev1
 group-lock none
 split-tunnel-policy tunnelall
 split-tunnel-all-dns enable
 vlan none
 nac-settings none
!
username MyVPNUser password MyPassword
username MyVPNUser attributes
 vpn-group-policy MyGroupPolicy
!
tunnel-group MyTunnelGroup type remote-access
tunnel-group MyTunnelGroup general-attributes
 default-group-policy MyGroupPolicy
tunnel-group MyTunnelGroup ipsec-attributes
 ikev1 pre-shared-key MySecretKey

The Tunnel Group defines the preshared key for the connection that was referenced in the group MyTunnelGroup key MySecretKey command on the client. The Tunnel Group config also points to a Group Policy that will control the policy for the tunnel. I created a new policy, but you could also use the default DfltGrpPolicy if it fit your needs.

Conclusion

The beautiful thing about EZVPN is that all of the policy aspects are controlled at the Server side. So while the current requirement is to block access to internal resources, I could easily change that on the server side without worrying about messing up the config on the client and bringing the tunnel down.

IT Crisis Management: Responding to and dealing with network outages

I’ve worked in the financial industry for almost a decade now, and in that time I’ve seen my fair share of network/system outages. When I started, I heard plenty of stories about “RGE’s” or “Resume Generating Events” as they were called around the office, and was jokingly advised to always keep my resume up to date. That’s not bad advice by itself, but I wanted to share my opinion on dealing with outages and the aftermath.

When I started, I lived in fear of what would happen if something went down during my watch. My heart skipped a beat each time an alert popped up on our monitoring system. And the first time an outage occurred, I was sure it was going to cost me my job.

Thankfully, that didn’t happen.

Over the years, not only have I grown less fearful of outages, I’ve learned that there are two phases to an outage: The outage itself (including the resolution), and the post-mortem.

The Outage

Obviously, if something is going wrong, the primary objective should be restoring service as soon as possible. Key to a rapid resolution is having clear channels of communication with other teams and knowing the most important people to bring into the loop during an issue. This is called being transparent about issues. Don’t wait to open channels of communication to partner teams or managers. In the financial industry information is key. Clients might be unhappy that you’re having problems, but they will be downright livid if you had an issue and you didn’t warn them. This can also have liability issues attached, so be sure to think out these things with your management.

Also vitally important is a strong understanding of the network, how things connect to each other, and having a good monitoring system in place. It almost goes without saying, but if you don’t know how anything relates to each other, and you have no way to monitor your systems, you’ve got some other issues to deal with.

The Post Mortem

You’ve solved the issue and things are working smoothly again — disaster mitigated! What happens after the outage, is in my experience, equally as important as resolving the issue itself. I believe this is where you can really shine and set yourself apart.

A mentor of mine once told me that what people really want after an outage is the answers to three questions:

  1. What happened?
  2. What did you do to fix it?
  3. When did you know about it?

And he followed up by saying that out of the three questions, the last was really the most important. His opinion, and one I share now, is that accidents happen, equipment breaks, circuits get the backhoe treatment, etc. and you generally can’t totally avoid outages. What you can do (besides practicing good operational principles) is respond promptly to your alerting system or reports from users, and show initiative by investigating on the first call and not on the 10th. Act quickly and openly.

Depending on the nature of your team, it’s likely that only a few people will understand the true technical cause of an issue. As you communicate up the chain, the gruesome details will be lost. Its like the telephone game:

Engineer to manager: "We started taking excessive CRC errors on an interface without the line actually going down."

Manager to CTO: "We had a line problem."

CTO to CEO: "It broke."

Ok, so maybe that’s a little too simplified, but you get the idea. You don’t speak the same language as the upper echelon of management, and conversely, they don’t speak your low-level technical language. All they care about is when did we know about it, and how soon did we fix it.

Everything Else

This is not meant to be an exhaustive description of everything that should happen before/during/after an outage. I also don’t want to get into the details of what constitutes good operational principles, but I’m thinking of things like exercising caution when implementing changes, not being cavalier about production environments, and only making changes after following proper change management procedures , and then only during approved change windows.

I’m also aware that there are different cultures around the human side of outages. Some companies or managers will roast you and leave you for dead no matter how small the misstep, while others will have a more forgiving approach. That’s another topic as well.

Just remember that no matter how deep your bureaucracy, or how carefully you plan, there will always be unexpected issues. Just be sure that you do everything you can to be responsive and transparent about the issues you face.

Have thoughts on this subject? I’d love to hear your comments!

FHRP Filtering on Cisco ASR1001 with OTV

I’m finally getting the chance to deploy OTV and LISP in a live environment and wanted to share one of the issues I’ve run into.

As I mentioned in my post about OTV Traffic Flow Considerations, using HSRP (or VRRP/GLBP) at each site has the potential to cause traffic to “trombone” through the network in a sub-optimal path. Because of this behavior, FHRP filtering should be configured on your OTV routers to ensure that the HSRP device on each side of the overlay becomes an active gateway for the network. The ASR1001 is supposed to have this built-in.

Here’s the topology:

Production OTV Diagram

The Problem

After I setup OTV and LISP, I noticed that I had spotty connectivity to my host inside the overlay. A continuous ping revealed that I was missing a ping or two almost every 60 seconds. When I looked at the route for that host, the age was always less than 1 minute. Since these routes are redistributed into OSPF, I went back to the OTV/LISP routers and tried to see what was happening.

On the OTV/LISP routers, I could see that the local Lisp routes were also being inserted and withdrawn regularly, which meant that Lisp thought the EID was moving to the other router. Since the LISP mapping system is in charge of communicating EID-to-RLOC mapping changes, I ran debug lisp control-plane map-server and observed the following output (abbreviated):

Oct  2 11:09:41.623 EDT: LISP: Processing received Map-Notify message from 10.5.7.82 to 10.5.8.82
...
Oct  2 11:09:41.623 EDT: LISP-0: Local dynEID MOBILE-VMS IID 0 prefix 10.78.1.245/32, Received map notify (rlocs: 1/1).
Oct  2 11:09:41.623 EDT: LISP-0: Local dynEID MOBILE-VMS IID 0 prefix 10.78.1.245/32, Map-Notify contains new locator 10.5.7.82, dyn-EID moved (rlocs: 1/1).

Since I hadn’t moved the VM across the overlay, It surprised me to see that LISP thought the VM was moving. After banging my head on the wall with that issue, I started looking lower in the stack at OTV.

During normal operation, the OTV routing table on the local OTV router (router closest to the host) should look like this:

SAV-OTVRTR2#sh otv route
...
OTV Unicast MAC Routing Table for Overlay1

Inst VLAN BD     MAC Address    AD    Owner  Next Hops(s)
----------------------------------------------------------
0    800  800    0000.0c07.ac4e 40    BD Eng Gi0/0/0:SI800
...

Note the route for 0000.0c07.ac4e , which is the MAC for HSRP group 78. This is a FHRP address, so should it even be showing up? Since it was there I assumed that the FHRP filtering must only prevent the route from being advertised to OTV neighbors.

But during one of the blips, I noticed this:

 Inst VLAN BD     MAC Address    AD    Owner  Next Hops(s)
----------------------------------------------------------
0    800  800    0000.0c07.ac4e 30    ISIS   RAD-OTVRTR2

So not only was the HSRP MAC showing up with FHRP Filtering enabled, but it was also still being advertised across the network. This shouldn’t be.

The Solution – for now

I opened a TAC case and consulted with Cisco about the issue. They agreed that it was “odd” that the HSRP information was leaking across the overlay and recommended I put in an ACL to block FHRP information:

mac access-list extended otv_filter_fhrp
 deny   0000.0c07.ac00 0000.0000.00ff host 0000.0000.0000
 deny   0000.0c9f.f000 0000.0000.0fff host 0000.0000.0000
 deny   0007.b400.0000 0000.00ff.ffff host 0000.0000.0000
 deny   0000.5e00.0100 0000.0000.00ff host 0000.0000.0000
 permit host 0000.0000.0000 host 0000.0000.0000

…and apply the ACL to the OTV Inside interface.

You might notice that OTV automatically adds another ACL:

Extended IP access list otv_fhrp_filter_acl
    10 deny udp any any eq 1985 3222 (57416 matches)
    20 deny 112 any any
    30 permit ip any any (51921 matches)

This ACL blocks the UDP ports used for HSRP and GLBP, as well as IP Protocol 112, VRRP. This must be the portion that is added by default, but it doesn’t seem to be sufficient.

Conclusion

I asked Cisco about why the extra ACL was necessary when the documentation indicates that FHRP was built-in and enabled by default. As soon as I hear something I’ll provide an update. As far as I know the Nexus 7K still requires you to manually configure these ACL’s, but it seems that, for now, so do the ASR’s.

 

Update: I heard back from Cisco TAC about my issue and they think my problem stems from the fact that I’m trying to use the same physical hardware for both the L2 bridging and the L3 gateway:

Due to the ASR1k architecture, it is recommended that you move FHRP off the ASR. It is unlike N7k architecture where we can keep FHRP on the same device and use a mix of MACLs, VACLs, etc to filter out the virtual MAC from going across the overlay. The only way to really prevent the virtual MAC from being learned across the overlay is to prevent the ASR from ever learning it in the first place.

In regards to the default OTV FHRP filtering, TAC confirmed that the otv_fhrp_filter_acl is added when OTV is configured.  It doesn’t attempt to prevent L2 information from being learned however — it only attempts to block actual HSRP communication across the overlay.

Redistributing Anyconnect VPN addresses into OSPF on Cisco ASA

I’m a big fan of the Cisco Anyconnect VPN client due to its easy configuration, and the relative ease of deployment to end users. When you deploy an Anyconnect VPN on your ASA, one of the important tasks is to decide how to advertise the VPN assigned addresses into the rest of your network. Fortunately, this is easy to accomplish using route redistribution.

Basic Setup

In this example, my VPN pool will be assigned from the 192.168.254.128/25 range, and I will redistribute these routes into OSPF. Notice that the ASA automatically creates a static host route for a connected client:

ASA# sh route | i 192.168.254
S    192.168.254.154 255.255.255.255 [1/0] via 1.1.1.1, Outside

So we have the building blocks for what we need, now let’s look at the configuration.

There are several different ways to accomplish this task, but I’ll demonstrate what I typically use.

Redistributing into OSPF

First, we’ll create a prefix list to match the address pool for our Anyconnect clients:

prefix-list VPN_PREFIX seq 1 permit 192.168.254.128/25 le 32

This prefix list entry matches the 192.168.254.128/25 subnet, as well as any routes with a mask less-than or equal to 32 bits. This works great, because our routes will all be /32.

Next we’ll create a route-map that we can reference inside OSPF:

route-map VPN_POOL permit 1
    match ip address prefix-list VPN_PREFIX

And finally, we’ll add enable redistribution in OSPF:

router ospf 1
    redistribute static subnets route-map VPN_POOL

If we look the routing table on another router in our network, we should see the route:

RTR#sh ip route | i 192.168.254
O E2     192.168.254.128/32 [110/20] via 10.5.2.6, 00:5:03, Vlan85

Advertising the subnet instead of individual host routes

If you like to keep your routing tables uncluttered, you might be inclined to only redistribute the entire VPN prefix, instead of the /32 routes. The important thing to remember here is that OSPF will not redistribute a route that is not already in the routing table.

We’ll simply add a static route for the VPN prefix:

route outside 192.168.254.128 255.255.255.128 1.1.1.1

Without any other modifications, we will now see routes like this in our network:

RTR#sh ip route | i 192.168.254
O E2     192.168.254.128/25 [110/20] via 10.5.2.6, 00:07:28, Vlan85
O E2     192.168.254.154/32 [110/20] via 10.5.2.6, 00:07:28, Vlan85

But we want to get rid of the /32 routes. So we have two options now:

  1. Modify the prefix-list to match only the /25 route
  2. Modify the OSPF redistribution command to ignore subnets.

Option 1: Modify the prefix-list

We’ll change the prefix list so we don’t even consider subnets with different masks:

no prefix-list VPN_PREFIX seq 1 permit 192.168.254.128/25 le 32
prefix-list VPN_PREFIX seq 1 permit 192.168.254.128/25

Our redistribution command still has the subnets keyword, but since the prefix list won’t even allow smaller prefix lengths, we end up with just the one route.

Option 2: Modify the OSPF redistribution command

You can also remove the subnets keyword from the redistribution command:

router ospf 1
    redistribute static route-map VPN_POOL

This way it doesn’t matter if the prefix-list matches longer routes, OSPF just won’t redistribute them.

Final Configuration

In the end we have a configuration that looks something like this:

route outside 192.168.254.128 255.255.255.128 1.1.1.1
!
prefix-list VPN_PREFIX seq 1 permit 192.168.254.128/25
!
route-map VPN_POOL permit 1
 match ip address prefix-list VPN_PREFIX
!
router ospf 100
 redistribute static route-map VPN_POOL

The ASA will still show all of the /32 routes, plus the /25 route:

ASA# sh route | i 192.168.254
S    192.168.254.154 255.255.255.255 [1/0] via 1.1.1.1, Outside
S    192.168.254.128 255.255.255.128 [1/0] via 1.1.1.1, Outside

But routers inside the network will only see the /25 route:

RTR#sh ip route | i 192.168.254
O E2     192.168.254.128/25 [110/20] via 10.5.2.6, 01:45:03, Vlan85

I didn’t talk about modifying any of the OSPF metrics as the routes are being injected, but that would be something to consider if you do this in your environment.

IOS CLI historical interface graphs

I was researching something for a project recently and came across a feature I hadn’t seen before:  historical interface graphs.

With this feature, you can enable up to 72 hours of traffic statistics on your interfaces, and you can view this data via the CLI, similar to how ‘show proc cpu history’ works.

Check it out:

Interface-history-cli

I’d never seen this before, so I was quite excited. If you’re like me, and haven’t tried this out yet, here is how you configure it:

interface Gig0/0
    history {bps | pps} [filter]

The filter can include a lot of different items, including:

  • input-drops
  • input-error
  • output-drops
  • output-errors
  • overruns
  • pause-input
  • pause-output
  • crcs

and the list goes on. You can see a full table of supported filters in the IOS Command reference.  I found this worked on my 7200s, ASR1Ks, ISRs.

A glimpse into LISP Control-Plane traffic

In our lab we were able to configure LISP and verify connectivity between our two hosts. One thing I noticed was the loss of the first two ICMP packets. Let’s walk through how LISP functions and examine what was happening behind the scene.

Upon receipt of the first packet, the SITE1 router (acting as an ITR) checked the LISP map cache to see if it already had an RLOC mapping for the destination EID (172.16.30.101):

SITE1#sh ip lisp map-cache
LISP IPv4 Mapping Cache for EID-table default (IID 0), 2 entries

0.0.0.0/0, uptime: 00:10:16, expires: never, via static send map-request
  Negative cache entry, action: send-map-request

This negative cache entry tells the router that it needs to send a map request to see if there’s an RLOC mapping available. The router will send a map request for EID 172.16.30.101/32 to the Map resolver at 192.168.255.3, and drop the initial data packet (ping #1) from our host:

LISP-Map-Reply

The Map Resolver/Map Server checks the namespaces that have registered, and looks for the RLOC address of the ETR that is authoritative for the EID prefix:

MR_MS#sh lisp site name SITE2
Site name: SITE2
Allowed configured locators: any
Allowed EID-prefixes:
  EID-prefix: 172.16.30.0/24
    First registered:     00:31:21
    Routing table tag:    0
    Origin:               Configuration
    Merge active:         No
    Proxy reply:          No
    TTL:                  1d00h
    State:                complete
    Registration errors:
    Authentication failures:   0
    Allowed locators mismatch: 0
    ETR 192.168.100.2, last registered 00:00:50, no proxy-reply, map-notify
                        TTL 1d00h, no merge, hash-function sha1, nonce 0x0524E21F-0x489614BB
                        state complete, no security-capability
                        xTR-ID 0xDC4A8044-0x87093251-0x602669CB-0x5B720F12
                        site-ID unspecified
    Locator        Local  State      Pri/Wgt
    192.168.100.2  yes    up          10/50 

Once the Map Server determines the RLOC for the authoritative ETR, it forwards the Map-Request message.  The ETR receives the forwarded Map-Request, and responds with a Map-Reply:

LISP-Map-Reply

Once the SITE1 router receives the reply, it updates the local cache:

SITE1#sh ip lisp map-cache
LISP IPv4 Mapping Cache for EID-table default (IID 0), 3 entries

0.0.0.0/0, uptime: 00:10:35, expires: never, via static send map-request
  Negative cache entry, action: send-map-request
0.0.0.0/1, uptime: 00:09:25, expires: 00:05:34, via map-reply, forward-native
  Negative cache entry, action: forward-native
172.16.30.0/24, uptime: 00:09:22, expires: 23:50:38, via map-reply, complete
  Locator        Uptime    State      Pri/Wgt
  192.168.100.2  00:09:22  up          10/50

Now that the router has a complete LISP cache, it can encapsulate packets in LISP headers and send them on their way.

LISP-Data-Packet-Header

LISP Data Packet Payload

In this setup, it’s interesting to note that we lost the first two ICMP packets to the control-plane process. The first packet was dropped by the SITE1 router as it went through the Map Request/Map Reply process to build the local cache. The second packet actually made it through to the other host, but the response was dropped by the SITE2 router as it also had to build the local cache. You can see some of that below:

Ping response sequence

Once the caches have been built, subsequent attempts are 100% successful:

[root@SITE1 ~]# ping -c 5 172.16.30.101
PING 172.16.30.101 (172.16.30.101) 56(84) bytes of data.
64 bytes from 172.16.30.101: icmp_seq=1 ttl=255 time=1.62 ms
64 bytes from 172.16.30.101: icmp_seq=2 ttl=255 time=1.41 ms
64 bytes from 172.16.30.101: icmp_seq=3 ttl=255 time=1.36 ms
64 bytes from 172.16.30.101: icmp_seq=4 ttl=255 time=1.48 ms
64 bytes from 172.16.30.101: icmp_seq=5 ttl=255 time=1.33 ms

--- 172.16.30.101 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4000ms
rtt min/avg/max/mdev = 1.338/1.442/1.626/0.114 ms

Conclusion

I think there are a few important items to remember about the LISP forwarding process. These probably seem obvious, but I still want to point them out:

  1. The mapping system is really more of a ‘director’, in that it doesn’t actually know the answers to queries, but knows who to talk to find out.
  2. LISP control-plane always uses the same source and destination ports: UDP/4342
  3. LISP data-plane packets are always destined for UDP/4341, but the source port will change.

Cisco has a website dedicated to LISP and you can find a great deal of information there, including the devices and software versions that support LISP.  I’d highly recommend checking it out:  Cisco LISP

Simple LISP Lab

We’re going to start with a very simple topology:

LISP-topology

Instead of using one of our standard routing protocols to advertise the host networks between the SITE1 and SITE2 routers, we’ll rely on LISP. OSPF will be used to advertise the loopback on the MS/MR router within the Core network. We’ll start configuring LISP assuming the basic network is ready to go.

First, lets setup LISP the SITE1 router:

router lisp
 locator-set SITE1
 192.168.100.1 priority 10 weight 50
 exit
!
database-mapping 192.168.5.0/24 locator-set SITE1
!
ipv4 itr map-resolver 192.168.255.3
ipv4 itr
ipv4 etr map-server 192.168.255.3 key secretkey
ipv4 etr
!

Now we’ll configure LISP on the SITE2 router:

router lisp
 locator-set SITE2
 192.168.100.2 priority 10 weight 50
 exit
!
database-mapping 172.16.30.0/24 locator-set SITE2
!
ipv4 itr map-resolver 192.168.255.3
ipv4 itr
ipv4 etr map-server 192.168.255.3 key secretkey
ipv4 etr
!

Finally, we’ll configure the Mapping Server and Mapping resolver functionality on the MS/MR router:

router lisp
site SITE1
 authentication-key secretkey
 eid-prefix 192.168.5.0/24
 exit
!
site SITE2
 authentication-key secretkey
 eid-prefix 172.16.30.0/24
 exit
!
ipv4 map-server
ipv4 map-resolver
exit

Once this is configured, our LISP infrastructure should be complete. Let’s check:

SITE1#sh ip lisp
  Instance ID:                      0
  Router-lisp ID:                   0
  Locator table:                    default
  EID table:                        default
  Ingress Tunnel Router (ITR):      enabled
  Egress Tunnel Router (ETR):       enabled
  Proxy-ITR Router (PITR):          disabled
  Proxy-ETR Router (PETR):          disabled
  Map Server (MS):                  disabled
  Map Resolver (MR):                disabled
  Delegated Database Tree (DDT):    disabled
  Map-Request source:               192.168.5.1
  ITR Map-Resolver(s):              192.168.255.3
  ETR Map-Server(s):                192.168.255.3 (00:00:55)
  xTR-ID:                           0x3D4C0900-0x95932BE0-0x2F5AF1F6-0x4C919E94
  ...

And over on SITE2:

SITE2#sh ip lisp
  Instance ID:                      0
  Router-lisp ID:                   0
  Locator table:                    default
  EID table:                        default
  Ingress Tunnel Router (ITR):      enabled
  Egress Tunnel Router (ETR):       enabled
  Proxy-ITR Router (PITR):          disabled
  Proxy-ETR Router (PETR):          disabled
  Map Server (MS):                  disabled
  Map Resolver (MR):                disabled
  Delegated Database Tree (DDT):    disabled
  Map-Request source:               172.16.30.1
  ITR Map-Resolver(s):              192.168.255.3
  ETR Map-Server(s):                192.168.255.3 (00:00:16)
  xTR-ID:                           0xDC4A8044-0x87093251-0x602669CB-0x5B720F12
  ...  

On the MS/MR Router, we can see that the ETR’s have registered with the LISP mapping system. Without this, LISP wouldn’t know the EID-to-RLOC mapping for each EID.

OTV-RTR3#sh lisp site
LISP Site Registration Information

Site Name      Last      Up   Who Last             Inst     EID Prefix
               Register       Registered           ID
SITE1          00:00:14  yes  192.168.100.1                 192.168.5.0/24
SITE2          00:00:17  yes  192.168.100.2                 172.16.30.0/24

Now, to show that LISP is working correctly, lets first show that we don’t have a route to the opposite site:

SITE1#sh ip route 172.16.30.101
% Network not in table
SITE1#ping 172.16.30.101
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.16.30.101, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

SITE2# sh ip route 192.168.5.101
% Network not in table
SITE2#ping 192.168.5.101
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.5.101, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

And since we didn’t configure a default route, there isn’t anything to fall back on.

Let’s test by pinging the SITE2 host:

[root@SITE1 ~]# ping 172.16.30.101
PING 172.16.30.101 (172.16.30.101) 56(84) bytes of data.
64 bytes from 172.16.30.101: icmp_seq=3 ttl=255 time=1.26 ms
64 bytes from 172.16.30.101: icmp_seq=4 ttl=255 time=4.57 ms
64 bytes from 172.16.30.101: icmp_seq=5 ttl=255 time=1.31 ms
^C
--- 172.16.30.101 ping statistics ---
5 packets transmitted, 3 received, 40% packet loss, time 4421ms
rtt min/avg/max/mdev = 1.268/1.374/1.466/0.086 ms

Looks good except for the two we lost at the beginning. Let’s try the reverse ping now:

[root@SITE2 ~]# ping 192.168.5.101
PING 192.168.5.101 (192.168.5.101) 56(84) bytes of data.
64 bytes from 192.168.5.101: icmp_seq=1 ttl=255 time=1.49 ms
64 bytes from 192.168.5.101: icmp_seq=2 ttl=255 time=1.31 ms
64 bytes from 192.168.5.101: icmp_seq=3 ttl=255 time=1.26 ms
64 bytes from 192.168.5.101: icmp_seq=4 ttl=255 time=4.57 ms
64 bytes from 192.168.5.101: icmp_seq=5 ttl=255 time=1.31 ms
^C
--- 192.168.5.101 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4001ms
rtt min/avg/max/mdev = 1.178/1.278/1.398/0.090 ms

So we have two way communication across our network, using LISP to locate and reach the opposite site. Next time we’ll dig a little deeper on what’s happening behind the scenes.

Cisco LISP

With technologies like OTV, we need a method to optimize traffic destined for our mobile virtualized hosts, by tracking the location and updating our underlying routing system. One possible solution is known as LISP.

LISP stands for Locator ID Separation Protocol, and its function is to allow you to separate the location component of an IP address, from the identity portion of the address. Or as the Cisco LISP configuration guide states, LISP “implements the use of two namespaces instead of a single IP address.” What does this mean?

With LISP you are introduced to two new concepts:

  • Endpoint Identifiers (EID)
  • Routing Locators (RLOC)

The endpoint identifier (EID) is the address used to identify a specific host — this is the same as the IP addresses you use today, and it is said to be in the LISP namespace. The Routing Locator (RLOC) is the address of a router that is part of the normal routing domain but is connected to the LISP namespace and the non-LISP namespace. The RLOC is said to be part of the non-LISP namespace.

One significant difference with LISP is that you no longer have to advertise the EID address space into the normal routing domain. Instead, you rely on LISP to provide mappings between EIDs and RLOCs, and you route based on the RLOC address.

I like to think of LISP as a DNS-like system for routing, because an important part of LISP is a mapping system that maintains EID-to-RLOC mappings. Just like you use DNS to query for name-to-IP mappings, a LISP router performs a query against a LISP mapping server to find out the RLOC that should be used to reach the desired EID.

Components of a LISP system

An complete LISP infrastructure will consist of many parts:

  • Ingress Tunnel Routers (ITR)
  • Egress Tunnel Routers (ETR)
  • XTR
  • Map Server (MS)
  • Map Resolver (MR)
  • Proxy Ingress Tunnel Routers (PITR)
  • Proxy Egress Tunnel Routers (PETR)
  • PXTR

ITR

The ingress tunnel router receives unencapsulated IP packets from the EID namespace, and is responsible for performing lookups to identify EID-to-RLOC mappings for destination addresses. If the packet is destined for an EID in another LISP namespace, the ITR will encapsulate each packet with a LISP header and route the packet towards the identified RLOC. If the packet is destined for a non-LISP address, the packet is routed without any LISP modifications.

ETR

The egress tunnel router receives LISP encapsulated packets from the non-LISP portion of the network, removes the LISP header, and delivers the unencapsulated packets to the EID. The ETR is also responsible for keeping the Mapping system up to date with EID mappings and responding to Mapping system requests.

LISP XTR

An XTR is a router that performs both ETR and ITR functions.

Map Server

The map server receives EID registrations from ETRs, and responds to map request messages that are forwarded from map resolvers.

Map Resolver

The map resolver receives encapsulated map request messages from ITRs and forwards them to Map Servers that are authoritative for the EID namespace being queried.

PITR/PETR/PXTR

The Proxy ITR/Proxy ETR allows non-LISP sites to communicate with LISP sites, and vice-a-versa, by performing ITR and ETR functionality. They can be deployed separately, or together. If deployed together the device is referred to as a PXTR.

Conclusion

It took me a little while to wrap my head around the LISP concept, and one big help was working with it in a lab environment.  In the next post, I’ll walk through a lab scenario to demonstrate LISP in action.