SDN

SDN
Today’s traditional Network paradigm: Control and data plane reside within the physical device
Control Plane: Routing protocol, spanning tree, SYSLOG, AAA, Netflow, CLI, SNMP: It hangles by switch CPU: These packet will come in the order of thousand of packet per seconds.
Data Plane: Layer 2 switching, L3 switching, MPLS forwarding, VRF forwarding, QoS, Marking, Classification, Policing, Netflow flow collection, security access control lists: Dedicated hardware ASIC’s available.  Millions or Billions of packet per seconds.
Over the Years… This network paradigm has remained mostly intact..Until 2012 and since then SDN comes
SDN Definition:  SDN is an approach to building computer networks that separated and abstracts elements of these systems. Here separation is talking about the separation about the control and data plane.
In SDN paradigm, not all processing happens inside the same device. It means control plane yank it out and run it some where in the network. or control plane is separated from the physical device, but it is likely to also have local control plane. So both control plane would have different functionality.
Where did SDN came from? from Stanford University
The important point to keep in mind that OpenFlow does not equal to SDN. It is one of the tool in toolbox in SDN.
What is OpenFlow?
OpenFlow is layer 2 communication protocol that gives access to the forwarding plane of network switch or router over the network.
1) Original Motivation:
– Research community’s desire to be able to experiment with new control paradigms
2) Base Assumption
– Providing reasonable abstraction for control requires the control system topology to be decouples from the physical network topology(as in the top-down approach)
Starting point: Data-plane abstraction: separate control plane from the devices that implement data plane
3) OpenFlow was designed to facilitate separation of control and data planes in a standardized way
4) Current spec is both a device model and a protocol
– OpenFlow device model: An abstraction of network element (switch/router); currently (version<=1.3.0) focused on forwarding plane abstraction.
– OpenFlow protocol: A communications protocol that provides access to the forwarding plane of an OpenFlow device.
Control and data plane communicate through OpenFlow protocol.

Four Parts of OpenFlow
1) Controller: Resides on a server and provides control plane function for the network
2) OpenFlow Agent: Resides on a network device(router/switch) and fulfil requests from the controller
3) Northbound APIs: Enable applications to interface with the controller
4) OpenFlow Protocol: The layer 2 protocol that the controller and Agent use to communicate.
Controller is acting as abstraction.

Two OpenFlow switch model
1) OpenFlow Only Switch: Yanking out all the control plane functionality and sit on OpenFlow controller. Switch has two component 1) flow table 2) OF interfaces. All the computation happen on OpenFlow Controller and it push down to flow table and when packet comes in depending upon flow table switch makes decision.
Reactive:
1) First packet of flow triggers controller to insert flow entries
2) Efficient use of flow table
3) Every flow incurs small additional flow setup time
4) If control connection lost, switch has limited utility
Proactive
1) Controller pre-populates flow table in switch
2) Zero additional flow setup time
3) Loss of control connection does not disrupt traffic
4) Essentially requires aggregated(wildcard) rule.

2) OpenFlow Hybrid Switch
We have OpenFlow Controller and at the same time we have switch plane controller plane as well. So some of the decision made at the switch controller. In this mode, we have two type of switch OF(OpenFlow) interface and switch interface.
Reactive Switch Operation
1) Data enters switch
2) Lookup key compared to Flow table(Lookup key created using the header of the packet)
3) If Match, Forward to Switch forwarding Engine
4) If no Match, Forward to controller
5) Controller injects new Flow Entry
6) Switch Forwards Data
Proactive Switch Operation
1) OF controller programs switch Flow Table
2) Data Enters switch
3) Lookup Key Compared to Flow Table
4) If No match, DROP
5) If match, Forward to switch forwarding engine
6) Switch Forwards data.

Flow Table
It has three field
1) Header Field: Based contains all the fields like Src IP, dst IP, mac, VLAN ID, etc,
2) Counter: About the packet and bytes track
3) Actions: What we will do?

Who Control Open Flow?
Open network Foundation.

Deployments model according to SDN terms…
1) Classic SDN
2) Hybrid SDN
Both of the model has controller but in device we have control plane(only in hybrid model). When controller talking to the data plane either in classic or hybrid we can see protocol over there is not only open flow it’s multiple other protocol(PCEP,BGP-LS etc) there could be vendor specific protocol as well which can talk between the controller and Data plane (Cisco onePK).

Controller/Agent: Old Concept- new Apps…
1) Networking already leverage’s a great breath of agents and controllers. Current Agent-Controller pairs always server a specific task( or set of task) in a specific domain.
2) System Design: trade-off between Agent-controller and fully distributed control control loop requirements differ per function/service and deployment domain.
Example 1) Session border control 2) Wireless LAN control 3) Patch Computation.

Controller and Agents
1) Some network delivered functionality benefits from logically centralized coordination across multiple network devices.
– Functionality typically domain, task or customer specific typically multiple controller/agent pairs are combined for a network solution
2)Controller
– Process on a device interacting with a set of devices using a set of APIs or protocols
– Offer a control interface/API
3)Agent
– Process on a device that deliver a task/domain specific function

The Aim of Open dayLight Project is to create the Open source controller.
Project Open DayLight Goals
1) Code: To create a robust, extensible, open source code base that covers that major common components required to build an SDN solution
2) Acceptance: To get broad industry acceptance amongst vendor and users
3) Community: To have a thriving and growing technical community contributing to the code base, using the code in commercial products, and adding value abov, below and around.
Cisco’s XNC is the part of daylight

Beyond SDN: Full Network Programmability
Fully Distributed Control Plane: Optimized For reliability
but now we are moving towards Hybrid Control Plane: Distributed control combined with logically centralized control for optimized behaviour(ex. Reliability and performance)
Apart from hybrid SDN and Traditional SDN we have Programmable APIs. It’s simple way to talk to router/switch through different ways. In Cisco we do that throhg OnePK. What does it mean of talking to the device, it really means that now we have ability using programmable API’s lets say routing protocol which has building RIB and I can through programmable API’s I can influence the RIB. Ok OSPF is telling  to take route A from 1 to 2 but through this API talking to router/switch you can say no do not take that router but take another route because that router value me depending upon my business needs.

onePK for Rapid Application development
1) Development Environment
– Language of choice
– Programmatic interface
– Rich data delivery via API
2) Comprehensive service sets
– Better Apps
– New Service
– Monetization Opportunity
3) deploy(onePK)
– On a server blade
– On an external server
– Directly on the device
4) Consistent Platform Support
– IOS
– IOS-XE
– NX-OS
– IOS XR

What about NfV and Overlay Networks?
Overlay network can be created and torn down without changing underlying physical network.
Network Functions Virtulization(NfV)
– NfV initiative announced at SDN and OpenFlow world congress.
– Leveraging clod technology to support Virtulizing specific network functions.
Every component we have in network should be virtual.

Overlay Networks
the idea here is to control the network through virtual switch. What does it really mean? You have start with physical switch network and top of that we have add an overlay. It provides the base of logical network and on this overlay we can build logical switch devices overlay the physical network. Underlying physical network carries data traffic for overlay networ. We can build multiple overlay network can exist at the same time. Overlay provides logical network constructs for different tenants.

Overlay Encapsulations and forwarding
1) Virtual OVerlays in the SDN context usually refers to host-based encapsulation and forwarding
– Extended L2 connectivity and scalability
– Secure segmentation(Multi-tenant environments, etc.)
2) Stateless tunnelling Mechanisms
– No static tunnel setup required
– Frame formats recognized by hosts and treated as tunnelled frame.
3) 3 popular hypervisor-based overlay technologies:
– Virtual extensible local area network(VXLAN)
– Network Virtulization using generic routing encapsulation(NVGRE)
– Stateless transport tunnelling(STT)

…and how does OpenStack fits into SDN?
To understand Openstack, let us first, let us define Cloud computing…
Cloud computing provides a set of resource and service through the internet.
What are these resource?
Application/server/netowking/ runtimes/virtualization/storage/database/security
What resource you mange inside the cloud defines the following…
private could
infrastructure as a service (IAAS)
platform as service(PAAS)
software as a service(SAAS)
How does these differ from one another?
The main differentiation criteria is managed by you or manger by vendor?
With IAAS, Compute, storage, networking and virtualization resource are managed by the vendor(this defines them as an IAAS provider)
Where does Openstack comes into picture? OpenStack lets the provider manger these resource.
Based on these OpenStack has 4 components
1) Openstack computer(NOVA): Allows the administrator to create and mange virtual machine using various machine image.
2) Openstack object store(SWIFT): Provides the ability to store object-basically it is the component that is responsible for managing storage and reading/writing objects to that storage.
3) Openstack Image service(GLANCE): This is the component responsible for managing the different operating system images(windows, linux etx) that NOVA uses to create virtual machines.
4) Openstack neutron service: Allows the administrator to create and mange virtual networks. This is the piece that has relevance to our SDN story.

Cisco ONE(open Network environments) Framework:
SDN or OpenFlow talk about the control plane or data plane but Cisco would consider 5 aspects 1) Management and orchestration 2) Network services 3) Control plane 4) data plane 5) Transport
We saw two model where control plane can be decouple from data plane model
1) Fully distributed control plane
2) Hybrid control plane: Distributed control combined with logically centralized control for optimized behavior.

ONE
1) Platform APIs
– onePK
– OpenStack
– REST
2) Controllers and Agents
– Cisco XNC controller
– Open Daylight
– OpenFlow
– Chef/puppet
3) Virtual OVerlays
– N1KV
– VXLAN
– NvGRE
Platform APIs
Full-Duplex, Multi-layer/Multi-plane APIs
Management: Workflow management network configuration and device models.(Network Models- interface(OMI))
Orchestration: L2-segments, L3-segments, service-chains multi-domain(WAN.LAN DC)(OpenStack, Quantum API)
Network Service: Topology, positioning, analytics multi-layer path control, demand eng.(Positioning(ALTO) pathc control(PCE))
Control: routing policy, discovery, VPN, Subscriber, AAA/logging, switching, addressing,…(Interface to the routing system I2RS)
Forwarding: L2/L3 forwarding control, Interfaces, Tunnels, Enhances Qos…(OpenFlow Protocols)
Device/Transport: Device configuration, life-cycle management, monitoring, HA,..(Network function virtualization NfV)
Not all networking APIs are created the same
classes of networking APIs following their scope
1) Classify networking APIs based on their scope
– API scopes: location independent; area; particular place; specific device
– Alternate approach like device/network/service APIs difficult to associate with use cases
– Location where an API is hosted can differ from the scope of the API
2) Different network planes could implement different flavour of APIs, based on associated abstractions.
Below are the few API
Utility: which covers all authentication, location
Area/set: which cover routing related information
Element: Get the interface statistics

onePK use case:
Customer routing: Customer is aksing to do routing based on their own metrics. Lets says it’s based on $. First onePK have topology discovery and route information and then define your own path.

Challenges with the conventional Approach
1) High cost of conventional matrix switches make scaling unaffordable
2) Filtering and forwarding are statically configured, not event driven
3) Tools compatibility limited to off the shelf.

Hence, Cisco replaces Matrix Network with Nexus 3000s, Controller, and monitor Manager Controller Application. Cisco XNC catching information from the network

Virtual Overlays
Nexus/catalyst —- vSwitch(Nexus1000v)
ASR/ISR/CRS—-vRouter(CSR1000v)
Idenity/policy-ISE–vISE
Firewall-ASA—-vFW(ASA1000v)
WAAS—vWAAS
Email Security-ESA—vESA
Wireless LAN Controller —vWLC
Security Gateway—VSG
Video Cache— vVidoeCache
Web Security WSA—vWSA
Network Analysis NAM—-vNAM
IOS/XR RR—–vRouterReflector

Virtual overlay network
1) Example: Virtual overlay networks and services with Nexus 1000v
2) Large scale L2 domains: Tens of thousands of virtual ports
3) Common APIs: including openstack quantum API’s for orchestration
4) Scalable DC segmentation and addressing VXLAN
5) Virtual service appliances and service chaining/traffic steering
VSG, vWAAS, vPATH
6) Multi-hypervisor platform support: ESX, hyper-V, OpenSource Hypervisors
7) Physical and virtual: VXLAN to VLAN gateway

Current Industry approaches and challenges
Traditional Network model: Existing infrastructure model and Existing application model.
Todays SDN model: 1)Lac of transparency and visibility
2) Virtual domain attempts to replicate physical network constructs ex LAN emulation
3) Per hypervisor integration overhead
4) Multiple management points
Application Centric: Centralized automation, security and application profiles
1) Simplification, complete network automation and programmability
2) Software flexibility with hardware based performance and integrated visibility
3) Bypass 1st generation SDN limitation to an Application centric infrastructure
4) Extensible to storage and compute.

OpenPK
It’s software development kit and which allow you to write code. You need API to communicate between the router and application. so you can say OpenPK is collection of API.

Where do you run the onePK?
You can run it on router/switch using the process hosting. If you have platform like ISR where it have blade then you can host openPK on that blade. Also, you can run this on physical server or virtual server.
1) Process hosting: Advantage is low latency.
2) Blade Hosting: Adv Low latency is an advantage.
3) End-node hosting: Supported on by all platform.

Configure IOS for onePK(unsecure mode)
#username user1 password pass1 //all application must autheticate with username and password
#oneP // Enter onePK config mode
#transport socket // socket = tcp 15001
#start // start onePK and activate API history trails
#history on

Configure IOS for onePK(Encrypted)
#unsername user1 password pass1
#onePasnport tls  // use tls = TCP 15002
#trasport tls localcert <trustpoint name>  // can use local certificate or certificate authority
# start
# history on

# show onep status  // to check if openpk is enabled or not.
#show openp statistics session all // this will tell u how much traffic is generating by openPK
# show openp history session all // what openPK is doing

Additional onePK IOS commands
# session max 2 // specify the maximum number of session that can connect to a device is 1 to 32.
# cpu threshold rising 60 failing 40 interval 10  // Use the cpu threthold command to control the amount of CPU used by onep application when the rising threshold is exceeded. API requests will be rejected with the error code ONEP_ERR_RESOURCE_BUSY
When CPU utilization reached the failing threshold, API request are served again.
# onep stop session all // Disconnect all openp applications on a network devices.

Using onePK APIs
onePK functions are grouped in service sets
data path: Provides packet delivery service to application, copy, punt, injects
policy: Provides filtering, classification, actions, applies policies to interface on network elements
Routing: Read RIB routes, add/remove routes, receive RIB notification
Elements: get element properties, CPU/memory statistics, network interface, element and interface events
Discovery: L2 topology and local service discovery
Developer: Debug capability, CLI extension which allows application to extend/integrate application’s CLIs with network element.

Punting and Injecting Packets(c)
TRY(rc,onep_dpss_register_for_packets(
ne1,
dpss,
targ_left,
interesting_class,
ONEP_DPSS_ACTION_PUNT   // Defines traffic of interest
encrypt_callback, // Action to take on interesting traffic
(void*)intf_left,
&reg_handle),”register for packet”);  // Where traffic goes next

Example: Customer encryption
problem: Customers want customer encryption on specific traffic types value properties: punt traffic of interest, encrypt, and re-injects
1) Policy APIs on ingress router are set to punt telnet and syslog to app
2) App encrypt punted traffic and re-injects into data path
3)Policy APIs on egress router punt telnet and syslog to app
4) App decrypt punted traffic and re-injects into data path
5) Traffic that does not match policy passes through unencrypted.

Revenue: Pay-as-you-go QoS
1) Customer buys pre-pay QoS package for cloud service
2) First packet for new session appears on ingress PE and is relayed to master server
3) Master server verifies pre-pay account and applies QoS
4) Ingress PE detects end of session and relays this to the server.
5) Server removes policy, bills customer for duration of session.

DMVPN and IPsec

DMVPN:
What is DMVPN?
1) Point-to-multipoint layer 3 overlay VPN
– Logocal hub and spoke topology
– Direct spoke to spoke traffic is supported
2) DMVPN uses a combination of…
– Multipoint GRE tunnels(mGRE)
– Next hop resolution protocol(NHRP)
– IPsec Crypto profiles(do encryption)
– Routing

Why use DMVPN?
1) Independent of SP access method
– Only requirement is IP connectivity(if you have connectivity between sited we can form DMVPN)
2) Routing policy is not dictated by SP(service provider)
– e.g. MPLS L3VPN restrictions
DMVPN run both underlay and overlay routing protocol. Overlay is protocol which is internal to our network and underlay is protocol which gets us towards the service provider.
3) Its highly scalable.
– If properly designed

How DMVPN works
1) DMVPN allows on-demand full mesh IPsec tunnels with minimal configuration through usage of…
– Multipoint GRE tunnels(mGRE)
– Next hop resolution protocol(NHRP)
– IPsec crypto profiles
– Routing
2) Reduce need for n*(n-1)/2 static tunnel configuration
– Uses one mGRE interface for all connections
– Tunnels are created on-demand between nodes
– Encryption is optional
we only have one mGRE interface for all connections
3) Creates on-demand tunnels between nodes
– Initial tunnel-mesh is hub-and-spoke(always on)
– traffic pattern trigger spoke-to-spoke tunnels
– Solves management scalability problem
Its similar to ARP which used in Ethernet, there has to some device which could tell us if you want to form tunnel destination with X what is the address we need to resolve towards X. We called it as hub in DMVPN or Next hop server would do.
Hub will always use for control plan but not for the data plane
4) Maintains tunnels based on traffic patterns
– Spoke-to-spoke tunnel is on-demand
– Spoke-to-spoke tunnel lifetime is based on traffic
5) Require two IGPs: Underlying and Overlay
– IPv4/IPv6 supported for both passenger and transport

How DMVPN works – Hub and Spokes
1) Two main components
– DMVPN hub/ NHRP server(NHS)
– DMVPN spokes /NHRP clients(NHC)
2) Spokes/clients register with hub/server
– Spokes manually specify hubs address
– Sent via NHRP registration request
– Hub dynamically learns spokes VPN address and NBMA address.
Inside of tunnel address(overlay address) and what is the address which I can use to route the packet(underlay address or NHRP address)
NHRP will bind the IP to IP.
3) Spoke establish tunnels to hub
– Exchange IGP routing information over the tunnel
in general we can use any routing protocol but most of the time people dont want to use OSPF

How DMVPN works – Spoke to Spoke
1) Spoke1 knows Spokes2’s routes via IGP
– Learned via tunnel to hub
– Next-hub is spoke2’s VPN IP for DMVPN phase2
– Next-hub is hub’s VPN IP for DMVPN phase3
(destination A )Spoke1——-ipv4——–HUB(NextHopServer)(Destination C)——ipv4—-Spoke 2(Destination B)
Once underlay(Static,bgp,eigpr) is established then we can form mGRE tunnel
when hub send traffic to spokes its replicate it as unicast.Now we are running EIGRP inside the gre tunnel
Spoke 1 will form adjacencies with hub and like wise hub form adjacencies with spoke 2. Now everyone knows about A,B,C
When spoke want to route the packet to each other will they send packet to hub and hub will replicate it or will they directly send traffic to each other on-demand?
its depends on phase and routing design.
Spoke do not form routing adjacencies with each other and they would learn it via hub and now depending upon the next hop value of route this will controls whther we send it to hub and then back to down or whether we send directly between spokes
but some how we need to do resolution, what is public and private address of spokes because we trying to get to two diff type of address. Underlay address(public) and overlay address(private)
2) Spokes1 asks for Spokes2 real address
– Maps next-hops(VPN)ip to tunnel source(NBMA)IP
– Sent via NHRP resolution request
3) Spoke to spoke tunnel is formed
– Hub only used for control plane exchange
– Spoke-to-Spoke data plane may flow through hub initially

NHRP important Messages
1) NHRP registration request
– Spokes register their NBMA and VPN IP to NHS
– Required to build the spoke-to-hub tunnels
2) NHRP resolution request
– Spoke queries for the NBMA-to-VPN mapping of other spokes
– Required to build spoke-to-spoke tunnels
3) NHRP redirect
– NHS answer to spoke-to-spoke data-plane packet through it
– Similarly to IP redirects, when packet in/out interface is the same
– Used only in DMVPN phase 3 to build spoke-to-spoke tunnels

Basic DMVPN configuration
First requirement of DMVPN is to check basic underlay transport between the routers. R5 is hub in this case.
“ip nhrp map multicast dynamic”: If you would like to send eigrp hello from hub to spoke. The hub has to replicate multiple times. We called it as psedo multicast. Multicast is not supported natively since we are encapsulating multicast inside the unicast.
In phase 1 we are going to look at point-to-point GRE tunnel but not multipoint GRE tunnel.

Hub: R5
interface Tunnel0
ip address 10.1.0.5 255.255.255.0 //overlay address
ip nhrp authentication donttell
ip nhrp map multicast dynamic
ip nhrp network-id 99
tunnel source gig1.100
tunnel mode gre multipoint
tunnel key 100000

Spoke: R1
interface Tunnel0
ip address 10.1.0.1 255.255.255.0 //just change this IP and configure it on other spokes if source interface is same
ip nhrp authentication donttell
ip nhrp network-id 99
//ip nhrp map private public
ip nhrp map 10.0.0.5 169.254.100.5 // mapping from private address to public address. The 169 is the public ip address configured in R5 interface.
ip nhrp map multicast 169.254.100.5 //if we want to send multicast traffic then what could be the address in header.
ip nhrp nhs 10.1.0.5
tunnel source gig1.100
tunnel destination 169.254.100.5 //Since phase one is point to point we need to specify the destination address.
tunnel key 100000

Test: On R5, Ping 10.1.0.5 success

At this point on R5 doesn’t know what the destination is? That the point of NHRP configuration. NHRP dynamically tell us when u actually build encapsulation what address we need to put in.
On R5: Show monitor capture 1 buffer.
R5 will send NHRP request packet and in that packet it will mentioned R5’s private address(10.1.0.1) and if you want to reach them then encapsulate them in public address(169.254.100.1)
show ip nhrp will show above in output of R5
show dmvpn on R5 shows all spokes are being registered

We will configure rip
On r5, r1,r2,r3,r4:
router rip
version 2
network 10.1.0.0
network 150.1.0.0
no auto-summary

Next hop in routing protocol determine how we will gone encapsulate the packets.
to reach 150 network next hop is 10.1.0.1 then it will query to NHRP which says actually use 169. address.
Disable the split horison on hub so that spokes can see the prefix which are being advertise by other spokes.
Tunnel key is used to configure when there are multiple tunnels are present it can act as tag or identify.
If we do not mentioned tunnel destination on spokes that mean we are running phase 2 or 3.
if we have multiple cloud and hub we need nhrp id.

DMVPN phase 1,2,3
1) DMVPN can be deployed in three “phases”
– DMVPN phase 1
– DMVPN phase 2
– DMVPN phase 3
2) DMVPN phase affects
– Spoke to spoke traffic patterns
– Supported routing designs
– Scalability

DMVPN Phase 1
1) mGRE on hub and p-pGRE on spokes
– NHRP still required for spoke registration to hub
– No spoke-to-spoke tunnels
2) Routing
– Summarization/default routing at hub is allowed
– Next-hop on spokes is always changed by the hub

DMVPN phase 2
1) mGRE on hub and spokes
– NHRP required for spoke registration to hub
– NHRP required for spoke-to-spoke resolution
– Spoke-to-spoke tunnel triggered by spoke
2) Routing
– Summarization/default routing at hub is NOT allowed
– Next-hop on spokes is always preserved by the hub
– Multi-level hierachy requires hub daisy-chaining

DMVPN phase 3
1) MGRE on hub and spokes
– NHRP required for spoke registration to hub
– NHRP required for spoke-to-spoke resolution
2) When a hub receives and forwards packet out of same interface
– Send NHRP redirect message back to packet source to update routing table
– Forward original packet down to spoke via RIB
there is new route type is NHRP in their routing table.
3) Routing
– Summarization/default routing at hub is allowed
–Result in NHRP routes for spoke-to-spoke tunnel
— with no-summary, NHO is performed for spoke-to-spoke tunnel
— Next-hop is changed from hub ip to spoke IP
-Next-hop on spokes is always changed by the hub
– Because of this, NHRP resolution changed by the hub
– Multi-level hierarchy works without daisy-chaining
phase 2 with ospf it means we are running network type broadcast
phase 3 with ospf it means we are running network type point to multipoint

DMVPN Phase 1 with IGP protocols
R5 would be hub.

Hub : R5
interface tunnel0
ip address 155.1.0.5 255.255.255.0
ip nhrp authetication donttell
ip nhrp map multicast dynamic
ip nhrp network-id 99
tunnel source gig1.100
tunnel mode gre multipoint
tunnel key 100000
Spokes R1 to R4(ip address need to change for r2 – r4)
interface tunnel0
ip address 155.1.0.1 255.255.255.0
ip nhrp authetication donttell
ip nhrp map 155.1.0.5 169.254.100.5
ip nhrp map multicast 169.254.100.5
ip ngrp network-id 99
ip nhrp nhs 155.1.0.5
tunnel source gig1.100
tunnel destination 169.254.100.5
tunnel key 100000

On R5: Show dmvpn
we need to make sure than spokes are registered and mapping is correct.
We are going to enable rip on R5,R8,R10 and will send default route on tunnel interface
Interconnected link on router we have shut down.
execute below command on all the routers

router rip
version 2
network 155.1.0.0
network 150.1.0.0
no auto-summary

we have started ping 10.0.0.100 on R5 and capture the traffic
We can see in pacap rip hello multicast packet have src ip 169.254.100.5 dest 159.254.100.3 inside it we have hello packet with gre encapsulation.
“ip NHRP map multicast dynamic” command will tell us to encapsulate multicast packet with unicast and send all the host are dynamically registered. MUlticast data planes are not supported from spoke to spoke on phase 1 to 3. We can do some workaround to achieve this like static tunnel but by design multicast not suppose to replicate between the spokes because we dont want them to form control plane adjacencies. The main point of DMVPN is to host the control plane for routing protocol and for IPsec then we actually forward the data plane it can be direct between the spokes or it can go through hub if its for some destination behind the hub.
Apply tcl script on all routers to test the reachability.
tclsh
foreach VAR {
150.1.1.1
150.1.2.2
150.1.3.3
150.1.4.4
150.1.5.5
150.1.6.6
150.1.7.7
150.1.8.8
150.1.9.9
150.1.10.10
} { ping $VAR source loopbacl0}
We can see there are some reachability problem its because, if we look at the routing table, R2’s perspective it has rechability to R8 and R10, but doesn’t have reachability with R3 since RIP is distance vector protocol. R3 is sending update to R5 and R5 is not reflecting it back to R2, since because of split horizon. In real desing we need to minimize the control plane. Hence, we can configure default route on R5
“default-information originate TUNNEL0”
“route-map TUNNEL0”
“set int TUNNEL0”
This is doing conditional default route that says Originate default route but I am only gone originate at tunnel. because the device is behing me R8 and R10 they have specific reachability information I don’t want default them to me.
Now all spokes can reach every where. We we want to give specific reachability then we have to disable split horizon. Configure no split-horizon on R5.
In real design we can do only this
on R5:
router rip
distribute-list prefix DEFAULT-ONLY out tunnel0
On Spokes, we have only one route in routing table and ie default route towards R5.

remove all routing configuration on all router.
ODR:
enable cdp on all routers and tunnel interfaces. On R5: “router odr”
spokes are telling these are my connected route in cdp packets. Spokes are learning default route from hub and there is rechability between them

With OSPF, all the routers should be in same area because of this if there is issues on one router, all other router needs to run SPF algorithm.
EIGRP:
On all router
router eigrp 1
network 150.1.0.0
network 155.1.0.0
All routers are only communicated through tunnel interface.
We have same problem in EIGRP that we were having with RIP. Do we need to know that all the prefix on Spokes? In phase 1 we just want to reach hub and until hub have all the specific routes we can reach anywhere.
On r5:interface tunnel0
ip summary-address eigrp 1 0.0.0.0 0.0.0.0
now all spokes we only have default routes.
When traffic need to send between spoke to spoke, hub need to first deencapsulate the traffic and again encapsulate it and send out to other spokes, in real desing its process consuming. Hence, phase 1 is not used when we need to send traffic between spoke to spoke.
In phase 1, we could configure default route on spoke or disable split horizon to have specific route on all spokes.

on R5: int tunnel0
no ip summary-address eigrp 1 0.0.0.0 0.0.0.0
no ip split-horizon eigrp 1

BGP:
we have one of two cases, we either run ebgp between spokes and hub and use private AS, if network is internal and not routing to internet then use private AS on overlay tunnel.
other options would be when we want to use one AS everywhere then typically hub would be Route Reflector(RR). In DMVPN, hub doesn’t need to manually track the spokes.
The spokes simply says this is the address of hub and it’s next hop server and I am going to registered with it. So potential disadvantage of running BGP, we need to manually specify the neighbour. So when we need to add spoke, we have to update neighbour statement. We do have command to specify dynamic neighbour command in BGP(see the configuration guide).

Remove previous eigrp configuration.
On R5:
router bgp 1
bgp listen range 155.1.0.0/24 peer-group DMVPN_SPOKE
neighbour DMVPN_SPOKES peer-group
neighbour DMVPN_SPOKES remote-as 1
neighbour DMVPN_SPOKES route-reflector-client

On all router:
route-map loopbacl0
match interface loopback0
router bgp
redistribute connected route-map loopback0

On R1, R4 :
router bgp 1
neighbor 155.1.0.5 remote-as 1

on Spokes:
show ip route: We could see other spoke prefix information but next hop value is not RR(R5). Since route reflector wont change the next hop value for data plane traffic.
So here next hop is spokes IP and not the hub.

DMVPN phase 2
spoke would get register with hub and informed about the mapping(public(NBMA or underlay) and private(tunnel address) address)
spoke would run mGRE protocol, hence they can trigger spoke to spoke communication. but the big issues in phase 2 is routing design because summarization/default routing is not allowed.

On R5:
it has tunnel mode mgre
On R1:
tunnel destination 169.254.100.5 command would make this case as Phase 1.
With OSPF, we could only have two options, we could run network in broadcast or point to multipoint.
We could use non broadcast and point to point non-broadcast but we don’t achieve the purpose of DMVPN since we need to manually track the spokes.

ON all router:
no router bgp 1
router ospf 1
network 150.1.0.0 0.0.255.255. area 0
network 155.1.0.0 0.0.255.255 area 0
Once we configure this we see on hub, it start to complain that it has some difficulty to form adjacency and it is related with network type.
On R1:
show ospf interface tunnel0
ospf treated GRE tunnel as point-to-point, and limitation of this network type is that we could only have one neighbour. When hub receive hello from multiple spoke, it would prempt previous hello. it would continously run over and over and they won’t form adjacencies.

On R5:
conf t
interface tunnel0
ip ospf network point-to-multipoint

now we could see the adjacency is formed.
All routers are on area 0, so we can not summarize it on hub. There is one way we can fix this to filter the route out of RIB but allow them to actually to go to fib for default. We could do this in below ways
On R5:
router ospf 1
default-information originate always

On R2:
show ip route ospf
we could see we have default route along with all specific prefix.
There is some hack you can do
On R2
Access-list 1 permit 0.0.0.0
router ospf 1
distribute-list ospf 1 in
When you move database to rib which routes are actually to go and it says only route that leak is default. If we do show ip route we could only have default route.
This is very very dangerous to do in OSPF design the reason why is the router 2 will continuously flood LSA 1 information that he know to other router. We could end up in situation where I advertise route to you that I am not actually using then you transit to me to get there and form loop in data plane. We could do this when we only have single path to hub and we need to use OSPF then we filter the route out of RIB then we only use default for forwarding. So end result would be the same
Lets says we don’t want to send all traffic to hub because underlay network assume that its internet and physically R2 is new york, R3 is in chickago and R5 is in LA. If we want to go to chicago from new york, then it wouldn’t make sense first go to LA and then go to chicago. This is advantage of running phase 2 and phase 3 is we dont know how the topology looks like but underlay network does because when we send packet to internet bgp would send traffic to whatever destination is. Here, we ideally would want traffic to go from R2 to R3, we can do this with phase 2 and 3.

on all spokes:
interface tunnel0
no tunnel destination
tunnel mode gre multipoint
end
Now spoke will dynamically discover the destination. This is the default command shown in configuration guide.
Order of operation
NHRP happens first before routing.
IPsec happens first crypto creates then NHRP happens and then routing.

On R2
router ospf 1
no distribute-list 1 in (it was saving the space in forwarding table)
Now we do need specific route and big implication is next-hop value must be preserve by hub this means summarization is not supported.
show ip route
all of the spokes is reachable via hub. when we send traffic to this destination end result would be same as we were running in phase 1. So even its mGRE tunnel the issue is next hop is wrong, when next hop point to hub, it wont do resolution for other address. This is difference between phase 2 and 3. in phase 3 they have just fixed this in backend by triggering the NHRP resolution automatically. Then we dont need specific route and hub can do nhrp redirection and say this is spoke you actually trying to reach.
but in this case, ospf point to multipoint is not supported in phase 2 because it changes the next hop value. how we can achive this if we want to run phase 2 with ospf. i.e think which network type doesn’t modify the next hop. Ans: When we have DR. Books says premption is not there is wrong. DR can be prempted if wait timer expired first before they form adjacency. What this means if you were trying to configure network type broadcast and hub specify as DR and one of the spokes comes up and elect it self as DR, we would bring down entire cloud. This is the reason you would not want to run OSPF over DMVPN.

Change the network tupe on all the routers
interface tunnel0
ip ospf network broadcast
end.

On R5:
show ip ospf interface tunnel0
We have to make sure that none of the spoke would be elected as DR which in this case they won’t. Only because R5 happens to have highest router id.

On R2
Router ospf 1
router-id 255.2.2.2
clear ip ospf process
now R4 became DR, what happend is that R4 was BDR before so DR went away and BDR elected itself DR.
If we seen end result: show ip route ospf
we haven’t solve the next hop issue, it still pointing toward hub.
If we see routing table on R3, it doesn’t know any prefix on tunnel since only DR will advertise the prefixes. Since R2 is current DR, it can not heard any multicast from R1,R3,R4.
This is supported design but we must set priority 0 on Spokes inorder to prevent this.

On R5:
interface tunnel 0
ip ospf priority 255
it doesn’t fix the problem though, since if we look in RFC for DR election process. It says priority only comes into play if someone else is already elected. If one spokes elected himself first then it doesn’t matter the priority of hub is. So we need to set spokes to 0.
In point-to-multipoint next hop is not preserve. However, in phase 2 next hop should be preserve by hub. so hub either would be DR wth network type non broadcast or broadcast

On spokes

int tunnel0
ip ospfp network broadcast
ip ospf priority 0
end.
On R5: show ip ospf interface tunnel0: Its says I am DR.
On R5, capture the traffic and on R2, we will generate the traceroute towards R3. We could see the traffic is directly goes to R3. We also see first packet lost here, because it was doing resolution. R2 is asking the next hop server what the actual resolution for 150.1.3.3, then R5 would say its 169.254.100.3 or what ever NBMA address is.
If we see this in capture we would see the same

In phase 1: Point to multipoint or broadcast or non broadcast is doesn’t matter. but in phase 2: it should be either broadcast or non broadcast in order to preserve next hop value, which means hub need to be DR and spoke should set priority 0
if we have multiple physical link to hub then you would have source the tunnel on the hub from the loopback because you have multiple ebgp routes to the internet. If we are doing the redudancy based on multiple hubs then you just probably pointed to physical link. if you have multiple route to same hub then you want to source the tunnel through loopback as suppose to the physical link .

no ip next-hop-self eigrp is need on hub when we configure eigrp.
What it says, when eigrp hello goes from spokes to hub, then hub reflects it to back down then it should not update the next hop value to itself because issue again is if spokes see the next hop as hub it doesn’t perform NHRP resolution for remote spokes and data plane has to go upto the hub as suppose to spoke to spoke. In phase 3, NHRP natively solve this problem.

On all router
router eigrp 1
network 155.1.0.0
network 150.1.0.0

On R5:
interface tunnel 0
no ip split-horizon eigrp 1
no ip next-hop-self eigrp 1
This means not only we update the spoke back to the spokes, but when we do that we not modify the next hop value to ourself.
on spokes: show ip route eigrp: next hop value is preserve

Phase 3 Configuration
in phase 2, it sends resoluton request back down to remote spokes and it say R1 and R2 resolve each other and here are mapping that going to send them.
In phase 3, new route is installed based on spokes that trying to reach other. When you execute show ip route, we could see nhrp route between the spoke to spoke.

On R5:
inter tunnel0
ip summary-address eigrp 1 0.0.0.0 0.0.0.0
It means EIGRP is going to suppress all of the specific route.

On R2: show ip route: it has only one single default route in his routing table.
ping 150.1.1.1 Success but now hub is in data plane. Since next hop is pointing towards the hub.
In phase 3, we need to have two simple command, one of them is on spoke and another is on hub.
Hub would say if someone send me NHRP resolution, I am going to redirected back to destination spokes and tell them directly resolve each other and we use “ip nhrp redirect” command
On R5:
interface tunnel 0
ip nhrp redirect

On spoke we need to install the update that are pushed from NHRP process, command is “ip nhrp shortcut”

On R1, r2:
interface tunnel0
ip nhrp shortcut (we could install this command on hub but that is not needed)

On R2: show ip route 150.1.1.1 , We dont have any route. If we look at forwarding table.
show ip cef 150.1.1.1: it says we gone use default and use 155.1.0.5 address as next-hop
since we are doing nhrp shortcut, NHRP generates request for whats the mapping between destination and NBMA address.
In phase three we will get response for actual(i.e. final) destination from hub but not the next hop. previous cases we would get response of tunnel interface from hub.
now on R2, show ip route 150.1.1.1
it shows route is being learnt from R1. but previously when we checked it says it doesn’t have any route. So here R5 has sent redirect since destination is not own by R5.
In phase 2 we were trying to resolve the next hop value which was present in tunnel interface. In phase 3, we are trying to resolve final destination. Final result would be spoke would have just default route as hub but any spoke to spoke traffic trigger the NHRP resolution you can find more specific route and directly between spokes.
In phase 2 you need to have exact route everywhere but there might not be cases where spokes might want to use that address. Lets say that R2 want send traffic to R3 but R2 doesn’t want to send traffic to R1. In phase three 2 doesn’t need to know route to R1 because it is not sending traffic. but in Phase 2 there is routing entry on R2, since NHRP doesn’t know where we are trying to reach.
Now are can run point to multipoint as next hop value doesn’t related direct to route.

All router
no router eigrp 1
router ospf 1
network 155.1.0.0 0.0.255.255 area 0
network 150.1.0.0 0.0.255.255. area 0
interface tunnel 0
ip ospf network point-to-multipoint (it works on phase 1 and 3 )
In this case, routing table would say we suppose to go via R5 but cef table would say there is direct remote spoke entry is present.
If we need to send all spoke to all spoke traffic at all the time then phase 2 is better option else we can go for phase 3.

hub is only in path for control plane in dmvpn

GetVPN:
Instead of creating individual tunnel for all of the spokes we will create group tunnel and share group encryption key and decryption key.

————————————————————————————————————————————————————-

IPsec Overview
1)Virtual Private Network(VPN)
– Extension of private network over a public network
– VPN doesn’t necessarily imply encryption
2) Examples:
– Layer 2 VPNs
— Ethernet VLANs, QinQ, frame relay PVCs, ATM PVCs, VPLS,
– Layer 3 VPNs
— GRE, MPLS layer 3 VPN, IPsec
Except IPsec, above all protocol doesn’t need encryption.

What is IPsec?
1) IPsec is a standards based security framework
2) Lots of RFC
2408,2409,4302,4303,5996

IPsec Features
1) IPsec offers four main features
a) Data origin authentication
– Who did the packet come from?
b)Data Integrity
– Was the packet changed in the transit path?
c)Data Confidentiality
– Can anyone read the packet in the transit path?
We can do encryption in payload.
d) Anti-replay
– Did I already receive this packet?

Why Use IPsec VPNs?
1) Does not need static RP provisioning like MPLS
2)Independent of SP access method
– IPv4/IPv6 transport is the only requirement
3) Allows both site-to-site and remote access
– Always on VS dial-on-demand
4) Offers data protection
– Main motivation for IPsec

How IPsec Works
1) IPsec is a network Layer protocol(Layer 3)
– Different from SSL(7) or 802.1AE MACSEC(2)
IPsec name indicate that it’s IP security protocol, it means we can not tunnel Ethernet natively inside of IPsec and IPsec doesn’t run directly over ATM or Ethernet, it need ipv4 or v6 negotiation for actual data plane data transport
2) Main goal is to encrypt and autheticate IPv4 or IPv6 packets
– Uses symmetric ciphers for encryption(ex 3DES)
– Keyed hashing for authentication(ex MD5)
3) Used to create P2P tunnels between endpoints
– GETVPN is an exception, which can use P2MP
Limitation of IPsec is, if we want to scale it to full mesh n(n-1)/2 scaling of tunnels
IPsec is not multicast and it’s point to point unicast ip tunnel

How IPsec Tunnels Work
1) Tunnels are dynamically negotiated through IKE
– Main goal is to NOT define crypto keys manually(IPsec peers will automatically figure it at the back end)
2) IPsec use two data structures to build a tunnel
– Security Association(SA)
— An agreement of IPsec parameter
— Maintains encryption and authentication keys on peers
– Security parameter Index(SPI)
— Field in packet header to select SA on receiver
— Analogous to VLAN header or MPLS label
SA is used to build control plane between the peers and SPI is used in actual packet header in order for them to figure out when the packets comes in what tunnels does it belongs to?

ISAKMP vs IKE
1) ISAKMP/IKE are the negotiation protocols used to form SAs
2) Internet security association and key management protocol(ISAKMP)
– ISAKMP is the framework
– Says that authentication and keying should occur
3) Internet key exchange(IKE)
– IKE is the actual implementation
– Defines how authentication and keying occurs
4) In general, ISAKMP/IKE terms are interchangeable

IKEv1 vs IKEv3 negotiation
1) IKEv1 was the original implementation
– Spread across lots of RFCs
– Problems with interoperability with some features
2) IKEv2 adds new improvements
– Simplified RFC and paket formats, better interoperability
– More flexible designs
— e.g. use pre-shared keys on on side but certificate on the other
– Supports more secure “suite B” algorithm
not all platform support IKEv2.
IKE only refers to control plane negotiation, once you form the tunnel there is no difference in data plane and it is ESP and AH. In control pnae, its IKE and ISAKMP.

IPsec Tunnel Negotiation with IKE
1) Goal of IPsec exchange is to establish SAs
– Occurs through two main negotiation phases using IKE
2) IPsec Phase 1
– Authenticate endpoint and build a secure tunnel for further negotiation
– Result is called the ISAKMP SA
3) IPsec Phase 2
– Establish the tunnel used to protect the actual data traffic
– Result is called the IPsec SA

(Phase 1) Negotiating the ISAKMP SA
1) During IKE phase 1, Peers negotiate four main parameters
– Authentication method
– Diffie-Hellman group
– 1/2/5/…
– Encryption Type
– DES/3DES/AES
– Hash Algorithm
– MD5/SHA1

IKE authentication
1) Authentication is mainly two ways
– Pre-shared key(PSK)(The issue is scaling and other issues is if we want to change the password.)
– X.509 certificates(PKI)
2) PSK is easy, but PKI is scalable
– PSK are difficult to maintain and change
– PKI allows easy revocation and also hierarchy
3) IKEv2 adds improved authentication
– e.g. EAP methods
– it will supports arbitrary authentication
FLEX VPN means IKEv2.

IKE Diffie-Hellman Group
1) DH is the method to exchange crypto keys
– i.e. Alice and Bob agree on a prime number…
2) DH group number determines strength of keys
– Higher group is better, but at expense of CPU
3) Result of DH is what 3DES, AES, etc. use as their symmetric keys.

IKE encryption
1) Encryption algorithm used to protect the traffic
– DES, 3DES, AES-128,AES-256 etc.
– Higher is better but at the expense of CPU
– Some can be hardware accelerated.
– i.e. AES crypto offload card.
2) IKEv2 introduces new enhancements
– RF 4307

IKE Hashing
1) One way hash used to authenticate the packet
– If hashes match, the packet was not modified in transit.
– Higher is better, but again at the expense of CPU.
2) Hashes supported depends on IKE version
– IKEv1 MD5 & SHA-1
– IKEv2 SHA-256, SHA-384, etc.
if hash is equal then we say original packet is not changed.

ISAKMP Policies
1) Combinations of IKE params are ISAKMP policy
– IKE initiator sends all its policies through a proposal
– IKE responder checks received policies against its own
– First match is used, based on lowest local priority value
– Else, Connection is rejected.
2) After phase 1 completes, an encrypted tunnel exists between the peers
– Phase 2 negotiation can now be hidden from devices in transit.
We are doing encryption twice, to protect the negotiation and 2nd time for actual data plane.

Negotiating the IPsec SA
1) In phase 2, peers agree on more parameters…
2) Security protocol
– Encapsulating security payload(ESP) or authentication Header(AH)
these security protocol are not compatible with each other since both are different frame format.
3) Encapsulation mode
– Tunnel mode or transport mode
4) Encryption
– DES,3DES,AES, etc
5) Authetication
– MD5,SHA, SHA-256,SHA-512 etc.
6) Combination of these is called the IPsec transform set

IPsec security protocols
1) IPsec supports two data plane encapsulations
– Authentication Header(AH)
– Encapsulating Security Payload(ESP)

AH vs ESP
1) Authentication Header(AH)
– IP protocol number 51
– Data origin authentication includes IP header
– Data Integrity
2) Encapsulating Security Payload(ESP)
– IP protocol number 50
– Data origin authentication excludes IP header
– Data Integrity
– Data Encryption
– Anti-replay
99% of time we don’t use AH since it doesn’t support encryption
3) AH only supports authentication
– Rarely used for VPNs due to this reason
– IPv6 ospfv3 can use AH for neighbor authentication
4) ESP supports authentication and encryption
– Preferred method of IPsec VPNs
– Most remote access clients only supports ESP

Tunnel Vs Transport
1) AH & ESP supports two modes of encapsulation
2) Transport
– Original IP header retained
– Payload and layer 4 header authenticated/encrypted with ESP
– Complete packet authenticated with AH
– Typically used in host to host IPsec
3) Tunnel
– Adds new IP header
– Original header & payload authenticated/encrypted with ESP
– Complete packet authenticated with AH
– Typically used between IPsec gateways or host to IPec gateway.

Transport Vs Tunnel mode paket format
1) Authentication header(AH)
RF4302
2) Encapsulating security Payload(ESP)
RFC4303

Negotiating the IPsec SA
1) Phase 2 negotiating also includes
2) Proxy Identities
– Defines what traffic will be protected
– Also called “Proxy ACLs”
This would tell what traffic goes into tunnel
3) Security Association(SA) lifetimes
– How often should we re-key
4) Prefect forward secrecy(PFS)
– Should we re-negotiation DH before we re-key
PFS says if we need to re-key the tunnel then should I run DH to get new key begin with or should I base of previous run

IPsec Proxy Identities
1) Defines what traffic goes into the IPsec tunnel
– i.e. the “interesting traffic” to trigger the tunnel
2) Proxy ACLs should be mirror image
– Take a tunnel from peer A to peer B
– Peer A says traffic is from X to Y
– Peer B says traffic is from Y to X
3) Like any other parameter mismatch, misconfiguration will cause tunnel failure.

SA lifetimes
1) ISAKMP/IPsec SA both have finite lifetime
– Before expiration, SA is re-keyed
– Lifetime can be in time or bits.
– Lower value of initiator or receiver is negotiated.
2) What do we re-key with?
– Previous DH exchange or new DH exchange?
– Depends on PFS

Perfect Forward Secrecy(PFS)
1) Without PFS, IPsec SA rekeying is done from the initially negotiated master key.
– Once compromised key theoretically means all keys compromised
2) PFS causes new DH exchange prior to rekeying
– Makes IPse SA keys independent from previous ones
– More secure but at the expense of CPU

IPsec Control plane vs Data plane
1) ALl traffic is unicast ipv4/ipv6
2) Ipsec control plane(ISAKMP)
– UDP 500
– UDP 4500 if going through NAT
3) IPsec Data plane
– ESP(50) or AH(51)
– ESP over UDP 4500 if going through NAT
4) Some platform allow custom ports
– Ex. ASA firewall

IPsec VPN configuration with Crypto Maps
Crypto Map based IPsec Overview
1) “Legacy” method of IOS IPsec configuration
– Still the most common method through
2) Used to form on-demand IPsec tunnels
– Session initiated only when interesting traffic detected
3) No dynamic routing support through tunnel(Limitation)
– Not without additional encapsulation such as GRE
Crypto map doesn’t have interface install in routing table

How crypto maps work
1) Crypto map is a data-plane filter
– Matching traffic triggers an ISAKMP session to start
2) Traffic is matched using ACLs
– ACLs define proxy IDs for IPsec phase 2
– E.g. what should be encrypted
Session is trigger by matching ACL this is we called it as proxy identity or proxy ACL that we defined in out IPSec transport set in phase 2 association.
3) Allows for granular control over VPN traffic
– e.g. Only send TCP port 12345 over the VPN.

Applying Crypto Maps
1) Crypto maps apply to physical(sub) interface
– Only one crypto map interface
– Always outbound with respect to traffic direction
2) One crypto map can apply to multiple interface
– Entries processed top-down until ACL match occurs
– Order is important
3) Tunnel source defaults to interface IP
– Can be changed using crypto-map local-address

Crypto-maps order of operations
1)Encryption applies after routing
– Static routing may be required
2) Encryption applies after NAT
– NAT exemption may be required
3) Rule of thumb is that crypto, routing, and NAT processes are always independent of each other.
we need to decide if we want to do NAT before crypto or after words. If we want it before the translation we may need to ex-amped the traffic from NAT when it going over crypto tunnel.
WEB server
|
A—–R1—–Internet—–R2—-B
They want to run encryption when the traffic is going between the R1 and R2.
We want to do NAT translation when traffic A hit to R1 so that it will go to web server. But if the traffic is going over tunnel and if we want a crypto from A to B, Basically we need to edit the NAT config and say do not NAT the traffic from A to B. Otherwise, it would get translated to that outside interface and crypto would not get apply. If we are doing GRE over IPsec or VTI tunnel then we don’t need to worry about this becuase the traffic is based on routing but not the proxy ACL for the crypto map.

High Level Configuration Steps
1) Define phase 1 ISAKMP policy, what is hashing along, DH group, encryption?
2) Define phase 2 IPsec policy,
parameter such as IPsec transport set, where is the tunnel going ie. who is the end point of tunnel, what can goes inside of tunnel thats gone be ACL and how the traffic is treated, what the encryption and authentication policy
3) Apply Crypto Map
4) Generate the interesting traffic

Defining phase 1 ISAKMP policy
1) Peers must agree on four main attributes
– Authetication
– Encryption
– Hash
– DH group
– Lifetime (Lower value is negotiated)
2) Policy is processed top-down until a match occurs b the responder
– Based on policy priority
– Lower priority value has higher precedence
– Order is important

Defining phase 2 IPsec Policy
1) IPsec policy defines three main attributes
2) Who?
– Define peer address, hostname, or FQDN
3) What?
– Define proxy ACL
4) How?
– Define Transform set

Applying the crypto Map
1) Crypto Map applies to the link level
– On interface closet to the destination
— i.e.how do I route there?
– Multiple routes mean multiple interface

Default phase 1 and 2 policies
1) IOS includes default fallback policies
2) Default ISAKMP policies
– Active until user-configurable policies are created
– can be disabled with “no crypto isakmp default policy”
3) Default transform-sets
– Active for if user-configured set is not applied
– Can be disabled with no crypto IPsec transform-set default.

Our goal is to establish tunnel between R7 and R8 and we are going to use this to encrypt traffic that is coming from R9’s interface and going towards R10.(155.1.79.0/24 to 155.1.108.0/24)
Define ISAKMP policy:
On R7:
crypto isakmp policy 10
authentication pre-share
R7 has two interface so it has two source IP address to reach R8.
crypto isakmp key cisco1234 address 155.1.58.8

On R8:
Crypto isa key cisco1234 address 150.1.7.7(we cab also put 0.0.0.0 i.e wildcard )
Then we say on R7, we are sourcing the tunnel from loopback
crypto policy 10(number doesn’t need to match and those are locally significant)
authentication pre-share
encryption aes 192

On R7:
encryption aes 192
group 5(higher number would cause high cpu.)
hash sha384

On R8:
group 5
hash sha384

on R7:
show run | se isakmp
four parameter matches on both side

Define phase 2 IPse policy:
On R7:
crypto map MAP1 10 ipsec-isakmp
set peer 155.1.58.8
crypto MAP1 local-address lo0(Since we are source tunnel from loopback we have to mentioned this command.)

On R8:
crypto map MAP1 10 ipsec-isakmp
set peer 155.1.7.7(Loopback)

Now create proxy ACL
On R7:
ip access-list extended R9_TO_R10
permit ip 155.1.79.0 0.0.0.255 155.1.108.0 0.0.0.255

On R8
ip access-list extended R10_TO_R9
permit ip 155.1.108.0 0.0.0.255 155.1.79.0 0.0.0.255
crypto map MAP1 10 ipsec-isakmp
match address R10_TO_R9

On R7:
crypto map MAP1 10 ipsec-isakmp
match address R9_TO_R10

We need to define if we are using ESP or AH and inside it what are the authentication and encryption algorithms, this is specifically defined as
crypto ipsec transform-set ESP_AES_192_SHA1 esp-aes 192 esp-sha-hm
default mode is tunnel
mode tunnel

On R8:
crypto ipsec transform-set ESP_AES_192_SHA1 esp-aes 192 esp-sha-hm
mode tunnel
crypto map MAP1 10 ipsec-isakmp
set transform-set ESP_AES_192_SHA1

On R7:
crypto map MAP1 10 ipsec-isakmp
set transform-set ESP_AES_192_SHA1

show run brief | s crypto | isakmp | access-list: will show you all the configuration related
On R7, We have to apply crypto map on both interface but on R8 we need to apply it on single interface.
On R7:
interface gig1.67
crypto map MAP1
interface gig1.37
crypto map MAP1
Direction is always outbound.

On R8:
Interface gig1.58
crypto map MAP1

We will packet capture on R5 which is connected towards R8.
On R7:
Debug crypto isakmp: It will show phase 1 negotiation
ping 155.1.108.8 from 155.1.79.7 : This traffic will get included inside crypto ACL.
We couldn’t see any output in debug. This means it;s routing problem. If router never find outgoing interface, it will never trigger crypto process and if we see “show crypto isakmp sa” there is no output becuase crypto process didn’t start.
We can see there are some interface is not part of eigrp, hence enabling it on all interfaces.
We checked the routing and it’s working. The crypto map is data plane filter that every time the packet goes out of interface and hit proxy acl to determine whether or session should initiated. So from R7 if we ping R10 that would not create crypto map because source IP address of crypto map is not inside of my ACL. We have turn of the debug on R7.
On R9: ping works…
On R7: show crypto isakmp : state is QM_IDLE and status is ACTIVE and this is the good output
show crypto ipsec SA: It show us that packets got (ESP works)encapsulated or de-encapsulated and encrypted, decrypted and (Authentication)digest or verify.

Now export pcap on R5.
phase one calls it as main mode or aggressive mode(VPN client) in pcap.
Quick mode is phase 2.
In data plane we don’t know what algorithm, encryption method or sha is using so how hacker would bruit force it.

configuration verification
1) show crypto isakmp [default] policy
– Verify custom or default phase 1 policies
2) Show crypto isakmp key
– Verify pre-shared-key
3) Show crypto ipsec transform-set <name>
– Verify custom or default phase 2 policy
4) show crypto debug-condition
– Verify condition/filter for debugging
5) Show crypto map interface <intf>
– Verify crypt-map configuration
6) Show crypto isakmp sa
– Show the result of phase 1 negotiation
7) debug crypto isakmp
– show the step-by-step phase negotiation
7) show crypto ipsec sa
– show the result of phase 2 negotiation
8) debug crypto ipsec
– Show the step-by-step phase 2 negotiation

IPsec Verification and troubleshooting
1)Lots of ways to make minor mistakes.
2)99% of them break your configuration
3)Understanding how to verify and troubleshoot is a must.
4)Understanding how to interpret the debug output is a must.

Phase 1 Verification and troubleshooting
1) Show crypto isakmp sa[detail]
– State should be “QM_IDLE” and status “ACTIVE”
2) Debug crypto isakmp
– Show negotiation process
3) debug crypto condition peer ipv4 <ip>
– Restricts debugging output to relevant peer
4) Is authetication working?
– Check PSK/CA
– Failure means corrupted packets
if debug shows the malfunction packet, it means PSK is mismatched
5) Do IKEv1 SA attributes match?
– Debug will show failure to negotiate attribute
debug will show reject if any of the 4 parameters mismatches
6) Result should be one bidirectional ISAKMP SA
It is connection oriented session. Lets say R7 is initiator and R8 is responder . R7 send UDP packet with src/dest port 500 and R8 will reply to that packet. Lets say we have firewall in the middle i.e R3 acting as firewall. Even though UDP packet goes out it will allow udp come back. Even though it’s UDP packet, ISAKMP is connection oriented protocol.
In terms of phase 2, it has 2 unidirectional SA. So SA that is originated from R7 out to R8 and there is separate SA association R8 back to R7. That means if there is any one who is filtering in the middle, say firewall, normally ESP based IP tunnel is not gone succeeded. Thats why the UDP traffic come in UDP port 455 to do NAT reversal, because firewall can inspect the traffic. but there can be cases when we filtering in data plane, phase 1 tunnel might succeed, but phase 2 tunnel might failed. becuase phase 1 is directional and phase 2 is 2 unidirectional SA.

Phase 2 Verification and troublehsooting
1) Phase 1 ISAKMP/IKEv1 “ACTIVE” first
2) Show crypto ipsec sa [peer <IP>]
– Are packets getting…
–En/Decrypted?
–En/Decapsulated?
– Check both sides
3) debug crypto ipsec
– shows negotiation of IPsec SA’s
4) Did transform sets match?
5) Are ACLs mirror images?
6) Result should be two SA’s per ACL entry (inbound and outbound)
– Unidirectional packet counters usually means a data plane problem in the transit path
– e.g. someone is filtering ESP.
Tunnel is going from R7 to R8 for the traffic between R9 to R10. First we are going to delete the tunnel, and see what happend between initiator and responder and then we will do some changes and see what goes wrong.
To delete phase 2 tunnel: “clear crypro sa”
To delete phase 1 tunnel: “clear crypto isakmp”
if we want to do delete completely, we have to go to the link level
show run int gig1.37
inter gig1.37
no crypto map MAP1
crypto map MAP1
int gig1.67
no crypto map MAP1
we got the log by stating crypto is off
crypto map MAP1

On R8:
logging buffered 7
logging buffer 9999999
clear log
debug crypto isakmp
debug crypto ipsec
logging console 6: I don’t want to see all the output on console, we will send to on buffer.
normally responder has more information about the debug instead of initiator.
On R7:
Logging buffered 7
logging buffered 9999999
logging console 6
end
clear log
debug crypto isakmp
debug crypto ipsec
On R9:
Ping 155.1.108.10: Sucess hence tunnel is up.
on R7: show log, it says it has sent SA request on UDP port 500, then send attributes and then authentication, pre-shared key. ipsec proposal is present in debug output.
The main theing we look in debug output when we moving from phase 1 to phase 2 is that there should be output which says “IKE_P1_complete” and now are going to IPSec but sill it showing ISAKMP negotiation.
phase 1 is for ISAKMP and phase 2 is for IPsec.
How to calculate SPI value: It is random number but in data plane it should match. If tunnel working for long and it suddenly stop working then we need to first see if SPI’s are same that of present in table. To clear the SPI which is presnet in router “clear cry sa spi”
We will do some change in configuration and without seeing show running configuration we will track down the issues.
1)check if we are able to reach other end of tunnel? On R9: ping 155.1.108.10 don’t have reachability. show ip route 155.1.108.10
2) Did ISAKMP i.e phase 1 is properly negotiated On R7: show crypto isakmp sa: It says status is active(deleted) and state is MM_no_state. Here it says we tried to start negotiation but some reason it is rejected. It could be transport problem, attribute problem. On R8: show crypto isakmp sa: we don’t see anything. We will check on R7 by doing some Phase 1 debug we will get some more information or not. “debug crypto isakmp” and on R9 send one ping packet to trigger the tunnel. debug output shows that MM_No_State it means there is negotiation problem but we dont know what it is. We are seeing this output from Initiators perspective and it will not give us clar idea, so on R8, it says proposed key length does not match policy, hash not match in other policy (there are multiple policy has configured). and at last it says no offer accepted it means none of the policy’s are matching. Issue the command show run | section isakmp and make sure that all attributes are matching. We will do copy paste of configuration from one router to another. and again start debug form isakmp, send one ping packet from R9. On R7,it says retransmitting phase 1 MM_KEY _EXCH.. It says atts are accepted, it means all attributes are accespted and now we are going to the authentication phase. show crypto isakmp sa: it says state is “MM_key_EXCH”, this indicate that there is authentication failure. On R8, it says, IKE messages failed its sanity check or it’s malformed. They are not exchanging the password but they would exchange the hash.
3) Check the authentication issue: show crypto isakmp key: see what key is and make sure that both keys are matching. In this case, here it’s typo. On R7: crypto isakmp key cisco1234 address 155.1.58.8. Now in debug it says, SA authentication status: Authenticated it means phase 1 is complete. but phase 2 didn’t form SA.
4) “debug crypto ipsec” and send ping again. in debug, it trying to negotiate the proxy ACL, and it says proposal is not chosen, it’s deleting the SPI, phase 2 SA policy not acceptable. It means in crypto map whether it’s proxy acl, peer address or transform set, something is mismatched and its been rejected based on it. show run| se access-list, we could see ACLs are not correct. Hence, we modify the acl. now we got differne debug output and it says, transform proposal not supported for identity. issue the command on both side “show crypto ipsec transform-set”,both are different. show crypto ipsec, we could see that spi’s are formed. but whats wrong about the output on R8, lets compare the output on R7. On R8, there is no hits on the counters. So one side is encapsulating but other side is not decapsulating. Other words, R7 is sending the packets somewhere but they are not getting received in. This could indicate the issues regarding the transport. THere is no routing problem, if there could be routing problem then we may not negotiate the phase 1. There could be filter confiured in the tranport. What we can do to check here is the send all traffic through NAT. We will see this later. On R8, we will configure dynamic crypto 0.0.0.0, it will accept all the incoming crypto information.On R7: Traceroute 155.1.158.8, we will go hop by hop basis and make sure that there is no filtering in data plane. but we could also do on R7, lets do debug ip icmp. Lets send packet from R9. We got icmp unreachable(administratively prohibited) from 155.1.45.5, what is this mean here? administratively prohibited means block by ACL. We have removed the ACL on R5.
Suggestion: Don’t use running config to figure out the issues.

GRE over IPsec
In this section
1) GRE over IPsec Tunnels
– GRE over IPsec vs IPsec over GRE
– GRE over ESP transport vs Tunnel Mode
– GRE over IPsec and fragmentation problems
– IP MTU and TCP adjust MSS

Why use GRE and IPsec Together?
1) Crypto maps have no interface in routing table
– Implies that dynamic IGMP routing isn’t supported
– Static routing is hard to scale
– BGP could be used, but overcomplicated the design.
2) Simple solution, add a Tunnel interface
– Run routing inside of a GRE tunnel
– Encrypt the GRE tunnel inside of IPsec
we will take GRE tunnel and run our routing inside GRE, then will take entire GRE header put it inside IPsec that means entire tunnel is going to encrypted.

Where Does the Crypto Map Apply?
1) If crypto map attached at physical interface
– GRE over IPsec(IPsec is transport)
– GRE encapsulation first, encryption second
– Proxy-ACL with single entry “permit gre host A host B”
2) If crypto map attached at tunnel interface level(bad design and it gives less throughput)
– IPsec over GRE(GRE is transport)
– Encryption first, GRE second
– Proxy-ACL with end-to-end entries
– Bad design

GRE over ESP transport vs Tunnel mode
Origination header | TCP/UDP | Data
New IP header | GRE IP Header| GRE | Original IP Header | TCP/UDP | Data
New IP header | ESP Header| GRE | Original IP Header | TCP/UDP | Data | ESP trailer | ESP Auth
( A U T H E N T I C A T E D )
( E N C R Y P T E D )
New IP header | ESP Header| GRE Ip Header | GRE | Original IP Header | TCP/UDP | Data | ESP trailer | ESP Auth
( A U T H E N T I C A T E D )
( E N C R Y P T E D )
Ip Header = 20 Bytes
TCP/UDP header = 20 Bytes
If MTU size is 1500 byte then data payload(MSS)is 1460 bytes.

Third example is where we are doing ESP with Transport mode:
Fourth example is where we are doing ESP with Tunnel mode: addition GRE ip header, new IP header and GRE IP header are equal.

GRE Over IPsec & Fragmenration problems
1) Overhead is negligible, why do we care?
– Extra 20 bytes is only 1% over head of 1500 byte MTU
This is not the issues, the issue is,
2) DF bit isn’t copied between headers when we do multiple encapsulation
– PMTUD is now broken
– Router must encrypt, then fragment
– Throughput is now out the window
The end result is path MTU is broken
3) How do we fix it?
– Offload fragmentation to the end host
In real design R9 and R10 would have private internal address. So they woudn’t be able to advertise their private address in bgp, we need to to some sort of encapsulation. This case when we are doing GRE and IPsec together is that we have two separate routing protocol. One is underlay which has route to public network. and for the overlay which is gone be our internal routing protocol. Simplicity, underlay protocol would be EIGRP, and overlay that is running inside the EIGRP that is going to be OSPF (between R7 and R8 its eigrp and between R9 and R7 it’s OSPF). Crypto process and routing process are 100% unrelated. We have to sort routing first before crypto even started.
We will configure routing on all device except R9 and R10.
router eigrp 1
network 155.1.0.0 0.0.255.255
network 150.1.0.0 0.0.255.255

On R7: ping 150.1.8.8 source 150.1.7.7: Success
Now configure GRE tunnel between R7 and R8. We look this with clear text configuration.
interface tunnel0
tunnel source lo0
tunnel destination 150.1.8.8
ip address 78.0.0.7 255.255.255.0
ip ospf 1 area 0
interface gig1.79(towards R9)
ip ospf 1 area 0

on R8:
interface tunnel0
tunnel source lo0
tunnel destination 150.1.7.7(r7 loopback)
ip address 78.0.0.8 255.255.255.0
ip ospf 1 area 0
int gig1.108 (enable it on inside interface)
ip ospf 1 area 0

On R9:
Router ospf 1
network 0.0.0.0 255.255.255.255 area 0

On R10
Router ospf 1
network 0.0.0.0 255.255.255.255 area 0
On R7 and R8, if we see show ip ospf negibour, we have full adjacency between the GRE tunnel and network type os point to point, there is no DR election.
show ip route ospf: we are learning the route for overlay network which is used to get the traffic between R9 and R10. We are still doing GRE tunnel, We have to thing about the route recursive error, since we are running underlay protocol in transit network and overlay protocol inside GRE, we do need to make sure that what ever source and destination are do not get advertised in overlay protocol. This case we are running EIGRP as underlay and advertising loopback between R7 and R8. We have to make sure that this informaiton doesn’t advertise into GRE tunnel and they don’t prefer route tunnel destination over that link. When we do route recursion to destination it points to physical interface and doesn’t point to tunnel, in this case that’s not the problem since we only enable ospf on physical link and didn’t enable it on loopback. Even if I were to enable it on loopback, OSPF has higher AD distance than EIGRP.
On R9:
ping 150.1.10.10 source 150.1.9.9: Success. If we trace the path, it’s going through GRE tunnel end point which is R7, R7 is encapsulating inside GRE and then it goes to R8 and then out to the final destination. We will do pcap and see what is difference between Native GRE ESP in tunnel mode vs ESP in transport mode. On R5 we are doing pcap.
If we see the result in pcap. We saw ospf hello that goes between R7 and R8. If we see the ping packet, it’s IP packet, IP header length is 20 byte, src is R7 and destination R8’s loopback. Then we have GRE header (47) and it saying payload is IP, GRE is 4 byte. So we have additional 24 byte of over head in actual payload. if we ping 150.1.10.10 source 150.1.9.9 size 1500 df-bit, it’s get loss. ping 150.1.10.10 source 150.1.9.9 size 1476 df-bit: Success. Allowed packet size is 1518. The 18 bits are Ethernet(14+4(802.1q)) overhead.
IPsec tunnel never timeout assuming that we are using dynamic routing protocol and they will send periodic hello packets.
Until this point we have configured the plain text tunnel and now we will going to configure encryption using crypto map.
L3 MTU is 1500 and L2 MTU is 1518 which allows you to enable ethernet encap with .1q header. If you enable dot1q tunneling we need to enable l2 mtu to 1522 and configuration point of view it’s 1504.
what is diff between regular MTU, IP MTU, MPLS MTU? We will see it later.
1) Phase 1 isakm SA
– Encryption
– Auth(psk or PKI)
– Hash
– Group
2) Phase 2 IPsec SA
– Who is tunnel going to(peer address) ?
– What is going inside tunnel(acl)?
– How the traffic is being treated(IPsec tranform-set)?
Instead of matching end to end i.e. R9 and R10. we just match GRE tunnel that is coming from R7 and going to R8 and back. It will cut down the control plane information which router need to maintain.
on R7:
crypto isakmp policy 10
authentication pre-share
encryption 3des
hash md5
group 5
crypto isakmp key password address 150.1.8.8(we are sourcing from R8’s loopback)

On R8:
crypto isakmp policy 10
authentication pre-share
encryption 3des
hash md5
group 5
crypto isakmp key password address 150.1.7.7

What is going in tunnel is gre(47) specially from loopback of R7 and going to loopback of R8

On R8:
ip access-list ex GRE
permit gre any any(if we have more tunnel then we have specify the R7 and R8 address)

On R7:
ip access-list ex GRE
permit gre any any
exit
crypto ipsec transform-set ESP_AES_SHA esp-aes esp-sha-hmac
mode tunnel
crypto map GRE_AND_IPSEC 10 ipsec-isakmp
match address GRE
set transform-set ESP_AES_SHA
set peer 150.1.8.8
crypto map GRE_AND_IPSEC local-address loopback0
On R8:
crypto ipsec transform-set ESP_AES_SHA esp-aes esp-sha-hmac
mode tunnel
crypto map GRE_AND_IPSEC 10 ipsec-isakmp
match address GRE
set transform-set ESP_AES_SHA
set peer 150.1.7.7
crypto map GRE_AND_IPSEC local-address loopback0

On R7 and R8:
conf t
logging console 6
logging buffered 7
logging buffered 9999999
end
clear log
debug crypto isakmp
debug crypto ipsec

On R7:
int gig1.67
crypto map GRE_AND_IPSEC
int gig1.37
crypto map GRE_AND_IPSEC

On R8:
int gig1.58
crypto map GRE_AND_IPSEC

show crypto ipsec sa
we could see plain text MTU 1438, Path MTU 1500 and IP MTU 1500. Plain text MTU is counting for all header encapsulation. When we send packet, it will take original packet i.e ping put it in GRE and GRE is going inside of IPsec obviously we have more overhead since we have more encapsulation. if we see data plane point of view and we go to R9 and send ping 150.1.10.10 source 150.1.9.9 size 1477 df-bit. ping is not successful and we will get must fragment(response with M).ping 150.1.10.10 source 150.1.9.9 size 1476 df-bit, packet is going through. On R9, we see now the packets are just ESP encapsulated. All payloads are encrypted. Ping has to be fragmented and the reason why the end host point of view they are unable to see esp encapsulation is happing in the middle. When we are doing GRE and IPsec in crypto map, the problem we run into that the DF doesn not get copied into IP header to outer GRE header and then to the final ESP header. So looks like MTU is larger but it’s really not. All packet need to s/w switches if router needs fragmentation. Hence solution is to offload the fragmentation towards host.

IP MTU & TCP adjust MSS

1) How do we offload fragmentation to host?
– Lower IP MTU on GRE tunnel to account for ESP overhead
– Actual overhead varies based on crypto algorithm
– e.g. AES uses 16-byte blocks
– Good rule of thum is about 1400 bytes normal MTU
– Jumbo MTU rule of thumb about 9000 bytes
2) What if host doesn’t implement PMTUD(path MTU discovery)?
– Edit their MSS in TCP SYN & SYN ACK
– TCP adjust MSS is IP MTU – 20 Bytes IP – 20 Bytes TCP

On R7 & R8
int tunnel 0
ip mtu 1400

Now if we send packet with 1476 then we get must fragment(M) response. Because we have clear text packet that is going inside GRE, GRE then enforcing IP MTU 1500 and it says you can’t send that particular frame. If ping size is 1400 then it will allow the ping with df bit set. if we see the pcap for size 1400 then we would see the size of icmp packet is 1514. We know L2 MTU is 14 byte and 4 byte of dot 1q header.

On R7 & R8
int tunnel 0
ip mtu 1408

now we ping with size 1408 df bit set. We still see the packet size is 1514 in pcap which is confsing since we have varying the payload size but the output of encryption is same. This is because of AES block encryption size. Depending upon what input in encryption you might end up with same output packet size and there is no really great way to calculate it exactly because its depends on per packet basis. Hence, we need to pick the general value.

On R7 & R8
int tunnel 0
ip mtu 1416
send packet of size 1416 and see pcap, now we see packets are getting fragmented. Even though there is only 8 byte of difference but still lot of overhead is being added.

One of the way we can optimize this though, we will leave the mtu value as 1416. if we run esp in transport mode then we will save one header.

On R7 adn R8
crypto ipsec transform-set ESP_AES_SHA esp-aes esp-sha-hmac
mode transport
Clear crypto isakmp
clear crypto sa

On R7
int gig1.37
no crypto map GRE_AND_IPSEC
crypto map GRE_AND_IPSEC
int gig1.67
no crypto map GRE_AND_IPSEC
crypto map GRE_AND_IPSEC
On R8
int gig1.58
no crypto map GRE_AND_IPSEC
crypto map GRE_AND_IPSEC

It’s still showing it as tunnel. Because transport is only allow if the tunnel is from the router to another router so it can npt use for end application traffic and that has to be source locally and destination to another end point.What actually controls that the proxy ACL. Proxy ACL saying any GRE traffic I don’t care what the source is and what the destination is. when they go to phase to exchange the proxy identity this validates transform set for transform mode. Even though we have configured the transform mode it’s not using it and fall back to tunnel mode.
what we need to do here, edit the ACL
On R8
ip access-list ex GRE
permit gre host 150.1.8.8 host 150.1.7.7
no 10(get rid of old seq number)
end
On R7
ip access-list ex GRE
permit gre host 150.1.7.7 host 150.1.8.8
no 10(get rid of old seq number)

clear cryto sa
celar ipsec sa
and now the setting is transport mode.
when we send ping with size 1416, this packet did not fragmented. Based on AES block size we saving more bytes.
Q) In Tunnel mode, when packet were getting fragmented, why packet size woudn’t be 1518 instead of 1514 bytes.it has to do with block size calculation. if we have 16 byte and when there was 16 byte input then we would get 16 byte output. if we have 17 byte input we would get 32 byte of output. we have to add addition padding.

What if end host are not running PMTUD.
This is why we typically see the crypto config on the tunnels. you not only see ip mtu command but you would also see ip tcp ip adjust asses. Now what MSS adjust does is basically allow the router to proxy for three way handshake . When ping/application comes in on R7, lets say we configured it as “ip mtu 1400” , once it goes above threshold, r7will reply back with ICMP must fragment. You set the DF bit but I must fragment so R7 will gone drop the ping. If this is the tcp packet from R9 (R9 is web client and R10 is web server). R9 send tcp syn packet, now inside the syn, it also tell the server what’s the max size that I support, because larger segment, the more efficient the communication is going to be. Most application takes MSS as MTU-40 bytes(IP and TCP header 20 bytes). Assuming that MTU is 1500 the MSS is 1460. The problem is if they agree on 1460 and if we send the packet on R7 1500 as IP packet then R7 needs to do fragmentation. So R7 and R8 listen for 3 way handshake and then edit the value down to something that lower. if MSS is greater that what I configured my adjust MSS value, I am gone swap out my number into there. TCP server things that the value is lower and we are offloading the fragmentation on end host. example we want to say on R7, we will do capture in bound on that link
ip access-list ex TCP
permit tcp any any
monitor capture 1 access-list TCP
monitor capture 1 int gi1.79 both

On R9, R10.
Ip tcp mss 1450
this will afect things like bgp , mstp
Now R9 will do tcp syn packet, telent 150.1.10.10 source lo0
when we see pcap, if they negotiate 1450 then actual IP packet is 1490(1450 + IP + TCP ).and we know we are not going to support that above 1400 we have to enforce fragmentation because of IPsec and R7 and R8 have to do fragmentation.
but what we can do now instead is on R7 and R8,

int tunnel0
ip tcp adjust-mss 1000

if we see packet again then MSS would be 1000 and not 1460. Here, router will see the payload and take the decision, it is per session basis.

IPsec Virtual Tunnel Interfaces(VTI)
1) Tunnel interface with direct Ipsec encapsulation
-Conceptually similar with GRE, but without additional GRE overhead
2) Two VTI variations
3) Static VTI(SVTI)
– Used for site-to-site VPN
4) Dynamic VTI(DVTI)
– Used for remote-access VPN, we called it as easy vpn
We are going to see static VTI

GRE over IPsec vs IPsec VTI
1) GRE over IPsec
– More overhead
— Arguably negligible
– Multiprotocol encapsulation
— ex. ipv4, ipv6, IS-IS etc
– On-demand VPN
– Usually triggered by IGP
– Line protocol based on route to destination
2) IPsec VTI
– Less overhead
– single protocol encap
– IPv4 over IPv4 IPsec only
– IPv6 over IPv6 IPsec only
– Always-on
– No interesting traffic needed
– Line protocol based on IPsec phase 2 completion
GRE would stay in routing table until next hop is reachable. GRE doesn’t run keep alive by default. We could use object tracking, IPSLA, EEM scripting. by default tunnel would stay up even though you don’t have reachability to final destination.
In case of IPsec VTI it’s different. Line protocol based on IPsec phase 2 completion, this means that VTI tunnel is up it means we do have end to end reachability. Limitation of VTI is it’s single protocol encap

IPsec VTI configuration
1) Phase 1 steps are identical to crypto map based tunnel
2) For phase 2…
3) Tunnel already defines the Who
– i.e. Tunnel destination
4) Tunnel already defines the what
– i.e. ip any any
5) How is the traffic treated?
– Defined by crypto IPsec profile
We have extra peace of configuration that is called as IPsec profile

Crypto IPsec profiles
1) IPsec profile is essentially a stripped down version of crypto map
– Contains only IPsec phase 2 negotiation parameters
– i.e IPsec transform set
2) Does not contains peer address or proxy ACL
– Peer is the tunnel destination
– Proxy ACL is non-configurable “permit/gre any any”
3) Ipsec profile can apply to both GRE tunnel and IPse VTI tunnel
– Tunnel MTU is automatically adjusted for ESP overhead.

We can not have VTI and Crypto map at the same time. They are mutually exclusive because router doesn’t understand order of operation. first remove the old crypto map configuration from router. Now our tunnel becomes simple plain text tunnel.
On R8: Ping 78.0.0.7 success. Show ip route: we are learning routing information over the tunnel which is again main motivation of why we used GRE inside of IPsec vs using regular crypto map that the crypto map doesn’t have interface in routing table so we are not able to dynamically route. now in IPsec VTI do have interface in routing table which remove the restriction on needing the extra GRE encapsulation.
so configuration upto this point is same
show run int tun0
interface tunnel0
ip address 78.0.0.8 255.255.255.0
ip ospf 1 area 0
tunnel source loopback0
tunnel destination 150.1.7.7
end

We have remove the tcp adjust-mss and MTU command as well.
On R7:
show run int tun0
interface tunnel0
ip address 78.0.0.7 255.255.255.0
ip ospf 1 area 0
tunnel source loopback0
tunnel destination 150.1.8.8
end

So we have previous phase 1 configuration and there is no difference between VTI config and crypto map configuration.

show run | sec crypto
crypto isakmp policy 10
authentication pre-share
encryption 3des
hash md5
group 5
crypto isakmp key password address 150.1.8.8
crypto ipsec profile VTI_PROFILE
set trasform-set ESP_AES_SHA(previously we have used this transform-set)

On R8:
crypto ipsec profile VTI_PROFILE
set trasform-set ESP_AES_SHA
inter tunnel0
tunnel mode ipsec ipv4
(ETH)(IP)(ESP)(payload)(ESP tailor)
tunnel protection ipsec profileVTI_PROFILE
show ip int tunnel0 is showing tunnel transport MTU 1500 bytes. We will once tunnel is up what is the value for MTU

On R7:
inter tunnel0
tunnel mode ipsec ipv4
tunnel protection ipsec profile VTI_PROFILE
show crypto isakmp sa: phase 1 is working

On r5 capture the traffic.
On R9, ping 150.1.10.10 source 150.1.9.9: we just see ESP packet this is same as GRE but when we see exact payload size for GRE tunnel vs VTI, VTI is little bit lower , because it is not adding extra GRE encap in IP header and one outer IP header , it doesn’t have additional GRE 4 bytes. 4 bytes is not huge difference, but technically it is less.
ping 150.1.10.10 source 150.1.9.9 df-bit size 1500: DF bit it copy to ESP so we know where to enforce the fragmentation. Previous problem we ran into is that we have IP packet with DF and DF will get copied in GRE, but it doesn’t get copied from GRE to ESP and it break path MTU discovery so when you used IPsec VTI you don’t need to set IP MTU because router is automatically going to account for that. The fragmentation in VTI depends on what crypto algorithm we are using. size 1438 will go without fragment.
show ip int tunnel0 : it indicate the MTU 1438. When router doen negotiation it knows based on transform-set what is effective MTU. Now since IPsec is getting directly encapsulating without going to GRE it knows that we did AES for transport based on AES overhead and end result would be 1438 which means MSS for TCP 40 bytes lower than this. Now if we still want to enforce that which means we would go to VTI and say
inter tunnel0
ip tcp adjust-mss 1398(40 bytes lower than 1438)
we only need to run this command if path MTU discovery is broken.
The main change here is that we don’t need crypto-map
Final config
crypto isakmp policy 10
authentication pre-share
encryption 3des
hash md5
group 5
crypto isakmp key password address 150.1.8.8
crypto ipsec transform-set ESP_AES_SHA esp-aes esp-sha-hmac
crypto ipsec profile VTI_PROFILE
set trasform-set ESP_AES_SHA
interfcace tunnel0
tunnel mode ipsec ipv4
tunnel protection ipsec profile VTI_profile
ip address 58.0.0.7 255.255.255.0
ip tcp adjust-mss 1398

show up route: vti install similar to normal interface does but overhead is less.

Lets say we want to run ospf v3 routing. On R9, show ipv6 int brief, we saw we have ipv6 configured on R9.
On R9:
int lo0
ipv6 ospf 1 area 0
int gig1.79
ipv6 ospf 1 area 0
ipv6 unicast-routing
On R7:
ipv6 unicast-routing
int gig1.79
ipv6 ospf 1 area 0
int tunn0
ipv6 enable (it will enable ipv6 link local address)
ipv6 ospf 1 area 0
On R10:
int loo0
ipv6 ospf 1 area 0
int gig1.108
ipv6 ospf 1 area 0
ipv6 unicast-routing

On R8:
ipv6-unicast routing
int tunnel0
ipv6 enable
ipv6 ospf 1 area 0
int gig1.108
ipv6 ospf 1 area 0
tunnel mode gre ip (its default mode)
On R7:
int tunnel0
tunnel mode gre ip
Now we can see we are not only have ospfv2 adjacency but we also have ospfv3 adjacency.

On R9:
show ipv6 route : We see that we have ospfv3 route, so we should get to the loopback of R10 ping 2001:150:10:10::10: Success. This is normal GRE config and we are putting gre inside IPsec but the way we did that not with crypto map here we did it by applying crypto profile, it calls IP transform-set and we have isakmp policy configured globally. It is same type of example as before GRE over IPsec but we skipp the step of having crypto map. If we see show crypto ipsec sa, the main difference is here is what is proxy ACL defining. In this case proxy acl saying (150.1.7.7/255.255.255.255/47/0) it’s gre(47) that is specifcly from R7 to R8.

On R9: Ping 150.1.10.10 & ping 2001:150:10:10::10 success
if we change the mode ipv4
On R7:
Int tunnel0
tunnel mode ipsec ipv4
On R8:
tunnel mode ipsec ipv4
On R9: Ping 150.1.10.10: Sucess & ping 2001:150:10:10::10 Fail.
Because VTI is single IP protocol encap. so vti is ipv4 in ipv4 or ipv6 in ipv6 and it’s not multi protocol.

On R7:
ip route 0.0.0.0 0.0.0.0 gig1
int tunnel1234
tunnel dest 1.2.3.4
tunnel source lo0
we see the log stating that tunnel234 interface is up even though destination 1.2.3.4 doesn’t exits. So only GRE is checking that do we have route to the destination if yes then line protocol is up. We could configure “keepalive 1 ” seconds. Now we see tunnel1234 state is down. So keepalive is not default behaviour. The default behaviour is we can not care about the destination and gre is up based on routing protocol.

One router running VTI and another end router running crypto, still they would form tunnel between them. Likewise one is running dynamic crypto map or easy vpn and another one is running dynamic VTI and you would able to terminate client on both at same side because negotiation point of view it’s same thing.
VTI frame format:
new ip header | ESP | Original IP | TCP | DATA | ESP trailor | ESP auth

IPsec over DMVPN
DMVPN Review
1) DMVPN is P-to-M layer 3 overlay VPN
– Logical hub and spoke topology
– Direct spoke to spoke traffic is supported.
2) DMVPN uses a combination of…
– Multipoint GRE tunnels(mGRE)
– Next hop resolution protocol(NHRP)
– IPsec crypto profiles
– Routing
3) DMVPN typically implies IPsec but doesn’t require it
– DMVPN on its own is an mGRE routing technique
– Can be combines with IPsec to encrypt mGRE
4) IPsec over DMVPN is same logic as IPsec over GRE
– Configured as crypto ipsec profile on Tunnel
– Only difference is dynamic spoke-to-spoke tunnels

DMVPN order of operations
1) Crypto first
2) NHRP second
3) Routing third
if there is problem with IPsec, the spokes of the network never registered with hub and they won’t be able to run routing protocols and not able to send any traffic. so from verification and troubleshooting point of view. when we look at full DMVPN config you would always look at IPsec first. If there is problem with IPsec and DMVPN is fine without IPsec one of the way quickly check that is to remove the crypto profile from tunnel and see if the tunnel forms clear text GRE. Once the DMVPN routing is working then we reapply the crypto profile and start Troubleshooting steps that related to IPsec.

Ipsec and DMVPN phase 1/2/3
1) Spoke to hub tunnels are always nailed up
– True for all DMVPN phases
– Implies hub always has IPsec SA for all spokes
2) Spoke to spoke tunnels are on demand
– True for DMVPN phase 2 and 3
– Implies IPsec SA is established on demand between spokes
DMVPN, NHRP, basic routing configuration is done but none of the crypto configuration is done. First verify the basic configuration. R5 is DMVPN hub, it also means R5 is NHRP NHS. This means when spokes want to do resolution to each other they need to first registered with hub and ask for resolution with other spokes. so first see if the NHRP is working on hub.
On R5: show dmvpn : This would tell us, did the spokes registered with HUB with NBMA address and do we have correct mapping with tunnel address. We see in output they are properly registered.
On R1: show dmvpn and show ip nhrp: We could see we know statically about the hub because we are manually configuring it as NHS. On R5 we are configured as tunnel mode gre multipoint. It means its running phase 2 or 3.
On R5:
crypto isakmp policy 100
authentication pre-share
encryption aes 192
group 16
hash sha512
Here we are going to form more than one tunnel, one tunnel is going to form spokes to hub. For authentication, we could specify individual password between each hub and spoke tunnel or we could specify wild card i.e. at multiple at same time. Now the disadvantages of using wild card is if we want to change we need to change it everywhere also one of the spokes is compromise then technically means entire network is compromised. In real scenarios we could use PKI and certificates.
Crypto isakmp key DMVPN_KEY address 169.254.100.1
Crypto isakmp key DMVPN_KEY address 169.254.100.2
Crypto isakmp key DMVPN_KEY address 169.254.100.3
Crypto isakmp key DMVPN_KEY address 169.254.100.4
now important point to understand here is crypto happening first, this means when we are looking isakmp identifier, ipsec identifier where the packets are coming from its coming from underlay network and not the overlay network. Another work it’s NBMA address
On All router:
crypto isakmp policy 100
authentication pre-share
encryption aes 192
group 16
hash sha512
Crypto isakmp key DMVPN_KEY address 169.254.100.1
Crypto isakmp key DMVPN_KEY address 169.254.100.2
Crypto isakmp key DMVPN_KEY address 169.254.100.3
Crypto isakmp key DMVPN_KEY address 169.254.100.4
(Crypto isakmp key DMVPN_KEY address 0.0.0.0) this is the wild card
phase 1 is complete, now we are going to do phase 2 profile. Profile would define where the connection is going to, we dont need to set peer address, it would defines what is going inside of tunnel, which in this case all GRE traffic.
On R5:
crypto ipsec tranform-set DMVPN_TRASFORM esp-3des esp-md5 (this should be same on all router)
crypto ipsec profile DMVPN_PROFILE
set transform-set DMVPN_TRASFORM
On all spokes:
crypto ipsec tranform-set DMVPN_TRASFORM esp-3des esp-md5 (this should be same on all router)
crypto ipsec profile DMVPN_PROFILE
set transform-set DMVPN_TRASFORM

On R5: debug crypto isakmp
debug crypto ipsec
tunnel protection ipsec profile DMVPN_PROFILE

On R1:
tunnel protection ipsec profile DMVPN_PROFILE
On R5 we saw ISAKMP negotiation in debug output.
show crypto isakmp sa: we could see state is QM_IDEAL state and phase negotiation is done. We haven’t configured phase 2 or phase 3 of DMVPN yet. Just loop the normal pattern for traffic from spoke to hub and back. Capture the traffic on R5. On R2: Ping 150.1.5.5 source lo0: Success. When we see the capture. Before the traffic goes to tunnel it comes in on public interface as ESP. Src and dest address are the public address i.e. 169.254.100.X. If we move the capture point to tunnel interface then we would see the payload. Because capture is happning after esp is doing encrypted. If you are in transit path then you are not going to see whats going on because whole communication between the devices is encrypted
spoke to spoke traffic: on R2: traceroute 150.1.1.1 source `50.1.2.2
on R2: int tunnel0
no ip split-horizon eigrp 1
no ip next-hop-self eigrp 1
in phase 2, we don’t want to change the next hop in spokes so we execute above command. With OSPF, we could do network type broadcast. With BGP, we could do it by configuring RR.
on R2: traceroute 150.1.1.1 source `50.1.2.2: Now new on demand tunnel is form between R1 and R2. The another change we didn’t do yet which mostly used in real design. You don’t want ESP in tunnel mode. We usually go to transform-set and set the mode to transform set which means we save some additional overhead in encapsulation.
On all router:
crypto ipsec tranform-set DMVPN_TRASFORM esp-3des esp-md5
mode transport
clear crypto sa: it cause to rekey phase 2.
show crypto ipsec sa: it says we are running transport mode.
if there are lot of tunnel and if we are using higher hash, encryption algorithm, it would be slowing down the convergence time.

Scaling IPsec over DMVPN
1) As DMVPN cloud grows, IPsec state grows
– Linear IPsec SA state from hub to spokes, spoke-to-spoke scale is on-demand.
if there are 1000 spokes, hub has to store 1K keys.
2) scaling ISAKMP Authentication
– PSK is supported but hard to manage
– Wildcard PSKs are a bad idea
– PKI is the preferred solution
3) Scaling IPsec SAs
– DMVPN over GETVPN

GETVPN
1) Group Encrypted Transport(GET)
– Transport, not tunnel encryption
2) Normal IPsec is point-to-point
– ISKMP SA and IPsec SA per tunnel
3) GETVPN is any-to-any
– Shared GDOI & IPsec SA for all group members
– i.e. everyone uses the same encryption and decryption keys
– Means IPsec state does not grow as group members grow.

 

CCIE LAB practice

1)When we configure the VTP domain name on server it will automatically propagated on other trunk devices of they are part of NULL VTP domain. if there is already VTP domain has been configured then they wont receive any VTP updates.

2)switches will use the IP address of the lowest physical interface number, if that interface does not have the IP address, then loopback 0 interface will be used as source of all VTP messages, but this behaviour can be change by using the “vtp interface loopback1” global config command.

3)

1) remove all the configuration from physical interface
2) configure the interface port-channel
3)execute the “no switch” command then configure the IP address.
4)Now, configure the physical interface with “no switchport” command.
5) Assign the port-channel ID which is created in step 2 using the channel-group intercace confgirue.
5)type the shut and no shut command on physical interface.

Now u can see the L3 etherchannel is UP
To confirm:
show ethernchannel summany | B summary
The flag “RU” should be present for specific channel #.

Etherchannel

4) MST support 4096 instances, once the spanning-tree mode is changed to MST and the MST configuration mode is entered, instance 0 is created automatically and all VLANs are mapped to that instance.  By default, all the VLANs that are not statically mapped to given instance will be assigned to instance 0, instance 0 is the catch-all instance.

5)!!NOTE!! Always do “show frame-relay map” when starting a lab and after configuration is complete to verify layer2 connectivity. If there are 0.0.0.0 frame-relay mappings, save the configuration and reload. It is the only way to get rid of this.

5)

frame-relay troubleshooting
1) Check DTE and DCE is properly configured using show controller <interface> | in clock
2) check LMI has been exchanged between the routers using show fram lmi | in Num
3) check MAP status using show frame map

Frame-relay can be configured in two different ways. Multipoint and Point-and-point. There is ONLY one way to configure f-r in p-2-p manner, and that’s through a p-2-p sub-interface, whereas, a multipoint can be configured in two ways:
1)Pefrom entire configuration directly under main interface.
2)Configure a sub-interface in multipoint manner.
If the f-r entire config was peformed without the use of sub-interface, then this is mulipoint interface. In multipoint f-r config, two condition must be met before an IP address is rechabe:
A: Destination IP address must be in the routing table with valid next hop
B: There must be frame-relay mapping for that destination.

6)When configuring the f-r mapping from one spoke to another spoke, the “brodcast” keyword should not be used, if this keyword is used, the hub router will receive redudant routing traffic.

7)when F-R is configured in p-2-p manner it’s important to understand the following two behaviours:

A: There is no need to disable inrvese-arp, because inverse-arp is disabled when f-r is configured in a p-2-p manner.
B: No need for F-R mappings, because there can be only be another router on the other end of the PVC, therefore, all IP address(including local router’s IP address) are reachable as long as the destination IP address s in the routing table with a valid next hop IP address.

8)If there is requirement to configure the F-R multipoint without using the frame-relay map command.

In this case the solution is PPP, PPP is configured on the DLCIs, when PPP is configured, a host route is injected into the routing table, this host route provides NLRI to the next hop is address.
EX. frame-relay interface-dlci 101 ppp virtual-template1
interface virtual-template1
ip address <assing local interface IP>

How do these routers communicate?
When running PPP a host route is injected by IPCP; if the routing table of a router is checked, you will see that next-hop is rechable via the local router’s virtual0template interface, since the VC are configured as P2P, any packets the local router puts on the virtual-template is received by one and ONLY one router on the other sde of the DLCI.

9)Before RIP routing protocols accespt routes from a given neighbor, they want to make sure that the source IP address of the advertising router is from the same IP address space as the link that the two routers are connected to. If the routers that have to exchange routing informaton are from different IP address spaces, then, the source validation MUST be negated using “No validate-update-source” command.

10)If the offset-list reference 0 instead of access-list number, the offset value applies to all the routes received through the specified interface.

11) OSPF passive interface:This works differently to distance vector protocols like RIP, where routes will still be received, but not sent.To get the same ‘passive-interface’ effect as distance vector protocols in OSPF,(i.e. receive routes but don’t send routes) use:”ip ospf database-filter all out” under the interface.

12)

Unconditional OSPF Default Route
> This advertises a default route into the OSPF domain, regardless of whether the local router can reach areas outside the OSPF domains, or not.
> With no additional configuration options, the default route is advertised as an External Type 2 (E2) route with metric 1.
> Configured with “default-information originate always” under the OSPF process.
– Conditional OSPF Default Route
> Configured with “default-information originate” but without the ‘always’ keyword.
> This advertises a default route into the OSPF domain, but only if the advertising router has a non-OSPF default route in its routing table.
> The non-OSPF default route could be any of the following:
>> A static default route with the next-hop pointing outside the OSPF domain.
>> A static default route based on IP SLA measurements (example: http://routing-bits.com/2009/03/10/ospf-default-route-alternative/).
>> Or a BGP advertised default route.
> The “default-information originate” command without the always option is functionally equivalent to redistributing a default route into OSPF.
> With no additional configuration options, the default route is advertised as an E2 route with a metric of 1.

13) If we want to change the MD5 authentication key between the two OSPF peers without tear down the adjacencies, then create the new key and apply it on both the adjacent router. It will select the latest key for an authentication(automatic roll over to new key) without tear down the adjacencies.

14) In OSPF, whenever we create the summary route, NULL route is automatically added to avoid the forwarding LOOP in the network. In order to remove the null route we need to execute the command “no discard-route internal/external”.

15) Default cost of the injected default route in OSPF can be changed using the “area XX default-cost CC”, where CC is the new default cost.

16)

In order to filter any prefix from routing table we have to create the prefix list and apply it on distribute-list. “Distribute-list in” command will be use when filtering any type of LSA on a given router, this command ONLY filters the prefix/es from the local router’s routing table and NOT the database.
To filter any prfix on ABR, we can configure filter-list on ABR.
To filter the LSA 1 and LSA 2 on ABR router, use the command area <#> range <network> <mask> no-advertise
To filter the route, we can use the distance command to set the AD to 255.
“Distribute-list out” command MUST be configured on the ASBR or else it will not have any effect whatsoever. This command filters LSA type 5s or 7s. Alternative of this command is “summary-address <prefix> <mask> not-advertise” which should be configured on ASBR or the router that generate LSA 5
To block all the outgoing LSA on specific interface, use command “ip ospf databse-filter all out”
In point-to-multipoint network type, we can filter the LSAs for specific router using the command “neighbor <prefix> database-filter all out”.

17) In OSPF, if we wants to redirect the traffic then without using the bandwidth, ip ospf cost, PBR or distance command, on transit router(secondary path) we can execute the command “max-metric router-lsa”. This command will cause Router to originate LSAs with a miximum metric of 0Xffff. so that other router do not prefer this router as transit hop in their path to given network.

18)

There are some additional optional non-transitive attributes that can be used when RRs are configured and they are : Originator-id, cluster-id and cluster-list.
Originator-id: This attribute is created by the RR; this is the router-id of the router that originated the prfeix. it’s created to avoid routing loops, a RR will not advertise a route back to the originator of the prefix and if the originator of a prefix receives an update with its own router-id, it will ignore that prefix.
Cluster and Cluster-id:A RR/s and its clients are collectively known as cluster, each cluster must be uniquely identified, and the cluster-id is typically the router-id of the RR unless specifically configured.
Cluster-list: This attribute is analogous to AS-path attribute, and it keeps track of the cluster-ids in the same way that the AS-path attribute keeps track of the AS number. When the RR advertise a prefx to a non-client, it appends the cluster-id to that prefix cluster-list, if a RR receives an update and sees its own cluster-id in the cluster-list, it will ignore that update.

19) Legacy customer queue: Because queueing is always outbound, when custom queueing is applied to the interface, no direction can be specified. Queue 0 is like a priority queue. Traffic in this queue will always be sent first.

20) CDWFQ: Don’t forget to change the default max-reserved-bandwidth of 75% for the interface before applying the service-policy. “max-reserve-bandwidth” is only a

configuration limitation!

Few points from Danile’s and CCIETOBE blog

Scaling PEs in MPLS VPN – Route Target Constraint (RTC)

The way this feature works is that the PE will advertise to the RR which RTs it intends to import. The RR will then implement an outbound filter only sending routes matching those RTs to the PE. This is much more effecient than the default behavior. Obviously the RR still needs to receive all the routes so no filtering is done towards the RR. To enable this feature a new Sub Address Family (SAFI) is used called rtfilter.

The scenario is that PE1 is located in a large PoP where there are already plenty of customers. It currently has 255 customers. PE2 is located in a new PoP and so far only one customer is connected there. It’s unneccessary for the RR to send all routes to PE2 for all of PE1 customers because it does not need them.

In this case we have 255 routes but what if it was 1 million routes? That would be a big waste of both processing power and bandwidth, not to mention that the RR would have to format all the BGP updates. These are the benefits of enabling RTC:

  • Eliminating waste of processing power on PE and RR and waste of bandwidth
  • Less VPNv4 formatted Updates
  • BGP convergence time is reduced

Conclusion

Route Target Constraint is a powerful feature that will lessen the load on both your Route Reflectors and PE devices in an MPLS VPN enabled network. It can also help with making BGP converging faster. Support is needed on both PE and RR and the BGP session will be torn down when enabling it so it has to be done during maintenance time.

Portfast

Even if portfast is enabled under the interface it will still lose its portfast status if BPDUs are received.

STP Convergence:

What happens when the root port is shutdown? In theory when the carrier detects that the link is down it should look at alternate BPDU and start to take that port through the different port states. This should take around 30 seconds. The timing is almost perfect. The port goes through listening and learning at 15 seconds each before it goes to forwarding almost exactly 30 seconds after the port was shutdown.

What happens when there is an indirect failure? The switch has to expire the root BPDU before it believes other BPDUs with worse cost. This should take around 20 seconds. By default Maxage will be set to 20 seconds. So it took almost 20 seconds for the BPDU to expire. Then the port goes through the ordinary state changes. Roughly 48.5 seconds after the filter was applied the port went into forwarding. For passive failures when running PVST+ the maximum recovery time should be 50 seconds.

Now let’s look at PVST+ with Uplinkfast configured. The theory is that when a root port fails the Alternate port should be bypass listening and learning states and go direct to forwarding. Let’s try this out. It took only 2 seconds from realizing the port was down to putting the alternate port into forwarding. For PVST+ this is a great enhancement.

Tiebreakers with routes from different OSPF process

If router is receiving same prefix from two routers in different OSPF process then which path should router take to forward the packet to destination. Tie breaker is lowest process number.

if everything is the same then the tiebreaker is the lowest process number. For EIGRP it is the lowest AS number so maybe Cisco chose to make it comparable.

 

Redistributing between OSPF and BGP:

R3(config)#router bgp 254
R3(config-router)#redistribute ospf 3……..This redistributes only OSPF intra- and inter-area routes into BGP. We need below command to redistribute the external route into bgp ” redistribute ospf 3 match internal external 1 external 2 “………………..By default, iBGP redistribution into IGP is disabled. Issue the “bgp redistribute-internal” command in order to enable redistribution of iBGP routes into IGP. Precautions need to be taken to redistribute specific routes using route maps into IGP.

PPPoE

PPP uses Link Control Protocol (LCP) to establish a session between a user’s computer and an ISP. LCP is responsible for determining if the link is acceptable
for data transmission. LCP packets are exchanged between multiple network points to determine link characteristics including device identity, packet size, and
configuration errors.

PPPoE:
Two stage
1) discovery stage: When a Host wishes to initiate a PPPoE session, it must first perform Discovery to identify the Ethernet MAC address of the peer and establish a PPPoE SESSION_ID.
The steps consist of the Host broadcasting an Initiation packet, one or more Access Concentrators sending Offer packets, the Host sending a unicast Session Request packet and the selected Access Concentrator sending a Confirmation packet. When the Host receives the Confirmation packet, it may proceed to the PPP Session Stage. When the Access Concentrator sends the Confirmation packet, it may proceed to the PPP Session Stage.
PPPoE Active Discovery Initiation (PADI) packet(Broadcast)
PPPoE Active Discovery Offer (PADO) packet(Unicast)
PPPoE Active Discovery Request (PADR) packet(Unicast)
PPPoE Active Discovery Session-confirmation (PADS) packet(Unicast)
PPPoE Active Discovery Terminate (PADT) packet(Unicast) :any one can send at any time.

2) session stage: Everything is unicast
The Maximum-Receive-Unit (MRU) option MUST NOT be negotiated to a larger size than 1492. Since Ethernet has a maximum payload size of 1500 octets, the PPPoE header is 6 octets and the PPP Protocol ID is 2 octets, the PPP MTU MUST NOT be greater than 1492.

MPLS

The OSPF and BGP router-ID is not the same as LDP router-ID. This might be confusing, but you can configure an IP that doesn’t exist on the router; as OSPF or BGP router ID. To be more clear this is not quite an IP address but instead it looks like one, that’s just a 32 bit value, on OSPF and BGP  that’s why  we don’t need to have reachability! But in LDP we must have reachability  to our neighbor router-ID because its an IP address.

RIP and MPBGP

when there is PE and CE router. On CE router RIP is running and on PE router when we tried to redistribute the routes from BGP to RIP we need to use below command to preserve the RIP metrics.

“redistribute bgp <asn> metric transparent”

When RIP routes are redistributed into BGP, the route metric is stored in the BGP MED value. When BGP routes are redistributed into RIP, and the transparent keyword used, the MED value is copied back as the RIP metric. Without the transparent keyword, the metric value specified is applied to all the routes.

for the technical implementation with “metric transparent” already mentioned make sure your BGP med value is not larger than 15 in any case. Otherwise the RIP metric will assume the route to be unreachable and not announce it from PE to CE.

One way of having larger MED values is to use OSPF or EIGRP on another site, which metric will be copied to BGP MED during redistribution.

If only RIP is used with only single homed sites then this should work properly.