Archives for posts with tag: WAN

IPv4 MTU issues can be hard to spot initially, there is a solution and its called Path MTU Discovery (RFC1191). The RFC describes it as the following “a technique for using the Don’t Fragment (DF) bit in the IP header to dynamically discover the PMTU of a path”

Further to that the RFC states “The basic idea is that a source host initially assumes that the PMTU of a path is the (known) MTU of its first hop, and sends all datagrams on that path with the DF bit set. If any of the datagrams are too large to be forwarded without fragmentation by some router along the path, that router will discard them and return ICMP Destination Unreachable messages with a code meaning “fragmentation needed and DF set” (Type 3, code 4)

The unfortunate issue is that the message that’s sent back doesn’t actually say what the MTU is.

A colleague of mines who is a Windows 7 expert, has reliably informed me that by default Windows 7 has PMTUD enabled.

The important point to focus on is the ICMP unreachable (Type 3, code 4). To put this quite simply, if you don’t receive an ICMP message back with the code for fragmentation needed then, your PC will assume that the MTU is fine and continue to send the packets even though somewhere in the path the packets are potentially being dropped.

There can be a number of reasons for this, including firewalls blocking the message, ICMP unreachable disabled on an interface, a transparent host between 2 endpoints (Often done in service provider networks) that has a lower MTU value.

I recently ran into an issue where IP connectivity between 2 sites looked to be fine, ping, traceroute and SSH were all working, but certain applications and protocols were not, most notably HTTPS.

Below I will explain how to spot this issue.

Take a look at the diagram below, i have deliberately used a transparent device as its most likely what you might see in a L3VPN (MPLS) network. The last mile provider provides a layer 2 path (perhaps a L2TPv3) from CE to PE and the underlying hops are hidden from us.  From the service provider perspective the routers are directly connected.

This is perhaps where an MTU issue has occurred. For this scenario I have reduced it quite significantly for effect.


Lets say for example you have a perfectly functioning network where MTU is fine along the path. Initially you can send a ping with 1460bytes and you will get a reply. Lets increase this to something we know is to big (1550bytes). This works great in a perfectly functioning network where you receive an ICMP type 3, you will get the “packet needs to be fragmented but DF set” message.


Now lets try that through our network where the MTU is set lower but the sending device doesn’t know about it.


At first you think its OK because you can ping along the path and get a reply, you try SSH and it works too. Now lets try to ping with different MTU sizes. Remember your PC doesn’t receive the ICMP message this time, so what happens is you get a “request timed out” message.


The reason for that is the packet is being dropped and the ICMP message isn’t being returned. If I ping with an MTU that is lower than the 1000 i get a reply.


Now the question, why would HTTPS not work? well in some cases web applications or your client might set the Do Not Fragement bit in the IP header SYN request. This means the packet should not be fragmented, so when we send this on our network with the bad MTU in the path, the packet is dropped and the sending device never receives the ICMP message. It never knows that it has to reduce the MTU value. The packet capture below shows where the DF bit is set.


I had a look through the RFC2246 for TLS1.0 and it doesn’t specify that the DF bit should be set. It’s most likely a vendor or O/S specific setting, so your observed results may differ from vendor to vendor.


IPERF is a great tool for testing bandwidth between 2 computers either on the local LAN or across a WAN.

I wont go deep dive into the mecahnics of TCP and UDP packets.

One side is the client and one side is the server the, the -w switch is to set the TCP receive window size for the purpose of this discussion its 64KB

Here are the commands you need to get started, from the server side you only need a few commands these are below.

iperf -s -w 64KB -i 5

What this does is set the device as server , the tcp window size as 64KB and the interval at which to display the progress.
Hit return,  it will begin to listen for any connections, see below.

IPERF Server

IPERF Server

Now on the client side, this is where you set all your paramameters.

iperf -c [ip address of server] -w 64KB -t 40 -i 5

What this does is send the server data for 40 seconds and gives you a update on the progress every five seconds. like below.

IPERF Client

IPERF Client

You could also send the server a set amount, maybe 100MB of data with the following command

iperf -c [ip address of server] -w 64KB -n 100M -i 5

Another trick you can do is run parallel client threads to simulate more than one device using the link use the command below
to simulate 3 parallel threads

iperf -c [ip address of server] -w 64KB -n 100M -i 5 -P 3

If you add the -u switch this will send UDP packets instead of TCP and give you some valuable statistics like jitter and packet loss.
This time dont set a window size on client or server, you set the bandwidth on the client side ,  dont set this to more than your actual bandwidth otherwise you’ll drop packets.

iperf -c [ip address of server] -u -t 40 -i 5 -b [b/width in Megabits]

iperf -s -u -i 5.

See below outputs from client and server side.





For help on IPERF just type IPERF –help to see all the switches.


This lab is setup for Multicast Source Discovery Protocol (MSDP) I will also be applying Source Active (SA) filtering on our Rendezvous Points (RP). On this occasion I will use Auto-RP to discover RP on each domain then use MSDP to enable us to share SA messages about our multicast sources. The SA filtering will be applied to R2 and R3. When using MSDP setup a multicast boundary, imagine a line between domains where you create an access-list which permits or allows any multicast address which you specify.

The reason I am using Auto-RP is to show the significance of using the multicast boundary, we will block the Auto-RP multicast address of and this will guarantee the boundaries between our domains, but not limited to only that purpose.

The main commands you will need are “ip msdp peer [ip address] connect-source [interface]” for SA filtering use the command “ip msdp sa-filter in | out [peer ip] list [access-list]” for the multicast boundary under interface use the command “ip multicast boundary [access-list] in | out filter-autorp” My previous blog “Multicast Notes” will provide a little more detail, and also some examples of access-lists.

MSDP Topology

MSDP Topology


Attached are the config’s for each router (Router Configs). As I’m using GNS3 I had to enter commands “no ip route-cache” and “no ip mroute-cache” on each PIM interface. Also add “no ip cef”. You won’t need this for physical routers.

Verify our msdp peers with the command “show ip msdp summary” from R3 shows we have 2 MSDP peers, both show as UP. All is looking good at this point.

R3#sh ip msdp summary
MSDP Peer Status Summary
Peer Address     AS   State   Uptime/  Reset    SA   Peer Name
Downtime Count  Count     2     Up         00:09:29      0          0            ?     6     Up         00:10:19      0          0            ?

We can also use the command “show ip msdp peer” from R3 to show us is if any SA filtering is being applied and also how many SA messages received from the peer. In our example we have received 1 SA message from our peer highlighted in red, and we are filtering inbound SA messages, also highlighted in red.

MSDP Peer (?), AS 2
Connection status:
State: Up, Resets: 0, Connection source: Loopback1 (
Uptime(Downtime): 00:17:41, Messages sent/received: 18/19
Output messages discarded: 0
Connection and counters cleared 00:19:31 ago
SA Filtering:
Input (S,G) filter: 101, route-map: none
Input RP filter: none, route-map: none
Output (S,G) filter: none, route-map: none
Output RP filter: none, route-map: none
Input filter: none
Peer ttl threshold: 0
SAs learned from this peer: 1
Input queue size: 0, Output queue size: 0
Message counters:
RPF Failure count: 0
SA Messages in/out: 1/0
SA Requests in: 0
SA Responses out: 0
Data Packets in/out: 0/0

To display any SA messages the router has received we use the command “show ip msdp sa-cache” this will display the source tree entry for any multicast sources our neighbour is aware of, bearing in mind all SA messages are flooded to MSDP neighbours unless filtered by an SA-filtering

R3#sh ip msdp sa-cache
MSDP Source-Active Cache – 1 entries
(,, RP, BGP/AS 2, 00:01:50/00:05:04, Peer

Now let’s consider SA-filtering, as per my multicast notes blog I will use this to filter messages sent from R2 to R3. The scenario is as follows, R5 is the multicast source of and but R7 doesn’t need to receive multicast traffic from R5 on only traffic from R5 is required. So on R2 which is the nearest RP to R5 I will filter outbound messages to R3. In reverse we could do an inbound filter on R2. Configuration is as follows on R2.

access-list 101 permit ip host host
ip msdp sa-filter out list 101

Let’s test this with a ping to the multicast address.

R5#ping rep 3
Type escape sequence to abort.
Sending 3, 100-byte ICMP Echos to, timeout is 2 seconds:
Reply to request 0 from, 28 ms
Reply to request 0 from, 156 ms
Reply to request 1 from, 52 ms
Reply to request 1 from, 180 ms
Reply to request 2 from, 48 ms
Reply to request 2 from, 176 ms

R5#ping rep 3
Type escape sequence to abort.
Sending 3, 100-byte ICMP Echos to, timeout is 2 seconds:
Reply to request 0 from, 52 ms
Reply to request 1 from, 48 ms
Reply to request 2 from, 40 ms

So as you can see above R7 replies to but not, so the filter works as expected.

Also on R2 I used an inbound SA-Filter this does the opposite from an outbound filter. So the commands will only accepts SA-messages from peer with ( ,

ip msdp sa-filter in list 102
access-list 102 permit ip host host

Below you can see some packet captures on the routers showing the source active messages being sent from R2 to R3 and also from from R3 to R6.

MSDP SA from R2 -> R3

MSDP SA from R2 -> R3

MSDP SA from R3 -> R6

MSDP SA from R3 -> R6


This lab is setup for static Rendezvous Point, you may use this in a small environment with only one or two multicast sources, static RP is managed by manually configuring each router with “ip pim rp-address x.x.x.x”. As you can probably tell manually updating each router with a new RP could be time conusming on larger networks,  so we will cover other ways to assign or elect the RP in my future blogs.

Static RP topology

Static RP topology

Attached are the config’s for each router (Router Configs). As I’m using GNS3 I had to enter commands “no ip route-cache” and “no ip mroute-cache” on each PIM interface. Also add “no ip cef”. You won’t need this for physical routers.

Now lets confirm our configurations with some well known commands.

From R4 we need to confirm the IGMP memberships “show ip igmp groups” will show all the IGMP joins the router has recieved from hosts wanting to join the multicast stream.

R4#sh ip igmp groups
IGMP Connected Group Membership
Group Address    Interface                Uptime    Expires   Last Reporter   Group Accounted        FastEthernet0/0          00:21:26  stopped

Next we confirm the RP with the command  ” show ip pim rp” lets do this from R2, this will show us all RP’s and the multicast group address they are responsible for.

R2#sh ip pim rp
Group:, RP:, v2, uptime 00:03:55, expires never
Group:, RP:, v2, uptime 00:24:12, expires never

Next from R1 we initate some traffic using “ping” you should get a reply, now we can run the command from R2 “show ip mroute summary”. This will show us the shared tree and source tree for the multicast source.

R2#sh ip mroute summary
IP Multicast Routing Table
(*,, 00:02:47/stopped, RP, OIF count: 0, flags: SPF
(,, 00:02:47/00:00:46, OIF count: 1, flags: FT

This confirms our multicast setup is working correctly.

If you are struggling with source and shared tree check out this blog (



This command might save you from a whole world of trouble!

For example you’ve got a scheduled change coming up on your WAN router that’s a little risky, you might just cut yourself off from the router half way through the change…….. what do you do to mitigate this risk?

Well if you don’t have OOB (out-of-band) access to the router you would use the following command

#reload in 20

What this does is reload the router in 20 minutes reverting back to the last saved config. So if you’ve been cut off just wait the time you’ve specified and the router will reload with the old config.

To confirm how much time you have before a reload use the command

#show reload

Once you’ve completed your changes and they are successful just issue the command

#reload cancel

Also used to schedule a reload is the command

#reload at 09:45

This reloads your router and the time you specify but obviously relies your NTP settings to be correct so check this first.

Plan ahead!!


I was asked a question some time ago, it went something like this “Robert, why does it take so long for me to open a file over the network? Its taking over 5 minutes” So I went through the usual troubleshooting steps like asking what is the file?, where is the file?, how big is the file? etc. It quickly dawned on me that he was opening a file somewhere in the region of 150Megabytes so my reply was “I’m sorry but this is normal time you can expect for downloading a file of 150Megabytes over the WAN” the users reply was “Robert we have a 50mbps link at each end surely this should take like 3 seconds?”

Firstly let me explain the difference between Megabytes and Megabits, Megabytes are what you and I refer to as file size’s so you might say to your friend “my flash drive is 2Gigabytes” which is  2048Megabytes or the song you just bought off iTunes is 6Megabytes we are familiar with this and accept it as the standard unit of measuring file size and your home Broadband speed………………WRONG!

Your ISP will give you a broadband connection of say 10Megabits per second (mpbs) which is in fact only 1.25Megabytes per second, this is because bandwidth is measured in Megabits per second and not Megabytes.

There are 8Megabits in 1Megabyte so just divide your current Broadband speed by 8 to get you speed in Megabytes

So to answer the users question downloading 150Megabytes at 50mpbs should download in 24 seconds? right? lets work it out.

50mpbs / 8 to get Megabytes = 6.25Megabytes / 150Megabytes = 24 seconds ………….. WRONG!

We haven’t thought about latency and how that can affect your TCP traffic flow.  When you open up a command prompt and you ping across the WAN to your file server you can see something called the RTT see below example.

Pinging with 32 bytes of data:
Reply from bytes=32 time=32ms TTL=44
Reply from bytes=32 time=32ms TTL=44
Reply from bytes=32 time=32ms TTL=44
Reply from bytes=32 time=32ms TTL=44

Ping statistics for
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 32ms, Maximum = 32ms, Average = 32ms

Highlighted in Red is your RTT or your round time trip this is the time it takes the packet to go to the destination and back again the further away the host the higher the RTT.

How does latency affect download speed? Well we have to think how TCP traffic works when transferring files over the WAN, your default TCP window size is 64KiloBytes this means that for every 64KiloBytes of data sent the receiving end has to send an acknowledgement, so this means in a 150Megabyte file there are 2400 acknowledgements.

So you can now see why latency can affect throughput, you could increase the default TCP window size on both sending and receiving devices, but this is not the answer as if there is any packet loss on your link the you will have to resend any data that gets lost on the way, this is the fundamental reason behind TCP traffic and guaranteed packet transmission.

So what can you do to speed things up? You could bring the 2 countries closer together? or work out a way of making the speed of light faster?

Luckily there are technologies out there and they fall under the category of WAN Acceleration or WAN Optimization. I wont mention any particular company but there are quite a few out there some even optimize UDP traffic.

They use many techniques like De-duplication, Compression, Caching, CIFS, Latency Improvements etc. All of this together can make the connection between 2 offices appear a lot quicker and even reduce the amount of traffic sent over the link by as much as 50%. Both ends of the connection must have an accelerator and usually connecting back to a main bridgehead or control server, in each accelerator there is usually some sort of hard drive perhaps a SSD for faster read/write times which stores all the data previously accessed over the link.

The actual calculation for working out maximum throughput on a WAN link no matter what the bandwidth is as follows.

TCP window size in bits / latency in seconds = throughput in bits per second / 1048576(bits in a megabit) = mbps

So here’s an example to answer the users question. The latency was 145ms at the time.

524288 bits / 0.145 = 3615799 bits per second.

Lets turn that into mbps so:  3615799 / 1048576 = 3.4mbps 

Divide this by 8 to get Megabytes: 3.4mpbs / 8 = 0.425 MBps

So take the: 150Megabytes / 0.425MBps = 352 seconds which is the same as 5+ minutes

This calculation is based on optimal network conditions with zero packet loss and zero jitter.