IPv4 MTU issues can be hard to spot initially, there is a solution and its called Path MTU Discovery (RFC1191). The RFC describes it as the following “a technique for using the Don’t Fragment (DF) bit in the IP header to dynamically discover the PMTU of a path”
Further to that the RFC states “The basic idea is that a source host initially assumes that the PMTU of a path is the (known) MTU of its first hop, and sends all datagrams on that path with the DF bit set. If any of the datagrams are too large to be forwarded without fragmentation by some router along the path, that router will discard them and return ICMP Destination Unreachable messages with a code meaning “fragmentation needed and DF set” (Type 3, code 4)
The unfortunate issue is that the message that’s sent back doesn’t actually say what the MTU is.
A colleague of mines who is a Windows 7 expert, has reliably informed me that by default Windows 7 has PMTUD enabled.
The important point to focus on is the ICMP unreachable (Type 3, code 4). To put this quite simply, if you don’t receive an ICMP message back with the code for fragmentation needed then, your PC will assume that the MTU is fine and continue to send the packets even though somewhere in the path the packets are potentially being dropped.
There can be a number of reasons for this, including firewalls blocking the message, ICMP unreachable disabled on an interface, a transparent host between 2 endpoints (Often done in service provider networks) that has a lower MTU value.
I recently ran into an issue where IP connectivity between 2 sites looked to be fine, ping, traceroute and SSH were all working, but certain applications and protocols were not, most notably HTTPS.
Below I will explain how to spot this issue.
Take a look at the diagram below, i have deliberately used a transparent device as its most likely what you might see in a L3VPN (MPLS) network. The last mile provider provides a layer 2 path (perhaps a L2TPv3) from CE to PE and the underlying hops are hidden from us. From the service provider perspective the routers are directly connected.
This is perhaps where an MTU issue has occurred. For this scenario I have reduced it quite significantly for effect.
Lets say for example you have a perfectly functioning network where MTU is fine along the path. Initially you can send a ping with 1460bytes and you will get a reply. Lets increase this to something we know is to big (1550bytes). This works great in a perfectly functioning network where you receive an ICMP type 3, you will get the “packet needs to be fragmented but DF set” message.
Now lets try that through our network where the MTU is set lower but the sending device doesn’t know about it.
At first you think its OK because you can ping along the path and get a reply, you try SSH and it works too. Now lets try to ping with different MTU sizes. Remember your PC doesn’t receive the ICMP message this time, so what happens is you get a “request timed out” message.
The reason for that is the packet is being dropped and the ICMP message isn’t being returned. If I ping with an MTU that is lower than the 1000 i get a reply.
Now the question, why would HTTPS not work? well in some cases web applications or your client might set the Do Not Fragement bit in the IP header SYN request. This means the packet should not be fragmented, so when we send this on our network with the bad MTU in the path, the packet is dropped and the sending device never receives the ICMP message. It never knows that it has to reduce the MTU value. The packet capture below shows where the DF bit is set.
I had a look through the RFC2246 for TLS1.0 and it doesn’t specify that the DF bit should be set. It’s most likely a vendor or O/S specific setting, so your observed results may differ from vendor to vendor.
RH