Jephe Wu - http://linuxtechres.blogspot.com
Objective: use all kinds of open source free softwares to troubleshoot the Internet vpn slowness issue and pinpoint where is the packet loss router.
Environment: OpenBSD, Freebsd as vpn firewall. When accessing servers such as web servers, Oracle database servers through Internet vpn, we experiencd very slow connection.
Steps:
The following is the vpn network diagram:
10.0.5.x__10.0.5.1||1.2.3.4++++++++++++++++5.6.7.8||10.0.6.1__10.0.6.X
1. Check latency and packet loss from host 1.2.3.4 to 5.6.7.8
a. ping check if you are able to ping from 1.2.3.4 to 5.6.7.8, if yes, check the latency and packet loss rate.
The latency is not accurate because it might be due to icmp rate limiting confiugred by router, only tcp traffic is in priority.
b. if the ping is blocked, use tcptraceroute or traceroute -T (on CentOS 5) and tracetcp on Windows (http://tracetcp.sourceforge.net/), the latency here is more accurate as it's real tcp traffic from sender to receiver, although the return traffic is icmp TTL exceeded message.
How does tcptraceroute and tracetcp work?
When you issue command like 'tracetcp www.redhat.com:443', Wireshark captures the traffic below, for each hop, it will send 3 tcp packets and set TTL starting from 1. When the final destination reached and it gets ACK reply from destination host, it will immediately tear down the connection.
13 3.732592000 192.168.100.20 184.85.48.112 TCP 20527 > https [SYN] Seq=0 Win=16383 Len=0 (time to live is 1 in ip header)
14 3.734182000 192.168.100.1 192.168.100.20 ICMP Time-to-live exceeded (Time to live exceeded in transit)
15 4.232399000 192.168.100.20 184.85.48.112 TCP 24043 > https [SYN] Seq=0 Win=16383 Len=0 (time to live is 1 in ip header)
16 4.241227000 192.168.100.1 192.168.100.20 ICMP Time-to-live exceeded (Time to live exceeded in transit)
17 4.732323000 192.168.100.20 184.85.48.112 TCP 13233 > https [SYN] Seq=0 Win=16383 Len=0 (time to live is 1 in ip header)
18 4.735323000 192.168.100.1 192.168.100.20 ICMP Time-to-live exceeded (Time to live exceeded in transit)
2. use mtr or pathping to check packet loss rate
You can install mtr (http://en.wikipedia.org/wiki/MTR_%28software%29) on Linux/FreeBSD/OpenBSD to check the packet loss rate for the trace path. There are also winmtr and pathping on Windows for similiar functionality.
MTR relies on ICMP Time Exceeded (type 11) packets coming back from routers, or ICMP Echo Reply packets when the packets have hit their destination host.
2082 191.558234 192.168.100.20 184.85.48.112 ICMP Echo (ping) request
2083 191.590458 203.117.34.14 192.168.100.20 ICMP Time-to-live exceeded (Time to live exceeded in transit)
2084 191.673071 192.168.100.20 184.85.48.112 ICMP Echo (ping) request
2085 191.804462 192.168.100.20 184.85.48.112 ICMP Echo (ping) request
2086 191.874170 198.32.176.127 192.168.100.20 ICMP Time-to-live exceeded (Time to live exceeded in transit)
........
2090 191.804462 192.168.100.20 184.85.48.112 ICMP Echo (ping) request
2091 174.139518 184.85.48.112 192.168.100.20 ICMP Echo (ping) reply
Note: mtr will send icmp ping request with incremental TTL value starting from 1 to the destination host, by getting reply from each hop to get round trip time and packet loss rate.
3. the importance of having no/low packet loss and how to read the mtr report for packet loss
a. Packet loss kills throughput.
b. a slower connection with zero packet loss can easily outperform a faster connection with some packet loss
c. packet loss on the last hop, the desination, is what is most important; packet loss will happen on the return path which is totally different with the outgoing path.
d. sometimes routers in-between will not send ICMP "TTL expired in transit" messages, it will see 3 asterisk which is normal.
e. some routers may specifically block (or down-prioritize) ICMP echo requests, or might do the same where TTL=0. These routers (or the final destination) might show 100% packet loss
f. the router may also be programmed to limit the number of responses it sends to ICMP packets in an effort to mitigate DoS attacks
g. just because you see a hop with high loss doesn't mean it's slowing down "real" traffic; it may only be throwing away ICMP.
References:
a. http://help.rr.com/hmsfaqs/e_packetloss.aspx
2. http://library.linode.com/linux-tools/mtr/ --for how to read mtr report
4. how to configure vpn firewall for icmp traffic in OpenBSD or FreeBSD packet filter firewall
pass out log quick
pass in log quick on $ext inet proto icmp all icmp-type { echorep, timex, unreach }
pass in log quick on $ext inet proto udp from 1.2.3.4 to $ext keep state
pass in log quick on $ext inet proto icmp all icmp-type { echo } from 1.2.3.4 to $ext keep state
References:
ICMP filtering on the firewall -
http://www.richweb.com/icmp_filter
How to troubleshoot packet loss and latency for Internet VPN
Labels: icmp, packet loss, tcptraceroute, traceroute