Network performance

Definitions of some terms that describe network performance. Analogies to human transportation vehicles like cars or busses are in this color.

Bandwidth is a measure of how many units of data can pass between two machines on a network per unit time. Examples include "megabits per second" and "gigabytes per hour". It is analogous to how rapidly cars can pass through an intersection in a given unit of time.
Latency is a measure of how long it takes for a single packet to get from machine A to machine B. It is somewhat analogous to how long it takes the first vehicle at a red light to accelerate back up to full speed, from a full stop. The more "send a packet, wait for an acknowledgement packet, repeat as needed" your protocol uses, the more latency will slow things down. NFS is a protocol that has high latency (at least until it becomes intent-based or similar). High network latency is somewhat analogous to a street light that changes color frequently, which means more time spent waiting for vehicles to accelerate.
MTU is the largest size of a packet as it passes across a single network link. It is somewhat analogous to the number of passengers in a vehicle: motorcycles, cars, busses, planes...
Path MTU is the largest size of a packet as it passes across all network links between two machines on a network. It is somewhat analogous to the minimum of the number of passengers on someone's commute. For example, if you start out by taking a van pool to a trainstation, and from the final trainstation taking a bus to your office, than the "Path MTU" might be the maximum number of passengers possible in the van pool - because it would typically have the smallest maximum number of passengers.
Jumbo frames are larger MTU's (and perhaps Path MTU's as well) beyond what the traditional ethernet standard indicates should be used. You might think of Jumbo Frames as being kind of like a train or an ocean liner
Block size is the size of packets as they are written by an application and passed into an operating system's network stack. The operating system is then free to chop up your blocks into smaller pieces, to shrink them down to the Path MTU, or to aggregate your packets into fewer, larger packets to bring them up to the Path MTU. (Older systems may use the plain MTU instead of the Path MTU, and let network equipment further along the path sort out MTU changes as needed, but this is often going to be slower). Block size is a little bit like how quickly people can board a vehicle. For example, you might have a car that allows 2 or 4 people to get in at a time, or a bus that only allows one person to get in at a time. So even though the bus can transport more people at a time, people board it more slowly. So if you have a super mag-lev train that goes ultra fast, the trip might still be slow if only one person can board or exit the train at a time. Some operating systems will be prone to only uses packets, the data portion of which will be at miximum the same as your block size - hence an application that writes lots of tiny packets may not be able to make good use of larger (Path) MTU's. Ironically, protocols that perform well with standard ethernet frames, may suddenly appear to be poor performers with jumbo frames, if the application does not know how to write blocks that are larger than the size of the data portion of a standard ethernet frame. For example, with standard ethernet frames, sometimes rsh will outperform NFS. However, NFS is sometimes better able to make use of jumbo frames, so may be faster than rsh sometimes on networks with jumbo frames.
The Nagel Algorithm is used on TCP sockets to improve performance, but it won't always be faster. It only pertains to small packets. When Nagel is enabled, your small TCP packets won't be sent ASAP after your application passes them into the kernel's TCP handling. Instead, if the packet is small, the kernel will dump your data into a buffer, and then wait a little while to see if more is coming. Then, after either enough data arrives in that buffer (padding out an MSS worth of data) or enough time passes without more data arriving, then your data will be sent to its destination. You can think of this as being a bit like a stop light that has a sensor to tell it how many cars are waiting, and is optimized to wait a little bit longer before going green if there aren't that many cars waiting for it from that direction.

Assessing network performance

Reliability
1. Everyone's first network analysis tool: ping. It comes with just about every network operating system conceived.
2. traceroute is also fairly common, and can often show the first hop having problems between two machines.
3. pchar (for more on pchar, see the section on pchar under "Bandwidth", below)
4. mtr and xmtr are very nice for determining which link in a multi-router connection are dropping packets, or are unpingable, but I believe that pchar does this, plus much more. However, mtr and xmtr are more likely to come with a linux distribution.
Bandwidth measurement
1. Something to report bandwidth, hop by hop, can be very useful, but these tools are a bit heuristic, so don't trust them overmuch. However, when such tools are working as intended, they can give you the bandwidth of each hop between two machines, across (potentially) multiple routers (IE, if there are 2 endpoints and n-2 routers, then such a tool should give you information about all n-1 links, as well as summary information).
  - pathchar, the original, by Van Jacobsen. Please mention it if you determine otherwise, but it appears that pathchar used to be available in source form, but now you can only get precompiled binaries for a few platforms.
  - pchar is excellent
  - pipechar - this one seems to work well on our "optiputer" network, but it also seems to be binary-only.
  - clink - Source is available, but it seems very linux-specific, at least in its current incarnation
2. Another excellent URL. "bing" looked especially interesting to my eye.
3. iperf is very nice, but tends to lean toward the "theoretical result" side - IE, it'll often be quite a lot faster than real-world performance. To use it, compile two iperf binaries, then run "iperf -s" on one machine A, and "iperf -c A" on the other. You can use it multiple times, or increase the run length.
4. ntop can give a breakdown of what sorts of traffic are on the LAN adjacent to the machine that ntop is running on.
5. My "reblock" program can measure real-world performance pretty well, especially if you give it a large block size:
  - reblock $(expr 1000 \* 1000) 300 < /dev/hd4 | ssh strombrg@sand.ess.uci.edu 'cat > /dev/null'
6. My "pnetcat" program can be used for testing network performance with varied blocksizes and window sizes, via TCP or UDP. Because it's in python, it's extremely easy to tweak for just the right sort of testing your specifics require.
Latency
- ping, mtr and pchar can all measure latency in some sense, but the amount that latency actually matters is very protocol-specific.
- Protocols that are not very subject to latency slowdown:
  - Imagine a protocol that is able to stream large packets of data and only expect confirmation packets when there is an error.
  - Although not an ethernet or IP protocol, zmodem (which is for serial transmission) is a good example of a protocol that is not slowed down much by high latencies. It has very little "back and forth" unless there are errors.
  - VNC is another example of a protocol that is not very subject to slowdown due to high latency - at least, not relative to raw X11 on a high latency network. This is because VNC pre-renders the highly-back-and-forth X11 traffic down to a simple, in-memory bitmap, and then only passes the portions of that bitmap that have changed in a series of bitmap-rectangles via the network. VNC's help due to reducing the impact of latency can be somewhat "toolkit" specific. For example, VNC seems to pay off for motif applications (like old versions of netscape), while simple GTK+ applications appear to be less subject to latency issues.
- Protocols that are slowed down by latency a lot:
  - Imagine a protocol that sends a huge number of tiny packets and waits until receiving an acknowledgement packet before sending the next tiny packet.
  - NFS with a small rsize and wsize is a good example of a protocol that will be slowed down greatly by high latencies.
  - As mentioned above, raw X11 can be very slow when used over a high latency network link, compared to VNC.
Finding a network link's path MTU
1. tracepath is a very convenient way of determining the "Path MTU" between two machines.
2. A more effective way of finding your Path MTU than tracepath, is to select some protocol that you want to see using your desired MTU size, send a lot of data across that protocol, and analyze the traffic with ethereal, tethereal, tcpdump, snoop, whatever you prefer. You can expect some of the packets to be well below your desired Path MTU, but if some of the packets are of the desired size, then you're at least theoretically able to get the Path MTU you want. Then there may be application issues.

Improving network performance

First assess where the network bottlenecks are (see above, perhaps especially under "pchar")
Then measure how well your required application is peforming, possibly by having it write data to localhost and measuring throughput (see above, perhaps especially under "ntop", but "reblock" may be helpful in some cases as well, particularly if you're working with pipes)
Is it the application that's slow, or the network?
- If pchar (or other measure) is showing much better throughput is possible than your application is getting, then you likely need to tune that application
  - You may be able to improve application performance by just convincing it to use larger block sizes
  - Another possibility, particularly if your network performance is suffering from high latency, might be to modify the application to use "intent-based" data transfers. "Intent-based" transfers basically are a series of "do something; if it works, do the next something", where you bundle up a bunch of these conditional somethings into a single, large block. You may not end up actually performing all of the somethings, which means some data transferred needlessly in a sense, but this may still allow a net gain in performance due to latency reduction.
  - Perhaps especially if you have old equipment involved that doesn't understand Path MTU discovery, you may find that making your application write data in integral multiples of the data-portion of your Path MTU will speed up performance quite a bit - but some would call this "outdated thinking". Anyway, if you want to do this, then using a sniffer is probably the best way to figure out what block size is to use a sniffer like ethereal.
- If pchar (or other measure) is showing much worse throughput than the application you require is getting on the first hop (or better, compared to using using "localhost" as the destination), then you probably have a network performance issue
  - Examine each network hop, and see if you can improve the slowest hop. You can reapply this procedure iteratively, until the network isn't slower than the application, or the budget dictates that further improvements aren't practical.
  - If you are on a gigabit network (or better), you may be able to squeeze out better performance by turning on Jumbo frames. This is sometimes a no-extra-budget performance boost, especially if your application is already using large blocks. However, it may turn out that one or more machines in your network path do not support jumbo frames, in which case you would need to upgrade those pieces of equipment.

Some nice links from prg on comp.os.linux.networking:

Here are some good links:

TCP Variables
http://ipsysctl-tutorial.frozentux.net/
/usr/share/doc/kernel-doc-2.[-X-]/networking/ip-sysctl.txt/usr/src/linux-2.[-X-]/Documentation/networking/ip-sysctl.txt

TCP Perf Links
http://www.psc.edu/networking/projects/tcptune/
http://www-didc.lbl.gov/TCP-tuning/TCP-tuning.html
http://www-didc.lbl.gov/TCP-tuning/linux.html
http://www.uninett.no/tcpperf/
http://www.infosyssec.net/infosyssec/netprot1.htm
http://ltp.sourceforge.net/tooltable.php
http://www.csm.ornl.gov/~dunigan/netperf/netlinks.html
http://www.web100.org/

Kernel Netwoking Notes: (as far back as 7/8/04)
NAPI performance
http://lwn.net/Articles/139208/

Pluggable congestion avoidance modules
http://lwn.net/Articles/128062/

TCP window scaling and broken routers
http://lwn.net/Articles/91976/

hth,
prg

Hits: 6888
Timestamp: 2025-07-26 03:49:17 PDT

Back to Dan's tech tidbits

You can e-mail the author with questions or comments: