Comparison of the Performance of Various MCA Ethernet Adapters


Introduction

During the big time of Microchannel-based PS/2 systems, Token Ring was the network technology of choice. It was heavily pushed by IBM and the performance was notably better than with a 10MBps Ethernet. As a result, almost every used PS/2 machine you get from a big installation that was taken down comes with a Token Ring Adapter. This is fine if you have enough other equipment to build up your own home network, but sooner or later, you may see the need to get an Ethernet card for your PS/2. But which one is the best? Basically I would say: the best one is the one you can get hold of, since there is no new Microchannel hardware made these days and Ethernet Microchannel boards may be a rare item where you live. Of course, there are differences in performance between the available boards, but I am not aware of anyone having quantified these differences up to now. This small article is intended to fill this gap. Of course, your mileage might vary significantly: the performance of an Ethernet board not only depends on the board's hardware, but also on the host CPU's performance and the quality of the driver implementation. The netio tool I used for performance evaluation is however available for various platforms, so you might want to make your own comparisons.

Test Setup

As I mentioned, I use netio to measure the performance of an Ethernet card. Netio's way of operation is simple: you start it on one machine in server mode, and on another machine with the server's IP address as argument. The client instance then builds a TCP connection to the server and sends packets of various sizes at maximum possible speed, which is printed after the runs. The results netio gets are very near to the theoretical maximum of the network link, since no disk accesses are involved in the tests, opposed e.g. to FTP transfers that are also very popular for performance tests. During a single run, the netio client transmits about 32 MBytes to the client and in turn receives about one Megabyte (mostly TCP ACKs). You may download netio here for your own measurements.

The server I used on all runs is an AMD K6-3/400-based machine with 128 MBytes of RAM and a DEC Tulip-based PCI network card. Running against e.g. a DEC Alpha AXP/150 with an EISA Etherlink III, this machine easily reaches data rates up to 1100..1200 KBytes/s on a 10MBps Ethernet, very close to the theoretical maximum. So when we vary the client side, we may be sure that the server is not limiting the performance.

For all tests, both the server and the clients run Linux with a 2.2 kernel. Since netio also runs on other platforms, you might want to make your own tests if Linux is not your favourite OS (bah ;-) )

When comparing Ethernet boards, the maximum data rate possible might not be the only interesting result. Except for some high-end server models, all Ethernet interface boards place the burden of transferring data between the card's buffers and memory on the CPU. Depending on the card's design, this may hog the CPU more or less and steal more or less of your CPU's time. I therefore made all measurements on three different machines:

Those are the cards I took into comparison (i.e. those are the ones I have available!): All Netio runs were done 10 times and averaged.

Measurement Results

OK, let's go for it. First the results in the PS/2 56SX:

Board 1K Packets (KB/s) 2K Packets (KB/s) 4K Packets (KB/s) 8K Packets (KB/s) 16K Packets (KB/s) 32K Packets (KB/s) CPU Load (%)
3C523 (old) 415.3 546.1 585.5 589.2 600.6 606.7 98.4
3C523 407.4 568.1 620.5 630.1 638.8 655.9 98.2
3C527 429.0 557.8 597.5 608.0 608.7 627.0 98.8
3C529 446.6 622.7 667.3 667.5 673.3 685.5 98.3
NE/2 365.8 492.4 534.6 541.2 542.2 563.6 98.1
DE-320 413.3 576.0 635.0 637.8 643.5 657.7 98.4
SMC 8013 414.7 377.1 581.5 589.9 586.3 585.1 98.0
Ethernet Adapter/A 399.2 431.8 582.3 588.4 587.4 614.3 98.5
SKnet 365.3 469.8 481.8 493.5 515.6 513.3 98.8
LAN Adapter/A 424.7 555.2 601.6 604.0 615.3 637.5 98.8
EtherExpress 417.4 542.1 597.5 599.5 597.8 627.7 98.5
DE-210 411.1 518.9 562.2 559.2 544.7 573.9 98.4

As one can see from the CPU load results, the slow CPU is the limiting factor in all cases and the values are far away from the theoretical maximum throughput. Older (read: ISA-derived) designs like the SKnet and NE/2 perform significantly worse than the others, an indication that transferring data from/to these cards is loaded with more wait states than on designs that exploit the higher speed of the MCA bus compared to ISA. The winner in this scenario is the 3C529, closely followed by the 3C523, 3C527, DE-320 and LAN/A. Though not dramatically slower, the old, 386-only 3C523 measurably lags behind the newer revision. The performance of the EtherExpress is similar to the old 3C523. One reason might be the usage of the same Ethernet controller chip, an Intel 82586, however the Intel card uses programmed I/O instead of shared memory for data transfer - an example that PIO needs not to be slower than direct access to the card's packet buffer memory. Furthermore, there is no noticable difference between the 8013 and 8003 - if there are differences between an 8- and 16-bit-card, they don't show up on this platform...

Given that the i386SX-20 was the limiting factor in the first setup, we may expect substantially better results from an i486SX-33, which is proven by the tests on the 77:

Board 1K Packets (KB/s) 2K Packets (KB/s) 4K Packets (KB/s) 8K Packets (KB/s) 16K Packets (KB/s) 32K Packets (KB/s) CPU Load (%)
3C523 1040.2 1082.9 1081.2 1081.5 1081.6 1079.1 44.6
3C527 1032.0 1074.0 1073.7 1069.8 1075.5 1075.1 49.1
3C529 899.9 950.4 950.0 947.6 945.3 945.5 28.1
NE/2 722.3 864.0 885.3 896.9 904.7 908.8 99.1
DE-320 1043.2 1087.3 1083.4 1082.6 1085.6 1087.9 44.7
SMC 8013 882.0 1082.0 1079.6 1077.8 1074.3 1080.7 66.2
Ethernet Adapter/A 879.5 1081.5 1083.6 1081.0 1082.2 1082.1 77.8
SKnet 778.7 797.8 796.3 801.0 794.5 805.7 41.1
LAN Adapter/A 1068.3 1088.0 1088.0 1088.0 1088.8 1086.5 24.5
EtherExpress 957.4 1088.3 1085.1 1088.0 1088.8 1087.9 53.0
DE-210 957.4 923.9 942.7 945.7 939.0 943.5 98.9

Now we're starting to talk at last! The CPU load for the SKnet board shows that this board already has reached its maximum performance of about ~800 KBytes/s. In contrast, the NE/2 is capable of more, however it is still able to completely hog the CPU - not very nice in a server! The DE-320 however demonstrates that this is not an inherent property of the NE2000 design: its performance and CPU load are similar to the 3C523. Similar to the SKnet board, the 3C529 has reached its limit. With 950 KBytes/s, this is not bad, however other boards can do better. Interestingly enough, its predecessor 3C523 performs better (with a substantial CPU overhead however). The clear winner is the LAN/A: The data rate is the highest, and the CPU load is even lower than for the slower 3C529. The EtherExpress also performs well, however with a measurably higher CPU load than the 3C523 or LAN/A, plus a lower performance for small frames.

WD8003 and 8013 do not differ significantly in their performance, the 8003's CPU load is however higher - a sign that the cards have reachrd their maximum internal transfer rate, but it takes the CPU more to stuff the dat through the 8003's slower bus interface.

What does this somehow surprising result tell us? Everybody's darling 3C529 is not the fastest card (at least for Linux), cards that use shared memory or a good PIO interface for transfer are the faster choice. The 3C529 however is still not a bad choice, the difference is mainly for people who want to drive it to the peak.

Another result is that the busmastering 3C527 produces significantly more load than other, non-busmastering boards! One explanation could be that controlling this board is more complex (not probable). Another reason might be that the driver does not exploit the board's bus mastering capabilities, i.e. the received frames are written to buffers kept by the driver, and the driver then copies the data into the operating system's buffers. This might sound awkward (and is is in fact, since it voids the advantages of busmaster operation), but it is sometimes unavoidable, either due to the way the kernel interface works or buffer alignment constraints... this shouldn't cover the fact that the 3C527 delivers good performance and deserves the designation 'High Performance Adapter'.

So let's see if a P90 can crank even more out of the cards:

Board 1K Packets (KB/s) 2K Packets (KB/s) 4K Packets (KB/s) 8K Packets (KB/s) 16K Packets (KB/s) 32K Packets (KB/s) CPU Load (%)
3C523 1000.9 1026.7 1014.6 1013.1 1015.3 1016.1 5.1
3C527 1030.6 1074.5 1074.4 1072.4 1076.7 1077.7 8.8
3C529 929.0 959.7 959.3 956.8 959.1 959.0 5.1
NE/2 1046.5 1106.5 1105.9 1098.3 1100.1 1102.3 9.3
DE-320 1061.7 1109.3 1109.3 1106.0 1106.3 1108.2 6.6
SMC 8013 1048.9 1093.7 1093.4 1087.4 1089.5 1086.7 6.0
Ethernet Adapter/A 1040.7 1107.5 1107.6 1104.6 1105.3 1105.5 5.6
SKnet 793.6 804.2 808.9 803.5 818.9 799.9 5.2
LAN Adapter/A 1099.9 1106.7 1107.7 1105.1 1107.6 1108.5 4.0
EtherExpress 1070.3 1081.8 1081.1 1080.9 1081.8 1082.2 4.7
DE-210 887.8 950.2 926.1 912.8 917.7 921.1 29.4

Only the NE/2 shows a significant gain and now exhibits performances comparable to the LAN/A, yet with a substantially higher CPU load, while the 3C523 finally has reached the ceiling. The same is true for the EtherExpress, but its limit is a few Kbytes/second higher. The 3C529's limit around 950 KByte/s is underlined by these values. Once again, the LAN/A shows the best performance paired with the lowest CPU overhead. Except for the NE/2 (and the DE210!), overhead is not an issue for any card. Frustatingly, the 3C527 that should have the lowest CPU load delivers one of the worst results - busmastering alone is no guarantee for a low CPU load, everything else (including the software implementation) has to cooperate! It remains to be seen if IBM's EtherStreamer MC/32, another busmaster Ethernet adapter, could do better. I also have such an MC/32, however there is no Linux driver for it and IBM isn't very cooperative in releasing programming information :-(

Beyond 10 MBbits Per Second

Fast Ethernet boards for the Microchannel bus are extremely rare. In fact, the only board that is halfway widespread is the Olicom 2335. This board is based on the National Semiconductor DP83800 controller chip, a chip originally developed for the ISA bus. It therefore does not benefit from the MCA bus's higher transfer rates - the speed difference is 'adapted' via wait states. Additionally, the board does not allow busmaster operation, something rather disturbing for an interface that has potentially to transfer more than 10 MBytes per second:

Packet Size (KByte) Rate (KB/s)
1 2665.2
2 3013.8
4 3032.7
8 2960.9
16 2879.5
32 2937.4
The 'counterpart' in this case is a VIA Rhine-based PCI board, again in a 400 MHz AMD K6 machine. Of course the rates are higher than for any other 10 MBit board, but somewhat disappointing. Judging from the CPU load measurements with 10 MBit boards, one can conclude that a Pentium 90 CPU is capable of saturating a 100 MBit line. The bad results are entirely due to the bad bus interface of the DP83800: An ISA-style interface maxes out at 3 MBytes/s, and any additional CPU performance is wasted in wait states :-(

Conclusions

And the winner is...? From the numbers above, I would give the performance crown to the LAN/A : Performance is always with the best adapters, and the CPU load is the lowest of all adapters tested. Believe me, I would say this also if I hadn't written the Linux LAN/A driver myself ;-)

Place 2 goes to the 3COM boards: While the 3C523 offers a slightly better performance, the 3C529 attracts with onboard 10baseT and a smaller form factor (handy e.g. in the P70 portable PS/2!). I wouldn't overrate the 3C529's lower performance too much: The rate is still more than you will ever get on a loaded 10 Mbps wire.

It's difficult to say whether the SKnet, DE210 or the NE/2 is the worst board: Only 800 KBytes/s is already a notable loss, but at least it doesn't put that much load on the CPU as the NE/2 or DE210. The NE/2 is like a lemon: if you squeeze it just enough, you will get what you want ;-) I wouldn't advise to use either in a server...

The 3C529 and even more the DE-320/EtherExpress/8013 are good cards where the length of the card is an issue, like in the P70/75 portables: They fit into the 'half length' slots and still offer good performance.


Back to List
©2000 Alfred Arnold, alfred@ccac.rwth-aachen.de