Extending the Cache of the PS/2 Cached SCSI Controller


Preface

SCSI has been the disk interface of choice for most of IBM's Microchannel PS/2 machines, except the first generation that either used ST-506 or ESDI disks or some late desktop models with IDE. IBM quickly targeted them to the high-end and server market since the Microchannel was not successful in the mass market. This and the fact that IBM has always put an emphasis on 'balanced systems' with an I/O performance growing with the CPU's power drove the development of SCSI controllers with separate cache memory.

Spock

One common example is the 'SCSI Controller with Cache', also known by its code name 'Spock'. It is a 32-bit busmaster with 512 Kbytes of local cache RAM, organized as two 30-pin SIMMs of 256Kx9. The RAM is controlled by a local 80188 or 80186 microprocessor. A rarely known fact is that the controller can be extended to 2 Mbytes by replacing the SIMMs with 1M types. So you simply take two SIMMs out of your junkbox, put them in, and...typically get a glorious error message and nothing works.

IBM SIMMs

Of course it would have been too simple for IBM to take 'industry standard' SIMMs. They used SIMMs with a different parity pinout and additional presence detection pins. The latter ones allow the computer to detect a SIMM's size automatically, without having to probe all possible sizes. In detail, there are the following differences between the standard and IBM pinout:

PinIndustry StandardIBM
2 Data CAS Data&Parity CAS
24 Not Connected Presence Detect
26 Parity Data Out Presence Detect
28 Parity CAS Not Connected
29 Parity Data In Parity Data In/Out

The two presence detect pins allow the following combinations:

Pin 24Pin 26Capacity
open open no module inserted
open grounded512Kx9
groundedopen 256Kx9
groundedgrounded1Mx9

These modules have not only been used for the cached SCSI controller, but also in some early PS/2 machines for main memory, namely in the ISA-based Model 30-286, and the MCA-based 50/60 (not the newer 50Z which already uses 72-pin modules). Watch out however to use the correct type of modules for this purpose, see below!

Warm Up Your Soldering Iron!

So to convert an industry-standard SIMM to an IBM-style SIMM, the following has to be done:

In my experience, these modifications are simpler if you use 1M SIMMs built with 3 chips instead of 9; the three chips occupy less space on the module's PCB, and it is easier to get access to the connections. However, if you are going to modify modules to extend a Model 30 or 50, you will have to use 9-chip modules; The memory controllers of these old systems do not provide enough refresh address lines for the higher-capacity 1Mx4 chips on a 3-chip module, and you will experience strange memory errors with them. Not all 9-chip modules can be modified; the high chip density leads to a complex layout, and some of the traces that have to be cut might be located in inner layers...

Shown below are two SIMMs; one unmodified and a SIMM that has been modified according to the instructions above.

Unmodified and modified SIMM
Click here or click onto the photo for a full size version of this picture.

These modules are typical for 3-chip industry-standard SIMMs. They consist of two 1Mx4 chips for the data (the 514400 chips) and an extra 1Mx1 chip (the 531000 in this case) for the extra parity bit. This in sum makes the usual 1Mx9 people are talking about...the names of the individual chips may vary wildly from module to module since the DRAM manufacturers are quite creative at this :-)

Luckily, there are ANSI standards for RAM chip pinouts, so I can give you a schematic rewiring plan here, both for 3-chip and 9-chip modules. The red lines show connections to be added, while the fat double red lines symbolize connections to be cut. Of course, there are more traces on a SIMM than the ones shown, but I left the ones out that need not be modified to make things a bit simpler.

Schematic Rewiring Plan for 3-Chip Modules

Schematic Rewiring Plan for 9-Chip Modules

CAUTION!!! The pinouts of the individual chips apply only to your SIMMs if the chips are housed in the same case type (called 'SOJ') as for my modules!!!

The Reward

After two modules have been modified in this manner, it is time to insert the modified modules into the SCSI contoller and to power it up. Of course, such operations always bear a risk and - as you might guess - I disclaim all responsibility for controllers that break...if this is your one-and-only Microchannel SCSI controller, it might be wiser to buy the correct modules from IBM (or scavenge them from a Model 30-286 or Model 50/60). If everything went right, your PS/2 should boot up as usual and report an error 174 (equipment change). After the usual autoconfiguration, it's time to make some measurements. I used cthdben, a DOS program from the German computer magazine c't that measures the data rate for linear and random accesses. Since Spock's cache is write-through, there are only improvements for read accesses with a working set smaller than the cache size. The PS/2 used for the measurements is a PS/2 8595-AKF upgraded to a 486DX2 CPU (66 MHz). The disk is an old HP 1 Gig monster (5,25 inch, full height) that maxes out at approximately 2 Mbytes/s. The first figure shows the average data rate for random accesses with 1/10th of the total file size:

Results for Random Accesses

For file sizes smaller than 256 Kbytes or larger than 2 MBytes, the results do not differ: Either the working set fits into both sizes, or it is to large for both. The differences in the area between these margins is however remarkable: They result both from the higher RAM-to-RAM than Disk-to-RAM data rate and from the lower access time (the half rotation needed in average before the first data sector comes by is eliminated plus the SCSI protocol overhead). One can extrapolate that for a larger cache, the curve could well extend into the range of 5 to 10 Mbytes per second. However, Spock has no provisions to use 4 Mbyte modules; honestly said, I was already surprised that 2 MByte cache work. The 80188 processor only has a 1 MByte address space so there must be some sort of bank switching mechanism on the controller's memory interface; this wouldn't have been necessary for a 512K-only design.

Looking at linear accesses, the differences are not as impressive, but still visible:

Results for Random Accesses

Though the cache cannot avoid repeated disk accesses to the same disk sectors, there is still a gain from the cache due to its read ahead capability: the controller reads data in larger chunks than the main CPU really requests them, and when the next chunk is requested, the controller can already deliver its beginning without overhead. Operation with no or too few cache is not as bad as in the random case as the drive's head only have to move by one cylinder.

Summary

These are only synthetic benchmarks; your mileage in practical applications may vary. Modern operating systems like Linux or NT already have dynamic disk caches that automatically use free main memory, and of course copying in main memory is even faster than transferring data over the MCA bus. I'd say it is a nice extension if you have 30-pin SIMMs collecting dust otherwise and if you have fun in hardware hacking. If you'd have to buy the modules, better invest the money in main memory ;-)


© Alfred Arnold, alfred@ccac.rwth-aachen.de 2001