After using my V240 as a test bed for the interrupt problems that we were seeing, and finally finding and fixing them, I thought that I should finally work on Sun V2x0 environmental monitoring. I ended up finding and fixing a few other code bugs along the way.

I already knew that the devices most likely to be sensors were at addresses 0x2e and 0x4e, but I did check all the devices in case the Sun i2c bridge had altered the addresses. Using modified i2cscan again, I determined that there was an ADM1031 at address 0x2e and a LM75 at address 0x4e. I didn't manage to make any permanent changes, but did make the ALOM think that the configuration card had been removed, which required a power cycle.

We already had code to add missing DIMM's for SPARCle, so adding the missing devices was straightforward. With this, our LM75 driver should have attached, but it failed due to not being able to write. Modifying the driver not to write was simple, and we could have device properties set on the V240 (and V210, which Martin Husemann kindly tested) to mark them as "no-write".

The next step was a driver for the ADM1026 chip. Whilst writing this, I noticed that I would sometimes see bogus values read from the chip (0x00, 0xba, 0xbb, and 0xbc were common, but others were possible). I haven't tracked down what causes this - it isn't timing related, as I've tried adding a delay between reads and that doesn't fix it. The workaround is to read the register twice and compare the values. If they are different, read another register then the original one (twice again and compare). Testing the ADM1026 on the V440, I noticed a similar problem. I also noticed that the fan speeds on the V240 didn't match the speeds reported by the ALOM for one set of fans, as one of the fan divisor registers wasn't correctly set. Device properties to the rescue again.

Whilst looking at the dbcool driver (to check if ADM1026 support should be merged there), I noticed that direct configuration wasn't supported, so it was simple to add that although the Red Sun Blade 2500 had the ADM1031 chips at addresses which the driver was testing anyway. I also discovered a problem with our PCF8584 driver when writing. This was the cause of the LM75 failing to attach. Fixing this meant that I could remove the device property that I thought that I needed for LM75 on V240. It also meant that I could actually write to i2c devices instead of inadvertently writing to a different device. As a test, I turned the hardware locator on and off using the ALOM, checked the GPIO registers, and can now turn it on and off from userland:

Locator ON:

/tmp/i2cscan -w /dev/iic0 0x22 0x07 0x1f	# set port to output
/tmp/i2cscan -w /dev/iic0 0x22 0x03 0x5c	# set logic level = 0
    

Locator OFF:

/tmp/i2cscan -w /dev/iic0 0x22 0x03 0xdc	# set logic level = 1
/tmp/i2cscan -w /dev/iic0 0x22 0x07 0x9f	# set port to input
    

These addresses are PCA9555 GPIO's and a driver for them should be straighforward. However, another driver or userland program that has hardware-specific information would be needed to handle this, along with keyswitch position, PSU status, etc. Collecting information about some values to report and set should be possible (if tedious) by reading the GPIO registers after each physical change (and also comparing with the ALOM output on machines that have those).

As an example of hardware-specific information, a simple awk script can change the envstat output:

taco# sh /tmp/sunenvstat
Model: SUNW,Sun-Fire-V240
                        Current  CritMax  WarnMax  WarnMin  CritMin  Unit
[adm1026hm0]
               F0.RS:      5720                                       RPM
               F1.RS:      5769                                       RPM
               F2.RS:      5921                                       RPM
               fan 3:         0                                       RPM
         MB.P0.F1.RS:     16463                                       RPM
         MB.P1.F0.RS:     17307                                       RPM
         MB.P0.F0.RS:     15697                                       RPM
         MB.P1.F1.RS:     16463                                       RPM
            internal:    25.000                                      degC
        MB.P0.T_CORE:    50.000                                      degC
        MB.P1.T_CORE:    41.000                                      degC
        MB.BAT.V_BAT:     2.906                                         V
        V3.3 standby:     3.348                                         V
           V3.3 main:     3.348                                         V
                V5.0:     4.995                                         V
        MB.P0.V_CORE:     1.488                                         V
                V+12:    11.750                                         V
                V-12:    -3.375                                         V
           MB.V_+1V5:     1.512                                         V
           MB.V_+2V5:     2.496                                         V
          MB.V_VCCTM:     2.543                                         V
       MB.V_GBE_CORE:     1.207                                         V
       MB.V_GBE_+2V5:     2.508                                         V
              V3.0 5:     0.000                                         V
            MB.V_VTT:     1.250                                         V
        MB.P1.V_CORE:     1.484                                         V
[lmtemp0]
            MB.T_ENC:    13.000                                      degC
    

so that it matches the names that the ALOM has for the sensors. As another test, Michael Lorenz kindly checked that it was now possible to alter the limit values with the dbcool driver on his SB2500. He also discovered that the machine will power off if the CPU temperature limit is exceeded. Presumably, the Therm output of the ADM1031 is either monitored by the firmware, or connected to the PSU.


-^- More notes -^-