[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freeipmi-users] Decoding ram errors on supermicro
From: |
Tom Hetmer |
Subject: |
Re: [Freeipmi-users] Decoding ram errors on supermicro |
Date: |
Tue, 04 Dec 2018 11:39:03 +0100 |
Sure. It seems there's a similar ticket already:
https://github.com/chu11/freeipmi-mirror/issues/19
Yep, that's the code. ipmitool and a few others decode it too.
We have a *lot* of Supermicros so I can help with testing if needed - but we
don't get that much CRC errors though :)
So I guess we'd have to wait till one pops up. But I hope the 'ver 2' method
from ipmiutil works fine.
We used ipmitool in our monitoring before and it was accurate but slow, that's
why I rewrote it all to use freeipmi.
Thanks!
Best,
Tom Hetmer
CDN77 Operations
address@hidden / +44 (0) 20 3514 2399 / www.cdn77.com
----- Původní zpráva -----
> Odesilatel: "Albert Chu" <address@hidden>
> Příjemce: "Tom Hetmer" <address@hidden>, address@hidden
> Datum: 12/03/18 21:06
> Předmět: Re: [Freeipmi-users] Decoding ram errors on supermicro
>
> Hi Tom,
>
> Thanks for the pointer to ipmiutil's code. I assume you found this
> comment:
>
> ---
> /* ver 2 method: 2A 80 = P1_DIMMB1 */
>
> /* SuperMicro says:
>
> * pair: %c (data2 >> 4) + 0x40 + (data3 & 0x3) * 3, (='B')
>
> * dimm: %c (data2 & 0xf) + 0x27,
>
> * cpu: %x (data3 & 0x03) + 1);
>
> */
> ---
>
> I can definitely add it to my todo list.
>
> Would you mind writing up an issue on github here?
>
> https://github.com/chu11/freeipmi-mirror
>
> Al
>
> On Mon, 2018-12-03 at 17:55 +0100, Tom Hetmer wrote:
> > Hi,
> >
> > it'd be good if freeipmi supported decoding the supermicro ECC
> > errors.
> >
> >
> > Manufacturer: Supermicro
> > Product Name: X10DRH LN4
> > eg.
> > freeipmi
> > 1,Dec-01-2018,06:37:53,Sensor #0,Memory,Critical,Uncorrectable memory
> > error ; OEM Event Data2 code = 3Ah ; OEM Event Data3 code = 81h
> >
> >
> > web interface
> > 1 | 12/01/2018 | 06:37:53 | Memory | Uncorrectable ECC
> > (@DIMMG1(CPU2)) | Asserted
> >
> >
> > something like this worked for me (stolen from ipmiutil)
> >
> >
> > $cpu = ($data3 & 0x03) + 1;
> >
> >
> > $NPAIRS = 26;
> > $rgpairs = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
> >
> >
> > $bdata = "0x".$data2.$data3;
> > $bdata = hexdec($bdata);
> > $pair = (($bdata & 0xF0) >> 4) - 1;
> >
> >
> > if ($pair < 0) $pair = 0;
> > if ($pair > $NPAIRS) $pair = $NPAIRS - 1;
> >
> >
> > $pair = $rgpairs[$pair - 1];
> >
> >
> > $dimm = $bdata & 0x0F;
> >
> >
> > $dimm may be incorrect as the original code decrements 9, but on that
> > board it was wrong so i changed it to get the right result - we'll
> > see if it keeps getting the right values.
> >
> > Best,
> > Tom Hetmer
> >
> >
> > CDN77 Operations
> > address@hidden / +44 (0) 20 3514 2399 / www.cdn77.com
> >
> > _______________________________________________
> > Freeipmi-users mailing list
> > address@hidden
> > https://lists.gnu.org/mailman/listinfo/freeipmi-users
> --
> Albert Chu
> address@hidden
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory
- [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/03
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Albert Chu, 2018/12/03
- Re: [Freeipmi-users] Decoding ram errors on supermicro,
Tom Hetmer <=
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Albert Chu, 2018/12/04
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/04
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Albert Chu, 2018/12/05
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Al Chu, 2018/12/10
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/11
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/11
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/11
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Albert Chu, 2018/12/11
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/12
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Albert Chu, 2018/12/12