[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freeipmi-users] Troubleshooting inconsistent results from server
From: |
Albert Chu |
Subject: |
Re: [Freeipmi-users] Troubleshooting inconsistent results from server |
Date: |
Thu, 07 Jan 2016 17:09:34 -0800 |
Hi Brian,
I think the over-LAN and inband are two separate issues. The over-LAN
is likely some configuration/networking issue. Can you atleast ipmiping
the node?
As for the inband issue, it sounds very much like this issue:
http://www.gnu.org/software/freeipmi/freeipmi-faq.html#Why-am-I-seeing-so-many-_0027internal-IPMI-error_0027-or-_0027driver-busy_0027-messages_003f
Al
On Thu, 2016-01-07 at 19:43 -0500, Brian LaFlamme wrote:
> I have a Dell C6100 blade server with 2 identical nodes, and I'm trying to
> troubleshoot some odd behavior with ipmi. One node (node2) works perfectly
> over LAN and over its local web interface. A separate node (node1) doesn't
> work well at all.
>
> Locally, I get very inconsistent results. E.g., running 'bmc-config
> --checkout' on node1 usually ends prematurely without any error message,
> resulting in an incomplete config file. Sometimes it completes. In
> contrast, i always get a complete config file on node2.
>
> Here is an attempt to run a simple command locally on node1 a few times to
> demonstrate the inconsistency. The first time it runs to completion, the
> next few times it dies with an error. I paused for 20+ seconds between
> each command to make sure I wasn't overloading anything.
>
> address@hidden:~# ipmi-sensors
> ID | Name | Type | Reading |
> Units | Event
> 2 | FCB FAN1 | Fan | 5500.00 |
> RPM | 'OK'
> 3 | FCB FAN2 | Fan | 5500.00 |
> RPM | 'OK'
> 4 | FCB FAN3 | Fan | 5500.00 |
> RPM | 'OK'
> 5 | FCB FAN4 | Fan | 5500.00 |
> RPM | 'OK'
> 6 | PEF Action | System Event | N/A |
> N/A | 'OK'
> 7 | WatchDog2 | Watchdog 2 | N/A |
> N/A | 'OK'
> 8 | AC Pwr On | Power Unit | N/A |
> N/A | 'OK'
> 9 | ACPI Pwr State | System ACPI Power State | N/A |
> N/A | 'Legacy ON state'
> 10 | FCB Ambient1 | Temperature | 20.00 |
> C | 'OK'
> 11 | FCB Ambient2 | Temperature | 21.00 |
> C | 'OK'
> 12 | CPU1Status | Processor | N/A |
> N/A | 'OK'
> 13 | CPU2Status | Processor | N/A |
> N/A | 'OK'
> 14 | PS 12V | Voltage | 12.09 |
> V | 'OK'
> 15 | PS 5V | Voltage | 5.10 |
> V | 'OK'
> 16 | MLB TEMP 2 | Temperature | 63.00 |
> C | 'OK'
> 17 | MLB TEMP 3 | Temperature | 52.00 |
> C | 'OK'
> 18 | Processor 1 Temp | Temperature | 60.00 |
> C | 'OK'
> 19 | MLB TEMP 1 | Temperature | 62.00 |
> C | 'OK'
> 20 | Processor 2 Temp | Temperature | 66.00 |
> C | 'OK'
> 21 | STBY 3.3V | Voltage | 3.35 |
> V | 'OK'
> 22 | PS Current | Current | 38.00 |
> A | 'OK'
> 23 | SEL Fullness | Event Logging Disabled | N/A |
> N/A | 'Log Area Reset/Cleared'
> 24 | PCI BUS | Critical Interrupt | N/A |
> N/A | 'OK'
> 25 | Memory | Memory | N/A |
> N/A | 'OK'
> 26 | VCORE 1 | Voltage | 1.04 |
> V | 'OK'
> 27 | VCORE 2 | Voltage | 0.87 |
> V | 'OK'
> 30 | NM Capability | OEM Reserved | N/A |
> N/A | N/A
> 33 | Security | Platform Security Violation Attempt | N/A |
> N/A | 'OK'
> 34 | PSU 1 AC Status | Power Unit | N/A |
> N/A | N/A
> 35 | PSU 2 AC Status | Power Unit | N/A |
> N/A | N/A
> 36 | PSU 1 Present | Power Supply | N/A |
> N/A | N/A
> 37 | PSU 2 Present | Power Supply | N/A |
> N/A | N/A
> 38 | PSU 2 POUT | Current | N/A |
> A | N/A
> 39 | PSU 1 POUT | Current | N/A |
> A | N/A
> address@hidden:~# ipmi-sensors
> ID | Name | Type | Reading |
> Units | Event
> 2 | FCB FAN1 | Fan | 5500.00 |
> RPM | 'OK'
> 3 | FCB FAN2 | Fan | 5500.00 |
> RPM | 'OK'
> 4 | FCB FAN3 | Fan | 5500.00 |
> RPM | 'OK'
> 5 | FCB FAN4 | Fan | 5500.00 |
> RPM | 'OK'
> 6 | PEF Action | System Event | N/A |
> N/A | 'OK'
> 7 | WatchDog2 | Watchdog 2 | N/A |
> N/A | 'OK'
> 8 | AC Pwr On | Power Unit | N/A |
> N/A | 'OK'
> ipmi_sensor_read: internal IPMI error
> address@hidden:~# ipmi-sensors
> ID | Name | Type | Reading |
> Units | Event
> 2 | FCB FAN1 | Fan | 5500.00 |
> RPM | 'OK'
> ipmi_sensor_read: internal IPMI error
> address@hidden:~# ipmi-sensors
> ID | Name | Type | Reading |
> Units | Event
> 2 | FCB FAN1 | Fan | 5500.00 |
> RPM | 'OK'
> 3 | FCB FAN2 | Fan | 5500.00 |
> RPM | 'OK'
> 4 | FCB FAN3 | Fan | 5500.00 |
> RPM | 'OK'
> 5 | FCB FAN4 | Fan | 5500.00 |
> RPM | 'OK'
> 6 | PEF Action | System Event | N/A |
> N/A | 'OK'
> 7 | WatchDog2 | Watchdog 2 | N/A |
> N/A | 'OK'
> 8 | AC Pwr On | Power Unit | N/A |
> N/A | 'OK'
> 9 | ACPI Pwr State | System ACPI Power State | N/A |
> N/A | 'Legacy ON state'
> ipmi_sensor_read: internal IPMI error
>
> Also, I get no response from node1 over LAN, whereas node2 works perfectly
> (not shown).
>
> address@hidden:~# ipmi-sensors -h node1ipmi -u root -p XXX --debug
> node1ipmi: =====================================================
> node1ipmi: IPMI 1.5 Get Channel Authentication Capabilities Request
> node1ipmi: =====================================================
> node1ipmi: RMCP Header:
> node1ipmi: ------------
> node1ipmi: [ 6h] = version[ 8b]
> node1ipmi: [ 0h] = reserved[ 8b]
> node1ipmi: [ FFh] = sequence_number[ 8b]
> node1ipmi: [ 7h] = message_class.class[ 5b]
> node1ipmi: [ 0h] = message_class.reserved[ 2b]
> node1ipmi: [ 0h] = message_class.ack[ 1b]
> node1ipmi: IPMI Session Header:
> node1ipmi: --------------------
> node1ipmi: [ 0h] = authentication_type[ 8b]
> node1ipmi: [ 0h] = session_sequence_number[32b]
> node1ipmi: [ 0h] = session_id[32b]
> node1ipmi: [ 9h] = ipmi_msg_len[ 8b]
> node1ipmi: IPMI Message Header:
> node1ipmi: --------------------
> node1ipmi: [ 20h] = rs_addr[ 8b]
> node1ipmi: [ 0h] = rs_lun[ 2b]
> node1ipmi: [ 6h] = net_fn[ 6b]
> node1ipmi: [ C8h] = checksum1[ 8b]
> node1ipmi: [ 81h] = rq_addr[ 8b]
> node1ipmi: [ 0h] = rq_lun[ 2b]
> node1ipmi: [ 23h] = rq_seq[ 6b]
> node1ipmi: IPMI Command Data:
> node1ipmi: ------------------
> node1ipmi: [ 38h] = cmd[ 8b]
> node1ipmi: [ Eh] = channel_number[ 4b]
> node1ipmi: [ 0h] = reserved1[ 3b]
> node1ipmi: [ 0h] = get_ipmi_v2.0_extended_data[ 1b]
> node1ipmi: [ 3h] = maximum_privilege_level[ 4b]
> node1ipmi: [ 0h] = reserved2[ 4b]
> node1ipmi: IPMI Trailer:
> node1ipmi: --------------
> node1ipmi: [ AAh] = checksum2[ 8b]
>
>
> Additional details (this info is identical to the working node)
>
>
> address@hidden:~# bmc-info
> Device ID : 37
> Device Revision : 1
> Device SDRs : unsupported
> Firmware Revision : 1.30
> Device Available : yes (normal operation)
> IPMI Version : 2.0
> Sensor Device : supported
> SDR Repository Device : supported
> SEL Device : supported
> FRU Inventory Device : supported
> IPMB Event Receiver : supported
> IPMB Event Generator : supported
> Bridge : unsupported
> Chassis Device : supported
> Manufacturer ID : Inventec Enterprise System Corp. (20569)
> Product ID : 52
> Auxiliary Firmware Revision Information : 6D6E0001h
>
> GUID : f790edd1-a000-0061-756d-502032343435
>
> System Firmware Version : 5442A170
> System Name :
> Primary Operating System Name :
> Operating System Name :
>
> Channel Information
>
> Channel Number : 0
> Medium Type : IPMB (I2C)
> Protocol Type : IPMB-1.0
> Active Session Count : 0
> Session Support : session-less
> Vendor ID : Intelligent Platform Management Interface forum
> (7154)
>
> Channel Number : 1
> Medium Type : 802.3 LAN
> Protocol Type : IPMB-1.0
> Active Session Count : 0
> Session Support : multi-session
> Vendor ID : Intelligent Platform Management Interface forum
> (7154)
>
> Channel Number : 6
> Medium Type : IPMB (I2C)
> Protocol Type : IPMB-1.0
> Active Session Count : 0
> Session Support : session-less
> Vendor ID : Intelligent Platform Management Interface forum
> (7154)
> _______________________________________________
> Freeipmi-users mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/freeipmi-users
--
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory