[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freeipmi-devel] ipmi_monitoring_sensor_readings_by_record_id: inter
From: |
Albert Chu |
Subject: |
Re: [Freeipmi-devel] ipmi_monitoring_sensor_readings_by_record_id: internal error |
Date: |
Tue, 18 Jul 2017 12:15:47 -0700 |
There's clearly some communication problems with the motherboard,
leading to the "internal IPMI errors". Many times we send a request and
don't even see a response. In atleast one case before, the response
wasn't even a fully formed packet.
But this made me realize what is the possible problem.
When you run IPMI commands (i.e. ipmi-sensors), are you using one of the
kernel device drivers (e.g. linux defaults to /dev/ipmi0) as your
communication driver?
The default ipmimonitoring-sensors example happens to use the KCS
driver, which is separate and not related to the kernel one. It may be
conflicting w/ the kernel device driver. Effectively they are both
doing communication to the BMC but not sharing a lock.
If you are using /dev/ipmi0, if you changed the ipmimonitoring example
to use the IPMI_MONITORING_DRIVER_TYPE_OPENIPMI driver, thing'll
probably work out.
Al
On Tue, 2017-07-18 at 11:43 -0700, Sohan Chowdary Kollu wrote:
> I am using 1.5.5 version.
>
> Below are the packet details along with errors. Except for the 3rd
> scenario all other errors are very frequent
>
>
> 1)
>
> Failed right away (first sdr request in the trace)
>
>
> Get SDR Repository Info Request
>
> =====================================================
>
> KCS Header:
>
> ------------
>
> [ 0h] = lun[ 2b]
>
> [ Ah] = net_fn[ 6b]
>
> IPMI Command Data:
>
> ------------------
>
> [ 20h] = cmd[ 8b]
>
> (ipmi_monitoring_sdr_cache.c, ipmi_monitoring_sdr_cache_load, 314):
> ipmi_sdr_cache_open: internal IPMI error
>
> ipmi_monitoring_sensor_readings_by_record_id: internal error
>
>
> 2)
>
> a) Failed right away (first sdr request in the trace)
>
> =====================================================
>
> Get SDR Repository Info Request
>
> =====================================================
>
> KCS Header:
>
> ------------
>
> [ 0h] = lun[ 2b]
>
> [ Ah] = net_fn[ 6b]
>
> IPMI Command Data:
>
> ------------------
>
> [ 20h] = cmd[ 8b]
>
> (ipmi_monitoring_sdr_cache.c, _ipmi_monitoring_sdr_cache_retrieve,
> 223): ipmi_sdr_cache_create: internal IPMI error
>
> ipmi_monitoring_sensor_readings_by_record_id: internal error
>
>
> b) Failed after going though some sdr requests
>
> =====================================================
>
> Get SDR Request
>
> =====================================================
>
> KCS Header:
>
> ------------
>
> [ 0h] = lun[ 2b]
>
> [ Ah] = net_fn[ 6b]
>
> IPMI Command Data:
>
> ------------------
>
> [ 23h] = cmd[ 8b]
>
> [ 8820h] = reservation_id[16b]
>
> [ 82h] = record_id[16b]
>
> [ 25h] = offset_into_record[ 8b]
>
> [ 10h] = bytes_to_read[ 8b]
>
> (ipmi_monitoring_sdr_cache.c, _ipmi_monitoring_sdr_cache_retrieve,
> 223): ipmi_sdr_cache_create: internal IPMI error
>
> ipmi_monitoring_sensor_readings_by_record_id: internal error
>
>
> 3)
>
> Failed right away (first sdr request in the trace). Seen this only
> twice
>
>
> =====================================================
>
> Get SDR Repository Info Request
>
> =====================================================
>
> KCS Header:
>
> ------------
>
> [ 0h] = lun[ 2b]
>
> [ Ah] = net_fn[ 6b]
>
> IPMI Command Data:
>
> ------------------
>
> [ 20h] = cmd[ 8b]
>
> (ipmi_monitoring_sdr_cache.c, ipmi_monitoring_sdr_cache_load, 336):
> ipmi_sdr_cache_open: internal IPMI error
>
> ipmi_monitoring_sensor_readings_by_record_id: internal error
>
>
> 4)
>
> a) Failed at Reading Request
>
> =====================================================
>
> Get Sensor Reading Request
>
> =====================================================
>
> KCS Header:
>
> ------------
>
> [ 0h] = lun[ 2b]
>
> [ 4h] = net_fn[ 6b]
>
> IPMI Command Data:
>
> ------------------
>
> [ 2Dh] = cmd[ 8b]
>
> [ B0h] = sensor_number[ 8b]
>
> (ipmi_monitoring_sensor_reading.c, _get_sensor_reading, 356):
> ipmi_sensor_read: internal IPMI error
>
> (ipmi_monitoring.c, _ipmi_monitoring_sensor_readings_by_record_id,
> 1449): ipmi_sdr_cache_iterate: error returned in callback
>
> ipmi_monitoring_sensor_readings_by_record_id: internal error
>
>
> b) Failed at Reading Response
>
> =====================================================
>
> Get Sensor Reading Request
>
> =====================================================
>
> KCS Header:
>
> ------------
>
> [ 0h] = lun[ 2b]
>
> [ 4h] = net_fn[ 6b]
>
> IPMI Command Data:
>
> ------------------
>
> [ 2Dh] = cmd[ 8b]
>
> [ 90h] = sensor_number[ 8b]
>
> =====================================================
>
> Get Sensor Reading Response
>
> =====================================================
>
> KCS Header:
>
> ------------
>
> [ 0h] = lun[ 2b]
>
> [ 5h] = net_fn[ 6b]
>
> IPMI Command Data:
>
> ------------------
>
> [ 0h] = cmd[ 8b]
>
> (ipmi_monitoring_sensor_reading.c, _get_sensor_reading, 356):
> ipmi_sensor_read: internal IPMI error
>
> (ipmi_monitoring.c, _ipmi_monitoring_sensor_readings_by_record_id,
> 1449): ipmi_sdr_cache_iterate: error returned in callback
>
> ipmi_monitoring_sensor_readings_by_record_id: internal error
>
>
> Thanks
>
>
>
> On Mon, Jul 17, 2017 at 11:46 PM, Albert Chu <address@hidden>
> wrote:
> Hi,
>
>
> What version of FreeIPMI are you using? The line numbers
> don't quite line up with the master branch.
>
>
> Also, could you set IPMI_MONITORING_FLAGS_DEBUG_IPMI_PACKETS
> and show the IPMI packet that occurs right before the error
> line?
>
>
> Thanks,
>
>
>
> Al
>
>
> On Mon, Jul 17, 2017 at 4:28 PM, Sohan Chowdary Kollu
> <address@hidden> wrote:
> Hi Albert,
>
> Thanks for quick response. I have set the flags for
> debugging and found it failing at one of the three
> instances below in different runs.
>
> 1) (ipmi_monitoring_sensor_reading.c,
> _get_sensor_reading, 356): ipmi_sensor_read: internal
> system error(ipmi_monitoring.c,
> _ipmi_monitoring_sensor_readings_by_record_id, 1449):
> ipmi_sdr_cache_iterate: error returned in callback
> ipmi_monitoring_sensor_readings_by_record_id: internal
> error
> 2)(ipmi_monitoring_sdr_cache.c,
> ipmi_monitoring_sdr_cache_load, 314):
> ipmi_sdr_cache_open: internal IPMI
> error ipmi_monitoring_sensor_readings_by_record_id:
> internal error
>
>
> 3) (ipmi_monitoring_sdr_cache.c,
> _ipmi_monitoring_sdr_cache_retrieve, 223):
> ipmi_sdr_cache_create: internal IPMI
> error ipmi_monitoring_sensor_readings_by_record_id:
> internal error
>
>
>
> Thanks
>
>
>
> On Mon, Jul 17, 2017 at 2:34 PM, Albert Chu
> <address@hidden> wrote:
> The "internal error" indicates some logical
> error that the library
> doesn't know how to handle. Given its coming
> from
> ipmi_monitoring_sensor_readings_by_record_id
> and it occurs when you run
> the program back to back, I would bet there is
> some internal IPMI issue
> on your system. Perhaps its a new error code
> or something like that
> that I do not handle gracefully correctly.
>
> To try and debug, could you set the flag
> "IPMI_MONITORING_FLAGS_DEBUG |
> IPMI_MONITORING_FLAGS_DEBUG_IPMI_PACKETS" when
> calling
> ipmimonitoring_init() in the example code.
> Hopefully that'll be enough
> to figure out the issue.
>
> Al
>
> On Mon, 2017-07-17 at 13:03 -0700, Sohan
> Chowdary Kollu wrote:
> > Hi,
> >
> > I am executing the ipmimonitoring-sensors.c
> example provided in the
> > freeipmi library. It throws internal error
> sometimes. Issue is
> > reproducible when i execute the program back
> to back couple of times.
> > I need to wait approximately 30 sec or more
> after the last execution
> > for the program to run properly.
> >
> >
> > This is the error
> ipmi_monitoring_sensor_readings_by_record_id:
> > internal error
> >
> >
> >
> > I ran some of the commands on terminal back
> to back , including
> > ipmi-sensors with group option,
> ipmimonitoring etc. None of them thew
> > any errors. Error occurs only when i am use
> the API.
> >
> >
> > Has anyone faced this issue before? If yes,
> can you tell me how to
> > avoid it
> >
> >
> >
> >
> > Thanks,
> > Sohan
>
> >
> _______________________________________________
> > Freeipmi-devel mailing list
> > address@hidden
> >
> https://lists.gnu.org/mailman/listinfo/freeipmi-devel
>
> --
> Albert Chu
> address@hidden
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory
>
>
>
>
>
>
> --
> Thanks,
> Sohan
>
> _______________________________________________
> Freeipmi-devel mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/freeipmi-devel
>
>
>
>
>
>
>
> --
> Thanks,
> Sohan
--
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory