[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freeipmi-devel] E1000 patch and other stuff
From: |
Albert Chu |
Subject: |
Re: [Freeipmi-devel] E1000 patch and other stuff |
Date: |
Tue, 10 Feb 2004 15:22:02 -0800 |
> 4) Tomorrow, Jason and I are going to packet sniff the switch, and
> makesure that packets are coming through the switch to the ethernet
> cardwhen a system is halted.
Jason setup a packet mirror and we saw that the packets came out of the
switch, going towards a halted node, but there was no reply. So I think
we can eliminate the switch as a potential problem. I think this is a
Linux/e1000 driver problem.
Jason noticed that there are a ton of packets flying around on the
management network. He said he wouldn't be surprised if the ethernet
card is dropping packets due to the *volume* of packets on thunder
(which is why we don't see the halting problem on tdev).
I noticed that the packet drop rate to a halted node was far less today
(20%-30%) than before (70%-95%). I turned off Gratuitous ARPs on all of
thunder today, so perhaps the card is behaving strangely above a certain
volume of packets.
Al
--
Albert Chu
address@hidden
Lawrence Livermore National Laboratory
----- Original Message -----
From: Albert Chu <address@hidden>
Date: Monday, February 9, 2004 3:38 pm
Subject: [Freeipmi-devel] E1000 patch and other stuff
> 1) I've verified that commenting out the e1000_suspend in
> e1000_notify_reboot fixes the power control halting problem on
> thunder.Finally!!! Thanks AB.
>
> 2) Calling e1000_remove or pci_unregister_driver instead of
> e1000_suspend will not work, because this is the same code path
> taken as
> "rmmod e1000". We get the "unregister_netdevice: waiting for eth0 to
> become free. Usage count = 2" bug. Whatever state the ethernet
> card is
> in at this point, we have difficulty doing power control (although, I
> noticed that packet drops weren't as severe, so I was able to
> "sneak in"
> a power control operation) ...
>
> 3) I still cannot get the halting problem or "rmmod e1000" problem to
> occur on tdev. Thus, raising my suspicions of a race or scalability
> problem, that is the "real bug" ...
>
> 4) Tomorrow, Jason and I are going to packet sniff the switch, and
> makesure that packets are coming through the switch to the ethernet
> cardwhen a system is halted. If that is the cause, we can
> eliminate the
> switch as a factor in the "real bug" ...
>
> Al
>
> --
> Albert Chu
> address@hidden
> Lawrence Livermore National Laboratory
>
>
>
> _______________________________________________
> Freeipmi-devel mailing list
> address@hidden
> http://mail.nongnu.org/mailman/listinfo/freeipmi-devel
>