Fix is very simple.
FILE: src/e1000_main.c
----------------------
static int
e1000_notify_reboot(struct notifier_block *nb, unsigned long event, void *p)
{
struct pci_dev *pdev = NULL;
switch(event) {
case SYS_DOWN:
case SYS_HALT:
case SYS_POWER_OFF:
pci_for_each_dev(pdev) {
if(pci_dev_driver(pdev) == &e1000_driver)
=> e1000_suspend(pdev, 3);
^^^^^^^ CAUSE OF BUG ^^^^^^
}
}
return NOTIFY_DONE;
}
We want the NIC to be in usable state even after kernel halts. Because
BMC shares the NIC. You can either comment out e1000_suspend or
replace it with
pci_unregister_driver(&e1000_driver);
or
directly call e1000_remove(struct pci_dev *pdev);
Call trace:
pci_unregister_driver
-> e1000_remove
-> e1000_smbus_arp_enable(adapter, TRUE);
-> e1000_phy_hw_reset(&adapter->hw);
/* Returns the PHY to the power-on reset state */
-ab
,----[ Albert Chu <address@hidden> ]
| I just did the following experiment:
|
| - Forced e1000 to *not* load by turning it off in /etc/modules.conf
| - Boot stock RHEL3 kernel
|
| and the halting problem was gone. So it looks like the e1000 driver
| is the cause, although I'm still not 100% if it is the root cause.
| I'll begin looking at redhat's e1000 driver, to see if there is
| anything fishy about it ... Have you guys gotten the Intel driver to
| work??
`----