RSS

ProLiant Server May Unexpectedly Reboot And Display Event ID 57 Error Messages

11 May

We have seen this case in a number of our Proliant servers since last year and have been fixing it as we see them. You may be interesting in this one too.

Advisory: (Revision) Integrated Lights-Out 2 (iLO 2) Firmware Version 1.81 (Or Earlier) And iLO 2 Management Controller Driver Version 1.11.1.0 (Or Earlier) – ProLiant Server May Unexpectedly Reboot And Display Event ID 57 Error Messages

Update: Tests in our environment has shown that despite the firmware and driver update from HP, we continue to see unexpected reboots. HP has agreed that this is an issue and will revert to us.

Advertisements
 
5 Comments

Posted by on May 11, 2010 in Windows

 

Tags:

5 responses to “ProLiant Server May Unexpectedly Reboot And Display Event ID 57 Error Messages

  1. Shaps

    October 31, 2010 at 4:32 am

    I have been banging my head on the wall trying to get this resolved on a ML350G6 tower. I had HP replace the system board thinking that any bad parts on the board (ILO, i’m looking at you) could be swapped and hopefully the problem would go away. Not so much, as we’re on the latest firmware and drivers (SBS 2008) and this awful problem goes on. In this instance though it doesn’t actually reboot; rather it goes into a strange unresponsive state where all one can do is ping the NIC. ILO is still fully functional, however. And in case this might make one think that it’s not ILO, it didn’t happen until ILO was plugged in and configured (the SAME day it happened, in fact). Perhaps replacing the backplane is next?

     
    • saltwetfish

      October 31, 2010 at 9:07 am

      What do you mean by unresponsive state? Does your server OS (windows? Unix) freeze? Black screen?
      What external cards do you have?
      How many times has it frozen since your have configured the ilo… it does sounds strange that confguring the ilo would cause this though…

       
  2. Shaps

    October 31, 2010 at 2:59 pm

    Sorry, more details:
    Windows SBS 2008 Standard (64-bit), 4GB RAM, one Intel Xeon 2.2GHz(?) proc; latest firmware/drivers from HP. When it “goes down” I’m usually remote but it’s always the same: I can ping the configured NIC but can’t login to the box. During the few times that i’ve been onsite when it’s happened, it’s always been a black screen on the server where i can’t do anything but reboot from ILO on another computer. Just happened again overnight tonight and HP has been utterly useless as far as finding the problem. It’s only a matter of time before all these reboots cause the array to start having its own problems.

    This particular server had the same problem whether ILO was configured/plugged in or not, it’s just that it’s way more frequent now that ILO actually is configured and plugged in.

     
    • saltwetfish

      October 31, 2010 at 6:16 pm

      So you are saying that after replacing with another motherboard, you have the exact same issue still? However, its possible that the replacement board is also faulty. I had that from HP before around 2-3 times in my whole career though.

      – ok you can ping to the nic only? I am assuming its a static IP right?
      – could you, say, access the eventlog or c$ remotely when that happened?
      – after you reboot, did you get any unexpected reboot events in the eventlogs?
      – Was there a gap in time in the events in the eventlogs? i.e. during the time when the screen was black, was Windows still logging events?

       
  3. Shaps

    October 31, 2010 at 6:26 pm

    Right, replaced the system board, problem continues. I wouldn’t be surprised if it was another faulty ILO part just b/c of how this problem apparently is. I’ve done everything on the HP advisory but the problem continues. I have seen the 57 errors in the event log but they’re not always around the time the server goes unresponsive.

    When this happens, the only thing I can do with the server is ping the NIC’s static IP. The screen is black, can’t wake it up with the keyboad or mouse; can only reboot with ILO from another box. There is always a gap in the event log which to me seems to point to an HP problem, simply b/c Windows doesn’t know what’s going on and therefore can’t write anything in the logs. Thanks again for your help. I’m really just looking for ideas as to what we haven’t tried yet b/c man is the client pissed…

     

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: