I don’t often do much in the way of hardware, but recently I had some problems with a couple of our Proliant DL180 G5 servers and the controller for the RAID array.
We took a power outage to our building and the generator that was supposed to come on never did. the UPS was speced to handle the period of time between a power outage and the generator coming on, so after a few minutes all of the servers powered down. Not necessarily what i wanted to come into on Monday morning, but it did highlight a couple of needs that i had expressed over the past couple of years.
When i powered everything back up, I only had 2 servers that came up with any large problems. Both seemed to be issues with the Smart Array controller as the server would get to that point in the POST and would sit at SmartArray E200 initializing for a few minutes and then fail to boot. I contacted support and they had me do a few things:
1. Boot from the Easy Setup CD and go to Maintenance and run Array diagnostics – Array diagnostics couldn’t find that any array controllers were installed.
2. update the flash rom for the server – There is a download for the server that then creates a bootable USB stick that you can use to update a server that won’t boot into the OS.
3. Reseat the Cache module on the controller card – there’s a chip on the Smart Array controller card that on first glance looks like a chubby piece of memory.
4. upgrade Storage Firmware using Maintenance CD – there is another download on the HP site to take their maintenance CD and create a bootable USB Drive. then you can update\replace some of the drivers on the CD. once you boot off the USB stick, it can automatically detect any updates and apply them. it also failed to see that there was a Storage controller installed.
5. boot with Cache removed from SA E200 – they then had me remove the Cache module and boot the server with the slot on the controller empty.
6. Move Smart Array card to another slot, Clear CMOS, upgrade Firmware with Smart Update Manager – i considered this their hail mary before they replaced the controller. they had me move the controller card to another slot on the server, clear the CMOS (this is done by holding down a button on the motherboard labeled CMOS, when you power back up the system date and time will need to be reset) and then try to update using the USB key from step 4.
so after all this we were still at the same place as we started. so they sent out a new Smart Array E200 controller card.
I replace the controller card and was still having the exact same problem.
I got back on the phone with support after quickly running through the above 6 things. I was trying to save myself having to run back and forth to the server when support asked me to try them again. since i had already exhausted all the prompts on their screen (i guess) they considered this an odd single case issue (funny that i had two servers doing the same thing), and had to get special instructions which amounted to replace the Cache module.
the new cache module came the next day and once installed the server booted normally.
oddly, for the second server the next support person that i got insisted it was the motherboard since booting with the cache module removed did not make any difference and setup for a tech to be dispatched with a motherboard.
I spoke with the tech and he confirmed my suspicion that the motherboard was most likely not faulty and that it was more likely the cache module. He explained that it used to be that removing the cache module would allow the server to boot, but he has recently found that if the server shipped with the cache module installed, the servers seem to expect it to be there and if it’s not (or it’s faulty) the controller can’t initialize.
So he came out the next day with a new cache module, installed it and the server worked fine.
We had the same exact problem, after reading your article I went back and I pointed it to the HP tech and. The HP Tech. changed the cash module and the server came back just fine. this was after 18 hours of working on the server and replacing almost everything in the server.
Thank you for the article it saved me lots grieves.
Great, glad it could help. It was one of the more frustrating Hardware issues I’ve dealt with. Lots of testing with little results until they replaced the cache module…
Thanks for this, had the problem today , great to tell the HP Tech guys what part we needed. I actually let the Cache Memory out while i was on the phone with them for the part number , plugged it back in 15 mins later and the server booted up
Going to get a replacement part anyway but hopefully a hotfix for anyone desperate!
Good tip, I’ll update the post with that. Wish I had known to try that when mine crashed. Thanks.
Thanks! We have exact the same problem with a Proliant ML 110 G5 today. During POST we got
Expansion ROM initialization failed – PCI Mass Storage Controller on Motherboard
Bus:02, Device:08, Function:00
HP Support replaced the E200 Controller and the Motherborad twice with no success. The support said that the server should boot without the E200 cache module which is definitly not true. It cost us 16 hours until HP sends us a new Cache module. Instelled it and everthing was fine!
Thanks for this article
Still surprised that in 3 months, HP support hasn’t figured this out. Their field techs seem to know this, just surprised that the phone support hasn’t gotten that prompt on their screen yet.
We had this same issue on a ML350 G5. New system board made no difference and pulling the old cache module and booting without made no difference. Still gave the 1783 controller boot error. Swapping the cache module with a new one fixed the server and booted right up. Thanks for this article!!!
Hi, thanks a lot! I have the same Server HP DL180 G5 with E200 controller. I have power outage with the same problem; I have buy with your council the cache module (100€) and work fine!!
Tnk very much!