‘Power on virtual machine’ shows Completed, however the server never gets passed ‘Starting’ on the VMWare BIOS splash screen.
Initial Troubleshooting Steps:
1) I tried powering off/on the guest to no avail
2) I tried restarting the ccagent process on the host but that did not help in my case. However this may be worth trying in your case.
a) Connect to the ESX hosts Service Console via Putty
b) input the command to find the ccagent process: ps -ef | grep ccagent
c) determine the PPID or root PID of ccagent and type “kill -9 PPID” in order to kill the process, the process will automatically restart.
Cause:
I worked on this case with VMWare support and we reviewed the properties of the Guest and discovered there were about 8 RDM LUN mappings which were in a ‘dead’ status and were mysteriously attached to the guest (click ‘Manage Path’s’ to see the ‘dead’ status). These inaccessible RDM mappings were preventing the Guest from booting.
How did these erroneous RDM mappings get there? Well, according to VMWare support they have to be manually added by someone, but I dont think anyone on the IT staff did that. The only possibility I can think of is the RDM’s were as a result of some NetApp SnapManager for Exchange issues that the customer had been experiencing. SME mounts SnapShots when it does it’s verification process, if this process fails the SnapShots are often ‘stuck’. I mentioned this to VMWare support and they said this type of thing would only be attached at the host level and not the guest level so it is a mystery how these RDM mappings got there.
Resolution:
1) I clicked ‘Manage Path’s’ to determine which RDM mappings had a status of ‘dead’
2) Take screenshot of each RDM LUN Path before removing it
3) I removed only the RDM mappings that showed as ‘dead’
4) Virtual Machine automatically proceeded through the VMWare BIOS / POST Screen
Thank you very much. you have saved my day
Thanks for this post.
Regards
Just thank you, a lot!
It’s just saved my maintenance windows!
Was a LUN removed from the storage, and someone forgot to remove from VM.
Ty!
100 points for this post making it really easy to spot this issue. All by removing the dead RDM mappings.
Had exactly this problem today, a mysterious RDM appeared and halted a server boot. Removed it as described and problem solved.
Thanks for the fix. Saved my bacon…
That solved it for me too. Thank you for this post!
Thanks for this post.
It helped me with a server which after a patching restart won’t start again.
I checked the “Hard Disk” items and found exactly the same conditions explained.
After I applied the recommendation, server booted up successfully.
Thanks again.
Thanks for the post. This helped with a backup crash we had which left multiple raw lun mappings.