vSphere host disconnects from vCenter host not responding
When vSphere 4 became available I was keen to upgrade all of my hosts and vCenter servers to start testing features like Fault Tolerance, Site Recovery Manager, and vCenter Linked Mode e.t.c. The host upgrade went smoothly using VMware Update Manager. Similarly, the upgrade of all vCenter servers was also without issue. A short while after, I needed to add RAM to a group of hosts and all was fine until I booted the first host post upgrading it's RAM.
When the host had finished booting it naturally tried to reconfigure HA as necessary, but within two minutes it was appearing as being disconnected from vCenter. Initially I was a little stumped because I instinctively tried to SSH onto the host and was able to do so. This immediately ruled out any form of network or connectivity problem. For the sake of speed I decided to restart the host just incase a service had failed when it was booting up. After the host restarted it appeared in vCenter as usual. I then noticed an error on the host and checked the Alarms tab. The alarm triggered was:
Host connection and power state
The alarm was weird since I had only upgraded the RAM on the host. I couldn't correlate the significance of the alarm and the host disconnecting from the vCenter server. I assumed I had disturbed one of the power cables on the host but a quick check revealed both PSUs in the host had a live power feed and this alarm doesn't actually monitor the power anyway.
I decided to use the vSphere client to connect directly to the host. As soon as it connected I received an information message explaining the host was being managed by the vCenter server, but the IP address listed (for the vCenter server) was 127.0.0.1. This was clearly incorrect.
When a host is added to vCenter it uses the IP address stored within vCenter located in:
Administration --> vCenter Server Settings --> Runtime Settings --> vCenter Server Managed IP
A quick look at this setting on the vCenter server revealed that the Managed IP was blank.
I assumed that as the Managed IP was blank hosts were trying to connect to themselves instead of the vCenter server.
I used nano to open /etc/opt/vmware/vpxa/vpxa.conf on the host and it contained the following:
After changing the vCenter Server Managed IP address using the vSphere client I disconnected from both the vCenter server and the host. I then connected back onto the host using the vSphere client and it was now being managed by the vCenter server with the correct IP address. A quick check on the vCenter server itself revealed that the Host connection and power state alarm had ceased. I also verified that the vpxa.conf on the host was reporting the correct IP address.
I assume that this issue may have been caused during the upgrade from ESX 3.5 to vSphere 4. I had never experienced this problem before and it has not reoccured since.