Monday 23 March 2009

Creating a backup Deployment Manager Node

I’m going to create a backup deployment node, as having the WebSphere deployment manager on the deployment server is NOT good enough.  I always thought that it was, but it aint.  You need to create a solid backup of the DMgr01 functionality.

Firstly I’ve gone to one of my clustered production web servers and executed “AddProfile”.  This is done with menu options, or you can just run “D:\IBM\WebSphere\AppServer\bin\ProfileCreator\pctWindows.exe”.  This will walk you through the profile creation.  Make note of the SOAP port that is assigned to the new profile.  I called the profile backup_Dmgr01.

The profile is a full set of WAS binaries that you can install additional webServers under.

You then need to federate the new profile to the existing cell.  I kept getting the following error (I tried addNode from the new profile bin dir and I also tried to add it from the ND screens pointing to the new SOAP default port 8882)

[23/03/09 20:52:51:964 EST] 0000000a AdminTool     E   ADMU0111E: Program exiting with error: com.ibm.websphere.management.exception.AdminException: ADMU0036E: The Deployment Manager cannot lookup by name host nsgshsjdnwb03.xxxxx.local at address 165.254.0.1

I did a quick google and found that you need to delete the localhost entry from the hosts file on the server (WTF???) of course that I what I would have guessed!

http://www.ibm.com/developerworks/websphere/library/techarticles/0812_howes/0812_howes.html

Above is the link to the fix.

of course you know that on windows, this file is C:\WINDOWS\system32\drivers\etc\hosts

addNode.bat DeploymentManagerHostName 8879

Hmm, so that did not work, so then I added a manual host entry to the fully qualified host name to the hosts file on the application server hosts file…  That did not work either.

Then I added a fully qualified host entry to the hosts file on the deployment manager node. 

tried ipconfig /flushdns this did not work

I’ve finally worked it out, bit embarrassing…  The web application server is a cluster, it must be picking up the cluster address (internal one), not the externally resolved IP address.

So, WebSphere is doing some funky query on the IP addresses of the machine, and the machine is returning the local one, not the proper IP.  I need to change the sequence of the adapters on a clustered windows 2003 machine…

I might put the brakes on tonight, attack again tomorrow.

No comments: