Monday 20 October 2008

WAS and ports and locks and stuff

Well, I've just spent the morning recovering from a doozie of a problem. The JAS.INI for a web server was changed to have the [WEB DEVELOPMENT] section inserted with the WebDevelopment=true line added too. Holy moly - what a f*ck up!. This had the affect of letting people in, but they could not run reports. There were errors in the logs about:

20 Oct 2008 10:03:51,979 [SEVERE] MOIRS - [METADATA] Cause exception com.jdedwards.database.base.JDBException: [DATABASE_CONNECT_FAILED] Database Connection failed for DataSource SPEC_MSDE.
20 Oct 2008 10:03:51,979 [SEVERE] MOIRS - [METADATA] JDESpec error code = JDBj_SELECT_FAILURE com.peoplesoft.pt.e1.base.metadata.dataAccess.JDESpecException: JDBj select to spec RDB table failed for Spec Type RDASPEC.
20 Oct 2008 10:03:51,995 [SEVERE] MOIRS - [JAS] SubmitUBE.launchUBE() : Exception thrown during UBE submission! | Report Name : R0010P, Version Name : XJDE0001, Server Name : null, User Name : MOIRS com.jdedwards.system.kernel.LaunchUBEException: JDBj select to spec RDB table failed for Spec Type RDASPEC.
com.jdedwards.system.kernel.LaunchUBEException: JDBj select to spec RDB table failed for Spec Type RDASPEC.
at com.jdedwards.jas.SubmitUBEXML.packageSpecsAndSubmit(Unknown Source)
at com.jdedwards.jas.SubmitUBE.launchUBE(Unknown Source)
at com.jdedwards.jas.JDEOWDirect.jdeOWDirect(Unknown Source)
at com.jdedwards.jas.JDEOWDirect.processServerRequest(Unknown Source)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java(Compiled Code))
at java.lang.reflect.Method.invoke(Method.java(Compiled Code))
at com.jdedwards.jas.ServletProxy.callServerComponent(Unknown Source)
at com.jdedwards.jas.ServletProxy.servletProxy(Unknown Source)
at com.jdedwards.jas.ServletProxy.dispatch(Unknown Source)
at com.jdedwards.jas.net.NetConnection.callJAS(Unknown Source)
at com.jdedwards.jas.net.NetManager.processRequest(Unknown Source)


So, to find the problem I'm thinking, why is only this cluster member trying to load serialized objects from MSDE??? It could not be the manifest, it had to be the jdbj.ini (or jas.ini as in this case). I checked the [BOOTSTRAP] DS settings, I thought that it was going to be this, and it was not. I then stumbled on the JAS.INI and found the Web-Development setting.

Ok, great, I know why everything is f*cked, now.. Easy fix. Send everyone a SAW message. Restart the JVM - bing bang bong. The perfect crime...

But the instance would not start again, I kept getting the errors below:

[20/10/08 10:30:03:478 EST] 0000001f ChannelFramew E CHFW0029E: Error initializing chain DCS because of exception com.ibm.wsspi.channel.framework.exception.RetryableChannelException: Address already in use: bind
[20/10/08 10:30:03:509 EST] 0000001d TCPPort E TCPC0003E: TCP Channel TCP_5 initialization failed. The socket bind failed for host nsgshsjdnwb02.news.newslimited.local and port 9352. The port may already be in use.
[20/10/08 10:30:03:509 EST] 0000001d ChannelFramew E CHFW0029E: Error initializing chain DCS-Secure because of exception com.ibm.wsspi.channel.framework.exception.RetryableChannelException: Address already in use: bind
[20/10/08 10:30:08:494 EST] 0000001f TCPPort E TCPC0003E: TCP Channel TCP_5 initialization failed. The socket bind failed for host nsgshsjdnwb02.news.newslimited.local and port 9352. The port may already be in use.
[20/10/08 10:30:08:494 EST] 0000001f ChannelFramew E CHFW0029E: Error initializing chain DCS because of exception com.ibm.wsspi.channel.framework.exception.RetryableChannelException: Address already in use: bind
[20/10/08 10:30:08:525 EST] 0000001d TCPPort E TCPC0003E: TCP Channel TCP_5 initialization failed. The socket bind failed for host nsgshsjdnwb02.news.newslimited.local and port 9352. The port may already be in use.


So, I use the netstat command to find the pid that is locking the port:

netstat -b -v | findstr 9352

It tells me launchJVM.exe is doing this. Note that before being this smart, I tried restarting the nodeagent, then the network deployment node agent - of course none of this worked.

So, WTF??? This WEB-DEVELOPMENT section in the JAS.INI must control the LaunchJVM.exe process. This was locking all of my ports and making the machine a cluster f&ck. I did not want to bounce the entire machine, because half my users were happy in Prod.

Anyway, we are all happy now and I've got faith in windows netstat.

No comments: