Friday 27 March 2009

Continue the backup ND process

Remember that I’m following: http://www.ibm.com/developerworks/websphere/library/techarticles/0611_ashok/0611_ashok.html 

Apart from the problems I’ve had with network identification, everything has gone well.  I’m doing part 2 now.  Configing the new “backup_dmgr01” as a backup of the config.

The custom property can be set on the administrative console (Figure 2) by navigating to System Administration => Node Agents => nodeagent => File Synchronization Service => Custom Properties => New.

From the ND admin console, go to the above and add the following::

he new property to create is:

  • Name: recoveryNode
  • Value: true
  • Description: (optional)

Save and sync the config and then stop and start your new profiles node agent.

This will send the network and disks berserk, as all of the applications are copied from your deployment node to the new profile.

In my case the additional java process used about 1:09 minutes of CPU and it was all done.

Now it’s time to create a script that will start the new deployment manager.  Part C.

  1. Run ./startNode.[sh | .bat] -script. This command will generate: start_nodeagent.sh (or .bat if Windows). You must then edit the script to point to the deployment manager configuration rather than the nodeagent.

  2. Rename the script to start_backup_dmgr.sh.

  3. Change the contents of the file; the key change is at the end of the file in the exec call:

In my case, the end of the script looked like:

"D:\IBM\WebSphere\AppServer\profiles\backup_dmgr01\config" "nsgshsjdndply02Cell01" "nsgshsjdndply02CellManager01" "Dmgr01" ""

Now I just have to wait and test it…

Tuesday 24 March 2009

MSDE corruptions, the new TAM corruption

Update packages to a new full were failing.  We were getting the following message on the deployment server (which has been working like clock-work)  I said clock!

image

The logs say things like:

3328/744 FOREIGN_THREAD                             Tue Mar 24 20:24:21.669000                   Jdekprnt.c1086

            KNT0000119 - PRTTable_GetDefaultPrinterInfo failed to find printer:   User: JDEDBA Host: WinClient

3328/744 WRK:Starting jdeCallObject                  Tue Mar 24 20:24:22.638000                   pkgInit.cpp739

            PKGINT0023 - Package: PY812CU02. PathCode: PY812. Update parent package. Wait for BUSBUILD. Data by PathCode. Update Package.

But the kicker was:

3328/744 WRK:Starting jdeCallObject                  Tue Mar 24 20:25:15.233001                   Jdbodbc.c8330

            ODB0000164 - STMT:00 [HY000][823] [Microsoft][ODBC SQL Server Driver][SQL Server]I/O error (bad page ID) detected during read at offset 0x00000052e72000 in file 'D:\JDEdwards\E812\PY812\package\PY812CF\pkgspec\SPEC__PY812CF.mdf'.

So, I tried a query through Enterprise Manager and was greeted with the same errors:

Server: Msg 823, Level 24, State 2, Line 1

I/O error (bad page ID) detected during read at offset 0x00000052f74000 in file 'D:\JDEdwards\E812\PY812\package\PY812CF\pkgspec\SPEC__PY812CF.mdf'.

Connection Broken

Therefore, it seems that MSDE corruptions might be the new TAM corruption of old!  Woopie.

Continuation of last night

So I put in a call to the clients windows servers guys, who, as it seems, came up with the goods.

The instructions are below:

Open ‘Network Connections’
Select Advanced from the main menu
Select Advanced Settings
From the Adapters and Bindings tab folder change the network binding order for the public network so it will be above the private network.
do a ipconfig /all again and check if the public network displays first.

Note that the order in the IP config did not change for me. I rebooted the server and the ipconfig was still the same... BUT- I was able to federate my second node into the cell... So watch this space for the final installment of backup deployment management

Monday 23 March 2009

Creating a backup Deployment Manager Node

I’m going to create a backup deployment node, as having the WebSphere deployment manager on the deployment server is NOT good enough.  I always thought that it was, but it aint.  You need to create a solid backup of the DMgr01 functionality.

Firstly I’ve gone to one of my clustered production web servers and executed “AddProfile”.  This is done with menu options, or you can just run “D:\IBM\WebSphere\AppServer\bin\ProfileCreator\pctWindows.exe”.  This will walk you through the profile creation.  Make note of the SOAP port that is assigned to the new profile.  I called the profile backup_Dmgr01.

The profile is a full set of WAS binaries that you can install additional webServers under.

You then need to federate the new profile to the existing cell.  I kept getting the following error (I tried addNode from the new profile bin dir and I also tried to add it from the ND screens pointing to the new SOAP default port 8882)

[23/03/09 20:52:51:964 EST] 0000000a AdminTool     E   ADMU0111E: Program exiting with error: com.ibm.websphere.management.exception.AdminException: ADMU0036E: The Deployment Manager cannot lookup by name host nsgshsjdnwb03.xxxxx.local at address 165.254.0.1

I did a quick google and found that you need to delete the localhost entry from the hosts file on the server (WTF???) of course that I what I would have guessed!

http://www.ibm.com/developerworks/websphere/library/techarticles/0812_howes/0812_howes.html

Above is the link to the fix.

of course you know that on windows, this file is C:\WINDOWS\system32\drivers\etc\hosts

addNode.bat DeploymentManagerHostName 8879

Hmm, so that did not work, so then I added a manual host entry to the fully qualified host name to the hosts file on the application server hosts file…  That did not work either.

Then I added a fully qualified host entry to the hosts file on the deployment manager node. 

tried ipconfig /flushdns this did not work

I’ve finally worked it out, bit embarrassing…  The web application server is a cluster, it must be picking up the cluster address (internal one), not the externally resolved IP address.

So, WebSphere is doing some funky query on the IP addresses of the machine, and the machine is returning the local one, not the proper IP.  I need to change the sequence of the adapters on a clustered windows 2003 machine…

I might put the brakes on tonight, attack again tomorrow.

Monday 16 March 2009

Summary of manifest knowledge

Hi Everyone,

I thought that I’d mention and thing or two about package manifests.  Please feel free to correct me if I’m wrong.

Please read this carefully, hopefully it will explain some of the mysteries of packages in 812 and above.

Background

· The quasi web manifest (F989999) tells the web runtime engine where to get specifications from for real-time generations of objects, this is because the web server does not have a spec.ini

· The comparison of the quasi web manifest(F989999) and package manifest (F98770) tells the web engine whether to truncate current serialized objects

· A quasi manifest record is written to the F989999 table for each pathcode, this tells the web what full package was used to generate the objects.

· This quasi record, plus the addition of OCMs (to find default logic source), then F96511 to find full package for logic source, then F98770FULLPACKAGENAME package manifest allows the web engine to determine if it is in synch or not.

· You can see this “quasi web manifest” record F989999 in UTB, wboid = “MAINIFEST-MANIFEST”

· Oracle syntax:  Select utl_raw.cast_to_varchar2(DBMS_LOB.SUBSTR(wbjpo, 2000, 1)) from dv812.f989999 where WBoID = 'MANIFEST-MANIFEST';

· Only server package deploys affect what is in the F98770 and F96511 tables

· Only generations affect what it written in the F989999 table

· You need to keep the above two in synch, or the system WILL do it for you

· If the manifest record (F989999) does not equal what is found in the F98770 manifest table, then the web engine might decide to truncate the web objects

· It’s harder to work out what it in the F98770 table, as it contains the based package and all of the updates.  You need to use the P98770 application

Generating package manifest from gen.bat.

In your generation jas.ini

[WEB DEVELOPMENT]

WebDevelopment=1

· You will not get the e1 login screen if the setting above is active, once you hit connect you will get a smaller login box.

· clip_image002

· Gen.bat does not use values in F96511 and F98770 to determine what to write in F989999

· Just uses spec.ini on generation machine to write MANIFEST-MANIFEST record to F989999

· Need to manually choose “generate manifest” from generator screen.

· This will generate from you current deployed local MSDE / SSE package from spec.ini

· IF you DO NOT SEE the manifest option in the generate menu bar, the WebDevelopment Option is NOT enabled, please read the section below

clip_image004

Do not enable this setting for the web server jas.ini, things will stop working!!!

Default setting or

[WEB DEVELOPMENT]

WebDevelopment=0

· This is the default setting

· When using gen.bat, you will be promoted for the standard E1 login splash screen if this is the setting

· You will not see the “generate->manifest” menu option

· In this mode, the generation machine determines the active deployed package for the current session (OCM + F96511 + F98770)

· This is the only information that is written to the F989999 MANIFEST-MANIFEST entry.

· If you only generate 1 object, this is what is written to the F9899999 table for the manifest

· IF you have not deployed your server package, and use this option – guess what??  EVEN if you have taken the client package on the generation machine, it will NOT generate from your local specs, it will generate from central objects and the package manifest.  It will generate the old currently deployed server objects!!!

· This option does not talk to the local MSDE / SSE specs, everything is done to and from the central objects database

manifest

This is a fairly handy post, if I do say so myself.

This firstly shows you how to convert a BLOB to readable text in oracle SQLPLUS and also shows you how to display manifest information for E1 8.12 and above using SQL

select
utl_raw.cast_to_varchar2(DBMS_LOB.SUBSTR(wbjpo, 2000, 1)) from dv812.f989999 where WBoID = 'MANIFEST-MANIFEST';

This will reveal something like:

¿¿ sr 9com.jdedwards.base.spec.manifest.SerializedObjectManifest 1¿? xpw
¿ Dcom.jdedwards.base.spec.manifest.SerializedObjectManifest/2005-10-07 DV812 
DV812IF ¿F¿ ¿F¿  0¿¿  0¿¿
nsgshsjdnwb01 ~ MOIRS  0¿¿
n
sgshsjdnwb01 ~ MOIRSx

If you search through it, you'll find the important bits:

Note that if you've taken the planner ESU, you can just run the following application (P98770) which will tell you the same information. But if you can't bring the web up - then you might need to run my SQL above.

Thursday 5 March 2009

apache redirect from port 80 to another

NameVirtualHost E1DEFAULT:80

ServerName E1DEFAULT
RedirectMatch permanent ^/$ http://PROPERSERVERNAME:81/jde/owhtml/index.html