Monday 21 July 2014

JDE SQL Packages again

Wow, these things are really painful (sometimes).

They are great when they work, because they seem to expedite query execution, nice… maintain themselves – nice… but become very large – not nice, can become slow – not nice, do not delete themselves – not nice.

So, a rather large client of mine was getting lots of web instability.  All they could come up with is that they did a build recently and promoted some menus..  Menu’s were being particularly troublesome too.  We were getting things like the following in the logs:

18 Jul 2014 07:48:32,984 [SEVERE]  - [BASE]            com.jdedwards.database.base.JDBException: [SQL_EXCEPTION_OCCURRED] An SQL exception occurred: [SQL0901] SQL system error.. com.jdedwards.database.base.JDBException: [SQL_EXCEPTION_OCCURRED] An SQL exception occurred: [SQL0901] SQL system error..
18 Jul 2014 07:48:40,237 [SEVERE]  - [JDBJ]            SQLException occured in the SQLPhysicalConnection.select(): | Table or View Name = F983051 - Data Source[0] = Versions - PD910 java.sql.SQLException: [SQL0901] SQL system error.
18 Jul 2014 07:48:40,237 [SEVERE]  - [BASE]            com.jdedwards.database.base.JDBException: [SQL_EXCEPTION_OCCURRED] An SQL exception occurred: [SQL0901] SQL system error.. java.sql.SQLException: [SQL0901] SQL system error.
18 Jul 2014 07:48:40,237 [SEVERE]  - [BASE]            com.jdedwards.database.base.JDBException: [SQL_EXCEPTION_OCCURRED] An SQL exception occurred: [SQL0901] SQL system error.. com.jdedwards.database.base.JDBException: [SQL_EXCEPTION_OCCURRED] An SQL exception occurred: [SQL0901] SQL system error..
18 Jul 2014 07:49:07,444 [SEVERE]  - [JDBJ]            SQLException occured in the SQLPhysicalConnection.select(): | Table or View Name = F983051 - Data Source[0] = Versions - PD910 java.sql.SQLException: [SQL0901] SQL system error.
18 Jul 2014 07:49:07,444 [SEVERE]  - [BASE]            com.jdedwards.database.base.JDBException: [SQL_EXCEPTION_OCCURRED] An SQL exception occurred: [SQL0901] SQL system error.. java.sql.SQLException: [SQL0901] SQL system error.

But see from the above, it’s talking about versions, so we are having lots of menu issues – but the table is the versions table.

An incorrect assumption that we made in the beginning is that there must be a problem with the SQL package in the versions data source.  Alas this is not correct. JDBC is not like ODBC with the placement of SQL Packages for JDE, essentially the SQL package seems to be placed in the library that is first accessed by the connection – this seems to be often control tables to find the menus.  So you’ve got all sorts of access plans being stored in the control tables SQL Package.  Or what it might be is stored in the library where that particular connection went to first – still control tables in my scenario.

Despite the above being in central objects, we could not delete the central objects SQL packages, because they were locked – locked by the 6 production JVM’s no less.  Another reason why we thought that this was the panacea that we were looking for.

Of course we delete these SQL package with an outage and then started to get the errors again after some time of the system being up (you know why now, the new connections [due to JDBJ connection pooling were going to a different library first and therefore starting to use the corrupt SQL package and bombing out!]. 

image

See the above on the menus being broken.

Remember that this can get really bad, that is when you need to build the AS/400 cross reference.  I’ve been there a number of times.  This is when you cannot query the existence of a able through the catalogs (QSYS2/SYSTABLES), but the file exists.  It means that the AS/400 catalogs are broken and the only way out is a rebuild – yuk!

But remember that you can TEST the database cross reference file by following step 1 below.  At least this means that you’ll know whether you need to organise some downtime.  If you need the downtime, then step 2 is your next step.  I’d delete every NOT Q SQL package and the one that is listed in step 2.

  1. Corrupted Database Cross Reference file
    Use command GO LICPGM, then option 5 (prepare for install), then option 1 on Verify System Objects.  The following results can come up:
    • Task to prepare for install successfully completed - - - is a good result
    • Error occurs during prepare for install - - - rebuild DB Cross Reference file or run RCLSTG *DBXREF from a restricted state to rebuild the system-cross reference files.  On an AS400, use commands: ENDSBS *ALL   and  RCLSTG SELECT(*DBXREF)
  2. Corrupted Database Host Servers SQL package
    • Delete IBM SQL package QZDAPKG from QGPL library - - - QZDAPKG on an AS400 can become unusable because of corruption, size limitations, or an incompatibility between releases.  All QZDASOINIT, QZDASSINIT, and QZDAINIT jobs hold a lock on QZDAPKG.  To delete QZDAPKG use these steps:
      • ENDHOSTSVR *DATABASE
      • ENDPJ QUSRWRK QZDASOINIT *IMMED
      • ENDPJ QUSRWRK QZDASSINIT *IMMED
      • ENDPJ QSERVER QZDAINIT *IMMED
      • DLTSQLPKG QGPL/QZDAPKG
      • STRHOSTSVR *DATABASE
      • STRPJ QSERVER QZDAINIT
  3. Obtain an AS400 Joblog which should provide further diagnostic information.

What’s my wrap – similar to last time.  Unpredictable behaviour and no changes made in the application, reach for your closest SQL Package and remove it (not if it starts with Q – be VERY careful with those!).

No comments: