Thursday 5 November 2009

JDESPECRESULT_DUPLICATEKEY is metadata kernel logs

We’ve been getting UBE’s failing randomly.

In the UBE Log we have:

Failed to load job spec cache for job 103247 report R09801_ZJDE0002. Cannot load specs for UBE. Please remove the cache directories for this report under the /PD812/spec/runtimeCache directory and rerun the report. If errors persist, check the metadata kernel logs for more information (object could be missing from package, failed to access package database, etc.)

Hmm, not really that helpful.  But the metadata kernel logs are much better:

79420/6683 WRK:Metadata job 804 Thu Nov 5 00:17:40.634859 ServerDispatch.cpp1419
MD_INFO: Previous UBE Job Failed (R03B551_SMNG051S_103114, Pathcode= isTemplate=<1>)
479420/515 WRK:Metadata job 791 Thu Nov 5 00:17:45.898359 specmisc.c2448
CreateJobCache() - Failed to find version record for R03B551_+ in RDASpec source repository for job 0.
479420/515 WRK:Metadata job 791 Thu Nov 5 00:17:45.899167 ServerDispatch.cpp728
Job Cache Template creation for report_version = R03B551_NNWN050S Failed, job = 103118
479420/6426 WRK:Metadata job 803 Thu Nov 5 00:17:50.790448 specmisc.c3386
copySpecs() - jdeSpecInsert() failed with error code, JDESPECRESULT_DUPLICATEKEY. Record was selected from another repository using the k
ey, R03B551.+.

So, we see the duplicate key problem.  The duplicate is converting F98761 to TAM in the runtimeCache dir. 

I had a sniff around with TAM Browse, thinking that there might be a corruption in the base object, but nothing much could be found.  It all looked good.  There were all of the correct constraints on the tables too, so no duplicates could have hit the database..

Then oracle support came up with the goods.

You CANNOT run the same REPORT|VERSION combo at the same time!!!!  WTF?  what happened to concurrent applications, concurrent UBEs?  Their magical solution is to ensure that jobs are not run at the same time (for the first time) HUH????  This is madness. So you deploy a full package, take JDE down…  Wait 10 mins…  Scheduler is ready to fire off a bunch of jobs… There might be some waiting jobs already…  They all want to run at the same time down multi-threaded queues and corrupt each other.  so the jobs you run more often are more prone to the corruption!  nice work!

SAR is 8933751

NEWS FLASH

It gets worse.  You cannot run parallel reports until a runtimeCache has been established for the report.  So just say you want to launch 4 versions of a report at the same time from the scheduler (after a full package deploy), you cannot.  They will fight each other to write the runtimeCache specs for the base report /PD812/spec/runtimeCache/R0010P/rdaspec.ddb etc and will write corrupt specs or fail or both! 

Remember that a full package deploy will / should delete all runtimeCache specs.

So all UBEs need to be single threaded until at least one has been run before.  Nice business rules!

No comments: