108/1168 Tue Dec 11 21:49:02.125002 jdbodbc.C7611
ODB0000164 - STMT:00 [08S01][10054][2] [Microsoft][SQL Server Native Client 11.0]TCP Provider: An existing connection was forcibly closed by the remote host.
108/1168 Tue Dec 11 21:49:02.125003 jdbodbc.C7611
ODB0000164 - STMT:01 [08S01][10054][2] [Microsoft][SQL Server Native Client 11.0]Communication link failure
108/1168 Tue Dec 11 21:49:02.125004 JDB_DRVM.C998
JDB9900401 - Failed to execute db request
108/1168 Tue Dec 11 21:49:02.125005 JTP_CM.C1335
JDB9900255 - Database connection to F98611 (PJDEENT02 - 920 Server Map) has been lost.
108/1168 Tue Dec 11 21:49:02.125006 JTP_CM.C1295
JDB9900256 - Database connection to (PJDEENT02 - 920 Server Map) has been re-established.
108/1168 Tue Dec 11 21:49:02.125007 jdbodbc.C2702
ODB0000020 - DBInitRequest failed - lost database connection.
108/1168 Tue Dec 11 21:49:02.125008 JDB_DRVM.C908
JDB9900168 - Failed to initialize db request
Who loves spending the morning fixing jobs from the night before and moving batch queues and UBE's until things are back to normal? Noone!
Here is something that may help, not I must admit I gotta thank an amazing colleague for this, not my SQL - but I go like it.
What you need to do is write a basic shell script (say that was on the ent server) that runs this:
select count (*) from SY910.F91300
where SJSCHJBTYP = '1'
and SJSCHSTTIME > (select
((extract(day from (current_timestamp-timestamp '1970-01-01 00:00:00 +00:00'))*86400+
extract(hour from (current_timestamp-timestamp '1970-01-01 00:00:00 +00:00'))*3600+
extract(minute from (current_timestamp-timestamp '1970-01-01 00:00:00 +00:00'))*60+
extract(second from (current_timestamp-timestamp '1970-01-01 00:00:00 +00:00')))/60)-60 current_utime_minus_1hour
from dual);
where SJSCHJBTYP = '1'
and SJSCHSTTIME > (select
((extract(day from (current_timestamp-timestamp '1970-01-01 00:00:00 +00:00'))*86400+
extract(hour from (current_timestamp-timestamp '1970-01-01 00:00:00 +00:00'))*3600+
extract(minute from (current_timestamp-timestamp '1970-01-01 00:00:00 +00:00'))*60+
extract(second from (current_timestamp-timestamp '1970-01-01 00:00:00 +00:00')))/60)-60 current_utime_minus_1hour
from dual);
If you get a 1 that is good, if you get 0 that is bad. You probably need to recycle your scheduler kernel (that control record should change every 15 mins at least).
So, if you have a script that runs that, you can tell if the kernel is updating the control record...
Then you can grep through the logs to find the PID of the scheduler kernel and kill it from the OS. Then I write a little executable that gives the scheduler kernel a kick in the pants (start a new one) - and BOOM! You have a resiliant JD Edwards scheduler.
No comments:
Post a Comment