Friday, 18 January 2019

Using AI and image recognition with JD Edwards

Introduction

This blog post is hopefully going to demonstrate how Fusion5 (quite specifically William in my team) have been able to exploit some really cool AI cloud constructs and link them in with JD Edwards.  We’ve been looking around for a while for use cases for proper AI and JD Edwards, and we think that we have something pretty cool.

I want to also point out that a lot of people are claiming AI, when what they are doing is not AI.  I think that true AI is able to do evaluations (calculations) based upon a set of parameters that it has not necessarily seen before.  It's not comparing the current images to 1000000's of other images, it has been trained on MANY other images and in it's internals it has the ability to apply that reference logic.  That model that has been built from all of it's training can be run offline, it's essentially autonomous - this is a critical element in the understanding of AI.


We are using JD Edwards orchestration to call out to an external web service that we’ve written (web hook).  This web hook has been programmed to call a number of different AI models to interpret images that have been attached to JD Edwards data.  So, if you use generic media object attachments – this mechanism can be used to interpret what is actually in those images.  This can greatly increase the ability for a JD Edwards customer to react to situations that need it.


For example if you used JD Edwards for health and safety incidents and you wanted some additional help in making sure that the images being attached did not contain certain critical objects – and perhaps if they do, you’d raise the severity or send a message based upon the results… You could also analyse frames of video with the same logic and detect certain objects.


We’ve decided to test the cloud and are using different models for our object detection.  we are using google, Microsoft and AWS to see if they are better or worse at object detection.

Object detection vs. Object Recognition

Note that there is a large difference between object detection and object recognition. – stolen from https://dsp.stackexchange.com/questions/12940/object-detection-versus-object-recognition
Object Recognition: which object is depicted in the image?

  • input: an image containing unknown object(s)
    Possibly, the position of the object can be marked in the input, or the input might be only a clear image of (not-occluded) object.
  • output: position(s) and label(s) (names) of the objects in the image
    The positions of objects are either acquired form the input, or determined based on the input image.
    When labelling objects, there is usually a set of categories/labels which the system "knows" and between which the system can differentiate (e.g. object is either dog, car, horse, cow or bird).
Object detection: where is this object in the image?
  • input: a clear image of an object, or some kind of model of an object (e.g. duck) and an image (possibly) containing the object of interest
  • output: position, or a bounding box of the input object if it exists in the image (e.g. the duck is in the upper left corner of the image
We’ve spent time training our algorithms to look for certain objects in images, so using object recognition.  We’ve trained 3 separate algorithms with the same 200 training images (we know that this is tiny, but the results are surprising!)


My PowerPoint skills are really being shown off in the above diagram of what has been done.

At this stage we’ve used a couple of the major public cloud providers and have created our own models that are specifically designed to detect objects that we are interested in, namely trains, graffiti and syringes.  This is quite topical in a public safety environment.
We’ve created an orchestration and a connector that are able to interrogate JDE and send the various attachments to the AI models and have some verification of what is actually in the images.  Note that this could easily be put into a schedule or a notification to ensure that this was being run for any new images that are uploaded to our system.

Testing

Let’s scroll deep into google images for train graffiti.  The reason I scroll deep is that these algorithms were trained on 70 pics of trains and 70 pics of graffiti and also 40 pics of syringes.  I want to ensure that I'm showing the algorithm something that it has never seen before.


And attach this to an address book entry in JD Edwards as a URL type attachment.



In this instance we are using the above parameters. 300 as the AN8 for ABGT and only want type 5’s.


William has written an orchestration which can run through the media objects (F00165) for ANY attachments.  We're currently processing image attachments, but really - this could be anything.

To call our custom orchestration, our input JSON looks like this:

{
   "inputs" : [ {
     "name" : "Object Name 1",
     "value" : "ABGT"
   }, {
     "name" : "Generic Text Key 1",
     "value" : "300"
   }, {
     "name" : "MO Type 1",
     "value" : "5"
   }, {
     "name" : "Provider",
     "value" : "customVision"
   }, {
     "name" : "ModelName",
     "value" : ""
   } ]
}
The provider in this instance is a model that we have trained in customVision in Azure.  We've trained this model with the same images as Google, and also our completely custom model.

Let’s run it and see what it thinks.  Remember that the orchestration is calling an Service Request which is essentially running a Lambda function though a connection.  There is some basic authentication in the header to ensure that the right people are calling it.  The Lambda function is capable of extending HOW it interprets the photo, how many different AI engines and models that it will return.
{
   "Data Requests" : [ {
     "Data Browser - F00165 [Media Objects storage]" : [ {
       "Media Object Sequence Number" : "1",
       "GT File Name" : "
https://c1.staticflickr.com/6/5206/5350489683_e7cdca43ba_b.jpg"
     } ]
   } ],
   "Connectors" : [ {
     "graffiti" : 0.9162581,
     "train" : 0.599198341,
     "syringe" : 0.00253078272
   } ]
}
Not too bad, it’s 91% sure that there is graffiti, 60% sure that there is a train and pretty sure that there are no syringes.  Not too bad, let’s try google now
A simple change to the provider parameter allows us to use google next.  Note I did have some issues with my screen shots, so that might reference some different pictures, but revealed these results.

{
   "inputs" : [ {
     "name" : "Object Name 1",
     "value" : "ABGT"
   }, {
     "name" : "Generic Text Key 1",
     "value" : "300"
   }, {
     "name" : "MO Type 1",
     "value" : "5"
   }, {
     "name" : "Provider",
     "value" : "autoML"
   }, {
     "name" : "ModelName",
     "value" : ""
   } ]
}
Results
{
   "Data Requests" : [ {
     "Data Browser - F00165 [Media Objects storage]" : [ {
       "Media Object Sequence Number" : "1",
       "GT File Name" : "
https://c1.staticflickr.com/6/5206/5350489683_e7cdca43ba_b.jpg"
     } ]
   } ],
   "Connectors" : [ {
     "graffiti" : 0.9999526739120483,
     "train" : 0.8213397860527039
   } ]
}
Similar, it thinks that there is 99.99% chance of graffiti and 82% chance of a train – more certain than Microsoft.


Finally, let’s try a hosted model that we are running on google cloud:
We drive that with the following parameters

{
   "inputs" : [ {
     "name" : "Object Name 1",
     "value" : "ABGT"
   }, {
     "name" : "Generic Text Key 1",
     "value" : "300"
   }, {
     "name" : "MO Type 1",
     "value" : "5"
   }, {
     "name" : "Provider",
     "value" : "customModel"
   }, {
     "name" : "ModelName",
     "value" : "multi_label_train_syringe_graffiti"
   } ]
}
And the output is:
{
   "Data Requests" : [ {
     "Data Browser - F00165 [Media Objects storage]" : [ {
       "Media Object Sequence Number" : "1",
       "GT File Name" : "
https://c1.staticflickr.com/6/5206/5350489683_e7cdca43ba_b.jpg"
     } ]
   } ],
   "Connectors" : [ {
     "graffiti" : 0.9984090924263,
     "syringe" : 7.345536141656339E-4,
     "train" : 0.9948076605796814
   } ]
}

So it’s very certain there is a train and graffiti, but very certain there is no syringe.

What does this mean?

We are able to do some advanced image recognition over native JD Edwards attachments using some pretty cool cloud constructs.  We’ve trained these models with limited data, and have some great results.  Although we should really try some images without trains or graffiti (trust me, this does also work).  We are paying a fraction of a cent for some massive compute to be able to load our models and process our specific AI needs.

You could be on premise or in the cloud and STILL use all of these techniques to understand your non structured data better.  This is all done with a single orchestration.


Fusion5 have the ability to create custom models for you and find actionable insights that you are interested in and ensure that this information is ALERTING users to ACT.

What does out AI algorithm think of me?

As you know, it's been trained (excuse the put) to look for trains, graffiti and syringes.  What if I pass in a picture of me?

Image result for "shannon moir"


Let’s process this for a joke: "https://fusion5.com.au/media/1303/shannonmoir-300.jpg"
{
   "Data Requests" : [ {
     "Data Browser - F00165 [Media Objects storage]" : [ {
       "Media Object Sequence Number" : "1",
       "GT File Name" : "
https://c1.staticflickr.com/6/5206/5350489683_e7cdca43ba_b.jpg"
     }, {
       "Media Object Sequence Number" : "2",
       "GT File Name" : "
https://fusion5.com.au/media/1303/shannonmoir-300.jpg"
     } ]
   } ],
   "Connectors" : [ {
     "graffiti" : 0.9162581,
     "train" : 0.599198341,
     "syringe" : 0.00253078272
   }, {
     "graffiti" : 0.100777447,
     "syringe" : 0.006329027,
     "train" : 0.00221525226

   } ]
}
Above is an example of processing a picture of “yours truly”, to see what Azure thinks…
10% chance graffiti and no syringe or train…  not bad…

Great, so if this was attached to an incident in JDE, you might want to still raise the priority of the case, but not becasue there is graffiti or a train!

What’s next

We (Innovation team at Fusion5) are going to integrate this into JD Edwards and increase the priority of incidents based upon what is seen in the images.  

These algorithms can be trained to look for anything we want in images and we can automatically react to situations without human intervention.  

Another very simple extension of what you see here is using AI to rip all of the text out of an image, OCR if you like.  It’d be super simple to look through ALL images and convert them to text attachments or searchable text.


Imagine that you wanted to verify a label or an ID stamp on an object, this could all be done through AI very simply!


Friday, 21 December 2018

Technical Debt - again

In my last post I showed you some advanced reporting over ERP analytics data to understand what applications are being used, how often they are being used and who is using them.  This is the start of understanding your JD Edwards modifications and therefore technical debt.

At Fusion5 we are doing lots of upgrades all of the time, so we need to understand our clients technical debt.  We strive to make every upgrade more cost efficient and easier.  This is easier said than done, but let me mention a couple of ways which we do this:

Intelligent and consistent use of category codes for objects.  One of the code is specifically about retrofit and needs to be completed when the object is created.  This is "retrofit needed" - sounds simple I know.  But, if you create something bespoke - that never needs to be retrofitted - the best thing you can do it mark it like that.  Therefore lots of time will be saved looking at this object in the future (again and again).

Replace modifications with configuration.  UDO's have made this better and easier and continue to do so.  If you are retrofitting and you think - hey - I could do this with a UDO - please do yourself a favour and configure a UDO and don't touch the code!  Security is also an important concept for developers to understand completely.  Because - guess what?  You can use security to force people to enter something into the QBE line - you don't need to use code.  (Application Query Security)



  1. Everyone needs to understand UDO's well.  We all have a role in simplification.
If you don't know what EVERY one of these are - you need to know!

OCM's can be used for force keyed queries.  Wow!!!  Did you know that you can create a specific OCM that forces people to only use keyed fields for QBE - awesome.  So simple.  I know that there is code out there that enforces this.   This is like the above tip for security.



System enhancement knowledge.  This is harder (takes time), but knowledge of how modules are enhanced over time is going to hopefully retire some custom code.  Oracle do a great job of giving us the power to find this, you just need to know where to look:



Compare releases


Calculate the financial impact.  Once you know all of this, you can start to use a calculator like fusion5 have developed, this is going to assist you understand your technical debt and do research around it.  We have developed a comprehensive suite of reports that allow you to slice and dice your modification data and understand what modifications are going to cost you money and which ones will not.  Here are a couple of screen grabs.  All we need to create your personalised and interactive dashboard is the results of a couple of SQL statements that we provide (or you run our agent - though ppl don't like running agents).


You can see that I have selected 5 system codes and I can see how much the worst case and best case estimates for the retrofit of those 5 system codes is.  I can see how often the apps are used and therefore make an appropriate finance based decision on whether that should be kept or not.  You are able to see the cost estimates by object type, system code and more.  Everything can also be downloaded for excel analysis.







Wednesday, 19 December 2018

To keep a modification or not–that be the question

The cost of a modification grows and grows.  If you look at your modifications, especially if you are modifying core objects – retrofit is going to continue to cost you money going forward.

How can you work out how often your modified code (or custom code) for that matter is being used?

One method is to use object identification, but this is only part of the story.

You’ll see below that ERP analytics is able to provide you things like number of session, number of unique users, Average time on page and total time on page for each of your JD Edwards applications.  This can be based on application, form or version – which can assist you find out more.

With this information, you can see how often your modifications are used, and for how long and make a call on whether they are worth their metal.


image

Our reporting suite allows you to choose date ranges and also system codes to further refine the analysis.

image


You are then able to slice and dice your mods (note that we can determine modified objects too, but this is using data blending with data studio) to give you a complete picture:

image


Of course, we can augment this list with batch and then calculate secondary objects from cross reference to begin to build the complete picture.  You want to narrow down both retrofit and testing if you can.


image


See below for how we look at queue concurrency and wait times to work out job scheduling opportunities and efficiencies.

image

Thursday, 13 December 2018

JDE scheduler problems

Who loves seeing logs like this for their scheduler kernel?

108/1168     Tue Dec 11 21:49:02.125002        jdbodbc.C7611
       ODB0000164 - STMT:00 [08S01][10054][2] [Microsoft][SQL Server Native Client 11.0]TCP Provider: An existing connection was forcibly closed by the remote host.
108/1168     Tue Dec 11 21:49:02.125003        jdbodbc.C7611
       ODB0000164 - STMT:01 [08S01][10054][2] [Microsoft][SQL Server Native Client 11.0]Communication link failure

108/1168     Tue Dec 11 21:49:02.125004        JDB_DRVM.C998
       JDB9900401 - Failed to execute db request

108/1168     Tue Dec 11 21:49:02.125005        JTP_CM.C1335
       JDB9900255 - Database connection to F98611 (PJDEENT02 - 920 Server Map) has been lost.

108/1168     Tue Dec 11 21:49:02.125006        JTP_CM.C1295
       JDB9900256 - Database connection to (PJDEENT02 - 920 Server Map) has been re-established.

108/1168     Tue Dec 11 21:49:02.125007        jdbodbc.C2702
       ODB0000020 - DBInitRequest failed - lost database connection.

108/1168     Tue Dec 11 21:49:02.125008        JDB_DRVM.C908
       JDB9900168 - Failed to initialize db request

Who loves spending the morning fixing jobs from the night before and moving batch queues and UBE's until things are back to normal?  Noone!

Here is something that may help, not I must admit I gotta thank an amazing colleague for this, not my SQL - but I go like it.

What you need to do is write a basic shell script (say that was on the ent server) that runs this:

select count (*) from SY910.F91300
    where SJSCHJBTYP = '1'
    and SJSCHSTTIME > (select
                      ((extract(day from (current_timestamp-timestamp '1970-01-01 00:00:00 +00:00'))*86400+
                        extract(hour from (current_timestamp-timestamp '1970-01-01 00:00:00 +00:00'))*3600+
                        extract(minute from (current_timestamp-timestamp '1970-01-01 00:00:00 +00:00'))*60+
                        extract(second from (current_timestamp-timestamp '1970-01-01 00:00:00 +00:00')))/60)-60 current_utime_minus_1hour
                        from dual);

If you get a 1 that is good, if you get 0 that is bad.  You probably need to recycle your scheduler kernel  (that control  record should change every 15 mins at least).

So, if you have a script that runs that, you can tell if the kernel is updating the control record...

Then you can grep through the logs to find the PID of the scheduler kernel and kill it from the OS.  Then I write a little executable that gives the scheduler kernel a kick in the pants (start a new one) - and BOOM!  You have a resiliant JD Edwards scheduler.




Tuesday, 11 December 2018

real-time session information for ALL your JDE users

This post is based upon another yourtube clip, which explains the kind if realtime information that you can extract from JD Edwards using ERP analytics.

Of course, this is just google analytics with some special tuning which is specific to JDE.

This clip shows you how you can see actual activity in JDE, not just server manager – people logged in.  What I find is that the actual load on the system has nothing really to do with what SM reports.  SM reports some artificially high numbers – those which have not timed out.  This can include many hours of inactivity.  What GA (Google Analytics) reports on is those which have interacted with the browser in the last 5 minutes.  It also gives you realtime pages per minute and pages per second.  Sometimes I wonder how you can run a site (or at least do load testing) without some of these metrics.  I often see 120 people in server manager and 35 people online with GA.

Anyway, enjoy the vid – if you have questions, please reach out.


Thursday, 6 December 2018

64 bits, not butts!

I've been to Denver and chatted to the team about 64 bit, and they are all pretty nonchalant about the process.  Very confident too, as we all know it's been baked into the tools for some time, just getting it into the BSFN's.

Honestly though, how many of your kernels or UBE's need to address more than 2GB of RAM (or 3GB with PAE blah blah), not many I hope!  If you do, there might be some other issues that you have to deal with first.

To me it seems pretty simple too, we activate 64 bit tools and then build a full package using 64 bit compile directives.  We then end up with 64bit pathcode specific dll's or so's and away we go.

The thing is, don't forget that you need to slime your code to ensure that it is 64bit ready, what does this mean?  I again draw an analogy between char and wchar, remember the unicode debacle?  Just think about that once again.  If you use all of the standard JDE malloc's and reallocs - all good, but if you've ventured into the nether-regions of memory management (as I regularly do), then there might be a little more polish you need to provide.

This is a good guide with some great samples of problems and rectifications of problems, quite specifically for JDE:
https://www.oracle.com/webfolder/technetwork/tutorials/jdedwards/White%20Papers/jde64bsfn.pdf

In the simplest form, I'll demonstrate 64 bit vs 32 bit with the following code and the following output.

#include
int main(void)
{
  int i = 0;
  int *d ;
  printf("hello world\n");
  printf("number %d %d\n",i,sizeof(i));
  d=&i;
  printf("number %d %d\n",*d, sizeof(d));
  return 1;
}

giving me the output

[holuser@docker ~]$ cc hello.c -m32 -o hello
[holuser@docker ~]$ ./hello
hello world
number 0 4
number 0 4
[holuser@docker ~]$ cc hello.c -m64 -o hello
[holuser@docker ~]$ ./hello
hello world
number 0 4
number 0 8

Wow - what a difference hey?  Can't get 32 bit to compile, then you are going to need to run this as root:

yum install glibc-devel.i686 libgcc.i686 libstdc++-devel.i686 ncurses-devel.i686 --setopt=protected_multilib=false

The size of the basic pointer is 8 bytes - you can address way more memory.  This is the core of the change to 64 bit and everything flows from the size of the base pointers.

Basically, the addresses are 8 bytes, not 4 - which changes arithmetic and a whole heap of down stream things.  So when doing pointer arithmetic and cool things, your code is going to be different.

The sales glossy is good from oracle, I say get to 64 if you can.

1.     Moving to 64-bit enables you to adopt future technology and future-proof your environments. If you do not move to 64-bit, you incur the risk of facing hardware and software obsolescence. The move itself to 64-bit is the cost benefit.
2.     Many vendors of third-party components, such as database drivers and Java, which JD Edwards EnterpriseOne requires, are delivering only 64-bit components. They also have plans in the future to end or only provide limited support of 32-bit components.
3.     It enables JD Edwards to deliver future product innovation and support newer versions of the required technology stack.
4.     There is no impact to your business processes or business data. Transitioning to 64-bit processing is a technical uplift that is managed with the JD Edwards Tools Foundation.

This was stolen directly from https://www.oracle.com/webfolder/technetwork/tutorials/jdedwards/64bit/64_bit_Brief.pdf
  
Okay, so now we know the basics of 64 vs 32 - we need to start coding around it and fixing our code.  You'll know pretty quick if there are problems, the troubleshooting guide and google are going to be your friend. 

Note that there are currently 294 ESUs and 2219 objects that are related to BSFN compile and function problems - the reach is far.

These are divided into the following categories:


So there might be quite a bit of impact here.

Multi foundation is painful at the best of times, this is going to tough if clients want to do it over a weekend.  I recommend new servers with 64 bit and get rid of the old ones in one go.  Oracle have done some great work to enable this to be done gradually, but I think just bash it into prod on new servers once you have done the correct amount of testing.

This is great too https://docs.oracle.com/cd/E84502_01/learnjde/64bit.html