Friday, 18 January 2019

Using AI and image recognition with JD Edwards

Introduction

This blog post is hopefully going to demonstrate how Fusion5 (quite specifically William in my team) have been able to exploit some really cool AI cloud constructs and link them in with JD Edwards.  We’ve been looking around for a while for use cases for proper AI and JD Edwards, and we think that we have something pretty cool.

I want to also point out that a lot of people are claiming AI, when what they are doing is not AI.  I think that true AI is able to do evaluations (calculations) based upon a set of parameters that it has not necessarily seen before.  It's not comparing the current images to 1000000's of other images, it has been trained on MANY other images and in it's internals it has the ability to apply that reference logic.  That model that has been built from all of it's training can be run offline, it's essentially autonomous - this is a critical element in the understanding of AI.


We are using JD Edwards orchestration to call out to an external web service that we’ve written (web hook).  This web hook has been programmed to call a number of different AI models to interpret images that have been attached to JD Edwards data.  So, if you use generic media object attachments – this mechanism can be used to interpret what is actually in those images.  This can greatly increase the ability for a JD Edwards customer to react to situations that need it.


For example if you used JD Edwards for health and safety incidents and you wanted some additional help in making sure that the images being attached did not contain certain critical objects – and perhaps if they do, you’d raise the severity or send a message based upon the results… You could also analyse frames of video with the same logic and detect certain objects.


We’ve decided to test the cloud and are using different models for our object detection.  we are using google, Microsoft and AWS to see if they are better or worse at object detection.

Object detection vs. Object Recognition

Note that there is a large difference between object detection and object recognition. – stolen from https://dsp.stackexchange.com/questions/12940/object-detection-versus-object-recognition
Object Recognition: which object is depicted in the image?

  • input: an image containing unknown object(s)
    Possibly, the position of the object can be marked in the input, or the input might be only a clear image of (not-occluded) object.
  • output: position(s) and label(s) (names) of the objects in the image
    The positions of objects are either acquired form the input, or determined based on the input image.
    When labelling objects, there is usually a set of categories/labels which the system "knows" and between which the system can differentiate (e.g. object is either dog, car, horse, cow or bird).
Object detection: where is this object in the image?
  • input: a clear image of an object, or some kind of model of an object (e.g. duck) and an image (possibly) containing the object of interest
  • output: position, or a bounding box of the input object if it exists in the image (e.g. the duck is in the upper left corner of the image
We’ve spent time training our algorithms to look for certain objects in images, so using object recognition.  We’ve trained 3 separate algorithms with the same 200 training images (we know that this is tiny, but the results are surprising!)


My PowerPoint skills are really being shown off in the above diagram of what has been done.

At this stage we’ve used a couple of the major public cloud providers and have created our own models that are specifically designed to detect objects that we are interested in, namely trains, graffiti and syringes.  This is quite topical in a public safety environment.
We’ve created an orchestration and a connector that are able to interrogate JDE and send the various attachments to the AI models and have some verification of what is actually in the images.  Note that this could easily be put into a schedule or a notification to ensure that this was being run for any new images that are uploaded to our system.

Testing

Let’s scroll deep into google images for train graffiti.  The reason I scroll deep is that these algorithms were trained on 70 pics of trains and 70 pics of graffiti and also 40 pics of syringes.  I want to ensure that I'm showing the algorithm something that it has never seen before.


And attach this to an address book entry in JD Edwards as a URL type attachment.



In this instance we are using the above parameters. 300 as the AN8 for ABGT and only want type 5’s.


William has written an orchestration which can run through the media objects (F00165) for ANY attachments.  We're currently processing image attachments, but really - this could be anything.

To call our custom orchestration, our input JSON looks like this:

{
   "inputs" : [ {
     "name" : "Object Name 1",
     "value" : "ABGT"
   }, {
     "name" : "Generic Text Key 1",
     "value" : "300"
   }, {
     "name" : "MO Type 1",
     "value" : "5"
   }, {
     "name" : "Provider",
     "value" : "customVision"
   }, {
     "name" : "ModelName",
     "value" : ""
   } ]
}
The provider in this instance is a model that we have trained in customVision in Azure.  We've trained this model with the same images as Google, and also our completely custom model.

Let’s run it and see what it thinks.  Remember that the orchestration is calling an Service Request which is essentially running a Lambda function though a connection.  There is some basic authentication in the header to ensure that the right people are calling it.  The Lambda function is capable of extending HOW it interprets the photo, how many different AI engines and models that it will return.
{
   "Data Requests" : [ {
     "Data Browser - F00165 [Media Objects storage]" : [ {
       "Media Object Sequence Number" : "1",
       "GT File Name" : "
https://c1.staticflickr.com/6/5206/5350489683_e7cdca43ba_b.jpg"
     } ]
   } ],
   "Connectors" : [ {
     "graffiti" : 0.9162581,
     "train" : 0.599198341,
     "syringe" : 0.00253078272
   } ]
}
Not too bad, it’s 91% sure that there is graffiti, 60% sure that there is a train and pretty sure that there are no syringes.  Not too bad, let’s try google now
A simple change to the provider parameter allows us to use google next.  Note I did have some issues with my screen shots, so that might reference some different pictures, but revealed these results.

{
   "inputs" : [ {
     "name" : "Object Name 1",
     "value" : "ABGT"
   }, {
     "name" : "Generic Text Key 1",
     "value" : "300"
   }, {
     "name" : "MO Type 1",
     "value" : "5"
   }, {
     "name" : "Provider",
     "value" : "autoML"
   }, {
     "name" : "ModelName",
     "value" : ""
   } ]
}
Results
{
   "Data Requests" : [ {
     "Data Browser - F00165 [Media Objects storage]" : [ {
       "Media Object Sequence Number" : "1",
       "GT File Name" : "
https://c1.staticflickr.com/6/5206/5350489683_e7cdca43ba_b.jpg"
     } ]
   } ],
   "Connectors" : [ {
     "graffiti" : 0.9999526739120483,
     "train" : 0.8213397860527039
   } ]
}
Similar, it thinks that there is 99.99% chance of graffiti and 82% chance of a train – more certain than Microsoft.


Finally, let’s try a hosted model that we are running on google cloud:
We drive that with the following parameters

{
   "inputs" : [ {
     "name" : "Object Name 1",
     "value" : "ABGT"
   }, {
     "name" : "Generic Text Key 1",
     "value" : "300"
   }, {
     "name" : "MO Type 1",
     "value" : "5"
   }, {
     "name" : "Provider",
     "value" : "customModel"
   }, {
     "name" : "ModelName",
     "value" : "multi_label_train_syringe_graffiti"
   } ]
}
And the output is:
{
   "Data Requests" : [ {
     "Data Browser - F00165 [Media Objects storage]" : [ {
       "Media Object Sequence Number" : "1",
       "GT File Name" : "
https://c1.staticflickr.com/6/5206/5350489683_e7cdca43ba_b.jpg"
     } ]
   } ],
   "Connectors" : [ {
     "graffiti" : 0.9984090924263,
     "syringe" : 7.345536141656339E-4,
     "train" : 0.9948076605796814
   } ]
}

So it’s very certain there is a train and graffiti, but very certain there is no syringe.

What does this mean?

We are able to do some advanced image recognition over native JD Edwards attachments using some pretty cool cloud constructs.  We’ve trained these models with limited data, and have some great results.  Although we should really try some images without trains or graffiti (trust me, this does also work).  We are paying a fraction of a cent for some massive compute to be able to load our models and process our specific AI needs.

You could be on premise or in the cloud and STILL use all of these techniques to understand your non structured data better.  This is all done with a single orchestration.


Fusion5 have the ability to create custom models for you and find actionable insights that you are interested in and ensure that this information is ALERTING users to ACT.

What does out AI algorithm think of me?

As you know, it's been trained (excuse the put) to look for trains, graffiti and syringes.  What if I pass in a picture of me?

Image result for "shannon moir"


Let’s process this for a joke: "https://fusion5.com.au/media/1303/shannonmoir-300.jpg"
{
   "Data Requests" : [ {
     "Data Browser - F00165 [Media Objects storage]" : [ {
       "Media Object Sequence Number" : "1",
       "GT File Name" : "
https://c1.staticflickr.com/6/5206/5350489683_e7cdca43ba_b.jpg"
     }, {
       "Media Object Sequence Number" : "2",
       "GT File Name" : "
https://fusion5.com.au/media/1303/shannonmoir-300.jpg"
     } ]
   } ],
   "Connectors" : [ {
     "graffiti" : 0.9162581,
     "train" : 0.599198341,
     "syringe" : 0.00253078272
   }, {
     "graffiti" : 0.100777447,
     "syringe" : 0.006329027,
     "train" : 0.00221525226

   } ]
}
Above is an example of processing a picture of “yours truly”, to see what Azure thinks…
10% chance graffiti and no syringe or train…  not bad…

Great, so if this was attached to an incident in JDE, you might want to still raise the priority of the case, but not becasue there is graffiti or a train!

What’s next

We (Innovation team at Fusion5) are going to integrate this into JD Edwards and increase the priority of incidents based upon what is seen in the images.  

These algorithms can be trained to look for anything we want in images and we can automatically react to situations without human intervention.  

Another very simple extension of what you see here is using AI to rip all of the text out of an image, OCR if you like.  It’d be super simple to look through ALL images and convert them to text attachments or searchable text.


Imagine that you wanted to verify a label or an ID stamp on an object, this could all be done through AI very simply!


Extending JDE to generative AI