logo

NJP

What's new in Washington DC for Document Intelligence (AI Academy)

Import · Mar 21, 2024 · video

hello everybody and welcome to AI Academy good morning good afternoon good evening depending on where you're joining us from before we get started as always uh today we're going to cover everything that's available in the product but in case you have questions or want to know about our road map we can share some details but we have to share the Safe Harbor notice here meaning that if we make any statement regarding what's coming in the product in the future you always always have to check with your account team before making purchasing decisions if you are new to AI Academy what is that so AI Academy is a session that we where we bring content for you we bring you fresh ideas so that you have a better understanding and practical guidance on our AI products this meeting this session is recorded and will be available after but if you are here with us today great opportunity for you to ask your questions and have our experts answer them you can ask your question in the Q&A panel and that's the easiest way for us to make sure we don't miss any questions uh but in addition to the session today we also have additional contents to share with you we have an generative Ai and intelligence Community Forum where you can find a lot of content ask your questions and see preview answered questions as well as FAQ and a lot more content as I mentioned the session is recorded and then is available on the now Community YouTube channel as a matter of fact there's some specific content that's going to be relevant for the session today on the community there is an article called document intelligence quick start guide it's the best way to get started if you don't know about document intelligence this is where uh the place I recommend started a few last year last year I did a session on get started with document intelligence it was a previous AI Academy the recorded is the recording is available on YouTube and finally all the questions that I get asked about document intelligence I put them in all in the FAQ so that whenever you have a question and you want to check if that's something that was already answered you could just have to go to the FAQ all right our agenda for today we are going to start with our overview and then we'll spend most of the session today on an instance so that I can show you what we are talking about and finally we are for questions so today we're going to talk about what's new in the new Washington DC release for document intelligence my name is l Sanchez and I'm one of the product managers working on document intelligence so in Washington we release document intelligence version 4.0 and version 4.1 it's an application that's available via a specific subscription typically linked to a pro or Enterprise license check with your account team if you're interested to know more about that what are the main highlight of the Washington DC release for document intelligence well the first one is that we able now to extract information by selecting data directly from the documents with something that we call the draw tool and I'll spend more time on that later we can also send send better input to the model when you don't need all the pages in the document so you can add a filter so that you only feed document intelligence the pages that are relevant for extraction and finally we have also increased some of the limits so that you can uh use docent intelligence in more use cases so let's go a bit into more details so first of all is draw tool so what is draw tool as I mentioned is our tool to extract tables Faster by selecting the data directly from the document itself and I'll show that in the exercise later why do we want to do that because it can save time and reduce the errors when we extract data from complex documents there's still a lot of documents today that have long tables and the process of extracting that data was still pretty tuse so that's what we addressing with draw tool how is that done so the draw tool is available during manual extraction in the document intelligence workspace again I'll show that in the instance very soon next is the page filter so as I mentioned it allows us to limit the pages that we send for extraction so why do we want to do that so we can send a better input to the model when not all the pages are required there still of cases where we get a long document but if we're only trying to extract the first Pages it makes sense to add a filter there and this also address cases where multiple documents are combined in the same file and we're only looking at extracting uh some of the pages in that document how is that done we can use a parameter in the attachment filter on the document task and again I'll show show that in a second there are other enhancements as well in these versions uh so as I mentioned increase limit we can now process more tasks on a daily basis as well as deal with a slightly bigger file size for documents we can also uh support more easily blank pages in document classification document classification is a feature that we released last year if you've seen a previous Academy on that and now we can identify a blank page in a document so that it's not used um to class classify the document we also made some enhancements regarding accessibility for so that everybody can use our tool and we have a new pre-train model that is available specifically for I it Asset Management specifically software Asset Management uh so that if you're interested in that I recommend checking out with your accounting especially specifically the itam uh specialist and finally we also made some improvement to our pre-train model to extract data from invoices that is part of our account payable operation product I will show you in an instance now before we go there I would just like to start a quick poll we are also working on improving our document intelligence dashboard and I figured that I could ask you all what would you like to see in that new dashboard and we'll go into the instance so the first thing I want to show you is the draw tool so in cases where we have documents like this one I'll show you an example it's a document with a pretty long table there might even be an other part of the table in another page in my document and if I have to extract all those values manually to put that into my other system whether it's an Erp system or in service now it can take a lot of time it could be prone to error so I'm going to use the draw tool to make that an easier process so for that I created a use case so I'll navigate to my admin experience for document intelligence and I open my use case and in my use case here I when I created a new field I selected a table because I'm trying to extract the table so it's a table field and then I creit it and it looks like that um so it's the name of my table which the serice now table I I map it I might I M it to and then the different columns that I'm trying to extract then I loaded a new document task and I can now open it in document intelligence and so my document is open here and I'm going to extract my table so for that I'm going to navigate so I have my single Fields here at the top and then my table I can click on the table and open it here so for now I don't have any values and I'm going to extract all the values from the document so for that with the table selected I see that I have a new icon available here and it says draw tool I'm going to click on that and then I see that my cursor changes tooll cross and I can just drag and drop on my document to go around my table now it's detecting my table and it's assigning my table with the right columns so I have an item quantity I have line total so in that case it didn't pick up line total so I'm just going to help it get to the right column so that one is line total and then I also see that it detected that colon in the middle because it's here but I don't need to extract that one so in that case I'll just say that I don't need to extract the colum so it's going to grade out um I'll make sure that it detected the the top here as a header and then all the rows and columns are selected appropriately if not I can always add more by going here at the top to to add a line or to add a column but in that case the detection was pretty good so I'm just going to click on extract data and if I open my bottom here I see that all the the values were extracted correctly so that's uh the first step here to make that process easier now there is even more if I do have another table or if my table continues to a different page I can go to my other page and click on the draw tool again and select that second table proceed with the same actions selecting the right columns um graying out columns that I don't need and then click on extract data again and now I see that I do have all the values from my first table so if I by the way if I click on the value it will uh direct me to where it's on the table so that I can do my validation and then later in my rad here I should have my second there we go so I've also extracted my second table here as a second piece of my data and that's it for draw when I'm done I can submit my document and all the values are extracted all right so that was the draw tool now uh let's talk about page range filter so again as we saw on that document um I had four different pages and it happens a lot in documents sometimes they're scanned uh sometimes they combine together but we don't need to extract all the pages so in that case um I'm going going to use page range filter to only process the first two pages of my document so that um I can only I I just have to extract the first two pages so I'll use the page R filter to do that um so the way I'll do that is I'm going to start uh I'm going to go to my Integrations and I'm going to create a new integration uh I'm going to keep the type process task going to click on save so that's the that's the outof thee boox flow that we use when we when a record in the Target table is created it's brought over to document intelligence to do the extraction I'll open that flow and here I'm going to make a slight modification so because I know that I only need to extract the first two pages I I'm going to add that as a filter so I will start editing that flow and that's the again the out of the box flow that's provided and that's uh filled out based on my specific document intelligence use case so whenever a new quick extract record is created I'll take the attachments and process them so I'm going to add a few things here the first thing is I'm going to look up for thata attachment so I'm going to use the Lup attachment action and say uh whenever I I get the quick extract record here I'm going to get the C ID of that and I'm going to to input that into my doc Intel create a document task action so I'm going to add to the filter here I'm going to say uh now that I have the attachment I want the C ID so that way it filters on the C ID and then I'm going to add the page range the page range looks like that it's two brackets and then inside those brackets it's my page range so it could look like it could look like uh a a range so for example from page one to page three and it can also be single Pages like comma five and it could be a combination of any of that for example if I wanted to go further like that so it's a combination of single pages and Page range that are used as a filter so in that case page one to page two I'm going to click on done here and then I can save that and I will activate that flow so once that flow is activated I can um go to my quick extract so that's the table that's linked here and create a new one I will add my attachment to that and save it now I will navigate to my my list of document tasks and it should have created a task just now there we go and if I add the attachment filter column I see that it added that filter with the page range there and so I'm going to wait for that task to process and then I'm going to look at it uh in document intelligence obviously in that case here uh is very static I inputed my page range based on what I knew of the document uh that could be my use case if I know that always need to extract the first page I can do it that way uh but obviously if I learn something a bit more Dynamic this is where I could pair that with document classifier and run my document through a classification process first to know what's inside of it and then use that as my input to put into the page range filter so that's another way to do it now that is processed I'm going to open it and you see what happened here is that now instead of the four pages I I only have the two pages so I limited the pages that I sent to my uh to document intelligence which makes it faster to extract it also makes the model better because I'm is not trying to uh detect the text from a lot more pages um and that was it for the page range filter I'm going to pause here and take a few question uh somebody is asking if document intelligence interacts with now assist plugin uh no it does not not not today today those are two different two separate products uh somebody is asking us once we've done it once is there a way to automate that yes absolutely what we seen today is really with the draw tool it's really the experience of uh reviewing the document manually but the more you do that the more the AI model learns and then the more it can automate that so there is a stage where when you've done that enough on enough documents you can automate the extraction can this draw tool be used when the data is available in Excel uh so we don't support Excel files but if you're using an a PDF of an Excel that's something that you could use um at the same time if you have the data in Excel you might look at a different process maybe CSV or something that is uh doesn't have to rely on an oocl someone is asking if the what we see today with the draw tool and only written in a custom table or can it be loaded in existing table no absolutely you can absolutely use that on any table you want whether it's a table that exists um or a custom table I show here with custom tables because I want to show that it's a a tool a product that works with any type of workflow but we have integration with the CSM tables with other product line so it it definitely works on existing tables as well somebody is asking which is the faster faster way of importing data draw tool or transform map so it's two completely different processes uh the draw tool can can only be used on documents you still have to use a transform map and Export set when you're dealing with data that is like in CSV format it's it's still the best way to do that now if your do if your data is in a document then you want to use document intelligence uh somebody is asking if we can extract from scan PDF no absolutely we do support scan PDF it does an OCR on that and it can extract the data and then last question draw tool supports uh what type of documents so we support PDF JPEG and PNG uh type of document all right and uh with that I'm going to show you the last feature which was part of document classification so we mentioned that we can use document classification to do our classification before fitting that to the page run filter so I'm going to show you uh the last new feature which was the detection of blank pages so if I open a document so that's my use case for a document classifier I loaded a document and if I open that document in my agent experience and see I'm going to go to that page I have I have a blank page here and so if I change my document and split that into different pages by using the mix category I can uh classify each of the different pages and I see that for my blank blank page here it's actually recommending me that it's a blank page and so I can use that to make sure it's not uh being used in in my model so that was it for for the exercise uh if you have any other questions I recommend again going to our gen and intelligence Community Forum you can use Snorks we have learn one last question and then I'm going to end the session uh somebody's asking uh what where is draw tool available so draw tool is available as part of the document intelligence product that's available on a separate subscription for your specific case all right and uh that's it for the questions today then our going to end the session today and as a reminder we're going to post the recording on YouTube after that and we have another session of AI Academy in two weeks hope to see you all there with that you have all have a good day

View original source

https://www.youtube.com/watch?v=a0fllfx_fmg