Overview:

  • Learn how to train the system to read invoice table including lines (description) and columns.


This type of annotation consists of line-based and column-based annotation techniques. The purpose of this techniques is to train the system to define lines and columns on the invoice correctly.


Table-based annotation

This type of annotation is needed at the table/line level of the invoice. For example, if you have incorrect or missing data extracted from this part of the document.


Usually, a system will define the table automatically by drawing a blue frame around it:


In the above example, the system defined the table correctly and thus extracted information successfully.


Some invoice layouts are tricky for the system and even for a human eye to read. So the system may identify the table incorrectly or miss it altogether. This is ok, and there is a way to define it correctly next time.


For example, in the image below the system picked up some extra data that we don’t want it to go on the line level:



You can draw an entirely new table or fix an existing one:


Drawing a new table:


You can easily draw a new frame around the table. To do so, please:


  1. Click the “Clean” button - it will automatically summarize all the lines into single-line. Ignore if you need all the lines and processed to the next step.
  2. Click “Table” button and move the cursor to the left. You will see a blue dot on the left side of the screen. 

Use this dot to draw a new frame around the table:

  1. Click and release the cursor and start drawing the frame around the invoice table. When finished, release the cursor, and it will fixate the frame. You should draw the frame including invoice lines only.
  2. Click “Annotate” but don't close annotate window unless it is completed, so you can see the result. 


It should now extract data defined by you:


If description, quantities or amounts are not picked from the right column using Table annotation, you can apply Column-based annotation.


Column-based annotation

This type of annotation is needed to show the system how to read the column from the invoice table correctly. To do so, we need to map the right column on the invoice. For this, you will find an icon next to the corresponding column:



For example, to train the system to extract correct Quantity column from the invoice, you will need to:

  • Click on the icon next to the Quantity column in Ocerra, and then click on the header of this column on the original document on the left. 
  • Click "Annotate" button.


Tip: If you have a few columns to fix, do it all at once and then click “Annotate”:


 


We are using a combination of the best data extraction tools available on the market globally, including our own engine that we built internally. Still, some invoice layouts may not be trainable in full. 


The beauty of Ocerra, is that our data processing screen is fully editable. You can manually add data or use "value-based annotation technique" to add it quickly.


Please refer to this video for a detailed explanation of invoice annotation process.