Thursday, 24 March 2022

Amazon Textract

 

  • A fully managed document analysis service for detecting and extracting information from scanned documents.
  • Returns extracted data as key-value pairs (e.g., Name: John Doe)
  • Supports virtually any type of documents
  • Can detect text written in Standard English alphabet and ASCII symbols.

Common Use Cases:

  • Building search indexes
  • Importing documents into a business application
  • Building automated document processing solutions
  • Text extraction for Natural Language Processing (NLP) Applications
  • Maintaining document compliance

Concepts

  • Amazon Textract returns a confidence score for each identified element, which indicates the probability that a given prediction is correct.
  • A low-confidence score can be rerouted to Amazon Augmented AI (A2I) for further human review.
  • The asynchronous operation allows you to process multipage PDF documents.
  • Detect Document Text API
    •  Uses optical character recognition (OCR) technology to extract printed text and handwriting from a document.
  • Analyze Document API
    • Extracts printed text, handwriting, and other data from tables and key-value pairs from forms.

Pricing

  • You only pay for what you use.
  • Charges vary for Detect Document Text API and Analyze Document API, with the latter being the more expensive.

No comments:

Post a Comment