Google Vision: detect text in PDFs synchronously with PHP

The Vision API now supports online (synchronous) small batch annotation (PDF/TIFF/GIF) for all features. To do so, the relevant documentation is Small batch file annotation online.

Let’s see how can we do this with PHP.

Context

Having PHP >= 7.4, the packages to require are:

google/cloud-vision
google/cloud-storage

Code

How to upload the file in the storage

Soon.

Text detection

Even with PDFs we are going to use ImageAnnotatorClient, the service that performs Google Cloud Vision API detection tasks over client images and returns detected entities from the images.

$path = "gs://mystorage.com/path/to/my/file.pdf"; /* If you have it, you can give an hint about the language in the doc */ $context = new ImageContext(); $context->setLanguageHints(['it']); /* Here's the annotator described before */ $imageAnnotator = new ImageAnnotatorClient(); /* We create an AnnotateFileRequest instance to annotate one single file */ $file_request = new AnnotateFileRequest(); /* We express our input file in terms of a GcsSource instance the represents the Google Cloud Storage location */ $gcs_source = (new GcsSource()) ->setUri($path); /* Let's specify the feature we need. You can find the options below */ $feature = (new Feature()) ->setType(Type::DOCUMENT_TEXT_DETECTION); /* Let's specify the file info: a PDF in that location */ $input_config = (new InputConfig()) ->setMimeType('application/pdf') ->setGcsSource($gcs_source); /* Some configurations, including the pages of the file to perform image annotation. */ $file_request = $file_request->setInputConfig($input_config) ->setFeatures([$feature]) ->setPages([1]); /* Annotate the files and get the responses making the synchronous batch request. */ $result = $imageAnnotator->batchAnnotateFiles([$file_request]); /* We take the first result, because that's 1 page only. */ $res = $result->getResponses(); $offset = $res->offsetGet(0); $responses = $offset->getResponses(); $res = $responses[0]; /* Finally!!! The annotations! */ $annotations = $res->getFullTextAnnotation(); /* Clean up resources such as threads */ $imageAnnotator->close();

Features

In your request you can set the type of annotation you want to perform on the file. You can check the reference or the features list documentation.

Some examples are:

  • Face detection
  • Landmark detection
  • Logo detection
  • Label detection
  • Text and document text detection
  • ..

Leave a Comment

Your email address will not be published. Required fields are marked *