OCR app

OCR app

Basic information

This topic provides you with general information about the OCR API.

About the OCR API

This documentation shows you how to use the programming interface of the OCR app for your own developments. The OCR app provides text recognition functions for digital pages.

Scope of function of the OCR app

The OCR app contains interfaces for recognizing text in documents. It has the following functions:

  • Full text recognition
  • Recognition of hOCR information

Using the API functions

In this section you will learn about the different options for using the interfaces of the OCR app for your requirements.

General information

This chapter provides general information on using the API.

Requirements

To use the provided interfaces, you require an app key. To get this key, please contact d.velop support.

Restrictions

The OCR app can process a maximum of 500 pages per request. If the document provided contains more than 500 pages, OCR is not performed and the job is marked as containing errors.

Jobs created with the OCR app are temporary and are stored for a maximum of one day. Presigned URLs that are provided by the OCR app are valid for a maximum of 5 minutes.

Supported formats

The OCR app supports a variety of input formats:

  • application/pdf
  • image/png
  • image/jpeg
  • image/tiff
  • image/bmp

The following target formats are supported:

  • text/plain
  • text/html;format=hocr

Determining the available interfaces and formats

To use the interfaces of the OCR app, you must first determine which interfaces are provided. To do so, perform an HTTP GET request for the root resource of the app:

Request

GET /ocr
Accept: application/hal+json

Response

{
    _links: {
        converter: {
            href: "/ocr/convert/features"
        }
    }
}

Follow the specified converter link relation to determine the available interfaces:

GET /ocr/convert/features
Accept: application/hal+json

Response

{
    async: [
        {
            _links: {
                convert: {
                    href: "/ocr/recognition/"
                }
            },
            sourceMimeType: "application/pdf",
            destinationMimeType: "text/plain"
        },
        {
            _links: {
                convert: {
                    "href": "/ocr/recognition/"
                }
            },
            sourceMimeType: "application/pdf",
            destinationMimeType: "text/html;format=hocr"
        },
        {
            _links: {
                convert: {
                    href: "/ocr/recognition/"
                }
            },
            sourceMimeType: "image/png",
            destinationMimeType: "text/plain"
        },
         .....
    ]
}

Performing OCR

Once you have determined the required interface as described in the section Determining the available interfaces and formats, you can begin OCR.

To do so, you first create a job in the OCR app by executing an HTTP POST request for the determined interface. You can specify your desired target format in the Accept header.

Request

POST /ocr/recognition
Accept: text/plain
{
   sourceMimeType: "application/pdf",
   callbackUri: "/my-app/callback",
   languageCodes: [
      "deu",
      "eng"
   ],
   appKey: "my-custom-app-key"
}

Notes on the properties

PropertyTypeDescriptionRequired
sourceMimeTypestringMime type for the file to be processed. You can find the supported formats in the section Supported formats.Yes
callbackUristringSpecifies the URI that is called when OCR is completed.No
languageCodesstringSpecifies the languages to be used for recognition by the OCR module. At present, German (deu) and English (eng) are supported. If you do not make an entry for the parameter, both languages are used.No
appKeystringKey required to use the interface. If you do not have an app key yet, contact d.velop support.Yes

After you execute the request shown above, you receive the following response.

Response

{
    "_links": {
        "self": {
            "href": "/ocr/recognition/my-job-id/"
        },
        "upload": {
            "href": "https://presigned-upload-url"
        }
    },
    "jobId": "my-job-id",
    "sourceMimeType": "application/pdf",
    "status": 0,
}

Notes on the properties

PropertyDescription
_links.uploadUpload URL for uploading the file for which OCR is to be performed.
jobIdUnique identifier for the created job.
statusStatus of the job. A breakdown of the possible statuses is shown in the section Description of the available job statuses and page statuses.

After you successfully create a job, you must then upload your source file in the OCR app by executing an HTTP PUT request for the route provided under the upload link relation:

Request

PUT https://presigned-upload-url
Content-Type application/pdf

<Binary-Content>

After the file is successfully uploaded, OCR begins. Your app is notified through the entered callbackUri when recognition is complete. If you did not specify a callback URL while creating the job, you can retrieve the job with the URL provided in the self link relation. In both cases, you receive the following JSON:

Response

{
    _links: {
        self: {
            href: "/ocr/recognition/my-job-id"
        }
    },
    jobId: "my-job-id",
    sourceMimeType: "application/pdf",
    status: 2,
    message: "",
    pageResults: [
        {
            _links: {
                outputfile: {
                    "href": "https://presigned-outputfile-url"
                }
            },
            index: 0,
            status: 2,
            message: ""
        }
    ]
}

Notes on the properties

PropertyDescription
jobIdUnique identifier for the job.
statusStatus of the job. All the possible statuses are described in the section Description of the available job statuses and page statuses.
pageResultsList of the results for each page.
pageResults._links.outputfileURL for downloading the respective page results.
pageResults.indexThe page index.
pageResults.statusThe page processing status. All the possible statuses are described in the section Description of the available job statuses and page statuses.
pageResults.messageError message for the event that an error occurs during recognition.

Description of the available job statuses and page statuses

Job status

StatusNameDescription
0WaitingForFileThe job was created successfully. The system is waiting for the source file to be uploaded so that recognition can begin.
1InProgressProcessing is in progress.
2FinishedProcessing was performed successfully.
3ErrorAn error occurred during processing.
4PartialSuccessOnly part of the pages provided in the source file could be processed successfully.

Page status

StatusNameDescription
2FinishedPage recognition was performed successfully.
3ErrorError occurred during page recognition.