OCR App

Basic information

This topic provides you with general information about the OCR API.

About the OCR API

This documentation shows you how to use the programming interface of the OCR app for your own developments. The OCR app provides text recognition functions for digital pages.

Scope of function of the OCR app

The OCR app contains interfaces for recognizing text in documents. It has the following functions:

Full text recognition
Recognition of hOCR information

Using the API functions

In this section you will learn about the different options for using the interfaces of the OCR app for your requirements.

General information

This chapter provides general information on using the API.

Requirements

To use the provided interfaces, you require an app key. To get this key, please contact d.velop support.

Restrictions

The OCR app can process a maximum of 500 pages per request. If the document provided contains more than 500 pages, OCR is not performed and the job is marked as containing errors.

Jobs created with the OCR app are temporary and are stored for a maximum of one day. Presigned URLs that are provided by the OCR app are valid for a maximum of 5 minutes.

Supported formats

The OCR app supports a variety of input formats:

application/pdf
image/png
image/jpeg
image/tiff
image/bmp

The following target formats are supported:

text/plain
text/html;format=hocr

Determining the available interfaces and formats

To use the interfaces of the OCR app, you must first determine which interfaces are provided. To do so, perform an HTTP GET request for the root resource of the app:

Request

GET /ocr
Accept: application/hal+json

Response

{
    _links: {
        converter: {
            href: "/ocr/convert/features"
        }
    }
}

Follow the specified converter link relation to determine the available interfaces:

GET /ocr/convert/features
Accept: application/hal+json

Response

{
    async: [
        {
            _links: {
                convert: {
                    href: "/ocr/recognition/"
                }
            },
            sourceMimeType: "application/pdf",
            destinationMimeType: "text/plain"
        },
        {
            _links: {
                convert: {
                    "href": "/ocr/recognition/"
                }
            },
            sourceMimeType: "application/pdf",
            destinationMimeType: "text/html;format=hocr"
        },
        {
            _links: {
                convert: {
                    href: "/ocr/recognition/"
                }
            },
            sourceMimeType: "image/png",
            destinationMimeType: "text/plain"
        },
         .....
    ]
}

Performing OCR

Once you have determined the required interface as described in the section Determining the available interfaces and formats, you can begin OCR.

To do so, you first create a job in the OCR app by executing an HTTP POST request for the determined interface. You can specify your desired target format in the Accept header.

Request

POST /ocr/recognition
Accept: text/plain
{
   sourceMimeType: "application/pdf",
   callbackUri: "/my-app/callback",
   languageCodes: [
      "deu",
      "eng"
   ],
   appKey: "my-custom-app-key"
}

Notes on the properties

Property	Type	Description	Required
sourceMimeType	string	Mime type for the file to be processed. You can find the supported formats in the section Supported formats.	Yes
callbackUri	string	Specifies the URI that is called when OCR is completed.	No
languageCodes	string	Specifies the languages to be used for recognition by the OCR module. At present, German (deu) and English (eng) are supported. If you do not make an entry for the parameter, both languages are used.	No
appKey	string	Key required to use the interface. If you do not have an app key yet, contact d.velop support.	Yes

After you execute the request shown above, you receive the following response.

Response

{
    "_links": {
        "self": {
            "href": "/ocr/recognition/my-job-id/"
        },
        "upload": {
            "href": "https://presigned-upload-url"
        }
    },
    "jobId": "my-job-id",
    "sourceMimeType": "application/pdf",
    "status": 0,
}

Notes on the properties

Property	Description
_links.upload	Upload URL for uploading the file for which OCR is to be performed.
jobId	Unique identifier for the created job.
status	Status of the job. A breakdown of the possible statuses is shown in the section Description of the available job statuses and page statuses.

After you successfully create a job, you must then upload your source file in the OCR app by executing an HTTP PUT request for the route provided under the upload link relation:

Request

PUT https://presigned-upload-url
Content-Type application/pdf

<Binary-Content>

After the file is successfully uploaded, OCR begins. Your app is notified through the entered callbackUri when recognition is complete. If you did not specify a callback URL while creating the job, you can retrieve the job with the URL provided in the self link relation. In both cases, you receive the following JSON:

Response

{
    _links: {
        self: {
            href: "/ocr/recognition/my-job-id"
        }
    },
    jobId: "my-job-id",
    sourceMimeType: "application/pdf",
    status: 2,
    message: "",
    pageResults: [
        {
            _links: {
                outputfile: {
                    "href": "https://presigned-outputfile-url"
                }
            },
            index: 0,
            status: 2,
            message: ""
        }
    ]
}

Notes on the properties

Property	Description
jobId	Unique identifier for the job.
status	Status of the job. All the possible statuses are described in the section Description of the available job statuses and page statuses.
pageResults	List of the results for each page.
pageResults._links.outputfile	URL for downloading the respective page results.
pageResults.index	The page index.
pageResults.status	The page processing status. All the possible statuses are described in the section Description of the available job statuses and page statuses.
pageResults.message	Error message for the event that an error occurs during recognition.

Description of the available job statuses and page statuses

Job status

Status	Name	Description
0	WaitingForFile	The job was created successfully. The system is waiting for the source file to be uploaded so that recognition can begin.
1	InProgress	Processing is in progress.
2	Finished	Processing was performed successfully.
3	Error	An error occurred during processing.
4	PartialSuccess	Only part of the pages provided in the source file could be processed successfully.

Page status

Status	Name	Description
2	Finished	Page recognition was performed successfully.
3	Error	Error occurred during page recognition.