OCR App
Basic information
This topic provides you with general information about the OCR API.
About the OCR API
This documentation shows you how to use the programming interface of the OCR app for your own developments. The OCR app provides text recognition functions for digital pages.
Scope of function of the OCR app
The OCR app contains interfaces for recognizing text in documents. It has the following functions:
- Full text recognition
- Recognition of hOCR information
Using the API functions
In this section you will learn about the different options for using the interfaces of the OCR app for your requirements.
General information
This chapter provides general information on using the API.
Requirements
To use the provided interfaces, you require an app key. To get this key, please contact d.velop support.
Restrictions
The OCR app can process a maximum of 500 pages per request. If the document provided contains more than 500 pages, OCR is not performed and the job is marked as containing errors.
Jobs created with the OCR app are temporary and are stored for a maximum of one day. Presigned URLs that are provided by the OCR app are valid for a maximum of 5 minutes.
Supported formats
The OCR app supports a variety of input formats:
- application/pdf
- image/png
- image/jpeg
- image/tiff
- image/bmp
The following target formats are supported:
- text/plain
- text/html;format=hocr
Determining the available interfaces and formats
To use the interfaces of the OCR app, you must first determine which interfaces are provided. To do so, perform an HTTP GET request for the root resource of the app:
Request
GET /ocr
Accept: application/hal+json
Response
{
_links: {
converter: {
href: "/ocr/convert/features"
}
}
}
Follow the specified converter link relation to determine the available interfaces:
GET /ocr/convert/features
Accept: application/hal+json
Response
{
async: [
{
_links: {
convert: {
href: "/ocr/recognition/"
}
},
sourceMimeType: "application/pdf",
destinationMimeType: "text/plain"
},
{
_links: {
convert: {
"href": "/ocr/recognition/"
}
},
sourceMimeType: "application/pdf",
destinationMimeType: "text/html;format=hocr"
},
{
_links: {
convert: {
href: "/ocr/recognition/"
}
},
sourceMimeType: "image/png",
destinationMimeType: "text/plain"
},
.....
]
}
Performing OCR
Once you have determined the required interface as described in the section Determining the available interfaces and formats, you can begin OCR.
To do so, you first create a job in the OCR app by executing an HTTP POST request for the determined interface. You can specify your desired target format in the Accept header.
Request
POST /ocr/recognition
Accept: text/plain
{
sourceMimeType: "application/pdf",
callbackUri: "/my-app/callback",
languageCodes: [
"deu",
"eng"
],
appKey: "my-custom-app-key"
}
Notes on the properties
Property | Type | Description | Required |
---|---|---|---|
sourceMimeType | string | Mime type for the file to be processed. You can find the supported formats in the section Supported formats. | Yes |
callbackUri | string | Specifies the URI that is called when OCR is completed. | No |
languageCodes | string | Specifies the languages to be used for recognition by the OCR module. At present, German (deu) and English (eng) are supported. If you do not make an entry for the parameter, both languages are used. | No |
appKey | string | Key required to use the interface. If you do not have an app key yet, contact d.velop support. | Yes |
After you execute the request shown above, you receive the following response.
Response
{
"_links": {
"self": {
"href": "/ocr/recognition/my-job-id/"
},
"upload": {
"href": "https://presigned-upload-url"
}
},
"jobId": "my-job-id",
"sourceMimeType": "application/pdf",
"status": 0,
}
Notes on the properties
Property | Description |
---|---|
_links.upload | Upload URL for uploading the file for which OCR is to be performed. |
jobId | Unique identifier for the created job. |
status | Status of the job. A breakdown of the possible statuses is shown in the section Description of the available job statuses and page statuses. |
After you successfully create a job, you must then upload your source file in the OCR app by executing an HTTP PUT request for the route provided under the upload link relation:
Request
PUT https://presigned-upload-url
Content-Type application/pdf
<Binary-Content>
After the file is successfully uploaded, OCR begins. Your app is notified through the entered callbackUri when recognition is complete. If you did not specify a callback URL while creating the job, you can retrieve the job with the URL provided in the self link relation. In both cases, you receive the following JSON:
Response
{
_links: {
self: {
href: "/ocr/recognition/my-job-id"
}
},
jobId: "my-job-id",
sourceMimeType: "application/pdf",
status: 2,
message: "",
pageResults: [
{
_links: {
outputfile: {
"href": "https://presigned-outputfile-url"
}
},
index: 0,
status: 2,
message: ""
}
]
}
Notes on the properties
Property | Description |
---|---|
jobId | Unique identifier for the job. |
status | Status of the job. All the possible statuses are described in the section Description of the available job statuses and page statuses. |
pageResults | List of the results for each page. |
pageResults._links.outputfile | URL for downloading the respective page results. |
pageResults.index | The page index. |
pageResults.status | The page processing status. All the possible statuses are described in the section Description of the available job statuses and page statuses. |
pageResults.message | Error message for the event that an error occurs during recognition. |
Description of the available job statuses and page statuses
Job status
Status | Name | Description |
---|---|---|
0 | WaitingForFile | The job was created successfully. The system is waiting for the source file to be uploaded so that recognition can begin. |
1 | InProgress | Processing is in progress. |
2 | Finished | Processing was performed successfully. |
3 | Error | An error occurred during processing. |
4 | PartialSuccess | Only part of the pages provided in the source file could be processed successfully. |
Page status
Status | Name | Description |
---|---|---|
2 | Finished | Page recognition was performed successfully. |
3 | Error | Error occurred during page recognition. |