Document OCR

Alice Onboarding is a complete solution created for onboarding, but flexible enough to enable other use cases through the same API and SDKs without any additional complexity.

Here you will find how to easily build a document OCR use case with Alice Onboarding.

Document OCR flow

The document OCR process can be divided into 3 steps:

  1. User creation: your backend registers a new user.

  2. Document adding : the user uploads documents to the Alice Onboarding platform.

  3. Report processing: your backend gets the documents reports and checks the results.

Alice Document OCR flow

1. User creation

Create a user from your backend in the Alice Onboarding API.

curl --request POST \
--url https://apis.alicebiometrics.com/onboarding/user \
--header 'Authorization: Bearer <BACKEND_TOKEN>' \
--header 'Content-Type: multipart/form-data' \
--form email=example@example.com
from alice import Config, Onboarding

config = Config(api_key=api_key)
onboarding = Onboarding.from_config(config)
user_id =  onboarding.create_user().unwrap_or_throw()

2. Document adding

At your frontend, you must create a document associated with that user and add the photos of the document sides (add_front and add_back). This can be done manually by calling the API or automatically by taking advantage of the capture functionalities of the SDKs.

curl --request POST \
--url https://apis.alicebiometrics.com/onboarding/user/document \
--header 'Authorization: Bearer <USER_TOKEN>' \
--header 'Content-Type: multipart/form-data' \
--form type=<DOCUMENT_TYPE> \
--form issuing_country=<ISSUING_COUNTRY>

curl --request PUT \
--url https://apis.alicebiometrics.com/onboarding/user/document \
--header 'Authorization: Bearer <USER_TOKEN>' \
--header 'Content-Type: multipart/form-data' \
--form document_id=<DOCUMENT_ID> \
--form side=<DOCUMENT_SIDE> \
--form image=@/path/to/doc/image.jpeg \
--form manual=true \
--form source=file
let userToken = "<ADD-YOUR-USER-TOKEN-HERE>"

let config = OnboardingConfig.builder()
    .withUserToken(userToken)
    .withAddDocumentStage(ofType: .idcard, issuingCountry: "ESP")

let onboarding = Onboarding(self, config: config)
onboarding.run { result in
    switch result {
    case let .success(userStatus):
        print("userStatus: \(String(describing: userStatus))")
    case let .failure(error):
        print("failure: \(error.localizedDescription)")
    case .cancel:
        print("User has cancelled the onboarding")
    }
}
val userToken = "<ADD-YOUR-USER-TOKEN-HERE>"

val config = OnboardingConfig.builder()
    .withUserToken(userToken)
    .withAddDocumentStage(type = DocumentType.IDCARD, issuingCountry = "ESP")

val onboarding = Onboarding(this, config: config)
onboarding.run(ONBOARDING_REQUEST_CODE)

...

override fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent?) {
    super.onActivityResult(requestCode, resultCode, data)
        if (requestCode == ONBOARDING_REQUEST_CODE) {
            if (resultCode == Activity.RESULT_OK) {
                val userInfo = data!!.getStringExtra("userStatus")
        } else if (resultCode == Activity.RESULT_CANCELED) {

        }
    }
}
String userToken = "<ADD-YOUR-USER-TOKEN-HERE>"

OnboardingConfig config = OnboardingConfig.CREATOR.builder().withUserToken(userToken);
config.withAddDocumentStage(DocumentType.IDCARD, "ESP");

Onboarding onboarding = new Onboarding(
        this,
        config
);

onboarding.run(ONBOARDING_REQUEST_CODE);

...

@Override
protected void onActivityResult(int requestCode, int resultCode, @Nullable Intent data) {
    super.onActivityResult(requestCode, resultCode, data);
    if (requestCode == ONBOARDING_REQUEST_CODE) {
        switch (resultCode) {
            case Activity.RESULT_OK:
                Log.d("ONBOARDING_RESULT", data.getStringExtra("userStatus"););
                break;
            case Activity.RESULT_CANCELED:
                Log.d("ONBOARDING_RESULT", "Onboarding canceled");
                break;
            case ONBOARDING_ERROR:
                Log.d("ONBOARDING_RESULT", data.getParcelableExtra("onboardingError").toString());
                break;
        }
    }
}
const ONBOARDING_CONFIG = {
    "stages": [
        {"stage": "addDocument", "type": "idcard", "issuingCountry": "ESP"},
    ]
}

<Onboarding
    userToken={userToken}
    config={ONBOARDING_CONFIG}
    onSuccess={(userStatusJson) => console.log("onSuccess:" + userStatusJson) }
    onFailure={(failureJson) => console.log("onFailure:" + failureJson) }
    onCancel={(value) => console.log("onCancel:" + value) }
/>
let userToken = "<ADD-YOUR-USER-TOKEN-HERE>"
let config = new aliceonboarding.OnboardingConfig()
    .withUserToken(userToken)
    .withAddDocumentStage(aliceonboarding.DocumentType.IDCARD)

function onSuccess(userInfo) {console.log("onSuccess: " + userInfo)}
function onFailure(error) {console.log("onFailure: " + error)}
function onCancel() { console.log("onCancel")}

new aliceonboarding.Onboarding("alice-onboarding-mount", config).run(onSuccess, onFailure, onCancel);

3. Report processing

From your backend, you can obtain the read fields for such document through a get Report request to the Alice Onboarding API.

curl --request GET \
--url https://apis.alicebiometrics.com/onboarding/user/report \
--header 'Authorization: Bearer <BACKEND_TOKEN_WITH_USER_ID>' \
--header 'Content-Type: multipart/form-data'
report = onboarding.create_report(
    user_id=user_id, verbose=True
).unwrap_or_throw()

If the document’s results do not fulfill your criteria (see How to accept a document), you should invalidate (void) the user’s document and ask him/her to capture again.

curl --request GET \
--url https://apis.alicebiometrics.com/onboarding/user/report \
--header 'Authorization: Bearer <BACKEND_TOKEN_WITH_USER_ID>' \
--header 'Content-Type: multipart/form-data' \
--header 'Alice-Report-Version: <0,1>'
report = onboarding.create_report(
    user_id=user_id,
    report_version=<ReportVersion.V0, ReportVersion.V1>
).unwrap_or_throw()

How to accept a document

In this section we show you how to analyze the reading results of a document and define your acceptance criteria.

0. Get the document report

The document report collects all the necessary info to accept or deny a document.

{
   "created_at": "2021-03-02T09:32:57.344383",
   "document_reports": {},
   "selfie_reports": {},
   "user_summary": {
      "authorized": false,
      "created_at": "2021-03-02T09:13:08",
      "documents": {
         "status": {},
         "summary": {
            "67101f93-5525-48b9-aced-67bf557a6382": {
               "completed": true,
               "created_at": "2021-03-02T09:13:15",
               "face_validation": {},
               "fields": {},
               "issuing_country": "ESP",
               "status": {},
               "type": "idcard",
               "voided": false
            }
         },
         "uploaded_documents": ["67101f93-5525-48b9-aced-67bf557a6382"]
      },
      "fields": {},
      "selfie": {},
      "user_id": "deb78277-af90-47b6-b3b7-461129a819bf"
   },
   "version": 0
}
{
 "created_at": "2021-07-20T15:16:33.845112",
 "documents": [
      {
       "checks": [],
       "created_at": "2021-03-02T09:13:15",
       "id": "67101f93-5525-48b9-aced-67bf557a6382",
       "meta": {},
       "sides": {},
       "summary_fields": []
      }
 ],
 "events": [],
 "id": "8d2ce06e-a80e-47a3-8095-ab7a677675a8",
 "selfies": [],
 "summary": {},
 "user_id": "deb78277-af90-47b6-b3b7-461129a819bf",
 "version": 1
}

1. Check if the document is complete

The first step is to check whether the document is complete. This means that every side was successfully uploaded.

If it is incomplete, you should invalidate (void) the document and ask the user to capture it again.

 "documents": {
    "status": {},
    "summary": {
       "67101f93-5525-48b9-aced-67bf557a6382": {
          "completed": true,
          "created_at": "2021-03-02T09:13:15",
          "face_validation": {},
          "fields": {},
          "issuing_country": "ESP",
          "status": {},
          "type": "idcard",
          "voided": false
       }
    },
    "uploaded_documents": ["67101f93-5525-48b9-aced-67bf557a6382"]
}
{
   "created_at": "2021-07-20T15:16:33.845112",
   "documents": [
        {
         "checks": [],
         "created_at": "2021-03-02T09:13:15",
         "id": "67101f93-5525-48b9-aced-67bf557a6382",
         "meta": {
              "completed": true,
              "issuing_country": "AUTO",
              "type": "passport",
              "voided": false
         },
         "sides": {},
         "summary_fields": []
        }
   ],
   "events": [],
   "id": "8d2ce06e-a80e-47a3-8095-ab7a677675a8",
   "selfies": [],
   "summary": {},
   "user_id": "deb78277-af90-47b6-b3b7-461129a819bf",
   "version": 1
 }

2. Define and check your document alerts

Our API implements some alerts and checks that might be suitable for you to accept or deny a document. For instance, document has expired, document data is inconsistent between front and back, etc.

If the alerts/checks you have defined are not met, you should invalidate (void) the document and ask the user to capture it again.

2.1 Report V0 (alerts)

The Report V0 includes Alert elements. These are collected at the Document summary status of the Document summary. Please see which are available at the Doc-level alerts section.

Report V0 | Expired document alert example
"user_summary": {
   "authorized": false,
   "created_at": "2021-03-02T09:13:08",
   "documents": {
      "status": {},
      "summary": {
         "67101f93-5525-48b9-aced-67bf557a6382": {
            "completed": true,
            "created_at": "2021-03-02T09:13:15",
            "face_validation": {},
            "fields": {},
            "issuing_country": "ESP",
            "status": {
               "alert": [
                  {
                     "code": 4302,
                     "message": "Expired document"
                  }
               ],
               "info": []
            },
            "type": "idcard",
            "voided": false
         }
      },
      "uploaded_documents": ["67101f93-5525-48b9-aced-67bf557a6382"]
   }
}

2.2 Report V1 (checks)

The Report V1 includes Check elements. These are collected at the checks array of the Document Report. Please see which are available at the Document-level checks section.

Each check has a value between 0-100. If its value is greater or equal to 50, the condition is met.

The checklist may be different depending on the document type and/or issuing country. Therefore, the absence of a certain check should never be a reason for rejection.

Report V1 | Unexpired document check example
{
 "created_at": "2021-07-20T15:16:33.845112",
 "documents": [
      {
       "checks": [
            {},
            {},
            {
             "detail": "The document has not expired",
             "key": "unexpired_document",
             "value": 100
            },
            {},
            {},
            {}
       ],
       "created_at": "2021-03-02T09:13:15",
       "id": "67101f93-5525-48b9-aced-67bf557a6382",
       "meta": {},
       "sides": {},
       "summary_fields": []
      }
 ],
 "events": [],
 "id": "8d2ce06e-a80e-47a3-8095-ab7a677675a8",
 "selfies": [],
 "summary": {},
 "user_id": "deb78277-af90-47b6-b3b7-461129a819bf",
 "version": 1
}

3. Define and check your key fields

First of all, you need to define which common document fields are suitable for your use case. Please check the Fields by document type section.

Most customers choose at least these 5 fields:

  • first_name

  • last_name

  • birth_date

  • expiration_date

  • id_number/license_number/passport_number

3.1. Report V0

You will find all the fields read by Alice OCR technology at the fields section of the Document summary.

A Field contains a value and a Field status, which holds the reading confidence score, the Field Info and the the Field Alert.

A field is considered as well read if one of these two conditions is met:

  • The field is checked (2301 code in field’s info)

  • The field has a score above the recommended threshold (0.7) and is not unchecked (4001 code not in field’s alerts)

If none of these is fulfilled, you should invalidate (void) the document and ask the user to capture it again.

Report V0 | Read field example
"user_summary": {
   "authorized": false,
   "created_at": "2021-03-02T09:13:08",
   "documents": {
      "status": {},
      "summary": {
         "67101f93-5525-48b9-aced-67bf557a6382": {
            "completed": true,
            "created_at": "2021-03-02T09:13:15",
            "face_validation": {},
            "fields": {
               "id_number": {
                  "status": {
                     "alert": [],
                     "info": [
                        {
                        "code": 2301,
                        "message": "Field is checked"
                        }
                     ],
                     "score": 0.91
                  },
                  "value": "99999999R"
               },
            },
            "issuing_country": "ESP",
            "status": {},
            "type": "idcard",
            "voided": false
         }
      },
      "uploaded_documents": ["67101f93-5525-48b9-aced-67bf557a6382"]
   }
}

3.2. Report V1

You will find all the fields read by Alice OCR technology at the summary_fields section of the Document Report. A Document Field contains:

A field is considered as well read if one of these two conditions is met:

  • The checked_field check IS in the array of checks and its value is greater than or equal to 50.

  • The checked_field check IS NOT in the array of checks but the field’s score is greater than or equal to 70.

If none of these is fulfilled, you should invalidate (void) the document and ask the user to capture it again.

Report V1 | Read field example
{
 "created_at": "2021-07-20T15:16:33.845112",
 "documents": [
    {
     "checks": [],
     "created_at": "2021-03-02T09:13:15",
     "id": "67101f93-5525-48b9-aced-67bf557a6382",
     "meta": {},
     "sides": {},
     "summary_fields": [
        {
         "checks": [
            {
            "detail": "The field passes its checksum",
            "key": "checked_field",
            "value": 100
            }
         ],
         "name": "id_number",
         "score": 91,
         "value": "99999999R"
        }
     ]
    }
 ],
 "events": [],
 "id": "8d2ce06e-a80e-47a3-8095-ab7a677675a8",
 "selfies": [],
 "summary": {},
 "user_id": "deb78277-af90-47b6-b3b7-461129a819bf",
 "version": 1
}