Retrieve Text From an Image Using OCR

Top Previous Next

Your application can retrieve text from an existing image file using the selected OCR engine.

The functions DTWAIN_ExecuteOCR allows the application to specify an image file to process. After successfully processing the file, the application calls one of the following functions to process the text:

DTWAIN_GetOCRText

DTWAIN_GetOCRTextInfoHandle

DTWAIN_GetOCRTextInfoLong

DTWAIN_GetOCRTextInfoLongEx

DTWAIN_GetOCRTextInfoFloat

DTWAIN_GetOCRTextInfoFloatEx

Here are the typical steps required to process an image using the OCR engine:

Before calling DTWAIN_ExecuteOCR, your application should set the image file type of the image that will be processed. This is done by calling DTWAIN_SetOCRCapValues using the DTWAIN_OCRCV_IMAGEFILETYPE capability.

Call DTWAIN_ExecuteOCR. If successful, get the text by calling DTWAIN_GetOCRText.

Here is a small sample program that selects an OCR engine, and starts the OCR process on a single page BMP file. After the image file has been processed, the text that the OCR engine generated is outputted. Followed by this, the character position of each character that the OCR engine generated is outputted.

The sample is in 'C' language, but can be translated to other languages very easily.

void GetOCRText( )

{

LONG nFormats;

int isSupported = 0;

DTWAIN_OCRENGINE SelectedEngine;

/* Initialize DTWAIN Library */

DTWAIN_SysInitialize( );

/* Initialize the OCR interface */

DTWAIN_InitOCRInterface( );

/* Select the default OCR engine */

SelectedEngine = DTWAIN_SelectDefaultOCREngine( );

if ( SelectedEngine != 0 )

{

/* assume we know the file format of the file is BMP */

DTWAIN_ARRAY arr = DTWAIN_ArrayCreate(DTWAIN_ARRAYLONG, 1);

if ( arr )

{

/* Set the OCR engine to process a BMP file */

DTWAIN_ArraySetAtLong( arr, 0, DTWAIN_BMP );

DTWAIN_SetOCRCapValues( SelectedEngine, DTWAIN_OCRCV_IMAGEFILETYPE, DTWAIN_CAPSET, arr );

DTWAIN_ArrayDestroy( arr );

}

/* Now start the OCR processing by getting a single page */

if ( DTWAIN_ExecuteOCR( SelectedEngine, "myfile.bmp", 0, 0 ) )

{

char OCRText[1001];

char OCRTextXPosInfo[1001];

char OCRTextYPosInfo[1001];

LONG OCRTextHandle;

int curChar;

/* Retrieve up to 1,000 bytes of text */

DTWAIN_GetOCRText( SelectedEngine, 0, OCRText, 1000, NULL );

/* Output the text data generated by the OCR engine*/

for (curChar = 0; curChar < ActualSize; ++curChar )

printf( "%c", OCRText[curChar] );

/* Get a handle to the other data items we desire to know about */

OCRTextHandle = DTWAIN_GetOCRTextInfoHandle( SelectedEngine, 0 );

/* Let's get the (x,y) location info of each character information */

DTWAIN_GetOCRTextInfoLongEx( OCRTextHandle, DTWAIN_OCRINFO_CHARXPOS, OCRTextXPosInfo, 1000 );

DTWAIN_GetOCRTextInfoLongEx( OCRTextHandle, DTWAIN_OCRINFO_CHARYPOS, OCRTextYPosInfo, 1000 );

/* Output the information about where each character is located in the image that was OCR-ed*/

for (curChar = 0; charChar < ActualSize; ++curChar )

printf( "Character %d has position (%d, %d)\n", curChar, OCRTextXPosInfo[curChar], OCRTextYPosInfo[curChar]);

/* Destroy the TWAIN interface */

DTWAIN_SysDestroy( );

}