Using OCR and TWAIN
The DTWAIN_AcquireFile and DTWAIN_AcquireFileEx allows an application to acquire images from a TWAIN device, and store those images as files. One of the file types available is text (single page and multi-page).
The OCR engine is responsible for generating the text files. You must select the OCR engine prior to calling DTWAIN_AcquireFile or DTWAIN_AcquireFileEx, as well as the file type that DTWAIN will use as an intermediate file when invoking the OCR engine
Example: void GetOCRText( ) { LONG nFormats; int isSupported = 0; DTWAIN_OCRENGINE SelectedEngine; DTWAIN_SOURCE SelectedSource;
char *filename = "MyText.txt";
/* Initialize DTWAIN Library */
/* Initialize the OCR interface */
/* Select the default OCR engine */ SelectedEngine = DTWAIN_SelectDefaultOCREngine( );
if ( SelectedEngine != 0 ) { /* assume we know that the OCR engine can process BMP files, so make sure that DTWAIN saves the image to a temporary BMP file for processing */ DTWAIN_ARRAY arr = DTWAIN_ArrayCreate(DTWAIN_ARRAYLONG, 1); if ( arr ) { /* Set the OCR engine to process a BMP file */ DTWAIN_ArraySetAtLong( arr, 0, DTWAIN_BMP ); DTWAIN_SetOCRCapValues( SelectedEngine, DTWAIN_OCRCV_IMAGEFILETYPE, DTWAIN_CAPSET, arr ); DTWAIN_ArrayDestroy( arr ); }
/* Now select the default TWAIN Source */ SelectedEngine = DTWAIN_SelectSource( );
/* return if no source was selected */ if ( SelectedEngine == 0 ) return;
/* Start the acquisition process */ DTWAIN_AcquireFile( SelectedSource, filename, DTWAIN_TEXT, DTWAIN_USENAME, DTWAIN_PT_BW, 1, TRUE, TRUE, NULL );
/* Destroy the TWAIN interface */ }
|