I am trying to OCR some documents that contain transliteration of Sanskrit, meaning that they are Latin characters with a set of about 10 diacritic marks above and below them, which are not standard.
In order to make my OCR software (ReadIris Pro) learn the right characters, I need advice.
First, have I the correct product to do what I need to do?
Readiris 12.0 (build f3) - i64
Model Name: Mac Pro
Model Identifier: MacPro4,1
Processor Name: Quad-Core Intel Xeon
Processor Speed: 2.66 GHz
System Version: Mac OS X 10.6.8 (10K549)
Kernel Version: Darwin 10.8.0
Boot Volume: Macintosh HD
Last Modified: 6/28/10 8:01 AM
64-Bit (Intel): Yes
Get Info String: Readiris 12.0.5 (build 1ef) Copyright 1987-2009, I.R.I.S.
Location: /Applications/Readiris Pro 12/Readiris.app
SAMPLE TEXT :
"The following brief statement regarding Vasubandhu’s religious view is limited to information obtained from his Sūtra-commentaries translated into Chinese by Bodhiruai during the first half of the sixth century A.D. These include T.1519 (SPU), T.1522 (Daśabhūmika-sūtra-śāstra) & T.1524 (SU), T. 1525 (Gayāśirṣa-sūtra-tīkā), T.1526 (Ratnācuḍaparipṛcchā-sūtra-catardharma-upadeśa), T.1532
(Viśeṣacintiparipṛcchā-sūtra-upadeśa) and T.1533 (Dharma-cakrapravartana-sūtra-upadeśa)."