OCR documents that contain transliteration of Sanskri

Looking for translations, or for help with translations and transliterations? This is the place.
Post Reply
User avatar
Leo Rivers
Posts: 498
Joined: Sun Jul 17, 2011 4:52 am
Contact:

OCR documents that contain transliteration of Sanskri

Post by Leo Rivers »

Greetings

I am trying to OCR some documents that contain transliteration of Sanskrit, meaning that they are Latin characters with a set of about 10 diacritic marks above and below them, which are not standard.

In order to make my OCR software (ReadIris Pro) learn the right characters, I need advice.

First, have I the correct product to do what I need to do?
----
Readiris 12.0 (build f3) - i64
---
Model Name:    Mac Pro
  Model Identifier:    MacPro4,1
  Processor Name:    Quad-Core Intel Xeon
  Processor Speed:    2.66 GHz
System Version:    Mac OS X 10.6.8 (10K549)
  Kernel Version:    Darwin 10.8.0
  Boot Volume:    Macintosh HD
Readiris:

  Version:    12.0
  Last Modified:    6/28/10 8:01 AM
  Kind:    Universal
  64-Bit (Intel):    Yes
  Get Info String:    Readiris 12.0.5 (build 1ef) Copyright 1987-2009, I.R.I.S.

  Location:    /Applications/Readiris Pro 12/Readiris.app

SAMPLE TEXT :

"The following brief statement regarding Vasubandhu’s religious view is limited to information obtained from his Sūtra-commentaries translated into Chinese by Bodhiruai during the first half of the sixth century A.D. These include T.1519 (SPU), T.1522 (Daśabhūmika-sūtra-śāstra) & T.1524 (SU), T. 1525 (Gayāśirṣa-sūtra-tīkā), T.1526  (Ratnācuḍaparipṛcchā-sūtra-catardharma-upadeśa), T.1532
(Viśeṣacintiparipṛcchā-sūtra-upadeśa) and T.1533 (Dharma-cakrapravartana-sūtra-upadeśa)."
User avatar
kirtu
Former staff member
Posts: 6997
Joined: Mon Jan 18, 2010 5:29 pm
Location: Baltimore, MD

Re: OCR documents that contain transliteration of Sanskri

Post by kirtu »

Well, when you scan the document does it scan correctly?

Kirt
“Where do atomic bombs come from?”
Zen Master Seung Sahn said, “That’s simple. Atomic bombs come from the mind that likes this and doesn’t like that.”

"Even if you practice only for an hour a day with faith and inspiration, good qualities will steadily increase. Regular practice makes it easy to transform your mind. From seeing only relative truth, you will eventually reach a profound certainty in the meaning of absolute truth."
Kyabje Dilgo Khyentse Rinpoche.

"Only you can make your mind beautiful."
HH Chetsang Rinpoche
User avatar
Kaji
Posts: 242
Joined: Thu Aug 23, 2012 11:16 am
Location: Perth

Re: OCR documents that contain transliteration of Sanskri

Post by Kaji »

Would installing the Sanserif Pali font (http://www.dharanipitaka.net/2011/2008/ ... nspali.ttf" onclick="window.open(this.href);return false;) help?
Namas triya-dhvikānāṃ sarva tathāgatānām!
User avatar
viniketa
Posts: 820
Joined: Tue Jul 03, 2012 2:39 am
Location: USA

Re: OCR documents that contain transliteration of Sanskri

Post by viniketa »

Leo - I'm not familiar with Readiris, I typically use ABBYY Fine Reader (and I use a PC). However, you likely need a 'language pack' for Readiris that includes all Extended Latin characters. English language does not. French, Spanish, German do.

Hope this helps.

:namaste:
If they can sever like and dislike, along with greed, anger, and delusion, regardless of their difference in nature, they will all accomplish the Buddha Path.. ~ Sutra of Complete Enlightenment
Post Reply

Return to “Language”