OCR documents that contain transliteration of Sanskri

Looking for translations, or for help with translations and transliterations? This is the place.

OCR documents that contain transliteration of Sanskri

Postby Leo Rivers » Thu Apr 26, 2012 2:47 pm


I am trying to OCR some documents that contain transliteration of Sanskrit, meaning that they are Latin characters with a set of about 10 diacritic marks above and below them, which are not standard.

In order to make my OCR software (ReadIris Pro) learn the right characters, I need advice.

First, have I the correct product to do what I need to do?
Readiris 12.0 (build f3) - i64
Model Name:    Mac Pro
  Model Identifier:    MacPro4,1
  Processor Name:    Quad-Core Intel Xeon
  Processor Speed:    2.66 GHz
System Version:    Mac OS X 10.6.8 (10K549)
  Kernel Version:    Darwin 10.8.0
  Boot Volume:    Macintosh HD

  Version:    12.0
  Last Modified:    6/28/10 8:01 AM
  Kind:    Universal
  64-Bit (Intel):    Yes
  Get Info String:    Readiris 12.0.5 (build 1ef) Copyright 1987-2009, I.R.I.S.

  Location:    /Applications/Readiris Pro 12/Readiris.app


"The following brief statement regarding Vasubandhu’s religious view is limited to information obtained from his Sūtra-commentaries translated into Chinese by Bodhiruai during the first half of the sixth century A.D. These include T.1519 (SPU), T.1522 (Daśabhūmika-sūtra-śāstra) & T.1524 (SU), T. 1525 (Gayāśirṣa-sūtra-tīkā), T.1526  (Ratnācuḍaparipṛcchā-sūtra-catardharma-upadeśa), T.1532
(Viśeṣacintiparipṛcchā-sūtra-upadeśa) and T.1533 (Dharma-cakrapravartana-sūtra-upadeśa)."
User avatar
Leo Rivers
Posts: 263
Joined: Sun Jul 17, 2011 4:52 am

Re: OCR documents that contain transliteration of Sanskri

Postby kirtu » Thu Apr 26, 2012 4:33 pm

Well, when you scan the document does it scan correctly?

Kirt's Tibetan Translation Notes

"The way to solve problems in your life is to open your heart to others and think differently."
Lama Zopa Rinpoche

"Only you can make your mind beautiful."
HH Chetsang Rinpoche
User avatar
Former staff member
Posts: 5045
Joined: Mon Jan 18, 2010 5:29 pm
Location: Baltimore, MD

Re: OCR documents that contain transliteration of Sanskri

Postby Kaji » Fri Aug 24, 2012 12:41 pm

Would installing the Sanserif Pali font (http://www.dharanipitaka.net/2011/2008/ ... nspali.ttf) help?
Namas triya-dhvikānāṃ sarva tathāgatānām!
User avatar
Posts: 232
Joined: Thu Aug 23, 2012 11:16 am
Location: Perth

Re: OCR documents that contain transliteration of Sanskri

Postby viniketa » Fri Aug 24, 2012 1:08 pm

Leo - I'm not familiar with Readiris, I typically use ABBYY Fine Reader (and I use a PC). However, you likely need a 'language pack' for Readiris that includes all Extended Latin characters. English language does not. French, Spanish, German do.

Hope this helps.

If they can sever like and dislike, along with greed, anger, and delusion, regardless of their difference in nature, they will all accomplish the Buddha Path.. ~ Sutra of Complete Enlightenment
User avatar
Posts: 820
Joined: Tue Jul 03, 2012 2:39 am
Location: USA

Return to Language

Who is online

Users browsing this forum: No registered users and 10 guests