OCR documents that contain transliteration of Sanskri

This is a space for inquiries about various languages, translations, or anything else related to language.

OCR documents that contain transliteration of Sanskri

Postby Leo Rivers » Thu Apr 26, 2012 2:47 pm

Greetings

I am trying to OCR some documents that contain transliteration of Sanskrit, meaning that they are Latin characters with a set of about 10 diacritic marks above and below them, which are not standard.

In order to make my OCR software (ReadIris Pro) learn the right characters, I need advice.

First, have I the correct product to do what I need to do?
----
Readiris 12.0 (build f3) - i64
---
Model Name:    Mac Pro
  Model Identifier:    MacPro4,1
  Processor Name:    Quad-Core Intel Xeon
  Processor Speed:    2.66 GHz
System Version:    Mac OS X 10.6.8 (10K549)
  Kernel Version:    Darwin 10.8.0
  Boot Volume:    Macintosh HD
Readiris:

  Version:    12.0
  Last Modified:    6/28/10 8:01 AM
  Kind:    Universal
  64-Bit (Intel):    Yes
  Get Info String:    Readiris 12.0.5 (build 1ef) Copyright 1987-2009, I.R.I.S.

  Location:    /Applications/Readiris Pro 12/Readiris.app

SAMPLE TEXT :

"The following brief statement regarding Vasubandhu’s religious view is limited to information obtained from his Sūtra-commentaries translated into Chinese by Bodhiruai during the first half of the sixth century A.D. These include T.1519 (SPU), T.1522 (Daśabhūmika-sūtra-śāstra) & T.1524 (SU), T. 1525 (Gayāśirṣa-sūtra-tīkā), T.1526  (Ratnācuḍaparipṛcchā-sūtra-catardharma-upadeśa), T.1532
(Viśeṣacintiparipṛcchā-sūtra-upadeśa) and T.1533 (Dharma-cakrapravartana-sūtra-upadeśa)."
User avatar
Leo Rivers
 
Posts: 250
Joined: Sun Jul 17, 2011 4:52 am

Re: OCR documents that contain transliteration of Sanskri

Postby kirtu » Thu Apr 26, 2012 4:33 pm

Well, when you scan the document does it scan correctly?

Kirt
Kirt's Tibetan Translation Notes

"Only you can make your mind beautiful."
HH Chetsang Rinpoche
User avatar
kirtu
Former staff member
 
Posts: 4570
Joined: Mon Jan 18, 2010 5:29 pm
Location: Baltimore, MD

Re: OCR documents that contain transliteration of Sanskri

Postby Kaji » Fri Aug 24, 2012 12:41 pm

Would installing the Sanserif Pali font (http://www.dharanipitaka.net/2011/2008/ ... nspali.ttf) help?
Namas triya-dhvikānāṃ sarva tathāgatānām!
User avatar
Kaji
 
Posts: 232
Joined: Thu Aug 23, 2012 11:16 am
Location: Perth

Re: OCR documents that contain transliteration of Sanskri

Postby viniketa » Fri Aug 24, 2012 1:08 pm

Leo - I'm not familiar with Readiris, I typically use ABBYY Fine Reader (and I use a PC). However, you likely need a 'language pack' for Readiris that includes all Extended Latin characters. English language does not. French, Spanish, German do.

Hope this helps.

:namaste:
If they can sever like and dislike, along with greed, anger, and delusion, regardless of their difference in nature, they will all accomplish the Buddha Path.. ~ Sutra of Complete Enlightenment
User avatar
viniketa
 
Posts: 819
Joined: Tue Jul 03, 2012 2:39 am
Location: USA


Return to Language

Who is online

Users browsing this forum: No registered users and 4 guests

>