IR Paper: Comparison of Tibetan ngram and Word Retrieval

This is a space for inquiries about various languages, translations, or anything else related to language.

IR Paper: Comparison of Tibetan ngram and Word Retrieval

Postby kirtu » Wed Dec 21, 2011 1:51 pm

Here's an information retrieval/computational linguistics paper by Hackett and Oard: Comparison of Word-Based and Syllable-Based Retrieval for Tibetan

Abstract Tibetan retrieval based on automatically segmented words is compared with the use of overlapping syllable n-grams using a known-item retrieval evaluation. The optimal span of fixed-length n-grams is found to be 2 syllables, and indexing words is found to be as effective as indexing syllable bigrams.


Kirt
"Set your heart on virtue: Virtue's outcome is delight".
Dharmapada 9:3
“All beings are Buddhas, but obscured by incidental stains. When those have been removed, there is Buddhahood.”
Hevajra Tantra
kirtu
Global Moderator
 
Posts: 3073
Joined: Mon Jan 18, 2010 5:29 pm
Location: Baltimore, MD

Return to Language

Who is online

Users browsing this forum: No registered users and 2 guests

>