On Mon, Feb 28, 2011 at 9:16 AM, Peter Ruwoldt <email@example.com> wrote:
I am wanting to develop a Pitjantjatjara spell checker for
Well, I know nothing about that language, and I'm not sure whether you
really intended to write what you did.
I don't think it is necessary to develop a new spell-checker, but
instead it is enough to create a corresponding dictionary for the
existing spellchecker, namely hunspell.
[...] I have a list of words in a unicode text file where each word is on a
new line. There are about 2300 lines/words.
This is a rather short list, very likely not enough for automatic
affix creation. I guess that this list doesn't included flexed forms
of the words anyway (i.e. past and future forms, genitives, etc.)
I have no idea what to do next and I would appreciate clues for the next steps.
You need to have a good understanding on how words are formed in the language.
For example if plural of a word is (almost) always formed by appending
an "s" to a word, then you should create an affix rule for that, etc.
But without knowing the languge specifics, it is hard to give a
concrete path. But then again 2300 is really short. With that list,
you can just save that list as a dictionary without any
But to develop a dictionary, not only a list with correct words, but
also a list with (possibly automatically generated) list of misspelled
words is needed, to do quality checks on your modifications.
Unsubscribe instructions: E-mail to firstname.lastname@example.org
List archive: http://listarchives.libreoffice.org/www/l10n/
*** All posts to this list are publicly archived for eternity ***
Impressum (Legal Info)
: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (MPLv2
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our trademark policy