On 06/24/2012 05:01 PM, webmaster-Kracked_P_P wrote:
BUT, I would like to make sure all of my .oxt dictionaries have the words/terms we use every day
in articles and email support for LibreOffice and other open source related "items".
If you have the disk capacity, then:
* Download the Wikipedia article database;
* Run a script that writes each word it finds into a file;
* Manually go through the list, to pick up misspellings;
* Merge the "correct words" list into your existing wordlist;
* Merge the "known misspelling" list into the autocorrect list;
Two potential issues with this approach:
* Names of individuals, organizations, and things are included;
* Foreign words are included;
Whilst there are ways to eliminate both of those problems, the usual
result, when doing so using scripts, is that legitimate words in the
target language are removed, along with the foreign word, or nouns. As
one example, the Afrikaans dictionary omitted the word "die" for several
years, because the script that was used to eliminate non-Afrikaans
words, read that word as the English "die".
jonathon
--
For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted
Context
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.