Date: prev next · Thread: first prev next last
2011 Archives by date, by thread · List index


Hi all,

Rather than try to list all the issues here I thought it might help if
I provided a script that tries to find errors in the files.

I have made some assumptions about the file format to look for the
common errors I found:
1. A line that starts with 1 or more characters followed by a |, then
only digits to EOL is a word definition.
2. A line that starts with either ( or - is a synonym definition.
This may not be a valid assumption as I've seen lines that start with
interj that were definitely synonym definitions.  I am not sure what
interj means in th_ro_RO_v2.dat so I have special cased interj and
prep to also be a synonym line.

With these assumptions the script compares the expected number of
synonyms with the actual number of synonyms and complains if they
don't match (with word and line numbers displayed for the definition).

It will also complain if it finds the same word more than once and
will print out both lines on which the suspect word was found.

I hope this helps - the script finds no issues in a number of
dictionaries, but output this many informational lines for the
following dictionaries in my libreoffice build tree:
    138 th_ca_ES_v3.dat
   1092 th_de_AT_v2.dat
   1101 th_de_CH_v2.dat
   1092 th_de_DE_v2.dat
      2 th_hu_HU_v2.dat
      6 th_ne_NP_v2.dat
   2582 th_ro_RO_v2.dat
      8 th_ru_RU_v2.dat
     15 th_sk_SK_v2.dat

I hope this helps.  The perl script is LGPL/MPL.

Regards
Steven Butler

-- 
Unsubscribe instructions: E-mail to l10n+help@libreoffice.org
List archive: http://listarchives.libreoffice.org/www/l10n/
*** All posts to this list are publicly archived for eternity ***

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.