[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [libreoffice-l10n] Help with Lightproof localisation for LibreOffice



Hi there,

I am the owner of the Tamil localisation effort, and am creating the
grammar checker for LibreOffice using Lightproof. I am having trouble
matching the diacritic marks that are so common in Tamil. For example --

\b(த[ா-ௌ]*\S*)\b

will match தாலம but not தாலம்

I would like to match the whole word, including the diacritic mark; but I'm
not sure how to trap it.

Would appreciate if you had faced similar problem for your language and
have solved it.

Cheers,
Elanjelian

Hi,

AFAIK it's a known bug of python2. It doesn't support unicode completely. So you need to switch to python3 to process your language without this kind of problems.

I'm not sure when Lightproof will deliver python3 support, probably László tell you more.

--
Unsubscribe instructions: E-mail to l10n+help@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted

References:
[libreoffice-l10n] Help with Lightproof localisation for LibreOfficeElanjelian Venugopal <tamiliam@gmail.com>
Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.