On Sun, 2012-07-08 at 08:08 -0700, sungkhum wrote:
I have two questions: is there a way to have the LibreOffice spelling
checker (Hunspell) also recognize word-breaks using the ICU break iterator
for Khmer so that Cambodians no longer have to add zero-width spaces
manually (as it seems to work for Thai now?)? Currently, lines without
zero-width spaces are seen as one long word to the spelling checker in
LibreOffice 3.6. But since the line-breaking is working, it would seem
breaking words for the spelling checker should also be able to work. Should
I submit a bug? How should I proceed?
Sounds like a bug really. I mean, hunspell itself generally doesn't do
the parsing of text into words, the app gives each word to hunspell. And
we're *supposed* to be using the icu breakiterator to split words. I
suspect its a similar bug as this original one.
So... sure, file a bug, assign it to me (caolanm@redhat.com) and paste a
short two word example text into the bug and indicate where the word
break should be and I'll add a regression test for it and see if its a
trivial fix for Khmer too now that we're using the latest-and-greatest
icu.
Also, since many other programs do not incorporate ICU's code, is there a
way to make the line breaks "real" when a document is saved in another
format (such as a .doc?). And by "real" I mean that a zero-width space is
actually added to the text where a line-break should be.
That should at least be theoretically possible, albeit a bit tricky
seeing as the layout code is the bit that knows the width of the page
and does the line breaking, while the export filters don't get to know
that information. There was something similar done in the past IIRC to
pass around soft-page-break information so that export filters could
know where the layout last put the page breaks. I forget the details of
that though.
C.
Context
- Re: Adding Extension for Experimental Thai Spelling (continued)
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.