Thanks for your reply Caolán,
I have submitted a bug and assigned you to it. I really appreciate you
being willing to look into this!
Here's the bug url:
https://www.libreoffice.org/bugzilla/show_bug.cgi?id=52020
Please let me know if there is anything else I can provide. I have a little
working knowledge of ICU, I helped implement the breakiterator for Khmer by
providing the dictionary and tests, but I am not a programmer by trade.
There was something similar done in the past IIRC to
pass around soft-page-break information so that export filters could
know where the layout last put the page breaks. I forget the details of
that though.
This would be a very useful feature for Cambodians (and I would assume Thai
as well, although Thai tends to have more programs that currently support
wordbreaking already) - would it be best to seek to do this with an
extension rather than LibreOffice core?
Thanks again for your time,
Nathan
On Thu, Jul 12, 2012 at 11:10 PM, Caolán McNamara [via Document Foundation
Mail Archive] <ml-node+s969070n3995127h32@n3.nabble.com> wrote:
On Sun, 2012-07-08 at 08:08 -0700, sungkhum wrote:
I have two questions: is there a way to have the LibreOffice spelling
checker (Hunspell) also recognize word-breaks using the ICU break
iterator
for Khmer so that Cambodians no longer have to add zero-width spaces
manually (as it seems to work for Thai now?)? Currently, lines without
zero-width spaces are seen as one long word to the spelling checker in
LibreOffice 3.6. But since the line-breaking is working, it would seem
breaking words for the spelling checker should also be able to work.
Should
I submit a bug? How should I proceed?
Sounds like a bug really. I mean, hunspell itself generally doesn't do
the parsing of text into words, the app gives each word to hunspell. And
we're *supposed* to be using the icu breakiterator to split words. I
suspect its a similar bug as this original one.
So... sure, file a bug, assign it to me ([hidden
email]<http://user/SendEmail.jtp?type=node&node=3995127&i=0>)
and paste a
short two word example text into the bug and indicate where the word
break should be and I'll add a regression test for it and see if its a
trivial fix for Khmer too now that we're using the latest-and-greatest
icu.
Also, since many other programs do not incorporate ICU's code, is there
a
way to make the line breaks "real" when a document is saved in another
format (such as a .doc?). And by "real" I mean that a zero-width space
is
actually added to the text where a line-break should be.
That should at least be theoretically possible, albeit a bit tricky
seeing as the layout code is the bit that knows the width of the page
and does the line breaking, while the export filters don't get to know
that information. There was something similar done in the past IIRC to
pass around soft-page-break information so that export filters could
know where the layout last put the page breaks. I forget the details of
that though.
C.
_______________________________________________
LibreOffice mailing list
[hidden email] <http://user/SendEmail.jtp?type=node&node=3995127&i=1>
http://lists.freedesktop.org/mailman/listinfo/libreoffice
------------------------------
If you reply to this email, your message will be added to the discussion
below:
http://nabble.documentfoundation.org/Adding-Extension-for-Experimental-Thai-Spelling-tp3735637p3995127.html
To unsubscribe from Adding Extension for Experimental Thai Spelling, click
here<http://nabble.documentfoundation.org/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3735637&code=c3VuZ2todW1AZ21haWwuY29tfDM3MzU2Mzd8LTE3NzAzNTQxNDk=>
.
NAML<http://nabble.documentfoundation.org/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
--
View this message in context:
http://nabble.documentfoundation.org/Adding-Extension-for-Experimental-Thai-Spelling-tp3735637p3995138.html
Sent from the Dev mailing list archive at Nabble.com.
Context
- Re: Adding Extension for Experimental Thai Spelling (continued)
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.