Date: prev next · Thread: first prev next last
2012 Archives by date, by thread · List index


Thanks for your reply Caolán,
I have submitted a bug and assigned you to it. I really appreciate you
being willing to look into this!
Here's the bug url:
https://www.libreoffice.org/bugzilla/show_bug.cgi?id=52020
Please let me know if there is anything else I can provide. I have a little
working knowledge of ICU, I helped implement the breakiterator for Khmer by
providing the dictionary and tests, but I am not a programmer by trade.

There was something similar done in the past IIRC to
pass around soft-page-break information so that export filters could
know where the layout last put the page breaks. I forget the details of
that though.

This would be a very useful feature for Cambodians (and I would assume Thai
as well, although Thai tends to have more programs that currently support
wordbreaking already) - would it be best to seek to do this with an
extension rather than LibreOffice core?

Thanks again for your time,
Nathan


On Thu, Jul 12, 2012 at 11:10 PM, Caolán McNamara [via Document Foundation
Mail Archive] <ml-node+s969070n3995127h32@n3.nabble.com> wrote:

On Sun, 2012-07-08 at 08:08 -0700, sungkhum wrote:
I have two questions: is there a way to have the LibreOffice spelling
checker (Hunspell) also recognize word-breaks using the ICU break
iterator
for Khmer so that Cambodians no longer have to add zero-width spaces
manually (as it seems to work for Thai now?)? Currently, lines without
zero-width spaces are seen as one long word to the spelling checker in
LibreOffice 3.6. But since the line-breaking is working, it would seem
breaking words for the spelling checker should also be able to work.
Should
I submit a bug? How should I proceed?

Sounds like a bug really. I mean, hunspell itself generally doesn't do
the parsing of text into words, the app gives each word to hunspell. And
we're *supposed* to be using the icu breakiterator to split words. I
suspect its a similar bug as this original one.

So... sure, file a bug, assign it to me ([hidden 
email]<http://user/SendEmail.jtp?type=node&node=3995127&i=0>)
and paste a
short two word example text into the bug and indicate where the word
break should be and I'll add a regression test for it and see if its a
trivial fix for Khmer too now that we're using the latest-and-greatest
icu.

Also, since many other programs do not incorporate ICU's code, is there
a
way to make the line breaks "real" when a document is saved in another
format (such as a .doc?). And by "real" I mean that a zero-width space
is
actually added to the text where a line-break should be.

That should at least be theoretically possible, albeit a bit tricky
seeing as the layout code is the bit that knows the width of the page
and does the line breaking, while the export filters don't get to know
that information. There was something similar done in the past IIRC to
pass around soft-page-break information so that export filters could
know where the layout last put the page breaks. I forget the details of
that though.

C.

_______________________________________________
LibreOffice mailing list
[hidden email] <http://user/SendEmail.jtp?type=node&node=3995127&i=1>
http://lists.freedesktop.org/mailman/listinfo/libreoffice


------------------------------
 If you reply to this email, your message will be added to the discussion
below:

http://nabble.documentfoundation.org/Adding-Extension-for-Experimental-Thai-Spelling-tp3735637p3995127.html
 To unsubscribe from Adding Extension for Experimental Thai Spelling, click
here<http://nabble.documentfoundation.org/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3735637&code=c3VuZ2todW1AZ21haWwuY29tfDM3MzU2Mzd8LTE3NzAzNTQxNDk=>
.
NAML<http://nabble.documentfoundation.org/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>



--
View this message in context: 
http://nabble.documentfoundation.org/Adding-Extension-for-Experimental-Thai-Spelling-tp3735637p3995138.html
Sent from the Dev mailing list archive at Nabble.com.

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.