Date: prev next · Thread: first prev next last
2016 Archives by date, by thread · List index


Dotan:

I'm sorry I never stumbled across your essay before, but thanks for an
excellent explanation of how the UnicodeĀ® Standard Annex #9/Unicode
Bidirectional Algorithm works *in actual practice*!

So far as I can see, your description is still valid for even the recently
updated version of that algorithm
(http://www.unicode.org/reports/tr9/tr9-35.html). As published, the Unicode
Consortium's algorithm really doesn't explain what's happening in a way that
would help an average user - one who is just trying to "type" while mixing
multiple scripts with opposing directionality - it's more intended for
developers.

Unfortunately (in my view anyway), the algorithm itself makes some
assumptions that I find unjustifiable. A primary example is the
categorization of certain "shared characters" (spaces, punctuation and so
forth) as neutral, and accompanying that with the idea that they should
therefore take on the directionality of the paragraph unless and until
surrounded by characters that clearly define them as one directionality or
another.

This seems to be why, for instance, the cursor jumps around mysteriously
when entering a multi-word segment of Hebrew or Arabic scripts (regardless
of the actual language they are used for) each time a space is encountered
(you said "In LibreOffice you shouldn't have such an issue" - true enough,
but several remain). It would seem to me that - from a user-interface
perspective at least - such characters should keep the directionality of the
most recently typed character, leaving the cursor where it was before the
space (most common example, but occurs with other such characters) was
entered. If the next character is indeed one of the opposite directionality,
then make the correction accordingly.

As a matter of principle, assumptions in algorithms always seem risky and/or
dangerous. In this case, the whole idea that one needs to set the
directionality of characters or phrases ahead of time seems particularly
problematic. The obvious counter-argument to this is when beginning a
paragraph with a character that isn't in the direction the writer intended,
that would need to be treated as a special circumstance.

The ultimate objective would seem to be completely removing any barriers to
freely typing in whatever language or script desired without needing to know
a lot of special tricks; both Unicode (and UTF-8) and OpenType font
technology are big huge steps towards this goal - but we're not quite there
yet.

Again, thanks for pointing out your essay!




--
View this message in context: 
http://nabble.documentfoundation.org/Struggling-with-Hebrew-in-LO-tp4198211p4199184.html
Sent from the Users mailing list archive at Nabble.com.

-- 
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.