Date: prev next · Thread: first prev next last
2016 Archives by date, by thread · List index


Hi Aleksandr,


2016-04-03 16:17, Aleksandr Andreev wrote:
On Sun, Apr 3, 2016 at 2:52 PM, Rimas Kudelis <rq@akl.lt> wrote:
The reason why I suspect it might belong to fonts is because there is
only one Unicode codepoint I know of serving this exact purpose (U+00AD
SOFT HYPHEN), and OpenType has a feature called "Localized forms", which
is designed exactly for cases like this (where glyph representation in
particular language is supposed to be different than usual). In
combination, these features seem to provide means to solve your problem.

You may be right that it belongs to the realm of Font Features
(although that sounds like a terrible design flaw IMHO, given that LO
has no mechanism currently to turn simple OpenType features on and off
IIUC). But it certainly has nothing to do with the Soft Hyphen.

According to the Unicode documentation (p. 268),

Despite its name, U+00AD soft hyphen is not a hyphen, but rather an
invisible format character used to indicate optional intraword breaks.

And on p. 812 of the Standard:

U+00AD soft hyphen (SHY) indicates an intraword break point, where a
line break is preferred if a word must be hyphenated or otherwise
broken across lines. Such
break points are generally determined by an automatic hyphenator. SHY
can be used with
any script, but its use is generally limited to situations where users
need to override the
behavior of such a hyphenator.

So, the SHY:
* has no visible glyph, despite what some font manufacturers are doing;
* is not a graphic character, but rather a format control character;
* is not supposed to be used by an automatic hyphenator for hyphenation;
* is supposed to be used by a user to *override* the behavior of an
automatic hyphenator.

I see you've done your homework and did a bit more research than me.
Great! :)

With all the data you shared, I'm even more certain that this belongs to
the locale data, much like quotation characters and number formatting
characters. I'm not sure if this locale property is readily available
for inclusion in locale data though. It might be that Slavonic is a very
rare exception to the common rule of using hyphens for that, and that
this hasn't been accounted for anywhere. At least I couldn't find
anything about this neither in the LDML standard, nor in our DTD for
locale definition files
(https://cgit.freedesktop.org/libreoffice/core/plain/i18npool/source/localedata/data/locale.dtd).

Regards,
Rimas


-- 
To unsubscribe e-mail to: l10n+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.