Date: prev next · Thread: first prev next last
2015 Archives by date, by thread · List index


On Tue, 30 Jun 2015 17:48:05 +0200
Eike Rathke <erack@redhat.com> wrote:

On Monday, 2015-06-29 20:40:46 +0200, Khaled Hosny wrote:

We already handle this at the text shaping level in VCL for
platforms where HarfBuzz is used.
 
I think we talk about two different things here.

Yes.  Khaled and I are focused on handling text, whether fundamentally
present or generated by field codes and the like.  What you are talking
of makes most sense for when there is no relevant user-input text. 

My view is from
correct language tag attribution that we need anyway, for document
storage

I don't understand that one.

and spell-checkers

Seems to work for 'unsupported' nod-TH.  Tai Tham script is encountered,
identified as complex (as demonstrated by the choice of font), so
language nod-TH and corrected using the nod-TH spelling dictionaries.  

(Mind you, they're only populated as nod-Lana-TH.  The fun starts when
we want to distinguish what might be called nod-Thai-TH-etymological,
nod-Thai-TH-Chiangmai and nod-Thai-TH-Chiangrai.)

and locale dependent representation.

Presumably for generated text.  Yes, here language and country will in
general be inadequate.

When
I mention "language tag" I'm always talking about BCP 47 language
tags. You, and possibly Richard, have the runtime view and what could
be automatically detected. So, even if detected automatically we'll
have to assign a language tag that for the non-default script of a
language includes the ISO 15924 script code.

<snip> arbitrary "Western"/CTL/CJK classification <snip>

The correct route to go is probably to
assign known scripts to these classes, whether detected automatically
or not,

Which is already being done, though conceivably going directly from
character to class.

and distribute language tags according to their (implied or
not) script over those classes.

I'm not sure I follow you here.  A supported language tag will have
corresponding strings for automatically generated text, and these
strings will generally imply the font.  The only exception I can think
of is common script text, where perhaps script information will be
required to select the styling.  This just requires a default script
for each supported language code (i.e. minimal BCP 47 tag), though we
could get away with default script class.

Richard.

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.