Date: prev next · Thread: first prev next last
2019 Archives by date, by thread · List index


Hi,

So, having read this again twice I'm still trying to understand.. but it
looks somewhat clearer now.

On Friday, 2019-07-12 16:51:48 +0400, ԻԴ | Սամվել Հարությունյան wrote:

Following is our mission:

We are building a GNU/Linux based OS with full support of all Armenian
literary (Գրական) language versions except middle-Armenian.

Namely:
1) Grabar - ISO 639-3 code "xcl",
2) Western Armenian (ISO 639-3 code "hyw")
3) and Eastern Armenian (639-3 code "hye")

And including both orthographies for Armenian Eastern variant:
3*) Eastern classic
3**) Eastern modern

Note that unfortunately ISO 639-3 does not distinguish between the 2,

It is not the role of ISO 639-3 to distinguish between ortographies. ISO
639 defines codes for *languages*. For distinguishing ortographies it
comes to BCP 47 "Tags for Identifying Languages", see
http://langtag.net/ and https://tools.ietf.org/html/rfc5646

I already mentioned in an earlier mail that the proper way would be to
request language tag subtags for different ortographies from IANA, and
possibly let one ortography be the default so it's redundant to specify
it in a language tag.

therefore we have decided to use ISO 3166 codes for differentiation, namely:
a) RU - definitely meaning hye(modern), because Armenian diaspora in RF is
mainly educated with hye-modern
b) IR - definitely meaning hye(classic), because Armenian diaspora of Iran
is mainly educated with hye-classic
c) AM - here we are in dilemma. Official laws of Armenia do not isolate any
of the 1,2,3,3*,3** as official language of the republic. We have
double-checked this with the State Lanugate Committee of RA in writing and
have written response.

If there is no offcial view on this then for hy-AM use the most widely
used ortography written by the majority of the people speaking Armenian
in Armenia. I think this is what Seda is and was translating.


Here a little more technical details are due to understand why it is a
problem and what solution has been chosen for it:

As you probably know for LOCALE setting one typically specifies 2 codes - a
language code (typically ISO-639) and a country code (typically ISO 3166).

Though language and country code are only the most basic tag of a locale.
Proper language tags know more subtags and ways to combine, for example
ca-valencia-ES, de-1901-DE, en-oxendict-GB

For such differences the GNU glibc came up with the locale @modifier for
a very few variations, so there's for example ca_ES@valencia and
sr_RS@latin, though these are only a kludge to the underlying problem
that the locale mechanisms don't understand proper language tags
(ca-valencia-ES and sr-Latn-RS). However, LibreOffice does understand
them, and *iff* proper subtags are registered they can be used, i.e. the
ca-valencia and sr-Latn translations do exist.

So in our GNU/Linux based Operating System, one who resides in RF would
definitely have RU in their LOCALE, and one in Iran wout definitely have IR.
And since we care about the diasporas, we do want to support them, and this
is tangential to the problem mentioned in (c) above.

But for somebody who lives in Armenia, we can unambiguously set proper
locales only for hyw_AM and xcl_AM, while for hye_AM (and one could argue
that also for hy_AM) we have lack of clarity in legislation. De-facto the
population of Armenia would probably vote for hye_AM meaning
armenian-eastern with modern orthography.

Which is what I was assuming and postulating.

While a small team of researchers,
including us, would prefer it to mean armenian-eastern with classic
orthogrpahy.

That apparently only a minority of Armenians in Armenia speak and write.

Finally, as mentioned in statement that you have referred to,
the head of Language Committee told to us that currently they are urged by
Min-Diaspora to not give any definitive answer to that, but only promise to
give it after coming to consensus about "unified armenian orthography".

Bottom line, our locale numeration goes as following:
1) xcl_RU, xcl_IR, xcl_AM - Classic Armenian

Btw, it was pointed out to me that it is Classical Armenian, not
Classic, also ISO 639 and the IANA language tag registry say so. Already
changed in master.

2) hyw_RU, hyw_IR, hyw_AM - Western Armenian
3) hye_IR - Eastern Armenian with Classic Orthogrpahy
4) hye_RU - Eastern Armenian with Modern (aka soviet) Orthography
5) hye_AM - our personal choice for the GNU/Linux OS we are building -
Eastern Armenian with Classic Orthography

This IMHO is wrong. See above about "most widely used".

6) hy_AM - our choice is hye_AM, although one could go as far as arguing
that this should be xcl_AM (e.g. Alexander Qananyan - who is advising us in
this activity)

There is no difference between hy-AM and hye-AM, it is only ISO 639-1 vs
639-3, 'hy' and 'hye' are equivalent, see
https://iso639-3.sil.org/code/hye

As LibreOffice prefers the ISO 639-1 code if available (because doing so
eased interoperability with software not understanding ISO 639-3 codes)
the current translation to Eastern Armenian uses 'hy' instead of 'hye'.

'xcl' on the other hand is Classical Armenian, a historical language,
see https://iso639-3.sil.org/code/xcl and not a choice for a living
language translation nor would an xcl-AM locale be (unless that is to be
used with the Classical Armenian of course).


So, to recap, if I understood correctly:
* the current LibreOffice 'hy' translation (assigned locale hy-AM) uses
  the modern ortography
* for some GNU/Linux distribution you are translating modern ortography
  using the hy-RU locale, and classic ortography using the hy-AM locale
* you want LibreOffice to do the same, despite it already uses hy-AM for
  the modern ortography as the more widely used


Our team needs to put in translations of Armenain Eastern Classic, Armenian
Eastern Modern/Soviet, and Armenian Classic. The other variants will come
later in our roadmap. For these 3 we are using the following codes:

1) Armenian Classic: xcl (any country code)
2) Armenian Eastern Classic: hye_IR, hy_IR, and unless you have strong
objections, also hy_AM and hye_AM
3) Amrenian Eastern Modern: hye_RU, hy_RU (we believe that your main
argument is that this should also be denoted by hy_AM and hye_AM?)

Please note that due to orthographic ambiguity between 2/3 the "hy" and
"hye" alone cannot denote a locale (PO file) unambiguously, so country code
needs to be specified.

This is not true. There can be a 'hy' translation as the "default" one
for 'hy-AM' and a 'hy-RU' or 'hy-IR' translation to distinguish, similar
to 'pt' and 'pt-BR' (Portuguese and Portuguese Brazilian).

If you agree with this clarification of the subject of our discussion, then,
please, let's complete it asap and come up with solution that is not only
respecting our own desires, but legislation of Armenia and also future
generation rights.

I repeat: register ortography variant subtags with IANA. That is the
only clean solution.

Specifically - we cannot claim that there is a legal
justification for hy_AM/hye_AM = hy_RU/hye_RU (i.e. denote modern
orthography), nor there is a legal justification for it to denote classic
orthography (hy_IR/hye_IR).

But it is fair to assume and common sense that the most widely used
ortography of Armenian as spoken in Armenia should be assigned the
locale of hy-AM.

More precisely/pedantically, ase mentioned in
(6) above, there is no even legal justification to assume that hy = hye.

There is ISO 639 that says so. It doesn't need any further legal
justification.

So
we have following open questions:

1) hy = hye or rather leave it ambiguous and let each Operating System
Vendor make their own choice, and even let User decide?

To me: modern, as the default ortography.

2) hye_AM = classic or modern variant of Eastern orthography ?

To me: modern, same as hy-AM, no difference.

Our preferred answers are:
1) hy = xcl

Definitely wrong. See ISO 639.

2) hye_AM = classic orthography of Eastern Armenian

I doubt it.

Are there any numbers on usage of classic vs modern orthography?

But we realize that our customers will force us to use:
1) hy = hye
2) hy_AM = soviet orthography
3) they would not care about hye_AM or anything else.

Nor does software. hy=hye.

We strongly believe that "politically correct" way of solving the situation
is:
1) ask our customers to set OS LOCALE to hy_RU if they want to use Eastern
Armenian with modern orthography

I think that's wrong, *and* politically incorrect if you expect
Armenians to do so for the most widely used orthography.

2) ask our customers to set OS LOCALE to hy_IR if they want to use Eastern
Armenian with classic orthogrpahy
3) temporarily use hy/hye_AM for Eastern Armenian Classic until our
MinEdu/Language Committee will come up with a clarification

I don't see any justification to use hy-AM for Eastern Armenian Classic.
Specifically anything temporary will just lead to confusion, even more
if 'hy' and hy-AM are already used with modern orthography. Also note
that document content already tagged with hy-AM expects the context of
the so far used modern orthography, for example spell-checking and
locale data dependant formatting. Changing that to a different
orthography and even worse, possibly changing it back after some
authority may have made up its mind, will yield unexpected results and
at the end just frustrate the user.

From your argumentation, it seems that you would prefer to use Eastern
Armenian Modern for (3). Can we say that this is the main argument we have
to discuss now, or is there any other point that needs to be revisited as
well?

As said, I think the 'hy' translation and thus default hy-AM locale
assigned should be for the most widely used ortography.

Meanwhile, for some possible solution if you don't like to use hy-IR for
Eastern Armenian Classic and if it is for UI translation only, not
document content attribution, Eastern Armenian Classic could use
a private use subtag, for example hy-x-clssic, see
https://tools.ietf.org/html/bcp47#section-2.2.7

When it comes to document content things get hairy, because as the
terminology indicates, private use tags are private.. they are not
suited for information exchange, as parties have to agree on it and
a third party might come up with the same tag and a different meaning or
just does not interpret it in the privately agreed manner. See also
https://tools.ietf.org/html/bcp47#section-4.6

Hence, again, best solution is to register an ortography variant subtag
with IANA, see http://langtag.net/register-new-subtag.html

  Eike

-- 
GPG key 0x6A6CD5B765632D3A - 2265 D7F3 A7B0 95CC 3918  630B 6A6C D5B7 6563 2D3A

-- 
To unsubscribe e-mail to: l10n+unsubscribe@global.libreoffice.org
Problems? https://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette
List archive: https://listarchives.libreoffice.org/global/l10n/
Privacy Policy: https://www.documentfoundation.org/privacy

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.