Hi Richard, On Thursday, 2019-04-18 20:40:01 +0100, Richard Wordingham wrote:
On Thu, 18 Apr 2019 12:25:11 +0200 Eike Rathke <erack@redhat.com> wrote:What I usually did is, lookup the language at SIL and the Ethnologue and use the most prevalent script as implied default script. Which here https://www.ethnologue.com/language/san would lead to Devanagari, but in this case more important is also what MS assigned the LCID for.So I shouldn't be misled by the fact that the CTL script I most frequently write Sanskrit in is Thai -:) Seriously, though, I believe the script of sa-TH is Thai is rather than Devanagari, and I am quite sure that the script of sa-MM is Mymr.
Your expertise is welcome! If the IANA language tag registry doesn't indicate a Suppress-Script field for a specific language then nowadays it is indeed better practice to explicitly state the script tag for languages that otherwise could be ambiguous. So that would be sa-Thai-TH and sa-Mymr-MM. Deducing the script from the language-country combo is deprecated, but for backwards and MS compatibility not avoidable for existing tags.
It sounds as though one has to specify the script where there is doubt as to what type of script will dominate. Is it an issue if there are two competing scripts of the same type, e.g Thai v. Lanna for Northern Thai? A dual script dictionary would correct inefficiently.
Competing in the sense two different scripts under one language tag? I wouldn't do that and IMHO it would be wrong.
Though with sa-Latn I doubt there's a use case, so I wouldn't call that "correct" in common sense.So how do you suggest we tag Sanskrit in Latin script? Within English works, its not uncommon for any Sankrit quoted precisely to be in the Latin script; about half the English language articles in the 'International Journal of Sanskrit Research' (http://www.anantaajournal.com/) that quote Sanskrit passages quote them in the Latin script. Several papers would benefit from the application of sa-Latn proofing tools, though I don't denying that proofing Sanskrit may be difficult.
I wasn't aware that there is indeed Sanskrit transcribed to Latin ... so then, sa-Latn might make sense. Eike -- GPG key 0x6A6CD5B765632D3A - 2265 D7F3 A7B0 95CC 3918 630B 6A6C D5B7 6563 2D3A
Attachment:
signature.asc
Description: PGP signature