Workflow between dev, UX and l10n teams

I really don't see a revision of all existing strings as a

requirement to start reviewing newly added ones.

At some point in time that review has to be done. To minimize the
overall workload, it is easier, and simpler, to do it before reviewing
newly added strings, than afterwards. (For starters, doing it
afterwards
means having to review those newly added strings at least twice, and
maybe thrice.)

You're right, but if the choice is between no reviews at all and reviews of new strings only, which would you choose? It's go for the latter.

Especially if you are a native English speaker and have a style guide

at hand.

Just as the English language has never met it word that it has not
adopted as it's own, so it has never met a grammatical construct that
it
has not adapted and mutilated. One direct consequence of both those
facets of acquisition, is that it is incredibly difficult to write a
sentence in English that is grammatically incoherent, but extremely
easy
to write a sentence that grammatically means the opposite of what was
intended.

IOW, that certain string might be "clear, concise and
grammatically, syntactically and typographically correct", but mean
something other than intended, because the vocabulary is usually used
to
mean something else elsewhere.

Let's add "semantically correct" or "contextually correct" to my list of requirements then. :slight_smile:

BTW, when you say "style guide", which specific one do you mean?

The one you're looking for, assuming it exists. If not, or could be a combination of Gnome HIG and any American English style guide we (the LibO community) would deem acceptable and meeting our needs (e.g. The Chicago Manual of Style).

I still have the one for French but it's copyrighted Sun anyway (from
2006). Gnome HIG contains a lot of information that we can use easily.

Cheers
Sophie

Hi again,

BTW, when you say "style guide", which specific one do you mean?

The one you're looking for, assuming it exists. If not, or could be a
combination of Gnome HIG and any American English style guide we (the
LibO community) would deem acceptable and meeting our needs (e.g. The
Chicago Manual of Style).

In fact, I just thought that it doesn't even have to be a formal manual: if somebody would be willing to oversee style consistency in our strings, and that style would look acceptable by our en-US users, then why not? Especially if that person would be willing to formalize these rules into a written style manual along the way.

Hi again,

BTW, when you say "style guide", which specific one do you mean?

The one you're looking for, assuming it exists. If not, or could be a
combination of Gnome HIG and any American English style guide we (the
LibO community) would deem acceptable and meeting our needs (e.g. The
Chicago Manual of Style).

In fact, I just thought that it doesn't even have to be a formal manual: if somebody would be willing to oversee style consistency in our strings, and that style would look acceptable by our en-US users, then why not? Especially if that person would be willing to formalize these rules into a written style manual along the way.

--
Rimas

This document may be of interest:
https://obriend.fedorapeople.org/WritingStyleGuide/
It's only recently (last year) been open sourced and made public.

I am not sure I understand you here (to me, the "otherwise" part reads: "if there is no way to script changes, wait until there is a script available," which would not make sense).

When talking about (developer-side) scripting, is it actually OK to commit modifications to the translations in the translations git sub-repo? My understanding was that such modifications would be overwritten by the next "import commit" (as typically done by Andras, AFAIU from some Pootle database).

- if there is a way to script changes, script them otherwise wait until
there is a script available to commit them

I am not sure I understand you here (to me, the "otherwise" part reads:
"if there is no way to script changes, wait until there is a script
available," which would not make sense).

Sorry for being unclear, I mean if scripting is possible but the
commiter don't know how to write the script, or have no time for it,
wait until somebody else write it. Of course, if it's no scriptable,
then no need to wait for a script :slight_smile:

When talking about (developer-side) scripting, is it actually OK to
commit modifications to the translations in the translations git
sub-repo? My understanding was that such modifications would be
overwritten by the next "import commit" (as typically done by Andras,
AFAIU from some Pootle database).

I don't know, maybe Andras or Rimas will.

Cheers
Sophie

Am I right in reading into this, that master is using American English?
And if so, why? Seeing as LibreOffice is, at heart, a European program
surely it should be using English?

Like British English? RP? Let's be specific here...

I mean, English like she is spoke in ENGLAND. There's no such language
as British English, true English is the majority language for the
*southern* end of Britain, and many places elsewhere because the English
have migrated, but it really grates when furriners (I guess you're not
British, I can't tell) assume that everybody here speaks the same
language. And if you're going to incorrectly call it "British English",
then actually "English English" would be far more accurate.

Yes, language is a complex matter, but as I like to put it, "In Britain
the Saxons speak English, the Angles speak Scots, and the Scots speak
Gaelic". It's a mess, agreed, but that's no reason to make the mess even
bigger by using completely bogus descriptions.

(And by sounding off like this, I've validated your comments further
down :slight_smile:

What proportion of developers are native American speakers?

Not that many. It would be great to see more involvement from the US,
but I think that promoting the "this is a European Project" attitude
can really hurt those numbers. LibreOffice is a global project: No one
country or continent should try to claim it for itself.

No I'm not trying to claim that at all. I'm simply saying that the
cultural norms of the project are not American, and we're quite happy to
make them welcome, but I don't want to make life hard for the majority,
just to appease a minority who may - OR MAY NOT - decide to join the party.

Bear in mind that most English variants use English spelling, not
American spelling.

I think the phrase "American English spelling" is clearer -- there are
lots of languages spoken here across the pond, so the phrase "American
spelling" is ambiguous.

At the end of the day, not enforcing en_us as a translation means that
the majority of us (including those of us that speak English rather than
American as our native language) are forced to suffer pain as the
foundations are messed up underneath us.

Whoa there, cowboy! (or whatever the British equivalent is) I think
that British, American, Canadian, etc.. English are all pretty
similar, so while I agree that we might have our little differences
about an extra 'u' in color, or whether the big vehicle that picks up
the trash is a Lorry or a Truck, it's not a big deal compared to the
diff between the Englishes and French or Spanish.

I'd be careful here !!! As a European, I really think the official
language should be Spanglianese! (That is, Spanish/Portuguese/Italian).
Those three are effectively modern Latin, are with some difficulties
mutually comprehensible, and are the MAJORITY FIRST language. What would
that do to encourage the Americans in? :slight_smile:

(And I say that as a Brit. Most of my countrymen would be absolutely
horrified by the idea :slight_smile:

And by allowing that *minority*
to avoid suffering, they are enabled to cause unnecessary pain without
even realising what they are doing!

*facepalm*

I know that you're just getting some stuff off your chest, and sure, I
get it: languages can be tough. So we get have a couple beers, find
the vertias in the vino, and start speaking French (wait, maybe that's
just what I do). More seriously, I'm trying to get people interested
in LibreOffice in the US, and it's really important for us to make the
project welcoming to users and new contributors.

You want to propose some changes? Sure, great plan. But please check
that your method of delivery doesn't paint the Americans as the
outsiders and buffoons of your diatribe, because the reality is that
we really don't have much going on in the US yet, and there's already
a hesitancy to interact with what is perceived as aloof Europeans. I
think that growth in the US has the potential to give a ton back to
the LibreOffice community in Development, Documentation, QA, and so
forth, but we need to go the extra mile there, not tell people that,
before they've opened a single spreadsheet or triaged a single bug,
they are somehow (?) "causing pain."

Look at it from the other side. You're telling us Europeans that we have
to do it the American way. It comes over far too much that the Americans
think their way is the only way and it really upsets a lot of people.
That's why so many cultures fear and hate America - they see it as a
direct threat to them. Certainly I see America as a serious threat to my
British way of life ...

I've got no problem whatsoever with welcoming Americans. What I do have
a problem with is them demanding that I become "more American", because
not only do I not want to, I absolutely positively hate the idea!

The rule should be simple. Any changes of meaning can be edited directly
in master. If it's non-native English, and poor at that such as it's
hard to comprehend then it can be corrected in master. If it's clear
comprehensible English, whether English or Strine or American or
International or whatever, then it's off-limits for changes to master,
and has to be done in Pootle or whatever as a localisation.

I like the general idea, but I am concerned about the feasibility. Notes:

1) will inconsistency of nouns (e.g. color vs. colour), inconsistency
of grammar, etc.. within the sources in master make translation harder
for the native-lang teams?

Why should it? English is a ployglot language. A good speaker has to be
able to understand Strine, Texas, Scouse, New Yorker, Geordie, Cannuck
etc etc. And decent written English (of whichever variant) is unlikely
to be a problem. Now if you want to throw in Pidgin that's a different
matter !! :slight_smile:

(And as a Brit, you're expecting me to understand American. What's that
saying? "Sauce for the goose is sauce for the Gander"? If you want me to
understand American, I'll demand you speak Scouse!!! :slight_smile:

2) What will the language be for builds w/o langpacks? Just a generic
'English'? (maybe we can call it "LibreOffice English" :slight_smile:

As someone else said, maybe we shouldn't have one! Or the default build
just includes en_us as standard.

3) Who's going to step up to maintain en_US? (I'd love to help, but
I'm working tons of hours as it is)

The same people who are causing all the grief for everybody else by
currently translating/changing all the strings in master? Surely there's
no difference to the amount of work in translating en_us, as there is to
translating master?

Cheers,
--R

Cheers,
Wol

Hi Stephan,

2015.01.28 11:20, Stephan Bergmann wrote:

When talking about (developer-side) scripting, is it actually OK to
commit modifications to the translations in the translations git
sub-repo? My understanding was that such modifications would be
overwritten by the next "import commit" (as typically done by Andras,
AFAIU from some Pootle database).

The process as I see it would be somewhat like the following: when we
have a big enough string change, which can be scripted coming up, it
should be announced at least a few days in advance, that on day X time
Y, this change will land. Localizers should be allowed to choose whether
or not they want that particular big automatic change transparently
ported to their locales, with a sane default for those who don't voice
their choice by deadline (I suppose that usually the default should be
to perform the change for them as well).

On day X time Y, we close down the affected Pootle project, push its
localizations into git, then somebody who's in charge checks them out of
git and runs the script. When the script finishes its work, the
resulting files are committed back to git, imported back to Pootle and
the project is re-opened for translation. Once that is done, an
announcement should be sent to the L10n list with huge thanks for
everyone's patience and kudos to everybody involved in the process. And
we all live happily ever after. :slight_smile:

Possible risks that I came up with:
* The process of exporting files from Pootle or importing them back
might take hours
    - I don't expect localizers to be too angry about that, because they
can do other stuff meanwhile (such as enjoy their Real Life or work on
other projects).
* The scripts might be buggy
    - Obviously, each time they should be tested in advance with at
least some real data and their result should be validated before committing.
* Some localizers might be too late to the train with their changes
    - Well, first of all, they shouldn't be, but then we could also
have some buffer timeframe to cope with this issue.

On the bright side, I hope the massive changes we are talking about here
will be less and less frequent, making this issue less and less
relevant. From my understanding, we've already changed three dots with
ellipses and straight quotes with curly ones, there shouldn't be much
more typography to improve on, is there? :slight_smile:

Rimas

On day X time Y, we close down the affected Pootle project, push its
localizations into git, then somebody who's in charge checks them out of
git and runs the script. When the script finishes its work, the
resulting files are committed back to git, imported back to Pootle and
the project is re-opened for translation. Once that is done, an
announcement should be sent to the L10n list with huge thanks for
everyone's patience and kudos to everybody involved in the process. And
we all live happily ever after. :slight_smile:

[...]

* The scripts might be buggy
     - Obviously, each time they should be tested in advance with at
least some real data and their result should be validated before committing.

Yeah, its vital that the script will be run in a way that can be tried locally and in advance by any script-writing developer (i.e., that the script will ultimately be run on the git data, not on some pootle data).

On the bright side, I hope the massive changes we are talking about here
will be less and less frequent, making this issue less and less
relevant. From my understanding, we've already changed three dots with
ellipses and straight quotes with curly ones, there shouldn't be much
more typography to improve on, is there? :slight_smile:

Typographic changes should rarely be scriptable across all locales anyway, I guess. (Even the ellipsis change might not have been, given there's not only U+2026 HORIZONTAL ELLIPSIS but at least also U+0EAF LAO ELLIPSIS and U+1801 MONGOLIAN ELLIPSIS.)

This has always been a clear as mud to me as to what the *current*
translation workflow is and how as a developer I can fix a translation.

e.g. in the past changing the "Letter" size translation in Spanish to
"Oficio" instead of the literal translation was a pain. And right now I
want to fix a gadzillion Indic translations short-cuts to ascii chars
and not characters that only available via IM. What I want to do is to
commit to the translations git repo and forget about it. Does that
work ?

C.

Hi Caolán, *,

e.g. in the past changing the "Letter" size translation in Spanish to
"Oficio" instead of the literal translation was a pain. And right now I
want to fix a gadzillion Indic translations short-cuts to ascii chars
and not characters that only available via IM. What I want to do is to
commit to the translations git repo and forget about it. Does that
work ?

Nope, as on the next export from pootle those changes will be overridden again.

The changes need to be done in pootle. (not necessarily via web-UI,
that would be tedious, but the change needs to be reflected in pootle
to "stick").

But as some languages use offline translation, even doing the change
in pootle might be undone when teams just upload their local copy
again without bothering to check for updates that were done in pootle.

ciao
Christian

Hi :slight_smile:
That is an extremely good question.

Even if this thread doesn't result in anything else i hope that more
people in the LibreOffice community DO ask the L10n team for their
thoughts on issues like either of the 2 main questions Caolan is
asking. I think even just that would be a huge step forwards and it
is very much appreciated. :)) Thanks Caolan! :slight_smile:

Sorry i don't have an answer to the actual questions though! :frowning:
Apols and regards from
Tom :slight_smile:

So IMO there's the root of the problem. Lets say a developer does want
to make LibreOffice comply with the GNOME HIG and stick full colons at
the end of every label that is a label for another widget and is willing
to fix all the translations at the same time then he can't really do it.
So there really needs to be some way for developers to mass change
translations themselves, and ideally a trivial way.

I think we should not worry too much about the offline "upload and
overwrite" style case. There's no helping that :slight_smile:

C.

2015.01.29 16:24, Christian Lohmaier rašė:

Hi Caolán, *,

e.g. in the past changing the "Letter" size translation in Spanish to
"Oficio" instead of the literal translation was a pain. And right now I
want to fix a gadzillion Indic translations short-cuts to ascii chars
and not characters that only available via IM. What I want to do is to
commit to the translations git repo and forget about it. Does that
work ?

Nope, as on the next export from pootle those changes will be overridden again.

The changes need to be done in pootle. (not necessarily via web-UI,
that would be tedious, but the change needs to be reflected in pootle
to "stick").

Well if we can't import stuff from git into Pootle, then it's quite a
broken process we have in place...

Until now, I believed that our l10n data can travel both ways, which
would mean that the state of affairs in Pootle can be exported into git,
altered, and imported back into Pootle without loosing anything.

In particular, since we're working on .po files, I see it like this:
1. reference strings are changed in en-US templates
2. the state of affairs is exported from Pootle into git
3. the translations are checked out and the changes made in step 1 are
reflected in msgid and, if necessary, in msgstr strings in the checked
out translation files, unless a particular l10n team opts out of this
automation.
4. changed files (more likely, a subset of them) are verified to be
correct and are committed back into git
5. these changed files are imported back from git into Pootle. When that
is done, the project is reopened for further translation and additional
validation of the result.

But as some languages use offline translation, even doing the change
in pootle might be undone when teams just upload their local copy
again without bothering to check for updates that were done in pootle.

That's why communicating such changes well in advance is necessary – so
that those who work offline can plan for that. In general, it is
possible that they will overwrite something, but in that case, it's just
that particular team losing the benefit of automation, not all of them.
Furthermore, the scripts we would use in each case don't have to be a
secret – a team that works offline will quite likely have enough
resources and knowledge to run them locally.

Rimas

While I understand the concerns that underlie your idea, that process
is so heavy that we are just going to lose "drive-by" contributions
like e.g. the commits I did in August 2014 (which possibly no one
noticed and somebody else redid most of the work again
independently...).

http://cgit.freedesktop.org/libreoffice/translations/commit/?id=d9ae641365f094cc1898d7f614dc8a72a1c6b914
http://cgit.freedesktop.org/libreoffice/translations/commit/?id=34a7cd1e0959023b5fb0fa0e5873bcc67ae026e4
http://cgit.freedesktop.org/libreoffice/translations/commit/?id=1a15415c3fe875ee4193fbdbcbd0ebde3b13b48

Is there a possibility that git and pootle are more-or-less constantly
kept in sync? For example:

1) a git hook (script run automatically each time a push is done) that
   pushes the changes to pootle as soon as they are pushed to git
   (just like we mirror our git repo(s) to freedesktop).

2) the same from the pootle side, as soon as a translator makes a
   change, it is exported to git.

3) There is a theoretical race condition for conflicts (although the
   window could be kept to a few seconds...). In case of merge
   conflict, error out and mail a human for manual merge?

Interestingly, doing a web search for "pootle git synchronisation"
leads me to http://weblate.org/en/features/ >:-)

Hi,

Interestingly, doing a web search for "pootle git synchronisation"
leads me to http://weblate.org/en/features/ >:-)

Well, once we dont require to be self-hosted anymore,
https://translations.launchpad.net/ certainly would be an option too. However,
IMHO there are very good reasons to be self-hosted with translations.

Best,

Bjoern

ourselves.

Hi,

>> Interestingly, doing a web search for "pootle git synchronisation"
>> leads me to http://weblate.org/en/features/ >:-)

> Well, once we dont require to be self-hosted anymore,
> https://translations.launchpad.net/ certainly would be an option too.

However,

> IMHO there are very good reasons to be self-hosted with
> translations.

From glancing at the webpage, Weblate is a software that we can host
ourselves.

This is something the l10n team has already in his radar. But please,
before changing our tool and our workflow, let's discuss at Fosdem with
Dwayne from the Pootle team what we can achieve.
Cheers
Sophie

Hi Lionel,

2015.01.30 10:53, Lionel Elie Mamane wrote:

2015.01.28 11:20, Stephan Bergmann wrote:

When talking about (developer-side) scripting, is it actually OK to
commit modifications to the translations in the translations git
sub-repo? My understanding was that such modifications would be
overwritten by the next "import commit" (as typically done by
Andras, AFAIU from some Pootle database).

The process as I see it would be somewhat like the following: when we
have a big enough string change, which can be scripted coming up, it
should be announced at least a few days in advance, that on day X time
Y, this change will land. (...)
On day X time Y, we close down the affected Pootle project, push its
localizations into git, then somebody who's in charge checks them out of
git and runs the script. When the script finishes its work, the
resulting files are committed back to git, imported back to Pootle and
the project is re-opened for translation. Once that is done, an
announcement should be sent to the L10n list with huge thanks for
everyone's patience and kudos to everybody involved in the process. And
we all live happily ever after. :slight_smile:

While I understand the concerns that underlie your idea, that process
is so heavy that we are just going to lose "drive-by" contributions
like e.g. the commits I did in August 2014 (which possibly no one
noticed and somebody else redid most of the work again
independently...).

http://cgit.freedesktop.org/libreoffice/translations/commit/?id=d9ae641365f094cc1898d7f614dc8a72a1c6b914
http://cgit.freedesktop.org/libreoffice/translations/commit/?id=34a7cd1e0959023b5fb0fa0e5873bcc67ae026e4
http://cgit.freedesktop.org/libreoffice/translations/commit/?id=1a15415c3fe875ee4193fbdbcbd0ebde3b13b48

I'm not sure exactly how such drive-by commits are relevant to this
case. I don't think anyone is taking care to watch for such commits at
the moment and import them into Pootle. I imagine that right now, only
locale that gets imported into Pootle periodically is the source locale
(en-US). Any changes to other locales, like this change of yours, are
doomed to be overwritten on next export from Pootle, unless you do them
in Pootle itself instead.

Is there a possibility that git and pootle are more-or-less constantly
kept in sync? For example:

1) a git hook (script run automatically each time a push is done) that
   pushes the changes to pootle as soon as they are pushed to git
   (just like we mirror our git repo(s) to freedesktop).

2) the same from the pootle side, as soon as a translator makes a
   change, it is exported to git.

3) There is a theoretical race condition for conflicts (although the
   window could be kept to a few seconds...). In case of merge
   conflict, error out and mail a human for manual merge?

Considering the size of our project and the amount of files, I'm afraid
that both these things would be impractical at the moment, here's why:
1) Exporting files from Pootle takes a lot of time currently. Exporting
only the relevant file on string submit would likely be faster, but
still not fast enough, I'm afraid.
2) Furthermore, even assuming that speed would be acceptable, making a
separate git commit for each string change would blow the size of
repository considerably and litter the commit log with thousands of
commit messages. Also, I suspect tree deviations might be unavoidable,
and merges might be required.
3) Regarding importing changes from git into Pootle – it's also slow, it
would likely be faster if import would be done for just the affected files.

Then again – how often do such drive-by commits happen? My guess is not
very often. So I don't think that is the scenario we should tailor for.
I can't open the git commit log page at the moment (perhaps the
repository is already too huge for cgit?), but if among the developers
there are only a handful exceptions like you, who want to also
contribute to their locale, perhaps the best option for them currently
is to do it using Pootle itself, or by contacting the localizer in
charge? I know that it seems much easier to just fix the problem, but as
you saw yourself, that doesn't quite work in our case.

On the other hand, the massive changes that we are discussing here are a
whole different beast: they are massive and they affect all locales,
because they change many strings in the source locale. And they are
often scriptable. And they drive localizers nuts, if not done properly. :slight_smile:

By the way, I just glanced over your commits, and it seems you mostly
removed spaces in some help SGML tags. Were these spaces breaking anything?
Also, the last commit you linked to mentions BugZilla. I feel obligated
to say that Bugzilla is called Bugzilla and not BugZilla (just like
Firefox is not called FireFox, and Microsoft is not called MicroSoft,
and we are not called Libre Office with a white-space in the middle).
That misnaming should probably be fixed in the source.

Regards,
Rimas