Using OmegaT to translate the guides

Hi,

As I have already written here, we would like to translate the LO Getting Started guide. We want to use for that OmegaT.

In fact, we have already translated about a half of the 3.5 version. Since 4.0 is out now, we started to think how to use the 3.5 translation for 4.0. Perhaps it can be done somehow by comparing the English documents. We, however, decided that we will try software with translation memory - OmegaT . We assume that also in the future this will significantly simplify translation of new versions and eventually also simplify translation of other guides.

A problem appeared: if OmegaT is used for transtation of odt files, the program has to keep track of tags. It works fine, but the text is too fragmented. For example:

/If you choose <i0/><f1>Load Basic code</f1><f2>, you can edit the macros in LibreOffice.</f2> <f3>T</f3>he changed code is saved in a<f4>n ODF</f4> document but is not retained if you save into a Microsoft Office format/

Here /<i0/><f1>Load Basic code</f1><f2>/ seems to be OK, but the other tags should not be there. These were probably introduced by direct formating. In fact, in the contents.xml file one can see many hundreds of styles, which are related to direct formating. These were applied about 10000 times (in the whole guide).

My solution: I wrote a python script which can remove these direct formating tags. For that I had first to change the plentiful direct formattings to styles (bold and italic to OOoMenuPath, OOoEphasis, OOoKeystroke, etc). As the result the files can be translated using OmegaT.

I've posted the modified files here: http://ubuntuone.com/1LSDBsRaraP5CHDXMRPjSW
There are three directories inside with originals, manually modified files (removed direct formating, in 'styles' subdirectory) and cleaned files (subdirectory clean). Tests have not shown any difference between the manually defined and cleaned files. See the Readme file for details. Background of styles is highlighted, so that one can see which is which.

I believe that the cleaned files could help also other translators. In fact, after the talk given by Leo Moons at the conference Milan, where I mentioned my work, some people expressed interest. Would it be possible to replace the current GS files with the cleaned ones?

best regards
Milos