On 24-12-2014 21:45, Constantine wrote:
Hi everyone,
I am new here and I hope someone can help me.
I am a translator and my colleagues and I use CAT-software with TMX-files,
glossaries in simple text-files which are in UTF8 encoding and TAB-separated
as well as other tools of course.
Over the years, some of us collected a huge amount of Terminology (with it's
definitions) from each other and various other sources.
This Terminology is unfortunately in simple text-files encoded with UTF8 but
with no TABs or similar as field-separators.
In order to be able to use this Terminology as a glossary or to convert it
into a TMX and then use it efficiently with our CAT-Software (OmegaT
mostly), it has to be in the following format:
Language-A TAB Language-B
or
Language-A ; Language-B
I 'll show you an example of our files to better understand my problem.
I will use a German-Greek terminology example, because I believe the
different languages encodings will make it easier to achieve our goal:
Abbrucharbeiten (πληθ.) εργασίες κατεδάφισης
Abbruchkosten (πληθ.) δαπάνες κατεδάφισης
abbruchreif ετοιμόρροπος, κατεδαφιστέος, das Haus ist ~ το σπίτι είναι
ετοιμόρροπο
Abbruchsbewilligung (θηλ.) άδεια κατεδάφισης
Abbruchunternehmen (ουδ.) εταιρεία κατεδάφισης
abbuchen χρεώνω, Gebühren vom Konto ~ τα τέλη χρεώνονται στον τραπεζικό
λογαριασμό
As a minimum requirement for our goal, it should look like this:
Abbrucharbeiten;(πληθ.) εργασίες κατεδάφισης
Abbruchkosten;(πληθ.) δαπάνες κατεδάφισης
abbruchreif;ετοιμόρροπος, κατεδαφιστέος, das Haus ist ~ το σπίτι είναι
ετοιμόρροπο
Abbruchsbewilligung;(θηλ.) άδεια κατεδάφισης
Abbruchunternehmen;(ουδ.) εταιρεία κατεδάφισης
abbuchen;χρεώνω, Gebühren vom Konto ~ τα τέλη χρεώνονται στον τραπεζικό
λογαριασμό
or
Abbrucharbeiten (πληθ.) εργασίες κατεδάφισης
Abbruchkosten (πληθ.) δαπάνες κατεδάφισης
abbruchreif ετοιμόρροπος, κατεδαφιστέος, das Haus ist ~ το σπίτι είναι
ετοιμόρροπο
Abbruchsbewilligung (θηλ.) άδεια κατεδάφισης
Abbruchunternehmen (ουδ.) εταιρεία κατεδάφισης
abbuchen χρεώνω, Gebühren vom Konto ~ τα τέλη χρεώνονται στον τραπεζικό
λογαριασμό
(where the space is a TAB)
Does anyone of you know any way to achieve this in Libreoffice or even any
other editor?
I was think of something like:
Search for German-language characters + space
at the beginning of the sentence/line and
replace it with the same word + ";" or TAB
I and my colleagues would be most grateful if any of you could provide a
simple solution or suggestion.
I thank you all in advance for your help and interest.
Constantine
After a lot of responses how to do this in Writer,
a shortnote how to do this in Calc..... ;-)
Open the textfile, when the 'Text import' wizzard is show do:
1) Select characterset 'Unicode (UTF-8)'
2) Separater options: 'separated by', check 'Tab' and 'Space', other
options should not be checked.
3) at 'Text delimiter' type a space
4) klik 'OK'
5) Insert a column B, and fill it with a semi-colon ';'
6) Klik save-as, type a name, and check 'Edit filter settings'
7) The Export Text file' wizard should be shown.
8) Character set: 'Unicode (UTF-8)'
9) Field delimiter: space ' '
10) Text delimiter: <empty> ''
11) checkboxes: only leave 'Save cell content as shown' checked.....
--
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted
Context
- [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file (continued)
Re: [libreoffice-users] Creating a dictionary with libreoffice from a simple TXT-file · Luuk
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.