At 13:45 24/12/2014 -0700, Constantine Marberg wrote:
This Terminology is unfortunately in simple text-files encoded with
UTF8 but with no TABs or similar as field-separators. In order to be
able to use this Terminology as a glossary or to convert it into a
TMX and then use it efficiently with our CAT-Software (OmegaT
mostly), it has to be in the following format:
Language-A TAB Language-B
or
Language-A ; Language-B
Does anyone of you know any way to achieve this in Libreoffice or
even any other editor? I was think of something like: Search for
German-language characters + space at the beginning of the
sentence/line and replace it with the same word + ";" or TAB
It won't be easy to search for characters in a particular language,
but in all your examples the part you want separated from the
remainder of the line is simply a single word, so you need just to
search for the first space and replace it with a semi-colon or a tab character.
o Paste your material into a LibreOffice text (Writer) document.
o Put the cursor at the beginning of the text.
o Go to Edit | Find & Replace... (or press Ctrl+F).
o In the "Search for" box, enter:
([^ ]*) (.*)
- that's
leftparenthesis-leftbracket-circumflex-space-rightbracket-asterisk-rightparenthesis-space-leftparenthesis-dot-asterisk-rightparenthesis.
o In the "Replace with" box, enter either:
$1;$2 or $1\t$2
- that's dollar-one-semicolon-dollar-two or
dollar-one-backslash-lowercasetee-dollar-two (as preferred).
o Click More Options.
o Ensure "Regular expressions" is ticked.
o Click Replace All.
Don't do this more than once, or this process will then replace the
second occurrences of spaces.
You can copy and paste the resulting material wherever you want it to
go, or you can use File | Save As... and select a plain text format
for "Save as type".
I trust this helps.
Brian Barker
--
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted
Context
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.