Date: prev next · Thread: first prev next last
2014 Archives by date, by thread · List index



The question is what type of dictionary you really want?

A spell checking type for the terms
or
something that shows "Latin / English / German" character set to the Greek character set term including the definition when available.

That really makes a huge difference.

To make a searchable dictionary for use within LibreOffice may be difficult.

Also, how many terms to you have? hundreds, thousands? That may also effect what can be done easily.

I use to do spacing alignments to massive text files using C++ and other programming languages for use with text input for various documents and word, term, processing. But it may not be easily done withing a office package.


On 12/24/2014 03:45 PM, Constantine wrote:
Hi everyone,

I am new here and I hope someone can help me.

I am a translator and my colleagues and I use CAT-software with TMX-files,
glossaries in simple text-files which are in UTF8 encoding and TAB-separated
as well as other tools of course.

Over the years, some of us collected a huge amount of Terminology (with it's
definitions) from each other and various other sources.
This Terminology is unfortunately in simple text-files encoded with UTF8 but
with no TABs or similar as field-separators.

In order to be able to use this Terminology as a glossary or to convert it
into a TMX and then use it efficiently with our CAT-Software (OmegaT
mostly), it has to be in the following format:

Language-A TAB Language-B
or
Language-A ; Language-B

I 'll show you an example of our files to better understand my problem.
I will use a German-Greek terminology example, because I believe the
different languages encodings will make it easier to achieve our goal:

Abbrucharbeiten (πληθ.) εργασίες κατεδάφισης
Abbruchkosten (πληθ.) δαπάνες κατεδάφισης
abbruchreif ετοιμόρροπος, κατεδαφιστέος, das Haus ist ~ το σπίτι είναι
ετοιμόρροπο
Abbruchsbewilligung (θηλ.) άδεια κατεδάφισης
Abbruchunternehmen (ουδ.) εταιρεία κατεδάφισης
abbuchen χρεώνω, Gebühren vom Konto ~ τα τέλη χρεώνονται στον τραπεζικό
λογαριασμό

As a minimum requirement for our goal, it should look like this:

Abbrucharbeiten;(πληθ.) εργασίες κατεδάφισης
Abbruchkosten;(πληθ.) δαπάνες κατεδάφισης
abbruchreif;ετοιμόρροπος, κατεδαφιστέος, das Haus ist ~ το σπίτι είναι
ετοιμόρροπο
Abbruchsbewilligung;(θηλ.) άδεια κατεδάφισης
Abbruchunternehmen;(ουδ.) εταιρεία κατεδάφισης
abbuchen;χρεώνω, Gebühren vom Konto ~ τα τέλη χρεώνονται στον τραπεζικό
λογαριασμό

or

Abbrucharbeiten (πληθ.) εργασίες κατεδάφισης
Abbruchkosten   (πληθ.) δαπάνες κατεδάφισης
abbruchreif     ετοιμόρροπος, κατεδαφιστέος, das Haus ist ~ το σπίτι είναι
ετοιμόρροπο
Abbruchsbewilligung     (θηλ.) άδεια κατεδάφισης
Abbruchunternehmen      (ουδ.) εταιρεία κατεδάφισης
abbuchen        χρεώνω, Gebühren vom Konto ~ τα τέλη χρεώνονται στον τραπεζικό
λογαριασμό

(where the space is a TAB)

Does anyone of you know any way to achieve this in Libreoffice or even any
other editor?

I was think of something like:
                                Search for German-language characters + space
at the beginning of the sentence/line and
                                replace it with the same word + ";" or TAB

I and my colleagues would be most grateful if any of you could provide a
simple solution or suggestion.

I thank you all in advance for your help and interest.

Constantine




--
View this message in context: 
http://nabble.documentfoundation.org/Creating-a-dictionary-with-libreoffice-from-a-simple-TXT-file-tp4133988.html
Sent from the Users mailing list archive at Nabble.com.



--
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.