Date: prev next · Thread: first prev next last
2017 Archives by date, by thread · List index


On 08/17/2017 10:08 AM, Andrej Warkentin wrote:

So I thought this could be used to find (or at least help finding) most missing words in 
dictionaries for all languages.

Back when the OOo dictionary for Afrikaans was created, a program was
run through the dictionary corpus, excluding words that were also found
in the English dictionary.

Which is why the three of the most common words in Afrikaans were not
found in that dictionary, for at least the first five revisions of it.

Die man.
Two words, which as a phrase, have completely different meanings, when
read in Afrikaans, and in English.  «I'll grant that "man" is bad
Afrikaans, but it is the only example I can think of, offhand, that
isn't also off-colour in either, or both languages.»

My question is if this would be something helpful at all or if missing words in dictionaries is 
not a problem anymore.

Once a dictionary has reached a certain size, it starts to include words
that are rarely used, whose spelling is a common misspelling for another
word. Earlier in this thread somebody mentioned "teh" as one example.

For dictionaries that are under initial construction, this type of tool
can be extremely useful.

don't have much spare time at the moment to work on this so if anyone

My impression is that a Python Library Module that includes this
functionality exists.  What would need to be done, would be to either
hook that library up with a bot that scrapes Wikipedia, etc, or an
extension that reads ODF documents.

Your question might be more appropriate for the upstream project that
provides the dictionaries used by LibreOffice. Also might be appropriate
on the Language Tool (grammar checking) list.

jonathon

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.