03.05.2016 u 20:35, toki je napisao/la:
On 03/05/2016 15:51, Kruno wrote:
not doubled maintained word lists by multiple maintainers (not
knowing each other)
will not and can not be resolved.
With a central repository for working on dictionaries, it is far
easier for two individuals interested in the same dictionary to find
each other, than if they are working on two different sites, in
different locations.
Yes, I agree, never argued that. But now we are talking about my point
from first mail: you build a place for people involved with LO and
provide them with tools to make better dictionaries. I was saying that
it makes more sense tie it up with LO for LO then thinking that you are
making repository for Hunspell -- you are making one for LibreOffice and
that's it.
My point was that you can't build repository for 'official' Hunspell's
dictionaries, only for 'official' LO's dictionaries. I was just saying
it was communicated a little bit blurry and unclear (to me).
(And you explained it to me in last part of your mail).
Who's dictionary to include to that single repository, how to merge
As a practical matter, a repository that only allows for one
dictionary per language, is not viable. At a minimum, you'll have
specialized dictionaries.
[Starting new discussion (sic!)]
Which languages? Have of them don't even have decent affix file (mine
included, and that's nobody's fault).
I'm not trying discourage or sabotage (I'm really hope this comes alive)
but who do you think will build such dictionaries for languages that
don't have them already? Who can maintain that?
Are all those specialized dictionaries sharing an affix file?
That was my second (and last) point, it's sounds like goals are set too
hight. Not in terms of possibilities, but actual interest.
And we are talking about different things here, having two for the same
languages was not what I meant. We are starting new discussion here, so
back on topic:
I was more concerned about how would a such system work. I'm not telling
anyone what and how (nor even suggesting) because I don't understand all
of that. Just wanted to know. It sounded so unreal.
[Yes, have a possibility -- we agree on practically everything but you
are pressure where hurts here ;) ]
how to merge affix files with different affix classes (that will be a
mess).
I've seen some tools for automating the creation of affix files.
I don't know how well they work, though.
No, no and - no! No scripts with any natural language if you already
don't have a finished dictionary for cross-referencing. No, no way. (And
small and not so small languages don't have access to thous, or they
simply don't exist).
This goes back to my claim that spell checking without built-in
grammar checking is useless.
>Why you think that included dictionary is 'standard' and is better
then the other one?
Any dictionary project has to include the ability to have the same
language in at least two different writing systems --- Braille (^1)
and the standard writing system for the language.
>The other guy will give up his work?
The proposal does not require the other guy to give up his project.
I wouldn't be surprised to see the other guy create a more specialized
dictionary.
* John Doe creates a general purpose dictionary;
* Jane Doe creates a name and places dictionary;
* John Roe creates a scientific terminology dictionary;
* Jane Roe creates a basic words dictionary;
It sure will be easier then it is now.
Who will hunt all those 'other' guys telling them 'Yo, dude, leave
that, do this shit!'
As far as existing spell checking and wordlist projects go, nobody is
going to tell them to "leave that, do this".
Yes, exactly: so again, you can invite people -- people you already
know, people who already doing this stuff (and thous are few).
So having some soft of bugzilla for missing or wrong words has more
potential for regular users (even integrated into UI so it's just
reporting to a matching language in that repository of some sort).
The dictionary building tool can help the ones already doing it to do it
better.
(Not trying to make discussion of this)
What might happen, is that known, existing projects, are offered
space, etc in the proposed repository/incubator, but they will stay
where they currently are, due to how their workflow operates.
How will such a repository resolve competition between two English
dictionaries?
Since you specifically mentioned English, there currently are versions
of English for a dozen locales, plus around half a dozen specialist
dictionaries.
Most users won't choose the English (OED) variant, because it has too
many words in it. Too many words means that words that are wrongly
used, get flagged as correct spelled. The "Eye right withe aye pin"
phenomena.
How many languages have that problem?
This proposal is about non-technical types being able to _easily_
create viable dictionaries for their specific use-case. It doesn't
matter if that use-case is a dictionary in Pondo, or a dictionary of
people and places in Bharat, or a dictionary in Moon.
Perfectly fine, can't wait.
Although, it will be very complex if it's going to support all of
Hunspell's advanced features (because how entering text in text box
differs from editing txt file?)
Not trying to be rude, again: I really hope this will work!
The other part of the proposal is that even if the original dictionary
creator abandons the dictionary, it can still be maintained, and updated.
That's a plus big as a skyscraper!
The third part of the proposal is that whilst it is initially for
LibO, the hope is that it becomes the source for dictionaries for
FLOSS projects.
#####
Hypothetical situation. One of Kevin Scannell's students decides that
what the world needs is a dictionaries in each of the 2,500 languages
that have been reduced to a writing system. So said student walks
thru Kevin's word lists, and creates a dictionary project for each of
the 2,000 languages that Kevin maintains word lists for. A year later,
said student graduates, and forgets about their dictionaries.
Under the current scenarios, when said student abandons their
dictionaries, the only way other people can update them, is by forking
them --- assuming that the license allows forking.
Under the proposed scenario, if said student creates the dictionaries
in the repository, when said student abandons them, other people can
still update the dictionaries, which can then be distributed to LibO,
etc.
I'll grant that were said student to create 2,000+ dictionaries for
LibO, it would break the UI. However, as far as the proposal goes,
that breakage is irrelevant.
All good. Just wanted to see if I got it all right.
Thanks,
Kruno
--
To unsubscribe e-mail to: l10n+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Context
- Re: [libreoffice-l10n] English Dictionaries Project - Introduction by Marco Pinto (continued)
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.