Hi,
I'm not sure if I'm doing it right by sending this mail do l10n, but
hopping someone will point me to right direction. Even tried to
subscribe to hunspell mailing list, but with no success.
I've done reading Hunspell help/documentation pdf from the internet but
still have few questions. From what stands in that document I can't be
certain in some specific things which are actually very important when
we talk about morphology of highly flective language like Croatian.
Here we go:
Let's say that .aff contain:
FLAG long
SFX Y A1 1
SFX Y 0 a .
FLAG long
SFX Y A2 1
SFX Y A2 0 e .
Now, A1 and A2 needs to be applied to word "jezik" (eng. language). So
if .dic file contain "jezik/A1" I should expect forms like "jezik" and
"jezika". If .dic file contain entery like "jezik/A2" expected forms are
"jezik", "jezike", but if the entery is
jezik/A1A2
I can expect forms like "jezik", "jezika", "jezike", but will hunspell
try to combine A1 and A2 together like "jezikae" or "jezikea"? Form
"jezikae" and "jezikea" are not valid in Croatian. I know it can be
prevented by doing
FLAG long
SFX Y A2 1
SFX Y A2 0 e [^a]
but question is - will hunspell combine (aglutinate) suffixes if suffix
statements doesn't prevent it explicitly (and allows it implicitly)?
Wasn't able to solve this by reading the manual from
http://hunspell.sourceforge.net nor by looking into other .aff files
from git repository.
Also if .dic file contains only form "PDF" will the word "pdf" in LO be
underlined as misspelled? Can .dic file contain two words in one line?
Further more - have trouble with understanding the REP section in .aff
file. Can I do:
REP 1
Plitvička_Jezera Plitvička_jezera
will that automatically correct capital letter in word "Jezera"?
I hope someone will give me a hand here...
Kruno
--
To unsubscribe e-mail to: l10n+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Context
- [libreoffice-l10n] Understanding the hunspell · Krunoslav Šebetić
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.