Sorry, I cannot take credit for the word lists. They are open-source
and come from a Linux repository source for a different type of system
dictionary. There were list for American English, British English [not
Oxford], Canadian English, French, Spanish, and Italian.
I just converted these lists to something that LO could use. To be
honest, if you know what to do, you can create your own specialty
dictionaries. You just take one from the correct language as a model
and make yours using that model. It also helps that have open-source
components to work from.
My issue was I could not find any "easy" documentation to help me do the
work. I found a few hints, but that was all. So I had to do some trial
and error for the non-word-list items. I wanted to be able to have a
person install all my dictionary files then use the enable/disable
options to choose which dictionary you wish to use at any given time.
My next "job" will be to see how I can add to the thesaurus data for
some of the words that are not included. The thesaurus I currently use
deals with American, British, and Canadian English words and phrases,
about 145,800 of them. There are a lot of phrases and abbreviations in
the thesaurus. I want to see what can be done to make it include others
that might be needed or wanted by our LibreOffice users.
Below are some of the word listings in the thesaurus I am using. This
is an indexed list so changing the number of lines for a word in this
list will reflect/change the indexed line number of the index file. I
will have to create/use a database style of software to alter the
thesaurus system. For the "1st baron verulam" it starts on line #10337
of the thesaurus data file, while "ocher" starts on line #11507218. If
I add lines to any word[s] between these two, all of the indexes will be
mess up from the first addition on. As for "ocher", it also shows
"ochre" as a different spelling for it.
1st baron verulam|1
(noun)|Bacon|Francis Bacon|Sir Francis Bacon|Baron Verulam|1st Baron
Verulam|Viscount St. Albans|statesman (generic term)|solon (generic
term)|national leader (generic term)|philosopher (generic term)
aarp|1
(noun)|Association for the Advancement of Retired
Persons|AARP|association (generic term)
aas|1
(noun)|Associate in Applied Science|AAS|associate degree (generic
term)|associate (generic term)
ocher|3
(adj)|ochre|chromatic (similar term)
(noun)|ochre|orange yellow (generic term)|saffron (generic term)
(noun)|ochre|earth color (generic term)
repeat|7
(noun)|repetition|periodic event (generic term)|recurrent event (generic
term)
(verb)|reiterate|ingeminate|iterate|restate|retell|tell (generic term)
(verb)|duplicate|reduplicate|double|replicate|reproduce (generic term)
(verb)|recur|happen (generic term)|hap (generic term)|go on (generic
term)|pass off (generic term)|occur (generic term)|pass (generic
term)|fall out (generic term)|come about (generic term)|take place
(generic term)
(verb)|echo|utter (generic term)|emit (generic term)|let out (generic
term)|let loose (generic term)
(verb)|take over|act (generic term)|move (generic term)
(verb)|reprise|reprize|recapitulate|play (generic term)|spiel (generic term)
smut|9
(noun)|carbon black|lampblack|soot|crock|carbon (generic term)|C
(generic term)|atomic number 6 (generic term)
(noun)|plant disease (generic term)
(noun)|smut fungus|fungus (generic term)
(noun)|obscenity|vulgarism|filth|dirty word|profanity (generic term)
(noun)|pornography|porno|porn|erotica|creation (generic term)|creative
activity (generic term)
(verb)|change (generic term)|alter (generic term)|modify (generic term)
(verb)|stain (generic term)
(verb)|mold (generic term)|mildew (generic term)
(verb)|infect (generic term)|taint (generic term)
snake dance|2
(noun)|file (generic term)|single file (generic term)|Indian file
(generic term)
(noun)|ritual dancing (generic term)|ritual dance (generic
term)|ceremonial dance (generic term)
trammel|6
(noun)|trammel net|fishnet (generic term)|fishing net (generic term)
(noun)|pothook (generic term)
(noun)|restraint (generic term)|constraint (generic term)
(noun)|shackle|bond|hamper|restraint (generic term)|constraint (generic
term)
(verb)|trap|entrap|snare|ensnare|capture (generic term)|catch (generic term)
(verb)|restrict|restrain|limit|bound|confine|throttle|control (generic
term)|hold in (generic term)|hold (generic term)|contain (generic
term)|check (generic term)|curb (generic term)|moderate (generic term)
trammel net|1
(noun)|trammel|fishnet (generic term)|fishing net (generic term)
On 11/10/2011 10:55 PM, Bruce Carlson wrote:
Hi Tom,
I just added the total number of words in these dictionaries and it's three
million, nine hundred and thirty one thousand words.
An unbelievable amount of work for one guy.
I would say Brilliant work.
Cheers,
Bruce.
Hi :)
It is good to see you have all those dictionaries accepted as extensions
listed on one page together
http://extensions.libreoffice.org/extension-center/american-british-canadia
n-spelling-hyphen-thesaurus-dictionaries/releases/1.0
Godo work chap!
Regards from
Tom :)
--
View this message in context:
http://nabble.documentfoundation.org/which-of-my-American-British-and-Canadi
an-English-dictionaries-are-now-online-tp3491034p3498661.html
Sent from the Users mailing list archive at Nabble.com.
--
For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
Problems?
http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be
deleted
--
For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted
Context
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.