Date: prev next · Thread: first prev next last
2011 Archives by date, by thread · List index



Sorry, I cannot take credit for the word lists. They are open-source and come from a Linux repository source for a different type of system dictionary. There were list for American English, British English [not Oxford], Canadian English, French, Spanish, and Italian.

I just converted these lists to something that LO could use. To be honest, if you know what to do, you can create your own specialty dictionaries. You just take one from the correct language as a model and make yours using that model. It also helps that have open-source components to work from.

My issue was I could not find any "easy" documentation to help me do the work. I found a few hints, but that was all. So I had to do some trial and error for the non-word-list items. I wanted to be able to have a person install all my dictionary files then use the enable/disable options to choose which dictionary you wish to use at any given time.

My next "job" will be to see how I can add to the thesaurus data for some of the words that are not included. The thesaurus I currently use deals with American, British, and Canadian English words and phrases, about 145,800 of them. There are a lot of phrases and abbreviations in the thesaurus. I want to see what can be done to make it include others that might be needed or wanted by our LibreOffice users.

Below are some of the word listings in the thesaurus I am using. This is an indexed list so changing the number of lines for a word in this list will reflect/change the indexed line number of the index file. I will have to create/use a database style of software to alter the thesaurus system. For the "1st baron verulam" it starts on line #10337 of the thesaurus data file, while "ocher" starts on line #11507218. If I add lines to any word[s] between these two, all of the indexes will be mess up from the first addition on. As for "ocher", it also shows "ochre" as a different spelling for it.

1st baron verulam|1
(noun)|Bacon|Francis Bacon|Sir Francis Bacon|Baron Verulam|1st Baron Verulam|Viscount St. Albans|statesman (generic term)|solon (generic term)|national leader (generic term)|philosopher (generic term)

aarp|1
(noun)|Association for the Advancement of Retired Persons|AARP|association (generic term)
aas|1
(noun)|Associate in Applied Science|AAS|associate degree (generic term)|associate (generic term)

ocher|3
(adj)|ochre|chromatic (similar term)
(noun)|ochre|orange yellow (generic term)|saffron (generic term)
(noun)|ochre|earth color (generic term)

repeat|7
(noun)|repetition|periodic event (generic term)|recurrent event (generic term)
(verb)|reiterate|ingeminate|iterate|restate|retell|tell (generic term)
(verb)|duplicate|reduplicate|double|replicate|reproduce (generic term)
(verb)|recur|happen (generic term)|hap (generic term)|go on (generic term)|pass off (generic term)|occur (generic term)|pass (generic term)|fall out (generic term)|come about (generic term)|take place (generic term) (verb)|echo|utter (generic term)|emit (generic term)|let out (generic term)|let loose (generic term)
(verb)|take over|act (generic term)|move (generic term)
(verb)|reprise|reprize|recapitulate|play (generic term)|spiel (generic term)

smut|9
(noun)|carbon black|lampblack|soot|crock|carbon (generic term)|C (generic term)|atomic number 6 (generic term)
(noun)|plant disease (generic term)
(noun)|smut fungus|fungus (generic term)
(noun)|obscenity|vulgarism|filth|dirty word|profanity (generic term)
(noun)|pornography|porno|porn|erotica|creation (generic term)|creative activity (generic term)
(verb)|change (generic term)|alter (generic term)|modify (generic term)
(verb)|stain (generic term)
(verb)|mold (generic term)|mildew (generic term)
(verb)|infect (generic term)|taint (generic term)

snake dance|2
(noun)|file (generic term)|single file (generic term)|Indian file (generic term) (noun)|ritual dancing (generic term)|ritual dance (generic term)|ceremonial dance (generic term)

trammel|6
(noun)|trammel net|fishnet (generic term)|fishing net (generic term)
(noun)|pothook (generic term)
(noun)|restraint (generic term)|constraint (generic term)
(noun)|shackle|bond|hamper|restraint (generic term)|constraint (generic term)
(verb)|trap|entrap|snare|ensnare|capture (generic term)|catch (generic term)
(verb)|restrict|restrain|limit|bound|confine|throttle|control (generic term)|hold in (generic term)|hold (generic term)|contain (generic term)|check (generic term)|curb (generic term)|moderate (generic term)
trammel net|1
(noun)|trammel|fishnet (generic term)|fishing net (generic term)




On 11/10/2011 10:55 PM, Bruce Carlson wrote:
Hi Tom,

I just added the total number of words in these dictionaries and it's three
million, nine hundred and thirty one thousand words.
An unbelievable amount of work for one guy.
I would say Brilliant work.

Cheers,
Bruce.


Hi :)
It is good to see you have all those dictionaries accepted as extensions
listed on one page together
http://extensions.libreoffice.org/extension-center/american-british-canadia
n-spelling-hyphen-thesaurus-dictionaries/releases/1.0
Godo work chap!
Regards from
Tom :)
--
View this message in context:
http://nabble.documentfoundation.org/which-of-my-American-British-and-Canadi
an-English-dictionaries-are-now-online-tp3491034p3498661.html
Sent from the Users mailing list archive at Nabble.com.

--
For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
Problems?
http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be
deleted




--
For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.