Fonts, Font Substitutions, Languages, Graphite, Justification, etc.

Documentation Team:

Some time ago, I provided an answer to a user who was having trouble using
the Thai language and Thai fonts in LibreOffice; that thread can be seen in
the Users section at
http://nabble.documentfoundation.org/saving-documents-written-in-Thai-tt4088195.html#a4088266.

I was only able to answer this because I happened to have spent hours
trolling the web to get my own Thai documents to work effectively in Writer.
A few readers suggested gathering tips on this subject in one place, but if
that ever happened, I'm unaware of it.

Of course, with each new release of Writer, something changes (usually
breaks), and I now have troubles displaying and printing Writer documents
with certain fonts that worked well a year ago. If I actually understood
what was going on, it might not be that much of a pain to cure the problem
but, with the apparent lack of documentation for this whole subject of font
rendering, changes in justification and hyphenation using the same font from
version to version, inconsistent support for open type features and/or
graphite features - I'm just thrashing around in the dark. It seems like
these features should either just be removed or implemented correctly (or at
least not inconsistently).

I've already had to switch one large document and several smaller ones to
W*** on W****** because Writer no longer handles graphics in tables (well,
it was acknowledged as a new bug [86578 and others] after enough people
reported it, but no activity seems to be taking place). If I now have to
switch another group of documents away from Writer because font handling
(and by that I mean spacing, kerning, etc.) has become screwy, I guess you
can suspect what I'm tempted to do. A new icon set is a nice thing, but ...

I'm using 64 bit Ubuntu 14.04LTS, and I know the font handling routines in
the various operating systems (and even different display managers) have
differing capabilities and behaviors, but the issues I see all seem to occur
only in LibreOffice. If there were some documentation available I would be
able to tell if I'm just setting things up improperly for recent changes or
experiencing a "bug."

I apologize (really) if this sounds petulant, but is there any source for
documentation on how this genre of stuff is supposed to work? Or: What OS
libraries are recommended? Which display and/or printing libraries should be
installed or removed? Are there third party kerning or justification
routines that can be substituted or is everything done internally in
LibreOffice? What should be done in templates, and what should be done in
Styles? And which of the substitution mechanisms are best suited for which
purposes?

Thanks.

A few readers suggested gathering tips on this subject in one place, but if that ever happened, I'm unaware of it.

It probably didn't.

Of course, with each new release of Writer, something changes (usually
breaks), and I now have troubles displaying and printing Writer documents
with certain fonts that worked well a year ago. If I actually understood

This is a side effect of release early and release often.

The theory is that the changes, especially things that are broken, is
reported to the appropriate party, in a timely manner.

The practice is that the people who are most affected by the breakage,
don't have any idea what to report.

but, with the apparent lack of documentation for this whole subject of font rendering, changes in justification and hyphenation using the same font from
version to version, inconsistent support for open type features and/or graphite features

This is where one is usually told: "Read the source, Luke. Read the
source". Advice that doesn't help, when one has no idea what to look
for, or even where to look for it.

(and by that I mean spacing, kerning, etc.) has become screwy, I guess you can suspect what I'm tempted to do.

Ironically, you might be able to solve that yourself.
The fix is an extension called "typographic Toolbar".
You'll have to rewrite it for Thai fonts --- which probably is a good
idea for various other reasons.

I apologize (really) if this sounds petulant, but is there any source for
documentation on how this genre of stuff is supposed to work?

AFAIK, not that non-programmers can understand.

And which of the substitution mechanisms are best suited for which purposes?

None of them.

jonathon

Uhh, thanks (I think) ...

I looked at the toolbar and it seems like, if used with styles, it could
help quite a bit, so Thanks for that. It also appears that it might only
work with two specific fonts. I'll experiment a bit and see, but is that
true?

Re: "You'll have to rewrite it for Thai fonts:" Looking at the oxt file, it
refers to StarBasic, but when unarchiving it seems to consist entirely of
headers unless I'm looking in the wrong place. Where would I locate the
source code for the routines the toolbar calls, what language do they use,
and what compiler (or are they also interpreted?) would be required for a
Linux installation?

Re: "any source for documentation on how this genre of stuff is supposed to
work? >> AFAIK, not that non-programmers can understand." While never
primarily a programmer, I wrote my first in the early 1970s and continued to
do so on and off through my entire career, using a wide variety of
languages. So I would like to at least attempt to understand it - can you
point me to whatever you were referring to?

Thanks again for the response ...

work with two specific fonts. I'll experiment a bit and see, but is that true?

Yes, it only works with those two fonts. Which is why I said you'd have
to rewrite it for Thai fonts.
(BTW, can you recommend one or two _good_ Thai fonts? I can't tell if
the fonts I'm using look bad, because they are bad, or if it has to do
with the point size I'm using. I think I set the font size to 15 points.
I know that below 12 points everything in Thai looks bad.)

Re: "You'll have to rewrite it for Thai fonts:" Looking at the oxt file, it
refers to StarBasic, but when unarchiving it seems to consist entirely of headers unless I'm looking in the wrong place.

I thought it was an FSF.hu project, which means that source code should
be somewhere, if it is not included in the extension.

I can't find an email address for Németh László, otherwise I'd ask him
about source code. I can't tell, but a blog of his on LibreOffice.HU
implies that he is no longer working on it, because LibO has built in
support for what that extension does. But then my ability to read
Hungarian is close to non-existent, so I'm probably mis-understanding
it. (For some reason, the translator extension I use on my web browser,
has decided that the blog is in English!)
If you can read Hungarian, or your browser has a translation extension
that works, you might glean more details on LibreOffice.HU. Nemeth has
written a couple of blog posts on typography. How relevant they are to
your specific situation, I don't know. :frowning:

Where would I locate the source code for the routines the toolbar

calls, what language do they use,
and what compiler (or are they also interpreted?) would be required for
a Linux installation?

Those calls are probably to one or more LibreOffice APIs. Most of the
LibreOffice code is C++, but there is some Python and Java. (LibreOffice
hasn't quite yet banished all Java from the program. They are working on
that, though.)

I don't know what language the typography toolbar extension is written
in. My choice would be Python, but C++ and Java are also candidates. (I
obviously haven't looked at the source code for the extension.)

If you're going to compile the extension, I'd suggest using the same
process as for LibreOffice.

Re: "any source for documentation on how this genre of stuff is supposed to work?

AFAIK, not that non-programmers can understand."

So I would like to at least attempt to understand it - can you point me to whatever you were referring to?

I should have known you'd ask me that.
And I've forgotten what the tool to use is. :frowning:

CLANG (
https://wiki.documentfoundation.org/Development/Clang_Code_Analysis )
might help you figure out why LibreOffice won't compile correctly. (This
isn't the bug tool I mention below.)

What I'm trying to remember, is the name of the tool that can be used to
pull up code related to specific functions. In this instance, typography.

The easy/hard way is to follow the instructions at
https://www.libreoffice.org/about-us/source-code/
and
https://wiki.documentfoundation.org/Development/How_to_build
and then hope/pray/curse, that the source code is commented enough, that
the typography related stuff is findable through using various command
line tools, if you can't find anything with your usual IDE.

Forget the claim about 26GB. You need at least 100 GB of free disk
space. And don't run anything else on your computer, when doing the
searches. Resource_Hogs-R-Us is their motto.

I know that there is an easier way. I just don't remember what it is, or
how to get there. :frowning:

This is really frustrating. There is/was an online tool, designed for
bug hunting, especially by people reporting bugs, but it can be utilized
for other things, including being semi-abused for this use-case.

http://cgit.freedesktop.org/libreofficeis the GIT repository browser.
Not quite what I was wanting, but helps segregate out the unwanted code,
from the wanted code.

I literally meant that the only documentation was the source code. Back
when Sun ran OOo, there was a fair amount of documentation for most of
the source code, but I don't know what happened to it. :frowning:
Documentation written by programmers, for programmers, to explain what
was supposed to happen, and occasionally, functional equivalents, tests,
and other useful things. It probably is still floating around,
somewhere. Nothing useful if one was writing documentation, to explain
how to use the software, though.

jonathon

Jonathon -

Thanks for the response; I get the sense that, like many other areas of
LibreOffice, development relies entirely on a volunteer who happens to have
the time, happens to have the inclination, happens to have familiarity with
the subject matter (in this case, rather arcane I guess), happens to be
familiar with any side effects of his/her efforts, and happens to be a
decent enough coder to not break anything. Coupled with those
characteristics, some knowledgeable user/tester/guinea pig(s) would need to
be available. Given the statistical improbabilities there, I suppose we live
with what we have.

Re: your request "can you recommend one or two _good_ Thai fonts?"

Well, "good" is somewhat dependent on your own esthetics as well as the
purpose for which the fonts are to be used. It would also depend on whether
you will be mixing Thai with English, or even with multiple other
languages/scripts.

Nonetheless, here is a list of some fonts to look at (all are free).

FreeSerif (In a class by itself)

     FreeSerif is a fairly complete Unicode font that includes Thai as well
as most scripts I've ever had a need for. The advantages of this font are a)
it's not at all bad looking, b) there is no need to experiment with matching
sizes across scripts (more about that below), and c) it's free for any
purpose.

     The disadvantage is that, as far as I know, the only matching
sans-serif version (Free Sans) only includes western glyphs.

Droid Sans Thai
Garuda (Sans)
Kinnari (Serif)
Loma (Sans)
Norasi (Serif)
NotoSansThai (part of Google's project)
NotoSerifThai (part of Google's project)
Purisa (Casual)
Sawasdee (Sans)
Tlwg Mono (fixed)
Tlwg Typist (fixed)
Umpush (Sans)
Waree (Sans)

The fonts listed above all include basic Latin glyphs as well as Thai
glyphs. In other words, they are useful for mixing Thai and English (and
limited other western scripts). In some cases (e.g. Droid Sans), you can
obtain similar fonts for other glyph sets/languages and, I suppose, make
your own custom language combinations using FontForge or something similar,
although I haven't tried that. All the fonts listed are free as well. I
should note that Microsoft also offers some suitable fonts that can be found
on the internet, but the licensing/permissions on these is unclear to me, so
I don't use them unless required by a client.

Mixing Thai font glyphs with Western glyphs is complicated because of the
way Thai characters are formed. There is no upper case, so the shift key
just gives you access to additional, less frequently used characters, but
there are tall and short letters that can appear to non-Thai speakers as
capitals. Several (though not all) vowels in Thai are symbols placed above
or below the consonents they are joined with. Coupled with the fact that
Thai not only has what we would call accents, but - being a tonal language -
has tone marks that can also go above some syllables.

Because there are lots of combinations, the placement of some of these
additional glyphs may change depending on how many of them need to go above
the same character at the same time.

Because of these characters, more vertical space is typically required above
and below lines of Thai text; this isn't a big deal if the paragraphs are
either all in English (for instance) or all in Thai, but if the two
languages are mixed in the same line, getting things to look clean takes a
bit of planning when choosing what fonts to use.

The best way to illustrate this is to download the Thai Font Book at
http://ftp.opentle.org/pub/national-fonts/FONTBOOK.PDF. Although the book is
written in Thai (go figure!), the numerous illustrations of how all these
things should work (in way more detail than I'm providing here) are pretty
self-explanatory even if you don't read a word of Thai. Begin on page 13
which shows how the Thai character analogs to what we refer to as ascenders,
descenders, x-height and such things are defined. The very next page shows
how to line up Thai and western charcters next to each other in order to
look good. The illustrations are very numerous and very well done.

The bottom line, though, is that if you plan to use multiple languages in a
single document without needing to worry too much about making things go
together, FreeSerif is (again, so far as I am aware) the only go-to font
that will let you do that.

I hope this helps a bit.

P.S. TomD: if you're lurking as usual, can you add this and the parts of my
earlier posting from a year or so ago to whatever resources the doc team
might eventually use to document font usage in LO?? Thanks.

LibreOffice, development relies entirely on a volunteer who happens to have

There are a couple of paid developers. Their coding priority is what
ever the organization that pays them, tells them to do.

At least one of the firms that offers paid support, automatically
submits all fixes for clients, into the LibreOffice code. Sometimes
those bigfixes are long-standing issues in Bugzilla. Sometimes they are
things that nobody noticed as problem, until their client said: "fix it".

I _think_ that some of the support firms let their developers select a
bug to fix, when their paying clients don't have any bugs that needed to
be fixed yesterday.

characteristics, some knowledgeable user/tester/guinea pig(s) would need to
be available. Given the statistical improbabilities there, I suppose we live

I've forgotten who it is, but somebody (¿QA?, ¿Developer? I really don't
remember) is actively looking for things that have been broken, to test,
either to ensure that the fix worked, or to find out what breaks, the
first build after the code merge is completed.

The critical datapoint here, is that the testing must be completely
automated. No babysitting allowed.

I think it is the same individual that is collecting documents, simply
to open, look at whether or not they "break", close, and go onto the
next one. If/when this type of testing can be automatized, it is automated.

Edge cases that are rare, but break things, are specifically welcome.

Nonetheless, here is a list of some fonts to look at (all are free).

Thanks for the list.

FreeSerif (In a class by itself)

I'll take a look at it.

your own custom language combinations using FontForge or something similar, although I haven't tried that.

I've customized some fonts with FontForge.
Not something I'd like to do with umpteen glyphs, though.

Mixing Thai font glyphs with Western glyphs is complicated because of the
way Thai characters are formed. There is no upper case, so the shift key
just gives you access to additional, less frequently used characters, but
there are tall and short letters that can appear to non-Thai speakers as capitals.

And when those tall letters pop up in the middle of the word, the
non-Thai readers wonder what is wrong with the spelling, or
grammar-checker, try to fix it, and find that they cant, but don't know
what is preventing them from doing so.

Several (though not all) vowels in Thai are symbols placed above or

below the consonents they are joined with. Coupled with the fact that
Thai not only has what we would call accents, but - being a tonal
language - has tone marks that can also go above some syllables.

One of these days, I'm going to figure out how to get styles to only
affect specific glyphs. Then create one style for tone marks, one style
for accents, and one style for the other non-basic glyph stuff.
Basically, give everything their own colour.
It is amazing how the errors pop out, when that colouring is done.

but if the two languages are mixed in the same line, getting things to

look clean takes a

bit of planning when choosing what fonts to use.

Not just in fonts, but also minor modifications of the styles that are used.

One reason I insist on language specific styles, is because that is the
only way to ensure that things stay the way they should, when creating
multilingual content.

The best way to illustrate this is to download the Thai Font Book at
http://ftp.opentle.org/pub/national-fonts/FONTBOOK.PDF.

Thanks. Looks very useful, especially to get an idea of what Thai
esthetic norms are.

descenders, x-height and such things are defined. The very next page shows
how to line up Thai and western characters next to each other in order to
look good. The illustrations are very numerous and very well done.

Looked to me like somebody was really tired of seeing bad alignment of
those writing systems, and wanted people to stop putting out bad work,
when it as easy to do good work.

together, FreeSerif is (again, so far as I am aware) the only go-to font
that will let you do that.

Code2000 is an ugly pan-unicode font that has fairly sensible fallbacks.
There are a couple of pan-unicode fonts, but they tend to be "ugly".

I did see one that was very pretty, but the price tag wasn't.(US$100,000
per weight.)

earlier posting from a year or so ago to whatever resources the doc team
might eventually use to document font usage in LO?? Thanks.

Almost convinced me to find your material from last year, and add it to
this, then post it to my LibreOffice blog.

I've been meaning to restart it, especially now that I'm writing
material in multiple languages, and writing systems, again.

jonathon

Hi Jonathon:

Re: "Not something I'd like to do with umpteen glyphs, though." I seem to
recall that you can copy a block from one set and drop it into another set
in FontForge, but I'd have to go back and look.

Re: "Looked to me like somebody was really tired of seeing bad alignment of
those writing systems, and wanted people to stop putting out bad work, when
it as easy to do good work." Rumor has it that this effort, as well as many
other similar technological efforts were actually pushed by the King himself
(his photo is in the book somewhere) - he himself is pretty savvy and is
also a pretty good jazz player besides. Quite a change from his predecessor
from "The King and I."

Re: "Almost convinced me to find your material from last year, and add it to
this, then post it to my LibreOffice blog." The stuff from last year is at
http://nabble.documentfoundation.org/saving-documents-written-in-Thai-tt4088195.html#a4088266
Interestingly enough, I found that you had posted to that same thread; I
didn't realize that you were somehow connected to LO somehow ...

If you do restart your blog and it deals with issues in this area, please
send me a link; I'd like to keep up with this stuff.

I'll look for the Code2000 font and see how it compares ... thanks for that.

Hi :slight_smile:
I dunno what to do with this info and i can't find the email from a year or
so ago.

Maybe create a wiki-page about it in the Faq? I could create the wiki-page
if you think that is a good idea. Then you could paste the information in
and i could do some formatting, if you like. What do people think?
Regards from
Tom :slight_smile:

Hi again Jonathon:

I hadn't heard the term "pan-unicode" that you used, so I looked around a
bit and found the following site:
http://www.fontspace.com/category/pan-unicode

It includes a font named Quivira that appears to be free and also appears to
have a nice set of Thai (and more glyphs than most fonts I've looked at -
which could be good or bad depending on what you're using it for of course).

I'll take a closer look, but from what I've seen so far, I suspect it should
be added to the list I gave you, and appears to support a large number of
other languages as well...

P.S. Should any part of this thread be posted to the L10N segment as well?
If so, who would be responsible for doing that? For TomD: I posted a link to
the earlier thread of mine that you couldn't locate in my last reply to
Jonathon just in case you're interested.

-- Frank

At this stage, I'm more interested in fonts that look good.

James Kaas (Code2000 creator) paved the way with low-cost/gratis
Pan-unicode fonts. I think most of the glyphs are "ugly", but it was all
that available then. And for some sub-sets, it still is the only thing
that is available.

I'd like to think that typographers are now focusing on "pretty".

jonathon

it as easy to do good work." Rumor has it that this effort, as well as

many other similar technological efforts were actually pushed by the
King himself.

Makes sense to me. I've read a couple things that imply he thinks that
too many people are putting out "ugly" things, and that makes the
country look bad.

Interestingly enough, I found that you had posted to that same thread;

Doesn't surprise me. I try to follow issues related to multiple writing
systems in the same document.

didn't realize that you were somehow connected to LO somehow ...

I'm not sure I am.

Anyway, the blog title is "Libre Office in a Multi Lingual Environment".
URL is http://libreoffice-environment.blogspot.com/
I haven't made many posts there.

I'll look for the Code2000 font and see how it compares

I don't know if Code2000 is still available.
The creator disappeared from the Internet several years ago.

Unifont is an extremely extensive pan-unicode font.
I don't remember the specific license it is distributed under, but it is
Libre, and Gratis. (Look for it your *Nix repository under fonts.)

jonathon