Interesting problem. Based on my tests, which I detail below, it appears
to be LibO bug rather than a font problem.
I'm using LibO 3.5.6.2 with Win7 and Adobe Reader.
In the sentence "This is the official version" with Linux Libertine G,
there are two instances of automatic ligatures--the "Th" combination in
"This," and the "ffi" combination in "official." In Adobe Reader, I aas
able to find "This" when I did a search, which means that the Reader
recognized the "Th" ligature as a "T" followed by an "h" which is what I
typed into the search box. But, when I tried to search for "official"
the Adobe Reader couldn't find it, which means it did NOT associate my
typing of an "f", "f", "I" with the "ffi" ligature.
Then when I copied and pasted the sentence from the PDF file into a
plain text editor, it placed an "h" before every instance of an "i" just
as was reported. However, this obviously has nothing to do with
ligatures as most of the instances of "i" were NOT included in the
ligatures. In fact, it did not place an "h" before the "i" in the "ffi"
ligature.
For comparison, I ran the same test using Apache OpenOffice 3.4.1. to
see if it is a font issue or a program issue. I'm sorry to report that
it appears to be a program issue. In AOO, I typed the same sentence
using Linux Libertine G, "This is the official version." I then saved it
as a PDF and opened it in Adobe Reader. This time, a search found both
"This" and "official" despite both words containing ligatures. And, when
I copied the sentence into a plain text editor, it copied correctly
without any additions of "h" before "i".
I love the Linux Libertine set of fonts. I use it, not only with LibO
and AOO, but also when I set a document in LaTeX.
I have found that Apache OpenOffice's support for Linux Libertine G
appears to be more complete and polished than LibO's. This may be an
example of that more complete support.
Of course, LibO has its advantages over AOO; for example, it properly
hyphenates American English words, with AOO does not appear to do. It
would be nice if someone could combine the best of both programs into
one complete program (along with the tabbed interface of Lotus Symphony,
yet a third fork of the original OO). But, I won't hold my breath.
Virgil
-----Original Message----- From: Dan Lewis
Sent: Wednesday, October 31, 2012 7:02 AM
To: users@global.libreoffice.org
Subject: Re: [libreoffice-users] Re: Searchable PDFs from Graphite fonts
On 10/30/2012 11:08 PM, Jonathan Schultz wrote:
I can select the text in the PDF, copy and paste, but get an 'h'
added before most 'i'. I can search, but not if the word is one with
the extra h before i Steve
That's exactly what I mean. It effectively means no searching.
I tried both Linux Libertine and Linux Biolinum [14 point] on my
3.5.7 version for Ubuntu 64-bit. I cannot replicate the issue with
"added" characters.
Were you using Graphite fonts ('Libertine G'/'Biolinum G')? Those are
the ones where the problem arises. Those are also the fonts that do
ligatures and other lovely typesetting things that make them look so
nice, which I why I want to use them.
Cheers,
Jonathan
Seems to me that you have solved your own problem: it is the
fonts. The search function can not handle the the lovely typesetting
things. As you mentioned, an "i" looks like a "hi" to it. The only real
solution is to not use any of the Graphite fonts in a PDF.
But if you want to search the PDF, have you opened it in Draw and
search for the text in it?
--Dan
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.