Date: prev next · Thread: first prev next last
2012 Archives by date, by thread · List index


Hi all,

Thank you for all your enthusiastic help with this.

I also made some progress. Using Infix PDF Editor 5 (Windows trial version under Wine) and/or fontforge (Linux) I could see that the Font mapping was messed up. The mistaken mappings begin:

> Th -> T
> i -> hi

That is, the PDF converter has broken 'This' at the wrong place into 'T/hi/s' instead of 'Th/i/s'. All successive instances of the 'i' glyph are mapped to 'hi', thus explaining the drunken slurring effect.

I was able to manually repair the mappings using podofobrowser (Opensource, pre-built Windows executable under Wine). Now my real PDF is good. But if I re-create it I need to re-repair it.

I've reproduced the problem with both Debian and the LO versions of LO. Maybe I should also try installing AOO.

webmaster-Kracked_P_P: would you be able to attach the PDF that you produced from my file so I can look for signs?

My version of the fonts is from January 2012, so that may be a source of the problem. Another possibility, since it's fairly clearly a bug somewhere, is that maybe building for 64 bit architecture makes it disappear.

Unless anyone has a better idea, I'll file this as a LO bug.

Cheers,
Jonathan


On 01/11/12 04:39, VA wrote:
Interesting problem. Based on my tests, which I detail below, it appears
to be LibO bug rather than a font problem.

I'm using LibO 3.5.6.2 with Win7 and Adobe Reader.

In the sentence "This is the official version" with Linux Libertine G,
there are two instances of automatic ligatures--the "Th" combination in
"This," and the "ffi" combination in "official." In Adobe Reader, I aas
able to find "This" when I did a search, which means that the Reader
recognized the "Th" ligature as a "T" followed by an "h" which is what I
typed into the search box. But, when I tried to search for "official"
the Adobe Reader couldn't find it, which means it did NOT associate my
typing of an "f", "f", "I" with the "ffi" ligature.

Then when I copied and pasted the sentence from the PDF file into a
plain text editor, it placed an "h" before every instance of an "i" just
as was reported. However, this obviously has nothing to do with
ligatures as most of the instances of "i" were NOT included in the
ligatures. In fact, it did not place an "h" before the "i" in the "ffi"
ligature.

For comparison, I ran the same test using Apache OpenOffice 3.4.1. to
see if it is a font issue or a program issue. I'm sorry to report that
it appears to be a program issue. In AOO, I typed the same sentence
using Linux Libertine G, "This is the official version." I then saved it
as a PDF and opened it in Adobe Reader. This time, a search found both
"This" and "official" despite both words containing ligatures. And, when
I copied the sentence into a plain text editor, it copied correctly
without any additions of "h" before "i".

I love the Linux Libertine set of fonts. I use it, not only with LibO
and AOO, but also when I set a document in LaTeX.

I have found that Apache OpenOffice's support for Linux Libertine G
appears to be more complete and polished than LibO's. This may be an
example of that more complete support.

Of course, LibO has its advantages over AOO; for example, it properly
hyphenates American English words, with AOO does not appear to do. It
would be nice if someone could combine the best of both programs into
one complete program (along with the tabbed interface of Lotus Symphony,
yet a third fork of the original OO). But, I won't hold my breath.

Virgil



-----Original Message----- From: Dan Lewis
Sent: Wednesday, October 31, 2012 7:02 AM
To: users@global.libreoffice.org
Subject: Re: [libreoffice-users] Re: Searchable PDFs from Graphite fonts

On 10/30/2012 11:08 PM, Jonathan Schultz wrote:
I can select the text in the PDF, copy and paste, but get an 'h'
added before most 'i'. I can search, but not if the word is one with
the extra h before i Steve

That's exactly what I mean. It effectively means no searching.

I tried both Linux Libertine and Linux Biolinum [14 point] on my
3.5.7 version for Ubuntu 64-bit. I cannot replicate the issue with
"added" characters.

Were you using Graphite fonts ('Libertine G'/'Biolinum G')? Those are
the ones where the problem arises. Those are also the fonts that do
ligatures and other lovely typesetting things that make them look so
nice, which I why I want to use them.

Cheers,
Jonathan

Seems to me that you have solved your own problem: it is the
fonts. The search function can not handle the the lovely typesetting
things. As you mentioned, an "i" looks like a "hi" to it. The only real
solution is to not use any of the Graphite fonts in a PDF.
But if you want to search the PDF, have you opened it in Draw and
search for the text in it?

--Dan


--
For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.