Date: prev next · Thread: first prev next last
2016 Archives by date, by thread · List index


Hello! One quick irrelevant thing before the real post...

The mailing list bot sent out an email that contained a broken link:

Please read

http://wiki.documentfoundation.org/Development/Use_of_MailList before posting.


On to the real email:

Hopefully I'm not bothering you too much! Sorry to intrude on your dev mailing list, this may seem 
a little off topic, but I have been banging my head on this forever, and I know you guys are the 
experts!  We can also all get some help out of this, perhaps!

I have been running into an issue that I know the developers of libreOffice have likely faced 
before! I was wondering if the developers that have worked on this before would like to talk some 
about this, I'd really appreciate it!

Basically, I have some questions on how to calculate the line spacing between lines, when parsing 
and rendering a docx file.

My requirement is to exactly match Word, not necessarily the OOXML spec, in the spacing between 
lines in a simple paragraph.

In order to try to do this, I have built a tool to analyze the differences between my layout and 
Word's layout. To do so it does the following:

- First it generates a (or many) docx files.
- Next it creates pdfs from the docx files. It uses Word to render the docx to PDF, and my program 
to render the docx to PDF. "word.pdf", and "me.pdf"
- Then it analyzes the resulting PDFs for differences in layout.

So, my tool would say:

- Create a document "template.docx" with 1000 "a" characters in a single run of text with the same 
properties.
- Make a "word.pdf" and "me.pdf" from this docx
- Calculate info from the pdfs, in particular, calculating the line spacing in terms of the 
calculated leading between a lines ascent and the previous lines descent (our (Ascent + Descent) 
are identical-ish, so all that differs is the whitespace between lines). I often think of it as the 
lines whitespace...

This tool showed me that the leading varies greatly from font to font.

To depict this, I used the tool to make thousands of these comparisons, in particular generating 
for:

- For each font in system
- For "a", "y", and a mix of letters and spaces.
- For different font sizes.
- For different line spacing types (Single, One and a half, and Double)

I was hoping to find groupings, such as "this type of font has 1.3 times my calculation of leading".

I was able to conclude far less than I had hoped, and was wondering if you could help me further 
with the issue of calculating line spacing. I'm providing you with a file that is best downloaded 
and opened using the filters in the header row. Note that its not totally complete, there are 
missing entries, but I doubt they will be a problem for anyone, and I'm going to regenerate it soon 
but its pretty slow, so I'm finishing up some changes to it first.

Here is a comparison of the layout of our software, vs the layout of Word's for every font 
installed on my system, etc. (attached and linked)
https://drive.google.com/file/d/0BzQpUdPjnJUUclRXVXFkaEh3Mms/view?usp=sharing

I'm not positive, but I believe the issue could be one of the following:

- Word is using a different process than we are to calculate the "leading" of a font. We don't 
parse the font files ourselves, instead rely on libraries to get font sizing information, and 
perhaps in the "world of font files" I am missing something, and word is parsing the fonts directly 
and differently.
- Word has some sort of lookup table that handles groups of fonts, or an algorithm, that scales a 
fonts leadings up or down based on some criteria I am unaware of.
- Word is using an additional criteria besides leading, ascent, and descent, to determine line 
spacing.

Please feel free to email me at nathanb@windward.net<mailto:nathanb@windward.net>

Thank you so much for your time!!

I know you guys aren't trying to emulate Word, (I very much enjoy Writer) but am sure you've had 
complaints of people opening documents made in Word that there friend sent them, and having major 
formatting differences, such as different pagination.

Microsoft is famous for having been both open and closed about this spec, and thus allowing its 
wide-spread adoption without proper competition, and I think that openness between communities 
trying to work with that is very helpful. We'd be happy to exchange information about quirks and 
things that differ from the spec that we find.

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.