Date: prev next · Thread: first prev next last
2011 Archives by date, by thread · List index


Gary,

Thanks for the links and analysis.

I haven't attempted the obvious test, which is to save the 
RTF that fails in LO back to RTF and then compare to see what
got left by the side of the road.

My superficial impression of the 96.rtf and the RTF specifications
 revealed to me that there are extensive provisions in the RTF 
specification (and, specifically, how the 96.rtf is coded), to 
accommodate all manner of up-/down-level adjustments between 
different versions and capabilities of software.  I wonder if 
this is not being handled well in the currently-implemented 
import-export features, but I did not dig in deep enough to 
determine that one way or the other.

 - Dennis

DEEPER ANALYSIS

I did waste a fun evening becoming re-acquainted with RTF though.  
My first romance with the format was around 1989 by tricking 
Borland Paradox (MS-DOS version) to produce text files of reports 
that were actually RTF documents that I could import into a Xerox
workstation desktop publisher and make nice, paginated documents 
from.  (I was compiling a glossary built in a database.)

In examining the actual RTF of the 96.rtf example, it was very 
interesting to see how little of the RTF file is actually needed 
to accomplish the result. (There is a ton of overhead material.)

I also started looking through the RTF specifications, using RTF 
Specification 1.9.1 (Office 2007 level).  It reminded me what a 
fascinating format the underlying RTF structure is.  And there's 
sample code in the specification, although I am attempted to see 
how fast I could make a processor on my own that serves as the basis 
for an RTF forensic analysis and validation tool.

The number of control-word (sort of like an XML element tag) details
is immense, of course, but very little is needed to make a simple 
document.

One important feature: Since the *1987* specification the "\*" 
prefix feature has been used to identify control words whose data 
should be completely ignored if the control word is not recognized 
or supported.  For a control word that is not recognized and that 
lacks the prefix, the content is presumably to be kept in-line 
assuming it is in a place where content is being expressed.  Drawing 
objects tend to be introduced by {\*\do ...}, for example, and those
are used in important ways in 96.rtf.

There are other interesting facets in the specification.  For 
example, an RTF can have non-Unicode and Unicode-based alternatives
of the same content, for selective use depending on the capabilities 
of the processor that is consuming the RTF.

In addition, there are Word97-2007 shape objects in {\shp ...} and 
those also figure significantly in 96.rtf.  These also permit an 
optional Word 6.0/95 alternative {*\shprslt ...} and I see that 
those are present as well in 96.rtf.

Finally, although some of OOXML is mapped into RTF (for example, 
MATHML), other parts of OOXML that are newer than the binary formats 
are included as XML and the OOXML specification applies.  (The way 
XML is embedded in RTF is a bit gnarly - the XML is coded in hex 
streams so the RTF parser is not confused.)  This may be one way that
it has not been necessary to update the RTF specification for Office
2010.

There are many other provisions for up- and down-level compatibility 
and soft adjustment to the capabilities of a given RTF consumer.  It 
is rather remarkable though it depends on the quality of the producer 
that such material is included and of the consumer that such material 
is exploited.

-----Original Message-----
From: NoOp [mailto:glgxg@sbcglobal.net] 
Sent: Wednesday, September 21, 2011 17:06
To: users@global.libreoffice.org
Subject: [libreoffice-users] Re: RTF files rendering: huge differencies in LO 3.4.2 and MSO 2010 -- 
bug?

On 09/21/2011 09:02 AM, Arkady wrote:
Hi all!


There's a trouble opening simple RTF file in LO 3.4.2. It is opened, but
it's rendering is incorrect. Same file is correctly opened by MSO 2010. May
someone provide feedback on the issue? 

RTF has always been an issue in OOo and LO. While some recommend using
it, OOo & LO opinion has always been otherwise. I'm not sure exactly
why, as Microsoft has had a published RTF spec for many years. But the
spec seems to be a moving target[1]. Perhaps it's because Microsoft
doesn't always follow their own spec? Samples:

1.
<http://social.msdn.microsoft.com/Forums/ar/innovateonoffice/thread/1cc049b9-d63e-4f5e-b98e-e48e8ee78e94>

[1]
http://www.microsoft.com/download/en/details.aspx?DisplayLang=en&id=7105
[Word 2003: Rich Text Format (RTF) Specification, version 1.8]
http://www.microsoft.com/download/en/details.aspx?id=10725
[Word 2007: Rich Text Format (RTF) Specification, version 1.9.1]



For reference:


Image ofLO 3.4.2 with opened file 96.rtf
http://nabble.documentfoundation.org/file/n3355881/file_96_in_libreoffice_342.jpg 


Image of MSO 2010 with opened file 96.rtf
http://nabble.documentfoundation.org/file/n3355881/file_96_in_mso_2010.jpg 


Original RTF file: 
http://nabble.documentfoundation.org/file/n3355881/96.rtf 96.rtf 

My guess is that because it's a drawing it's an issue.

http://en.wikipedia.org/wiki/Rich_Text_Format
<quote>
However, RTF drawing objects are not supported in many RTF
implementations, such as OpenOffice.org[53], LibreOffice, KWord,
Abiword[54] or IBM Lotus Symphony (up to version 1.3 only some limited
support[55]; improved in later versions). When a RTF document with
drawing objects is opened in a software that does not support RTF
drawing objects, they are not displayed at all. Some implementations
will also not display any texts inside drawing objects.[56][57]
Similarly, when a document with drawing objects is saved as RTF in a
software that does not support RTF drawing objects, these are not
preserved in the RTF file. (For example, OpenOffice.org supports drawing
objects in some file formats (e.g. in ODF, SXW, DOC), but do not support
RTF drawing objects.)
...
Each of RTF implementations usually implements only some versions or
subsets of RTF specification. Many of the available RTF converters
cannot understand all new features in the latest RTF specifications.
</quote>

I tested with MS Word 2003 and it opens fine. However with OOo versions
(linux and Windows) 3.3.x - 3.4-dev, and LO 3.3.4 and LO 3.4.3 the issue
is as you show in your .jpg.


Another problematic RTF file (incorrectly rendered table in LO 3.4.2): 
http://nabble.documentfoundation.org/file/n3355881/requisites_table_in_russian.rtf
requisites_table_in_russian.rtf 

You are correct; the second table isn't rendered as a table (ending with
the last data ...@genesis.ru), but is instead converted from table to text.

You have a valid bug/issue, but my guess is that you'll be waiting a
*very* long time before the issue(s) get resolved (if at all). I'm not
posting this to discourage you from using LO, but I would discourage you
from using RTF in general.
...


-- 
For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted


-- 
For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.