Date: prev next · Thread: first prev next last
2014 Archives by date, by thread · List index


Hi :)
Imagine a schoolkid getting beaten up every day by a bully (that the
parents all seem to think is a good kid) demanding his tuck-shop
money.  The kid goes home hungry.  At home his parents don't give him
supper because they have already paid for his lunch.  He eventually
grumbles about the bully and gets smacked for telling a lie about the
wonderful bully.

How do you convince people of the truth?  What happens if all the kids
gang up together?  It's now the word of the popular bully against the
word of all the rest.
Regards from
Tom :)





On 5 March 2014 09:52, Tom Davies <tomcecf@gmail.com> wrote:
Hi :)
Please file a bug-report with MS Office for failing to implement their
own format as per their ISO specifications.


The file(s) that you have trouble opening are likely to also be
difficult or even impossible to open in many versions of MS Office.
You really need to be using the same version of MS Office as the
creator of the document.  Try it!  You might be amazed.

Microsoft themselves state that the DocX version in 2007 is a
"transitional" version (=not the same as the ISO spec), as is the one
in 2010 and the default ones in 2013 and 365.  In Ms Office 2013 it is
possible to use "Save As" to use their current "strict" version of
DocX which is supposedly a lot closer to their ISO version.


Such files often rely on someone using a non-MS program to open, make
a few changes (such as anonymising information) and then saving into a
more standards-compliant version of DocX or even better into a format
that everyone can use.


Microsoft's various different implementations appear to have a large
proprietary components which are apparently copyright protected,
preventing non-MS implementations from being able to read their files
properly.  By attempting to create filters to properly read the
various implementations of DocX non-MS companies put themselves at
risk of court action!

This is NOT an excuse and NOT anyone being defensive about it and it
is NOT an evasive tactic.  It's one of the reasons why various
companies, governments and organisations are moving (or already have
moved) away from MS formats (which appear to fail any of their
promises of interoperability, just as Rtf did (see relevant court
case)) and towards a truly Open Document Format (which has been
successfully implemented by many programs, on many platforms for years
and is not reliant on the whims of any single vendor or company or
commercial interest (unlike DocX)).


The best bet for exchanging files is to refuse to accept DocX and ask
for the file to be sent in a format which everyone can use, either Doc
(the older MS format which MS dropped just after everyone else was
able to implement it) or Odt.  Note that in 10-20 years files in DocX
format are likely to be extremely difficult to open because the
various implementations are not properly written up anywhere.  Each
ODF format IS properly drawn up and the specification is available for
free from OASIS or, for a charge through, the ISO governing body.  The
DocX specification available, for a charge, through the ISO
organisation is not the same as any implementation except the
non-default one in MS Office 2013.


So, your options for opening the file properly is to
1.  Find out which version of MS Office was used to create the file
and one which version of Windows.  Then buy that version of Windows
and that version of MS Office.
2.  Try out various programs until you stumble onto the one that can
open that version of DocX (and cross your fingers that MS doesn't take
them to court because of it)
3.  Ask the person sending the unreadable files to upgrade to a more
recent version of MS Office (this is the advice you would get from the
rest of the MS world)
4.  Ask the person to please send in a more compatible format, such as
Doc or Odt, this is the option the non-MS world generally advise.

Many people are now sending Pdfs insted of editable versions of
documents specifically because this whole formats issue has become
such a nightmare.  If you ask the originator to send as a PDF then you
will probably find that many other people have made the same request
of them.


Just to be clear i am just a volunteer here and do NOT represent any
official viewpoint of TDF or anyone else associated with LibreOffice
or ODF.  On many of the mailing lists here i am on moderation and/or
have often been threatened with being removed from the mailing lists
because my views are unpopular outside of the normal users peer-led
support mailing list.  It's similar for quite a few of us here.
Regards from
Tom :)



On 5 March 2014 07:59, warp9pnt9 <warp9pnt9@gmail.com> wrote:
From time to time I receive .docx email attachments from local groups, such
as colleges or community organizations, which may display with varying
amounts of success.  Recently, I received such a .docx as a mail attachment
which only displays partial text.

My goal is to gather sufficient information to file a meaningful bug report
in the hopes of corrective action being taken to help make LibreOffice more
robust by properly displaying documents such as this.  If there is an
unreleased fix or existing unresolved bug report, please inform me.

The .docx file in question has 2 visible pictures (on left side of page),
each in some sort of box of their own, and 3 other colored boxes (2 colored
boxes in right side of page and one full width box across bottom third of
page).  One of the colored boxes (topmost rightmost) has text in it that I
can see.  The other 2 boxes (right-middle, and bottom) have no visible text.
This is the problem area.

I used 7-Zip to extract the .docx to files, and skimmed through the
convoluted .xml soup -- using NotePad++ -- and found that the missing text
was locations, times and topics for some guest speaker series.  There may
also be some embedded or inline images, as I recognized some base64 content
mixed in as well.  Perhaps bullet points or backgrounds.  I did not explore
further.

I would share the offending document, but it is not completely anonymized,
and I have no software with which to open it and anonymize and retain the
same bug.  Perhaps I could manually edit the .xml text areas.  Then there
are the images, which I must decade and save as external files, to
anonymyze, re-encode, and then adjust any byte counts.

Finally, not sure if 7-Zip can create the exact same zip container, or if
that matters?  I'd have to get the 7-Zip compression settings right, usually
reduce dictionary and word size options to minimums (has worked for .xpi and
.jar files in the past, why not .docx).  Assuming I could anonymize and
retain the errant behavior, would it then be helpful to determine the source
of the bug?

Can anyone think of any other troubleshooting I might try, either as
LO/Writer options, or raw .xml edits, or both?

Also note and please respect that within the context of this thread, I (and
likely many others) have no interest to debate the wheretos and whyfors of
assigning blame (to someone or something else), or denial that it is a
compatibility problem (belonging to LO), or other evasive tacts.

This is not a mere formatting problem to be ignored, simply because some
text is fonted wrongly, image positioned oddly, and so on.  Text here is not
displaying at all.  If I have to unarchive and muck about in .xml files,
that speaks to me of an issue of compatibility which must be resolved or
worked around in some way with a more graceful degradation of document
quality than to drop things on the floor and walk away with fingers in ears
singing, "la la la can't hear you".

In other similar threads over the years, I have noticed a peculiar tendency
of some people to insist on inserting such defensive and evasive commentary
any time an issue of compatibility is raised.  Please don't waste time or
energy on such responses, nor responding to such responses.  It is not
helpful to this specific issue at hand.

Should a prolonged discussion of such things be so strongly desired by
anyone, feel free to start a new thread on that specific topic, and leave
this one alone.  Perhaps I will then say a few words on the topic in that
thread, sufficient to communicate most of these sentiments.  But I ask
please that such dialog not be included here, for the sake of a streamlined
discussion.




--
View this message in context: 
http://nabble.documentfoundation.org/LO-4-2-1-1-docx-file-partial-text-and-images-tp4100085.html
Sent from the Users mailing list archive at Nabble.com.

--
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted


-- 
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.