Date: prev next · Thread: first prev next last
2019 Archives by date, by thread · List index


Am 03.06.19 um 21:52 schrieb Luboš Luňák:
On Monday 03 of June 2019, Caolán McNamara wrote:
On Mon, 2019-06-03 at 12:23 +0200, Luboš Luňák wrote:
On Thursday 30 of May 2019, Michael Meeks wrote:
      + some in the zip area - assuming they are threading related.

 Is this about those documents such
as /srv/crashtestdata/files/caolan/opendocument_stack_overflow_2.odt
? How
can I reproduce that problem? If I try to fetch
buildslave@vm138.documentfoundation.org:
an/opendocument_stack_overflow_2.odt ,
it doesn't exist.

I attach two of the examples here. The input name was foo.sample, the
output to odt name appears higher up in the bt during the export.

./instdir/program/soffice.bin --headless --convert-to odt
opendocument_stack_overflow.sample


 Ok, so it's not a problem with my code, my changes just happened to show the 
problem, and the problem is that those documents are broken. If you try to 
unzip the documents, it will complain about incorrect CRC (although it still 
will uncompress them). And what happens is that when we try to save the file, 
apparently only by that point we'll read those zip streams, there will be a 
ZipException about that, and the code in package/ is not exception-safe. So 
ZipOutputStream::writeLOC() gets called but not the matching 
ZipOutputStream::rawCloseEntry().

 But this is actually broken on several levels. If I make the code to catch 
the exception better, I'll need to make it somehow handle the fact that 
writeLOC() prepared for writing en entry, but then there's nothing to write. 
But that's actually not important, since ZipPackageStream::saveChild() will 
still return failure, so ZipPackageFolder::saveContents() will throw an 
exception, making the whole document saving fail. Which in turn means this 
whole save business is irrelevant, as there's just no way to save the 
document, even though we can load it and we can edit it. Which seems rather 
lame.

 Any idea what to do about that? Is it really ok that we just refuse to save 
it? Or should we save it even though the contents may be broken?

IMHO the only sane solution would be to detect the broken CRCs on read and
report a broken file to the user. Eventually we could offer some recovery
option: mark the broken CRCs to be recalculated and keep the stuff or drop the
broken ZIP entries. I guess most users can't make this decision, so I would opt
for optional recovery of the entries, ignoring the CRCs.

Easier solution: we just deny / abort loading the file and tell the user babout
the broken file.

It really strange that the broken CRCs are just detected on write and the
document loads without a problem. Or do we ignore all ZIP entries, which we
don't know, which would be strange too?

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.