Date: prev next · Thread: first prev next last
2011 Archives by date, by thread · List index


On 12/08/11 14:38, planas wrote:
On Fri, 2011-08-12 at 13:52 +1000, Simon Cropper wrote:

Hi Everyone,

I have been doing some follow-up investigations that I thought you might
like to be made aware.

First, why? Well after testing the file that I recreated by copying the
text, formula and styles I could not cause the error to occur again.
This is what I reported yesterday. I then proceeded with importing the
few images and 'artwork' objects from the old file to the new. After
several saves and sheet manipulations I noticed the file became unstable
again.

The outcome of my investigation was that one of the image files used on
one of the sheets had become corrupted* and adding this file in caused
the file to become unstable - eventually exhibiting the problem I
mentioned about not being able to save after a sheet was deleted. These
'corrupted' files are relatively benign with the error only becoming
apparent *once* you attempt to delete the sheet. So somehow the error
caused problems with the broader 'workbook' structure not the sheet
structure. Although I compared files between different versions of the
file the XML were too varied (usually style names and definitions;
content was identical).

     * note I say corrupted but it rendered OK and only resulted
     in the observed behaviour one a sheet is deleted. I 'deem' it
     corrupted as once replaced with a clean version render and
     saved as a new file by GIMP, the problem disappeared.

The error was with the particular corrupt objects. On recreating new
images and inserting them into a file I have not been able to trigger
any problems. If I cut-and-paste from the original 'corrupt' file the
file becomes unstable after a few saves and sheet manipulations.

So the steps for salvage is...
1. recreate a new file with the exact number of sheets as the original.
Ensure each sheet names are the same.
2. Cut and Paste each sheet. Make sure you 'Paste special' limiting the
content being placed in the new file to text, numbers, date&  time,
formulas and formats.
3. If you have images in the file. Recreate / Save using another
package. As mentioned I used GIMP.
4. Insert new copies of the images into file. *Don't* cut-and-paste
objects from the old file.
5. If you have any lines, text boxes and artwork; recreate them from new.

During this process...
- Save as a new version (+tabs, +data, +images, +other objects)
following each step.
- Test each version thoroughly before proceeding.
- Only add one object at a time, so if something goes wrong you can
isolate the problem component.
- To check it is not a bug try and recreate with a fresh file.

A couple of quick notes that may be valuable to others...
- ODS files are archives. Use an archive facility to extract the data
inside. Inspect the contents in the folders to see what is different.
- On every 'Save as' the file size changes. This is not due to changes
in the file contents, but rather in changes in how the components of the
file is compressed/archived. If you open a file, add objects then save,
the file size with be so big. Open that file and "Save as" a new name
and the file will be a different size. If you extract the files the
contents of all the files and directories are identical. It is just the
internal archive facility in LO will decide the best compaction routine
based on what it encounters.
- The content.xml file can be quite large and has no internal
end-of-line characters. This make it difficult to open and be parse by
various text editors, xml viewers and comparison facilities. To insert a
EOL character after the end of each tag (i.e.>), I used the following
command in the terminal (requires Linux).

     cat content.xml | sed -e 's/>/>\n/g'>  content_with_linebreaks.xml

     cat just spews the content of the text file to the standard output.
     I then pipe it to sed, where I used regular expressions to find '>'
     and replace every instance of it with '>\n'. I then compared the
     contents with Diffuse.


Snip

+1

I wonder how the images got corrupted, interesting.

Good question and something I was not happy just leaving alone.

Once I satisfied myself that this was most likely not a bug but file corruption the questions that came to mind were...
(1) How this file became corrupted? and
(2) How extensive is the damage (physical damage, system wide, file specific)?

As mentioned I currently use Linux so needed to find a method to check the file system for errors and see if any blocks or clusters were damaged. I am assuming here that I could of had a power spike or jolt to my machine that could of caused the platter on my hard disk becoming damaged.

I did this by running 'fsck.ext4' from a LiveCD on all my hard disks. It took all Saturday to do this. Thankfully no errors were detected. Anyone interested in how I did this can check out my more specific response on the Ubuntu Forums...

http://ubuntuforums.org/showthread.php?p=11152491#post11152491

This left me with an event that could have occurred while the affected file was open that could have resulted in the identified corruption. In fact, on reflection I did have a power blackout 3-weeks ago! I know pretty much exactly when it occurred and so was able to work out exactly what I was doing... as it turns out I was editing the exact sheet in the corrupted file that contained the corrupted image!

As LO had 'recovered' the file; all text, formula and images were present and encountered for; and rendering as expected, I never even thought the file was damaged; so much so that 2-weeks later when peculiar 'sheet/tab deletion errors' appeared I did not even connect the two events (power outage and corruption).

Thanks everyone for your help - I appreciate your efforts and suggestions.

--
Cheers Simon

   Simon Cropper
   Principal Consultant
   Botanicus Australia Pty Ltd
   PO Box 160, Sunshine, VIC
   W: www.botanicusaustralia.com.au

--
For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.