a few more comments ...

Michael Meeks <michael.meeks -AT- collabora.com>
Mon, 5 Jun 2017 19:40:29 +0100

Hi Mohammed,

        Just a few more esthetic comments =) on a nice patch.

+    size_t mnPos;
+    size_t mnStreamSize;
+
+    Buffer maInUseBuffer;
+    int mnOffset;

        I'd love to see some doxygen comments:

        size_t mnPos; /// position in stream
        int mnOffset; /// position in maInUseBuffer

        etc. added to this header =) makes it easier to read.

        Also - since we wrote all this code ourselves - we don't need to use
the ALv2 variant header - lets just use the MPLv2 header from
TEMPLATE.SOURCECODE.HEADER in the top-level for the two new files.

        Now I read it again, I'm rather skeptical that:

+        if( !maUsedBuffers.empty() )
+        {
+            pProducedBuffer = maUsedBuffers.front();
+            maUsedBuffers.pop();
+        }

        is helpful. I'm not sure that we can re-use these buffers efficiently
here - perhaps its better just to allocate new ones in the stream read
method; can you profile with and without that (would love to get rid of
the extra complexity if possible).

+class UnzippingThread: public salhelper::Thread
+{
+    XBufferedThreadedStream *mxStream;

        I'd use a reference - to clarify lifecycle: so "&mrStream" - but of
course, with std::thread we could use a lambda for this thing I guess
and capture it ...

        Gotcha - I think I found the bad bit which is this:

commit 4ae705d02df0ddf75b97d0e94add6994626f487e
Author: Kohei Yoshida <kohei.yoshida@collabora.com>
Date:   Fri Jan 13 20:47:46 2017 -0500

    tdf#97597: Ensure that each parsing thread has its own buffer.

        The package/ code is (I think) thread-safe, but is not safe wrt. the
file-pointer underlying itself. I -think- it believes that only one
thread is read at a time.

        Kohei's change here - I think is to make each stream fully de-compress
itself into memory before parsing it - which fixed some horrible calc
problem.

        Hmm - so we're going to need to fix the root cause here I think, not
least because de-compressing the whole stream can be really very
expensive: ODF eg. is -incredibly- verbose with the XML taking up far
more memory than the equivalent calc data structures.

        Sooo ... I think your code is golden; but we're going to need to fix
the package/ code to avoid the bug.

https://bugs.documentfoundation.org/show_bug.cgi?id=97597#c28

        Suggests (input from Kohei appreciated) - that we may need to do a new
'seek' each time we come to reading some bytes from the underlying
package/ stream - to ensure that no other thread is messing with our
underlying stream position while we are reading compressed data from it =)

        I guess we should also write a unit test for this - that is heavily
threaded - 4-8 threads doing back-to-back reads for quite some time
might prolly catch this.

        So - could you look into the package/ impl ?

        Thanks !

                Michael.

-- 
michael.meeks@collabora.com <><, Pseudo Engineer, itinerant idiot

Context

a few more comments ... · Michael Meeks
- Re: a few more comments ... · Noel Grandin
- Re: a few more comments ... · Kohei Yoshida
  - Re: a few more comments ... · Michael Meeks
    - Re: a few more comments ... · Kohei Yoshida
      - Re: a few more comments ... · Michael Meeks
- Re: a few more comments ... · Mohammed Abdul Azeem
  - Re: a few more comments ... · Michael Meeks

Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.