Date: prev next · Thread: first prev next last
2011 Archives by date, by thread · List index


On 09/24/2011 12:48 PM, Michael Meeks wrote:
I'm poking at an endless hang in the smoketest:

#12  0xb7d24aec in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libc.so.6
#3  0xb7f1b6c0 in osl_waitCondition ()
from /data/opt/libreoffice/core/solver/unxlngi6.pro/lib/libuno_sal.so.3
#4  0xb72db42a in osl::Condition::wait (this=0xbfffb8c4, pTimeout=0x0)
at /data/opt/libreoffice/core/solver/unxlngi6.pro/inc/osl/conditn.hxx:84
#5  0xb72d9024 in (anonymous namespace)::Test::test (this=0xb7c16008)
at /data/opt/libreoffice/core/smoketestoo_native/smoketest.cxx:200
#6  0xb72d9e2e in CppUnit::TestCaller<<unnamed>::Test>::runTest(void)
(this=0xb73ac0a8)
at /data/opt/libreoffice/core/solver/unxlngi6.pro/inc/cppunit/TestCaller.h:166

        If I were a betting man I'd say this is down to us waiting on a
condition, and not spinning the main-loop; but (to be honest) this
remote-control nonsense is somewhat opaque to me. I see no live
soffice.bin process being controlled. I was slightly amazed to read:

toolkit/source/awt/AsyncCallback::addCallback()

        which seems to do nothing / not fire an exception if
Application::IsInMain() is not true - which is in itself odd.

        I have another quiescent thread:

#2  0xb7d24b44 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
from /lib/libc.so.6
#3  0xb7f3f18e in ?? ()
from /data/opt/libreoffice/core/solver/unxlngi6.pro/lib/libuno_sal.so.3
#4  0xb7c28b05 in start_thread (arg=0xb7c0fb70) at pthread_create.c:297
#5  0xb7d16d5e in clone () from /lib/libc.so.6

        So - I'm tempted to say:

     Result result;
     // Shifted to main thread to work around potential deadlocks
(i112867):
     com::sun::star::awt::AsyncCallback::create(
         connection_.getComponentContext())->addCallback(
             new Callback(
                 disp, url, css::uno::Sequence<  css::beans::PropertyValue
(),
                 new Listener(&result)),
             css::uno::Any());
     result.condition.wait();
     CPPUNIT_ASSERT(result.success);

        should be a timed wait - but only if we fail if the timeout is
triggered (ie. not on the common path). I've committed that at 30
seconds - possibly this needs tweaking to be infinite when under the
debugger.

A timed wait is no solution here. (Timeouts in this kind of code pose at least two problems. For one, they prevent a human from coming back to a hung "make check" after a while, only to find out they no longer get a clue where it hang, as the build has unhelpfully been forced to move forward. For another, what is typically also needed is proper cleanup, like killing abandoned sub-processes, so that manual intervention is needed, anyway.) The real solution, instead, is to not only wait on the Result object, but also on the OfficeConnection. Fixed as <http://cgit.freedesktop.org/libreoffice/core/commit/?id=c09b966f94f5a50fe537916398451339f008947d>.

-Stephan

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.