On 19/06/2020 14:51, Stephan Bergmann wrote:
On 28/05/2020 22:19, Stephan Bergmann wrote:
For now, I have updated 
<https://ci.libreoffice.org/job/gerrit_linux_clang_dbgutil/> to use 
the new kill-wrapper timeout feature instead of Jenkins' "Abort the 
build if it's stuck" option.  (And am planning to roll it out to other 
Linux Jenkins jobs that could benefit from it, once it has proven 
sufficiently stable.)
I have rolled out the kill-wrapper and its timeout feature now also for 
<https://ci.libreoffice.org/job/gerrit_linux_clang_dbgutil_branch/>, 
<https://ci.libreoffice.org/job/gerrit_linux_gcc_release/>, and 
<https://ci.libreoffice.org/job/lo_ubsan/>.
Just to note down the semi-obvious somewhere:  One scenario that 
kill-wrapper apparently doesn't prevent is leftover processes after 
Jenkins "has lost the connection" (for whatever reason, maybe a bug in 
Jenkins itself?).
<https://ci.libreoffice.org/job/gerrit_linux_clang_dbgutil/62736/> had 
gone down with
[...]
[build JUT] linguistic_unoapi
FATAL: command execution failed
java.io.EOFException
        at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2738)
        at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3213)
        at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:896)
        at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
        at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
        at hudson.remoting.Command.readFrom(Command.java:142)
        at hudson.remoting.Command.readFrom(Command.java:128)
        at 
hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
        at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Caused: java.io.IOException: Unexpected termination of the channel
        at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
Caused: java.io.IOException: Backing channel 'tb75-lilith' is disconnected.
        at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:216)
        at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:285)
        at com.sun.proxy.$Proxy66.isAlive(Unknown Source)
        at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1147)
        at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1139)
        at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
        at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
        at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)
        at hudson.model.Build$BuildExecution.build(Build.java:206)
        at hudson.model.Build$BuildExecution.doRun(Build.java:163)
        at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
        at hudson.model.Run.execute(Run.java:1880)
        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
        at hudson.model.ResourceController.execute(ResourceController.java:97)
        at hudson.model.Executor.run(Executor.java:428)
FATAL: Unable to delete script file /tmp/jenkins3180341342272089625.sh
java.io.EOFException
        at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2738)
        at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3213)
        at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:896)
        at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
        at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
        at hudson.remoting.Command.readFrom(Command.java:142)
        at hudson.remoting.Command.readFrom(Command.java:128)
        at 
hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
        at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Caused: java.io.IOException: Unexpected termination of the channel
        at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
Caused: hudson.remoting.ChannelClosedException: Channel 
"hudson.remoting.Channel@629ec1e9:tb75-lilith": Remote call on tb75-lilith failed. The channel is 
closing down or has closed down
        at hudson.remoting.Channel.call(Channel.java:991)
        at hudson.FilePath.act(FilePath.java:1069)
        at hudson.FilePath.act(FilePath.java:1058)
        at hudson.FilePath.delete(FilePath.java:1543)
        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:123)
        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
        at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
        at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)
        at hudson.model.Build$BuildExecution.build(Build.java:206)
        at hudson.model.Build$BuildExecution.doRun(Build.java:163)
        at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
        at hudson.model.Run.execute(Run.java:1880)
        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
        at hudson.model.ResourceController.execute(ResourceController.java:97)
        at hudson.model.Executor.run(Executor.java:428)
Build step 'Execute shell' marked build as failure
Finished: FAILURE
leaving behind some pstree forest of
oosplash─┬─soffice.bin─┬─soffice.bin
         │             └─182*[{soffice.bin}]
         └─{oosplash}
sh───sh───python.bin─┬─oosplash─┬─soffice.bin─┬─soffice.bin
                     │          │             └─294*[{soffice.bin}]
                     │          └─{oosplash}
                     └─2*[{python.bin}]
sh───sh───python.bin───oosplash
sh───sh───gdb-core-bt.sh───gdb
sh───sh───python.bin───oosplash
on tb75, where each of those processes belonged to the above build as 
demonstrated with a respective
$ cat /proc/$PID/environ | tr '\0' '\n' | grep BUILD_NUMBER
BUILD_NUMBER=62736
That caused later builds like 
<https://ci.libreoffice.org/job/gerrit_linux_clang_dbgutil/62758/> on 
tb75 to fail with "the test UITest_calc_demo failed".
Context
   
 
  Privacy Policy |
  
Impressum (Legal Info) |
  
Copyright information: Unless otherwise specified, all text and images
  on this website are licensed under the
  
Creative Commons Attribution-Share Alike 3.0 License.
  This does not include the source code of LibreOffice, which is
  licensed under the Mozilla Public License (
MPLv2).
  "LibreOffice" and "The Document Foundation" are
  registered trademarks of their corresponding registered owners or are
  in actual use as trademarks in one or more countries. Their respective
  logos and icons are also subject to international copyright laws. Use
  thereof is explained in our 
trademark policy.