Changes between Initial Version and Version 1 of Ticket #113, comment 13


Ignore:
Timestamp:
Aug 5, 2016, 3:49:05 PM (8 years ago)
Author:
dmik

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #113, comment 13

    initial v1  
    1 I made LIBC return ETIMEOUT on ERROR_TIMEOUT and latest testing shows that this is exactly the case. However the deadlock detector barfs not on the recursion as I guessed above but on the dead owner. And now I know exactly what's going on and why it doesn't happen all the time.
     1I made LIBC return ETIMEDOUT on ERROR_TIMEOUT and latest testing shows that this is exactly the case. However the deadlock detector barfs not on the recursion as I guessed above but on the dead owner. And now I know exactly what's going on and why it doesn't happen all the time.
    22
    33The `fhForkChild1` callback does its part of the job for setting up inherited file handles in the child process and then releases the `LIBC SYS Filehandle Mutex`. This mutex is a "must complete" one and when the must complete section is left at mutex release, there is a check for handling lost poke signals (the one marked as a "hack" above). If there are no lost signals, then nothing is done and all works smoothly. However if there is some signal being poked at that time, then `__libc_Back_signalLostPoke` gets called in the child context. This call, among other things, results into enumerating all child's threads to perform some signal work. The enumeration function locks the `LIBC Thread DB Mutex` at its beginning. However, the mutex data at this stage is still a raw copy of the parent's `LIBC Thread DB Mutex` in the locked state (i.e. it's state its set to owned and the owner is the parent process). The attempt to lock it in the child process results into a wait cycle since the code thinks the mutex is owned. But in the child process this mutex is not actually owned so there is no-one to release it and this wait cycle lasts forever.