Opened 13 years ago
Closed 11 years ago
#256 closed defect (fixed)
LIBC stream fmutex panic
Reported by: | dmik | Owned by: | Silvan Scherrer |
---|---|---|---|
Priority: | normal | Milestone: | libc-0.6.6 |
Component: | libc | Version: | 0.6.4 |
Severity: | normal | Keywords: | |
Cc: |
Description
An attempt to terminate a multi-threaded console program with Ctrl-C may often cause LIBC to abort with a message like this:
LIBC PANIC!! fmutex deadlock: Owner died! 0x18b0902c: Owner=0x01490002 Self=0x01490001 fs=0x3 flags=0x0 hev=0x0001004a Desc="LIBC stream" pid=0x0149 ppid=0x00ba tid=0x0001 slot=0x00ba pri=0x0200 mc=0x0000 D:\CODING\TESTS\OS2\DOS\SEH\SEH.EXE Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
This happens when a thread gets killed (as a result of the process termination procedure, initiated by the default Ctrl-C handler) while it is inside e.g. printf() and then another thread attempts to use printf() before it is terminated.
The same effect may be also seen when killing process with DosKillProcess.
This is very annoying (especially if there are many threads each of which panics) and it can also cause a hard deadlock (e.g. if printf() is used in an exception handler) that won't ever let thread 1 to exit.
Attachments (4)
Change History (15)
comment:1 by , 13 years ago
by , 13 years ago
Attachment: | stream_panic.diff added |
---|
comment:2 by , 13 years ago
Priority: | highest → normal |
---|---|
Severity: | blocker → normal |
The proposed fix may leave the stream in a bad state, so it's not very helpful per se. I would have to look at how glibc and the freebsd libc handles these kinds of issues, but I'm kind of fearing that they just blame the application and say you-should-not-do-that.
comment:3 by , 13 years ago
I actually checked glibc already. Although they define a cleanup routine that removes the lock, it's not intended for signal handlers it seems. However I don't think you can fully use Linux as a reference here since it doesn't have the thread 1 exitlist hangup problem at all...
Regarding the bad state. I don't see how my patch changes the situation here. It only comes into play when the thread gets killed, but when it happens the state of everything (including streams) is potentially messed up anyway, regardless of what I changed. My change is only to prevent a loud annoying crash or a much more annoying hang leading to an unkillable zombie.
Also, in 99 out of 100 cases, a dying thread means the whole application is being unexpectedly terminated. In this case, the question of the application state isn't relevant at all (as opposed to the crash delay or a hang which requires you to reboot the system).
I understand the "fix the application" attitude, but in this particular case it isn't possible to fix the application; I can't put an exception handler around each printf() call.
by , 13 years ago
Attachment: | stream_panic.3.diff added |
---|
Fixed assertions in kmk in -j >=2 mode plus some optiimizations
comment:5 by , 13 years ago
The stream_abort() assertions in kmk were caused by the exception handler that could unexpectedly unlock the stream.
Note also that the panic may also happen w/o any user interaction -- simply if one thread programmtically kills another thread while it is doing some IO (e.g. prints to the console). There are such apps over there.
by , 13 years ago
Attachment: | stream_panic.4.diff added |
---|
Fixes occasional "stream lock count 0" assertions
comment:6 by , 13 years ago
One more tiny fix which checks in the exception handler if the stream is already unlocked (there is a hole between unlocking and resetting the field containing the FD to unlock).
comment:7 by , 11 years ago
Milestone: | → libc-0.6.6 |
---|
comment:8 by , 11 years ago
Priority: | normal → high |
---|
comment:9 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | new → accepted |
comment:10 by , 11 years ago
Priority: | high → normal |
---|
I think a better and less invasive solution here is to detect the process-is-being-kill situation (there should be code for that already) and shut up the panic.
comment:11 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
r3849 relaxes the fmutex code when the process is being terminated. May of course just change some panic scenarios to crashes elsewhere if a thread was killed/interrupted in the middle of critical update.
Attached is the patch that fixes this by installing an OS/2 exception handler and unlocking the stream mutex on thread termination.
PS. BTW, I didn't set the priorities that high, they are like that by default.
PPS. I really hope that it will appear in the upcoming 065.