Opened 8 years ago
Last modified 7 years ago
#366 new defect
No fork callback to safely use LIBC heap
Reported by: | dmik | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | new |
Component: | libc | Version: | 0.6.6 |
Severity: | normal | Keywords: | |
Cc: | lewisr, dryeo |
Description
It turns out that even the callbacks registered with __LIBC_PFORKHANDLE::pfnCompletionCallback that are called very last before returning from fork()
to the user code are not safe for using LIBC itself: at least, an attempt to create a new log instance with __libc_LogInit
aborts the forked child with
LIBC PANIC!! fmutex deadlock: Owner died! 0x2003013c: Owner=0x25930001 Self=0x25940001 fs=0x3 flags=0x0 hev=0x00010004 Desc="LIBC Heap" pid=0x2594 ppid=0x2592 tid=0x0001 slot=0x00ab pri=0x0200 mc=0x0000 ps=0x0010 D:\CODING\LIBCX\MASTER-BUILD-DEBUG\STAGE\BIN\TST-ANON_MMAP.EXE Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it. _fmutex operation failed: LIBC Heap request
It looks to me like it can't use its own heap by that time.
A fork callback when it's safe to use LIBC is very desirable. It would, in particular, allow to solve the infamous _DLL_InitTerm problem in forked children and similar 3rd party DLL initialization issues when the only way to initialize the DLL in the forked child is to do it in a delayed fashion, i.e. when a function requiring initialization is called for the first time. This is very annoying as it basically requires a check in all functions routing the execution to the initialization routine if it's not already called.
Attachments (1)
Change History (8)
comment:1 by , 8 years ago
comment:2 by , 8 years ago
Hmm, there is a umForkCompletion
callback that does exactly that: releases LIBC heaps in the child. However, it seems to be called after the callback I add in reply to __LIBC_FORK_OP_FORK_PARENT. Which is because completion callbacks are called in LIFO order and and my callback is installed much later than umForkCompletion
(added in reply to __LIBC_FORK_OP_EXEC_PARENT called with priority 0xffffff01). I need to try using __LIBC_FORK_OP_EXEC_PARENT as well and a higher priority for my parent callback, something like 0xfffffff (the highest one).
comment:3 by , 8 years ago
Well, using the highest priority doesn't help for some reason, my callback still gets called first. I need to enable LIBC logging again to see what's going on.
comment:4 by , 8 years ago
Okay, it turns out that fork callbacks (including their priorities) are processed per module. And the module processing order matches the link order: first, LIBC callbacks are processed, then LIBCX callbacks and always the last - EXE callbacks. This way, it's impossible to register a completion callback that would be called *after* all LIBC callbacks and hence it's impossible to use LIBC Heap from 3rd-party fork callbacks. This looks like a missing feature to me.
A backward-compatible solution would be adding pfnCompletionCallbackEx
that takes a boolean flag indicating if the callback should be added to the end of the list (executed last) or to the front of the list (executed first). Or, even more compatible, the enmContext argument of the existing pfnCompletionCallback
may accept a __LIBC_FORK_CTX_APPEND flag (added to the LIBC_FORKCTX enum) that will instruct the callback to be added to the end and executed last.
comment:5 by , 8 years ago
Cc: | added |
---|
by , 7 years ago
Attachment: | fork_completion_callback.diff added |
---|
comment:6 by , 7 years ago
Attached a patch that solves this problem by letting the callback be called last with a new __LIBC_FORK_CTX_FLAGS_LAST flag. This serves well for LIBCx needs meaning that I can finally perform post-fork tasks in the child having LIBC itself fully operational (its Heap not locked, all file handles are properly inherited, etc.) now. This may be useful for other DLLs wanting to support forking too, of course.
The only disadvantage is that it's impossible to fail gracefully from the forked child if something goes wrong because the completion callback is called when the communication between the parent and the child is mostly over. It would be better to have a fully functioning LIBC at a __LIBC_FORK_OP_FORK_PARENT / __LIBC_FORK_OP_FORK_CHILD time but given that LIBC finalizes itself from the completion callback as well, this would require to either change the way how LIBC uses callbacks for its own needs or call completion callbacks also per-module (rather than keep them in a global list processed at once in the end, after all per-module callbacks, like now). But both approaches seem non-trivial to implement so given that we may fail in the child by simply doing exit(1)
or such inside the completion callback (which will result in EINTR from fork()
on the parent side and LIBC PANIC bitching in the child), my patch seems fine.
comment:7 by , 7 years ago
Cc: | added |
---|
To be more exact, it fails in _hmalloc, here is the excerpt from the LIBC log:
and here is the console output:
So it's clear that the LIBC Heap fmutex is owned by the parent for the duration of fork(). Perhaps this can be fixed in a way similar to #363.