Opened 4 years ago

Closed 3 years ago

#143 closed defect (fixed)

Crash with rpm -qa --last

Reported by: Lewisr Owned by:
Priority: major Milestone:
Component: rpm Version:
Severity: medium Keywords:
Cc:

Description

Attempting to query installed packages, listed by date/time of install yields:

LIBC PANIC!!
LIBC fork: Child aborting fork()! rc=0xfffffffc
pid=0x00e3 ppid=0x00e2 tid=0x0001 slot=0x006d pri=0x0200 mc=0x0000 ps=0x0010
C:\USR\BIN\RPM.EXE
LIBC066 0:00001e28
cs:eip=005b:1ed81e28      ss:esp=0053:0012df2c      ebp=0012df34
 ds=0053      es=0053      fs=150b      gs=0000     efl=00012212
eax=00000040 ebx=bb678a80 ecx=18c10000 edx=00000018 edi=00000000 esi=18c10000
Process has been dumped

Details:

rpm-libs-4.8.1-23.oc00.pentium4
rpm-python-4.8.1-23.oc00.pentium4
rpm-4.8.1-23.oc00.pentium4

From sh, however, the following works as expected:

rpm -qa --qf '%{INSTALLTIME} %-40{NAME} %{INSTALLTIME:date}\n' | sort -n | cut -d' ' -f2-

Change History (16)

comment:1 Changed 4 years ago by diver

  • Priority changed from minor to major
  • Severity changed from low to medium

just to be sure, I tried it also and can reproduce this libc panic.

comment:2 Changed 4 years ago by dmik

I confirm this too.

comment:3 Changed 4 years ago by ydario

While the query in the description is working for me with cmd/4os2/sh, running rpm -qa --last is crashing in fork even in current rpm 4.13

comment:4 Changed 4 years ago by lewisr

Same here with 4.13, and the workaround cited in the original ticket continues to function under sh.

Some of the addresses have changed, however, though I can't be sure that they're relevant:

LIBC PANIC!!
LIBC fork: Child aborting fork()! rc=0xfffffffc
pid=0x00f7 ppid=0x00f6 tid=0x0001 slot=0x00f0 pri=0x0200 mc=0x0000 ps=0x0010
J:\USR\BIN\RPM.EXE
LIBC066 0:00001e28
cs:eip=005b:1d2b1e28      ss:esp=0053:0012df2c      ebp=0012df34
 ds=0053      es=0053      fs=150b      gs=0000     efl=00010212
eax=00000040 ebx=7d656980 ecx=121a0000 edx=00000018 edi=00000000 esi=121a0000
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.

Tried 4os2, JdeBP, ash, bash, YAOS - all similar.

Currently installed builds are:

rpm.pentium4                         4.13.0-3.oc00
rpm-libs.pentium4                    4.13.0-3.oc00
rpm-python.pentium4                  4.13.0-3.oc00
Last edited 4 years ago by lewisr (previous) (diff)

comment:5 Changed 4 years ago by ydario

Option --last is not a real rpm option, but a popt alias as shown in /usr/lib/rpm/rpmpopt-4.13.0-rc1:

rpm	alias --last --qf '%|INSTALLTIME?{%{INSTALLTIME}}:{000000000}| %{NVRA} %|INSTALLTIME?{%{INSTALLTIME:date}}:{(not installed)}|\n' \
	--pipe "LC_NUMERIC=C sort -r -n | sed 's,^[0-9]\+ ,,' | awk '{printf(\"%-45s %-s\n\", $1, substr($0,length($1)+2))}' " \
	--POPTdesc=$"list package(s) by install time, most recent first"

and most of these alias are making use of --pipe rpm option. This option is using fork() and thus it crashes rpm.

This crash is fixed in r634.

rpm: replace fork() with popen() when redirecting output. fixes ticket#143.
Committed revision r634.

comment:6 Changed 4 years ago by lewisr

Nice, Yuri; thanks.

Thanks also for the mention of popt. I wasn't aware that that was an alias at all.

See? I hang around with smart people and I learn good and useful things! ;-)

Last edited 4 years ago by lewisr (previous) (diff)

comment:7 Changed 4 years ago by ydario

To allow sh to process all macro commands, quotes must be added after -c.

This breaks some macros because of multiple quotes in the command line. Without surronding quotes, cmd.exe is used to process other commands after first one.

rpm: replace fork() with popen() when redirecting output. fixes ticket#143.
Committed revision r635.

comment:8 Changed 4 years ago by ydario

  • Resolution set to fixed
  • Status changed from new to closed

comment:9 Changed 4 years ago by ydario

rpm: check file handle before closing stuffs. ticket#143.
Committed revision r639.

comment:10 Changed 3 years ago by dmik

Yuri, I think this should be reevaluated as one major fork() crash was fixed (see http://trac.netlabs.org/libc/ticket/363). There is a good chance that the fork code path will work now. In this case we should revert r634, r635, r639.

BTW, these change sets may be responsible for failures we get with RPM when dash is the default shell (Invalid executable format when running scriptlets).

Last edited 3 years ago by dmik (previous) (diff)

comment:11 Changed 3 years ago by dmik

  • Resolution fixed deleted
  • Status changed from closed to reopened

I'm reopening this as it turned out that the problem described in #204 is actually a regression of the above fork to popen change. The thing is that passing the --pipe portion for the --last popt option to popen()in the form of sh -c "BLAHBLAH" makes sh expand $0, $1 and some escape shars there which breaks sh syntax as well as awk invocation (which expects $0 and $1 to be its own args). The unexpected expansion happens popen actually already uses sh to execute what it is given, so it turns out to sh -c "sh -c "BLAHBLAH"" hence the expansion. It could be fixed by removing sh -c from a popen call and relying on the fact that the user has EMXSHELL set to sh, not to cmd.exe but it has its pitfalls and I also looked closer at why fork failed in the first place.

Unfortunately, the fix from ​http://trac.netlabs.org/libc/ticket/363 was not enough here. It turned out that the problem was not in RPM and not even in the fork implementation but in a fact that RPM statically links against NSS3.DLL which in turn dynamically loads SOFTOKN3.DLL at NSS init time and this is done by PR_LoadLibrary from NSPR which in turn uses DosLoadModule? directly instead of dlopen from kLIBC and this breaks fork machinery because modules loaded by DosLoadModule? directly are not properly set up for forking and at the time when the fork child process copies its data segments from its parent it ends up copying the SOFTOKN3.DLL segment to a memory location that hasn't been allocated in the child (due to a missing dlopen call). This would result in XCPT_ACCESS_VIOLATION while doing fork which would eventually cause the child to abort with LIBC PANIC 0xfffffffc (which means -EINTR emitted by the fatal error handler). Pretty hairy chain of events, huh.

And, expectedly, making NSPR use dlopen instead of DosLoadModule? fixes the problem. I.e. now I have RPM using fork (with the above revisions rolled back) and rpm -qa --last works like a charm here.

Last edited 3 years ago by dmik (previous) (diff)

comment:12 Changed 3 years ago by dmik

After some more thinking I now think it *is* a defect of the fork implementation and I created a ticket for that: http://trac.netlabs.org/libc/ticket/372. It also turns out that in case of NSPR a more elegant solution is just to use DosLoadModuleEx? directly, I committed the fix here: http://trac.netlabs.org/ports/changeset/2034. Note NSPR/NSS are used a lot so this problem could have many more failed use scenarios than just RPM.

I will commit the proper RPM code now. Then I will try to revert other fork->popen changes to to make the code more clean and also to see if it solves the problem of scriptlets not aborting the install and in %{error:} statements not aborting rpmbuild.

Note that using fork means that we get rid of the exact sh specification in some places which in turn may require to either have EMXSHELL or SHELL set to sh or have /@unixroot/bin symlinked to /@unixroot/usr/bin (so that kLIBC finds sh even if EMXSHELL/SHELL isn't set). As this needs to be deployed on the user side as well, we will enhance the os2-base package to put these lines in CONFIG.SYS.

BTW, using the kLIBC path rewriter to map /@unixroot/bin to /@unixroot/usr/bin would be a *better* solution than EMXSHELL/SHELL because it would speed up exec/popen/system calls in kLIBC by reducing stat calls and variable checks a lot (the first hit of /@unixroot/bin/sh.exe hard-coded in kLIBC would immediately match). But that's a different topic I suppose.

Last edited 3 years ago by dmik (previous) (diff)

comment:13 Changed 3 years ago by dmik

I also forgot to mention that fork (as well as select) in kLIBC doesn't work with pipes created with pipe, a usual socketpair hack is necessary to make It work. I may even add a pipe override to LIBCx to simplify porting. As kLIBC pipe is an interface to native OS/2 pipes and I don't remember a single case where we would specifically want to use these (and they will still be available via _std_pipe anyway).

comment:14 Changed 3 years ago by dmik

Well discarded the last comment, I'm actually wrong: pipe works perfectly with fork, I double checked, it's native OS/2 pipes which get inherited by the forked process (it only makes problems for select like any other OS/2 native file handle). So no LIBCx enhancement with pipe is necessary, it will even break some existing code due to slight semantic differences (pipe handles are unidirectional while socketpair handles are bidirectional).

comment:15 Changed 3 years ago by dmik

Fork is restored in r1009 this also fixes #204 (requires a new NSPR which will be releases soon).

comment:16 Changed 3 years ago by dmik

  • Resolution set to fixed
  • Status changed from reopened to closed

I also restored fork everywhere in RPM and it seems to work smoothly. A new release is being uploaded to the repositories. Closing this. Please try and reopen if necessary.

Note: See TracTickets for help on using tickets.