Opened 10 months ago

Last modified 3 days ago

#143 new defect

hpijs returns "Failed to execute ghostscript to determine number of input pages!"

Reported by: lewisr Owned by:
Priority: Feedack pending Milestone:
Component: cups Version:
Severity: high Keywords:
Cc: steve53@…

Description

Socket backend used for HP Color LaserJet 3800. Confirmed printing works with the hpcups driver (not preferred). However, selecting the preferred hpijs driver, printing fails with:

D [12/Feb/2017:13:04:17 +0500] [Job 4] backendWaitLoop(snmp_fd=5, addr=0x20037da4, side_cb=0x110f0)
D [12/Feb/2017:13:04:17 +0500] [Job 4] PDF template file doesn\'t have form. It\'s okay.
D [12/Feb/2017:13:04:17 +0500] [Job 4] Filetype: PDF
D [12/Feb/2017:13:04:17 +0500] [Job 4] Storing temporary files in /@unixroot/var/spool/cups/tmp
D [12/Feb/2017:13:04:17 +0500] [Job 4] Process is dying with \"Failed to execute ghostscript to determine number of input pages!
D [12/Feb/2017:13:04:17 +0500] [Job 4] \", exit stat 6
D [12/Feb/2017:13:04:17 +0500] [Job 4] Cleaning up...
D [12/Feb/2017:13:04:17 +0500] [Job 4] PID 71 (/@unixroot/usr/lib/cups/filter/bannertopdf.exe) exited with no errors.
D [12/Feb/2017:13:04:17 +0500] [Job 4] PID 72 (/@unixroot/usr/lib/cups/filter/pdftopdf.exe) exited with no errors.
D [12/Feb/2017:13:04:17 +0500] [Job 4] PID 74 (/@unixroot/usr/lib/cups/filter/foomatic-rip.exe) stopped with status 6.
D [12/Feb/2017:13:04:17 +0500] [Job 4] Hint: Try setting the LogLevel to "debug" to find out more.
D [12/Feb/2017:13:04:17 +0500] [Job 4] PID 77 (/@unixroot/usr/lib/cups/backend/socket.exe) exited with no errors.
D [12/Feb/2017:13:04:17 +0500] [Job 4] End of messages
D [12/Feb/2017:13:04:17 +0500] [Job 4] printer-state=3(idle)
D [12/Feb/2017:13:04:17 +0500] [Job 4] printer-state-message="Filter failed"
D [12/Feb/2017:13:04:17 +0500] [Job 4] printer-state-reasons=none

I have tried (via yum-replace-plugin) swapping gnu-ghostscript for latest ghostscript, and the error is consistent with either one.

It is also apparently not possible to run gs.exe from an OS/2 shell; instead this must be done from dash (or a compatible *nix shell). The shell reported in the error log is bash, so I doubt this is an issue.

foomatic-rip?

Attachments (19)

error_log (12.2 KB) - added by lewisr 10 months ago.
CUPS error log trimmed for one job
error_log-2017-09-26 (103.2 KB) - added by lewisr 3 months ago.
Fresh error log (SET SHELL=sh in CONFIG.SYS)
error_log-2017-09-26-socket (177.0 KB) - added by lewisr 3 months ago.
Error log (same configuration, but using socket backend)
gs_debug.log (1.3 KB) - added by lewisr 3 months ago.
GS debug log
foomatic-5SP7AX (39.1 KB) - added by lewisr 3 months ago.
error_log-tmpdir-with-slashes (12.5 KB) - added by lewisr 3 months ago.
TMPDIR set with slashes in master environment
error_log-2017-12-07 (105.5 KB) - added by lewisr 9 days ago.
updated error log (GPL GS 9.18, CUPS 2.1.3)
error_log-2017-12-08 (114.0 KB) - added by lewisr 8 days ago.
TMP and TEMP changed in session (not master env). Printing fails.
error_log-2017-12-08-TempDir (90.7 KB) - added by lewisr 8 days ago.
TempDir uncommented in cups-files.conf. ijs succeeds.
error_log-2017-12-08-TempDir-PS (78.8 KB) - added by lewisr 8 days ago.
TempDir uncommented in cups-files.conf. PS fails (filter error).
foomatic-rip.exe (155.8 KB) - added by diver 6 days ago.
please try the attached foomatic-rip.exe. this should resolve the tempdir issue.
g.py (1.4 KB) - added by diver 5 days ago.
please try this script as well and gimme the output
stdout (660 bytes) - added by diver 5 days ago.
patch usr/share/hplip/base/g.py with the attached one please and try again
error_log-2017-12-12 (57.3 KB) - added by lewisr 5 days ago.
new foomatic-rip.exe and installed g.py patched per previous attachment
gs.sh (396 bytes) - added by diver 4 days ago.
libc_log.zip (120.3 KB) - added by diver 4 days ago.
libc_logs. pid 041 is the falling one
gsos2.exe (176.5 KB) - added by diver 3 days ago.
with this gsos2.exe the crash is gone. please try
error_log-2017-12-13 (3.4 KB) - added by lewisr 3 days ago.
ijs error after test libcx0 (may be unrelated to fixes here)
error_log-2017-12-13-PS (86.2 KB) - added by lewisr 3 days ago.
PS error after test libcx0 (may be unrelated to fixes here)

Download all attachments as: .zip

Change History (74)

comment:1 Changed 10 months ago by lewisr

FWIW, temporary output file is readable PDF.

comment:2 Changed 10 months ago by diver

Please add a debug log.
And gs.exe can only run in shell or libc based apps as it's a link to gsos2.Exe

comment:3 Changed 10 months ago by lewisr

I thought gs.exe was a link. I should have looked more closely. There is a ticket open for 4OS2 for better handling of symlinks. One day, I'll remember that. ;-)

Attaching log (hopefully, it will fit; if not, I'll split it).

Changed 10 months ago by lewisr

CUPS error log trimmed for one job

comment:4 Changed 10 months ago by diver

do you have any "SHELL" env var? If not please add SHELL=sh and try again.

comment:5 Changed 10 months ago by diver

  • Priority changed from major to Feedack pending

hmm no reaction within a week.

comment:6 Changed 3 months ago by diver

is this also still open?

comment:7 Changed 3 months ago by lewisr

Sorry, yes, this is still an issue (just tested now against ArcaOS 5.0.2 pre).

Without setting SHELL= to sh, the job fails as before. The log indicates /bin/bash is the shell (which does not exist):

D [26/Sep/2017:13:44:35 +0500] [Job 7] Added option Font
D [26/Sep/2017:13:44:35 +0500] [Job 7] Parameter Summary
D [26/Sep/2017:13:44:35 +0500] [Job 7] -----------------
D [26/Sep/2017:13:44:35 +0500] [Job 7] Spooler: cups
D [26/Sep/2017:13:44:35 +0500] [Job 7] Printer: HP_Color_LaserJet_3800
D [26/Sep/2017:13:44:35 +0500] [Job 7] Shell: /bin/bash
D [26/Sep/2017:13:44:35 +0500] [Job 7] PPD file: /@unixroot/etc/cups/ppd/HP_Color_LaserJet_3800.ppd
D [26/Sep/2017:13:44:35 +0500] [Job 7] ATTR file:
D [26/Sep/2017:13:44:35 +0500] [Job 7] Printer model: HP Color LaserJet 3800 hpijs pcl3, 3.16.5
D [26/Sep/2017:13:44:35 +0500] [Job 7] Job title: Test Page
D [26/Sep/2017:13:44:35 +0500] [Job 7] File(s) to be printed:
D [26/Sep/2017:13:44:35 +0500] [Job 7] <STDIN>
D [26/Sep/2017:13:44:35 +0500] [Job 7] Ghostscript extra search path (\'GS_LIB\'
): /@unixroot/usr/share/cups/fonts

I'll change the SHELL statement in CONFIG.SYS, reboot, and retry.

Last edited 3 months ago by lewisr (previous) (diff)

comment:8 Changed 3 months ago by lewisr

No change.

D [26/Sep/2017:13:56:20 +0500] [Job 8] Filetype: PDF
D [26/Sep/2017:13:56:20 +0500] [Job 8] Storing temporary files in H:\\var\\temp
D [26/Sep/2017:13:56:20 +0500] [Job 8] Process is dying with \"Failed to execute ghostscript to determine number of input pages!

Changed 3 months ago by lewisr

Fresh error log (SET SHELL=sh in CONFIG.SYS)

Changed 3 months ago by lewisr

Error log (same configuration, but using socket backend)

comment:9 Changed 3 months ago by diver

please update to latest cups and cups-filters. I doubt a lot changed, but it's easier to track the errors as I don't work in the old source anymore.

comment:10 Changed 3 months ago by diver

also as a test, change the symlink of gs.exe to gs in usr/bin

comment:11 Changed 3 months ago by diver

also use the gsos2.exe which is attached to ticket http://trac.netlabs.org/ports/ticket/157 and add the setting as described in http://trac.netlabs.org/ports/ticket/157#comment:19

I wonder if gs is called at all, or if it fails earlier in popen()

comment:12 Changed 3 months ago by lewisr

Updated cups to 2.1.3-8 and cups-filters to 1.13.3-2 (and requires updated as necessary).

In the interim, I had to reinstall this test setup. CUPS selects hpijs pcl3, 3.16.11 (en) for this LaserJet 3800 (socket backend for this test).

Without changing the symlink or using the gsos2.exe attached to ticket #157, I get the same failure as originally described. That is, of course, until the CUPS admin stops responding at all (links on Maintenance and Administration dropdowns for printer to nothing). Downgrading to 2.1.3-7 and 1.13.3-1 gets me a workable printer page (it;s not just restarting the daemon by itself did *not* fix this; I had to downgrade to get access again.

Renaming the symlink from gs.exe to gs does not help.

Installed gsos2.exe from #157 and set master environment as described in ticket:157#comment:19. I can verify that I do have the gsos2.exe attached to that ticket and GS_DEBUG is set to 1, but I get no log file. In fact, I can't run that gsos2.exe from dash (I just return to a prompt).

That version of gsos2.exe is not for gnu-ghostscript (gsdll2.dll is not compatible). Switched to Artifex GS 9.18-5. Also re-updated cups, etc., again. Last test was with latest cups and friends.

Debug log attached.

Last edited 9 days ago by lewisr (previous) (diff)

Changed 3 months ago by lewisr

GS debug log

comment:13 Changed 3 months ago by diver

ok GS clearly gets back with an rc of -101. I need to see why this would be. But this might take some time, as I'm at other tasks right now.

comment:14 Changed 3 months ago by diver

hmm now I'm really lost. as this rc -101 is quit, which is ok and handled in gsos2.exe. I need to try to have such a setup as well, as else we get nowhere.

comment:15 Changed 3 months ago by diver

do you have files in h:\var\temp which have about the same timestamp as your tests? I wonder if there are some or how they look like. If there are any, please attach one of them here.

Changed 3 months ago by lewisr

comment:16 Changed 3 months ago by lewisr

Attached foomatic-5SP7AX, and as stated in comment:1, this is a fully readable PDF.

comment:17 Changed 3 months ago by diver

I might have an idea. Could you change your tmpdir env from h:\var\temp to h:/var/temp and try again? In my limited tests with the file you provided and a backslash as directory separator gs fails.

Changed 3 months ago by lewisr

TMPDIR set with slashes in master environment

comment:18 Changed 3 months ago by lewisr

Nice idea, Silvan. Unfortunately, that's not quite it, either:

D [28/Sep/2017:14:49:30 +0500] [Job 11] Storing temporary files in H:/var/temp
D [28/Sep/2017:14:49:30 +0500] [Job 11] Process is dying with \"Failed to execute ghostscript to determine number of input pages!
D [28/Sep/2017:14:49:30 +0500] [Job 11] \", exit stat 6
D [28/Sep/2017:14:49:30 +0500] [Job 11] Cleaning up...

Full error log for the job, attached.

comment:19 Changed 3 months ago by diver

Then I have no more ideas atm. I will try to do a setup like your in the next weeks.

comment:20 Changed 3 months ago by diver

one thing you could try it to gather libc logs with the libc logging dll. I bet Steven knows exactly how to do it, if you don't. This might shed some light.

As it seems like popen() doesn't end correct. See
http://trac.netlabs.org/ports/browser/cups-filter/trunk/filter/foomatic-rip/pdf.c#L56
that's the falling line.

comment:21 Changed 2 months ago by diver

you might also try the latest 1.17.2 rpm of cups-filters. there I added the errno, which popen() brings us back. This might also shed some lights

comment:22 Changed 2 months ago by diver

any followup?

comment:23 Changed 2 months ago by lewisr

Steve and I are going to spend some quality time with this before the beginning of the week, and I will follow up then. Thanks for the additional suggestion of the newer cups-filtersw, Silvan. I'll be sure that I have that in place before testing.

comment:24 Changed 9 days ago by diver

dos the latest cups, cups-filters and ghostscript combination work better?

comment:25 Changed 9 days ago by lewisr

HPIJS still fails (GPL Ghostscript 9.18):

D [27/Nov/2017:11:37:58 +0500] [Job 4] PDF template file doesn\'t have form. It\'s okay.
D [27/Nov/2017:11:37:58 +0500] [Job 4] PAGE: 1 1
D [27/Nov/2017:11:37:58 +0500] [Job 4] Filetype: PDF
D [27/Nov/2017:11:37:58 +0500] [Job 4] Storing temporary files in C:\\var\\temp
D [27/Nov/2017:11:37:58 +0500] [Job 4] PID 330 (/@unixroot/usr/lib/cups/filter/bannertopdf.exe) exited with no errors.
D [27/Nov/2017:11:37:58 +0500] [Job 4] PID 332 (/@unixroot/usr/lib/cups/filter/pdftopdf.exe) exited with no errors.
D [27/Nov/2017:11:37:58 +0500] [Job 4] GPL Ghostscript 9.18: Unrecoverable error, exit code 1
D [27/Nov/2017:11:37:58 +0500] [Job 4] Process is dying with \"Unable to determine number of pages, page count: -1
D [27/Nov/2017:11:37:58 +0500] [Job 4] \", exit stat 3
D [27/Nov/2017:11:37:58 +0500] [Job 4] Cleaning up...
D [27/Nov/2017:11:37:58 +0500] [Job 4] PID 334 (/@unixroot/usr/lib/cups/filter/foomatic-rip.exe) stopped with status 3.
D [27/Nov/2017:11:37:58 +0500] [Job 4] Hint: Try setting the LogLevel to "debug" to find out more.
D [27/Nov/2017:11:37:58 +0500] [Job 4] PID 336 (/@unixroot/usr/lib/cups/backend/ipp.exe) exited with no errors.
D [27/Nov/2017:11:37:58 +0500] [Job 4] End of messages
D [27/Nov/2017:11:37:58 +0500] [Job 4] printer-state=3(idle)
D [27/Nov/2017:11:37:58 +0500] [Job 4] printer-state-message="Filter failed"
D [27/Nov/2017:11:37:58 +0500] [Job 4] printer-state-reasons=none

Still need to get some time with Steve to catch more debug data. Sorry this has taken so long.

Changed 9 days ago by lewisr

updated error log (GPL GS 9.18, CUPS 2.1.3)

comment:26 Changed 8 days ago by diver

I will create a foomatic-rip.exe which does not delete the temp file after failure. Then we can send that file to GS and see if GS has an issue, or GS within cups.

comment:27 Changed 8 days ago by diver

I just see we already did that and found out that the PDF file is valid. So still a mystery to me why GS fails. But it might also be it's not GS that fails, but instead popen(). We have some popen() issues afair.

comment:28 Changed 8 days ago by diver

hmm but I also see you have a slightly different error now. GS is executed, while it failed earlier. Could you try the tempdir \ vs / again please? As GS doesn't like \ from what I see.

comment:29 Changed 8 days ago by lewisr

Okay, I found it for the ijs driver (thanks to your help, Silvan).

Yes, it was the backslash issue. Setting:

TMP=C:/VAR/TEMP
TEMP=C:/VAR/TEMP

in the session and then starting the daemon in the foreground does not get us around it (apparently, we either read this from the master environment, so setting it in a wrapper won't get us what we need, or "/" is converted to "\" when coming from the shell - I can't very easily set the master environment to use slashes when everything else on OS/2 wants backslashes - LOL). Rather, what is necessary is to explicitly set

 TempDir

in cups-files.conf gets us past this. All I did was to uncomment the "default" setting of /@unixroot/var/spool/cups/tmp and restart the daemon.

PS still fails, however. Logs coming momentarily.

Last edited 8 days ago by lewisr (previous) (diff)

Changed 8 days ago by lewisr

TMP and TEMP changed in session (not master env). Printing fails.

Changed 8 days ago by lewisr

TempDir uncommented in cups-files.conf. ijs succeeds.

Changed 8 days ago by lewisr

TempDir uncommented in cups-files.conf. PS fails (filter error).

comment:30 Changed 8 days ago by lewisr

Looking at the earlier log when I did set TMP in the master environment with slashes, either something else changed between versions or perhaps it's not the slashes so much as the drive letter which caused the issue with the ijs driver.

comment:31 Changed 8 days ago by stevenhl

  • Cc steve53@… added

comment:32 Changed 7 days ago by diver

A lot changed in between the versions. And also gs changed. So comparing those makes not much sense 😂. As it looks like the slash issue I will build a fixed foomatic exe. Then we know more.

comment:33 Changed 7 days ago by diver

Ok ic why it then still fails. Hpps python scriot needs a fix also. I will provide that on monday. We get there soon.

Last edited 7 days ago by diver (previous) (diff)

Changed 6 days ago by diver

please try the attached foomatic-rip.exe. this should resolve the tempdir issue.

comment:34 Changed 5 days ago by diver

does the system you test on has a "klibc user management" in place? as in theory it should grab the missing infos from there.

what do the following 2 python scripts print?

# Std Lib
import sys
import os
import os.path
import locale
import pwd
import stat
import re

username = pwd.getpwuid(os.getuid())[0]
pdb = pwd.getpwnam(username)
userhome = pdb[5]

print("username %s, userhome %s %s") % (username, userhome, os.name)
# Std Lib
import sys
import os
import os.path
import locale
import pwd
import stat
import re

if os.name == 'os2':
  username = os.getenv("USER")
  userhome = os.getenv("HOME")
else:
  username = pwd.getpwuid(os.getuid())[0]
  pdb = pwd.getpwnam(username)
  userhome = pdb[5]

print("username %s, userhome %s %s") % (username, userhome, os.name)

Changed 5 days ago by diver

please try this script as well and gimme the output

Changed 5 days ago by diver

patch usr/share/hplip/base/g.py with the attached one please and try again

comment:35 Changed 5 days ago by lewisr

Overall clarification

I didn't look at the actual printer output form the last tests until much later. While we did get printer output (i.e., no failure reported per this initial ticket), the output was :

**** Unable to open the initial device, quitting.

This same output was produced for the ijs and ps drivers. The job is then shown as stopped "Filter failed" in the list.

foomatic-rip.exe

After commenting the TempDir directive from cups-files.conf, output is sent to the printer. the error log says:

D [11/Dec/2017:11:25:40 +0500] [Job 16] envp[13]="TMPDIR=C:\\var\\temp"

so this appears to be working as expected.

Other info

does the system you test on has a "klibc user management" in place? as in theory it should grab the missing infos from there.

Yes, though completely unconfigured from the initial installation. root is UID 0, with no full name (Missing message 193!) and no %HOME% specified. User is UID 100 with a username of User and %HOME% of H:\Home (as set in CONFIG.SYS).

First script pasted above (test1.py):

{0}[c:\] python test1.py
Traceback (most recent call last):
  File "test1.py", line 10, in <module>
    username = pwd.getpwuid(os.getuid())[0]
KeyError: 'getpwuid(): uid not found: 0'

Second script pasted above (test2.py):

{0}[c:\] python test2.py
username User, userhome H:\Home os2

These rather confirm what I just posted about users listed in klusrmgr (guest is also shown with no username and "Missing message 194!" in the list).

g.py as attached

Output:

{0}[h:\home\downloads] python g.py
username User, userhome H:\Home os2

Patched usr/share/hplip/base/g.py

Output:

{0}[c:\usr\share\hplip\base] python g.py
Traceback (most recent call last):
  File "g.py", line 29, in <module>
    from .sixext import PY3
ValueError: Attempted relative import in non-package

Note the following from sixext.py:

"""Utilities for writing code that runs on Python 2 and 3"""
import sys
import types
from .six import *

Looks like we have an extraneous "." perhaps. These scripts are filled with them. Also, the line:

from . import logger

makes no sense, and neither does the line following that:

from . import os_utils

(os_utils should be in base, I think).

logger.py is also broken.

If I've misstated any of this, please advise. My python is still rather limited, so I may be missing something obvious.

Last edited 5 days ago by lewisr (previous) (diff)

comment:36 Changed 5 days ago by diver

You can't run the patched g.py direct. Please try to print something with the patched g.py in place and the foomatic-rip from here.
The klusrmgr has an issue and a new version of that is also almost done.

Last edited 5 days ago by diver (previous) (diff)

comment:37 Changed 5 days ago by lewisr

Hmmm...

Okay. the syntax definitely disagrees with the syntax on a couple Linux boxes I have here, but you're the expert.

Same result:

**** Unable to open the initial device, quitting.

and in the web UI:

stopped "Filter failed"

Log attached.

This was with the installed g.py patched per your attachment.

How did you want me to run the g.py which attached?

Last edited 5 days ago by lewisr (previous) (diff)

Changed 5 days ago by lewisr

new foomatic-rip.exe and installed g.py patched per previous attachment

comment:38 Changed 5 days ago by diver

ok we do progress. The origin issue of not finding the number of pages is solved now.
But now GS wants to start a hpijs server and can't find that. I need to see what the heck this is. It seems to me that this ijs stuff is done a bit too complex.
but as said earlier we get there once.

Last edited 5 days ago by diver (previous) (diff)

comment:39 Changed 4 days ago by diver

I did some debugging and unfortunately it's a fork() issue. fork gives back EINTR. Why I have no idea. you can see that with the attached gs.sh. I used the file which was attached here some weeks ago.
Eventually Steven has an idea

Changed 4 days ago by diver

comment:40 Changed 4 days ago by stevenhl

On possible cause of EINTR is an unhandled signal. Another is that the kLIBC has detected a hang in the fork process and issues a signal to cancel the fork process. Do you have a link to the sources with the failing fork() call?

comment:41 Changed 4 days ago by diver

http://trac.netlabs.org/ports/browser/ghostscript/trunk/ijs/ijs_exec_unix.c#L53
The pid which fork() brings back is -1, so the crash is within fork itself.

comment:42 Changed 4 days ago by stevenhl

FYI, the bulk of the kLIBC fork code is at src/emx/src/lib/sys/libcfork.c.

One possible reason for EINTR is an exception in the child. Since the failure is repeatable, you might want to test with the logging version of libc066.dll which will report the exception in more detail.

comment:43 Changed 4 days ago by diver

I did that already, but it tells me not much to be honest. It might be a memory issue, but that's more a wild guess.

Changed 4 days ago by diver

libc_logs. pid 041 is the falling one

comment:44 Changed 4 days ago by diver

[E:\Trees\tests]sh gs.sh
bww trace before fork()
bww trace after fork() child_pid=-1 errno=4 (Interrupted system call)
GPL Ghostscript 9.18: Can't start ijs server "hpijs"
**** Unable to open the initial device, quitting.

LIBC PANIC!!
LIBC fork: Child aborting fork()! rc=0xfffffffc
pid=0x0041 ppid=0x0040 tid=0x0001 slot=0x008a pri=0x0200 mc=0x0000 ps=0x0010
C:\USR\BIN\GSOS2.EXE
LIBC066 0:00004f6e
cs:eip=005b:1f424f6e      ss:esp=0053:0212beec      ebp=0212bf24
 ds=0053      es=0053      fs=150b      gs=0000     efl=00010246
eax=00058fba ebx=5cd273c0 ecx=02140048 edx=0212bc80 edi=00000040 esi=19120000
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
00059024 01 0d 0018 Asrt: Assertion Failed!!!
00059024 01 0d 0018 Asrt: Function: __libc_spmSocketDeref
00059024 01 0d 0018 Asrt: File:     D:/Coding/libc/libc-0.6/src/emx/src/lib/sys/
sharedpm.c
00059024 01 0d 0018 Asrt: Line:     1681
00059024 01 0d 0018 Asrt: Expr:     0
00059024 01 0d 0018 Asrt: iSocket=2105 already 0! (uRefs=0xffff0000)

this is what libc told me, when using the libc logchk dll
The 2 additional outputs are a printf() right before and after fork()

comment:45 Changed 4 days ago by stevenhl

This is a reference counting issue. The failing code is:

src\lib\sys\sharedpm.c:1681

LIBC_ASSERTM_FAILED("iSocket=%d already 0! (uRefs=%#x)\n", iSocket,

uRefs);

rc = -EBADF;

The assert overrides the EBADF return code.

The assert is triggering because there is at least one too many calls to libc_spmSocketDeref for socket #2105.

I need to poke about in the full log to find the ref/deref sequence.

I wonder if it is possible that socktidy is getting in the way here? I doubt that kLIBC expects other code to close the socket behind the curtains.

comment:46 Changed 4 days ago by stevenhl

OK. The assertion failure is a red herring. The real failure is your basic illegal access exception:

00058fba 01 09 0019 Mesg 0002 forkChlExceptionHandler: forkChlExceptionHandler: ExceptionNum?=0xc0000005 ExceptionAddress?=0x1f424f6e eip=0x1f424f6e ExceptionInfo?={0x000002,0x19120000,0x0051ec,0xfe2356a0} fHandlerFlags=0x0

We need to convert eip=0x1f424f6e to a selector:offset.

comment:47 Changed 4 days ago by stevenhl

OK, the log tells us

libc_logInit: addr 0x1f469840 iObj=0 offObj=0x49840

which means object 0 loads at 1f469840 - 49840 = 1f420000

which means the trap is at

1f424f6e - 1f420000 = 4f6e = 0000:00004f6e

Note that the object numbers reported by kLIBC are off by 1 compared to anything else.

The bad news is that the map tells us that this is a static number and I don't have a fully disassembled logchk version of libc066.

A WAG says the trap is somewhere in the _spawnve processing.

comment:48 follow-up: Changed 4 days ago by stevenhl

Make sense. It's probably

ijs_exec_unix.c:83

status = execvp (argv[0], argv);

that is causing the trap.

Can we add some logging code to verify this?

Has anyone checked for an exceptq report? forkChlExceptionHandler notifies the fork parent passes the exception on to the next handler in the fork child.

comment:49 Changed 4 days ago by stevenhl

No need. It's not execvp. It's been too long since I spent any quality time with Knut's logging code:

00058fba 01 09 0019 Entr 0002 forkBthCopyPagesMMX: pvDst=0x19120000 pvSrc=0x5cd273c0 cb=0x00001000
00058fba 01 09 0019 Mesg 0002 forkChlExceptionHandler: forkChlExceptionHandler: ExceptionNum?=0xc0000005 ExceptionAddress?=0x1f424f6e eip=0x1f424f6e ExceptionInfo?={0x000002,0x19120000,0x0051ec,0xfe2356a0} fHandlerFlags=0x0

I need to find the assembly code for forkBthCopyPagesMMX and decide how to figure out why 0x19120000 is not accessible.

comment:50 in reply to: ↑ 48 Changed 4 days ago by diver

Replying to stevenhl:

Make sense. It's probably

ijs_exec_unix.c:83

status = execvp (argv[0], argv);

that is causing the trap.

Can we add some logging code to verify this?

Has anyone checked for an exceptq report? forkChlExceptionHandler notifies the fork parent passes the exception on to the next handler in the fork child.

yes it's not execv, as it never get there at all. as fork() already crashes and gives back pid -1.
But you already saw that as well according to your comments.

comment:51 Changed 3 days ago by diver

and I removed socktidy, rebooted and tested again. I got the same crash. so socktidy seems not to interfere

comment:52 Changed 3 days ago by diver

it smells like http://trac.netlabs.org/libc/ticket/372
I will use DosLoadModuleEx?() and see what we gain

Changed 3 days ago by diver

with this gsos2.exe the crash is gone. please try

comment:53 Changed 3 days ago by lewisr

ijs output is terribly slow (minutes delay), and while I do now get a test page, it is very poor quality (snapshot coming soon). PS fails with "filter failed" and no output so far (log coming).

Perhaps I need to roll back previous script changes?

comment:54 follow-up: Changed 3 days ago by lewisr

After applying the test libcx0.dll as discussed in today's meeting:

11-29-17  18:53         119,344      0   ___A_  libcx0.dll

Output is much faster, but test page result for ijs and PS is:

*** HEAP RESULT 1

Logs attached.

Changed 3 days ago by lewisr

ijs error after test libcx0 (may be unrelated to fixes here)

Changed 3 days ago by lewisr

PS error after test libcx0 (may be unrelated to fixes here)

comment:55 in reply to: ↑ 54 Changed 3 days ago by diver

Replying to lewisr:

After applying the test libcx0.dll as discussed in today's meeting:

11-29-17  18:53         119,344      0   ___A_  libcx0.dll

Output is much faster, but test page result for ijs and PS is:

*** HEAP RESULT 1

thats because it's a debug libcx. ask @dmik to build a non debug for testing

Logs attached.

Note: See TracTickets for help on using tickets.