Opened 7 years ago

Closed 6 years ago

#143 closed defect (fixed)

hpijs returns "Failed to execute ghostscript to determine number of input pages!"

Reported by: Lewis Rosenthal Owned by:
Priority: Feedback pending Milestone:
Component: cups Version:
Severity: high Keywords:
Cc: steve53@…

Description

Socket backend used for HP Color LaserJet 3800. Confirmed printing works with the hpcups driver (not preferred). However, selecting the preferred hpijs driver, printing fails with:

D [12/Feb/2017:13:04:17 +0500] [Job 4] backendWaitLoop(snmp_fd=5, addr=0x20037da4, side_cb=0x110f0)
D [12/Feb/2017:13:04:17 +0500] [Job 4] PDF template file doesn\'t have form. It\'s okay.
D [12/Feb/2017:13:04:17 +0500] [Job 4] Filetype: PDF
D [12/Feb/2017:13:04:17 +0500] [Job 4] Storing temporary files in /@unixroot/var/spool/cups/tmp
D [12/Feb/2017:13:04:17 +0500] [Job 4] Process is dying with \"Failed to execute ghostscript to determine number of input pages!
D [12/Feb/2017:13:04:17 +0500] [Job 4] \", exit stat 6
D [12/Feb/2017:13:04:17 +0500] [Job 4] Cleaning up...
D [12/Feb/2017:13:04:17 +0500] [Job 4] PID 71 (/@unixroot/usr/lib/cups/filter/bannertopdf.exe) exited with no errors.
D [12/Feb/2017:13:04:17 +0500] [Job 4] PID 72 (/@unixroot/usr/lib/cups/filter/pdftopdf.exe) exited with no errors.
D [12/Feb/2017:13:04:17 +0500] [Job 4] PID 74 (/@unixroot/usr/lib/cups/filter/foomatic-rip.exe) stopped with status 6.
D [12/Feb/2017:13:04:17 +0500] [Job 4] Hint: Try setting the LogLevel to "debug" to find out more.
D [12/Feb/2017:13:04:17 +0500] [Job 4] PID 77 (/@unixroot/usr/lib/cups/backend/socket.exe) exited with no errors.
D [12/Feb/2017:13:04:17 +0500] [Job 4] End of messages
D [12/Feb/2017:13:04:17 +0500] [Job 4] printer-state=3(idle)
D [12/Feb/2017:13:04:17 +0500] [Job 4] printer-state-message="Filter failed"
D [12/Feb/2017:13:04:17 +0500] [Job 4] printer-state-reasons=none

I have tried (via yum-replace-plugin) swapping gnu-ghostscript for latest ghostscript, and the error is consistent with either one.

It is also apparently not possible to run gs.exe from an OS/2 shell; instead this must be done from dash (or a compatible *nix shell). The shell reported in the error log is bash, so I doubt this is an issue.

foomatic-rip?

Attachments (20)

error_log (12.2 KB) - added by Lewis Rosenthal 7 years ago.
CUPS error log trimmed for one job
error_log-2017-09-26 (103.2 KB) - added by Lewis Rosenthal 6 years ago.
Fresh error log (SET SHELL=sh in CONFIG.SYS)
error_log-2017-09-26-socket (177.0 KB) - added by Lewis Rosenthal 6 years ago.
Error log (same configuration, but using socket backend)
gs_debug.log (1.3 KB) - added by Lewis Rosenthal 6 years ago.
GS debug log
foomatic-5SP7AX (39.1 KB) - added by Lewis Rosenthal 6 years ago.
error_log-tmpdir-with-slashes (12.5 KB) - added by Lewis Rosenthal 6 years ago.
TMPDIR set with slashes in master environment
error_log-2017-12-07 (105.5 KB) - added by Lewis Rosenthal 6 years ago.
updated error log (GPL GS 9.18, CUPS 2.1.3)
error_log-2017-12-08 (114.0 KB) - added by Lewis Rosenthal 6 years ago.
TMP and TEMP changed in session (not master env). Printing fails.
error_log-2017-12-08-TempDir (90.7 KB) - added by Lewis Rosenthal 6 years ago.
TempDir uncommented in cups-files.conf. ijs succeeds.
error_log-2017-12-08-TempDir-PS (78.8 KB) - added by Lewis Rosenthal 6 years ago.
TempDir uncommented in cups-files.conf. PS fails (filter error).
foomatic-rip.exe (155.8 KB) - added by Silvan Scherrer 6 years ago.
please try the attached foomatic-rip.exe. this should resolve the tempdir issue.
g.py (1.4 KB) - added by Silvan Scherrer 6 years ago.
please try this script as well and gimme the output
stdout (660 bytes) - added by Silvan Scherrer 6 years ago.
patch usr/share/hplip/base/g.py with the attached one please and try again
error_log-2017-12-12 (57.3 KB) - added by Lewis Rosenthal 6 years ago.
new foomatic-rip.exe and installed g.py patched per previous attachment
gs.sh (396 bytes) - added by Silvan Scherrer 6 years ago.
libc_log.zip (120.3 KB) - added by Silvan Scherrer 6 years ago.
libc_logs. pid 041 is the falling one
gsos2.exe (176.5 KB) - added by Silvan Scherrer 6 years ago.
with this gsos2.exe the crash is gone. please try
error_log-2017-12-13 (3.4 KB) - added by Lewis Rosenthal 6 years ago.
ijs error after test libcx0 (may be unrelated to fixes here)
error_log-2017-12-13-PS (86.2 KB) - added by Lewis Rosenthal 6 years ago.
PS error after test libcx0 (may be unrelated to fixes here)
error_log-2018-02-05 (91.2 KB) - added by Lewis Rosenthal 6 years ago.
Fresh error log, latest exp packages installed

Download all attachments as: .zip

Change History (82)

comment:1 Changed 7 years ago by Lewis Rosenthal

FWIW, temporary output file is readable PDF.

comment:2 Changed 7 years ago by Silvan Scherrer

Please add a debug log.
And gs.exe can only run in shell or libc based apps as it's a link to gsos2.Exe

comment:3 Changed 7 years ago by Lewis Rosenthal

I thought gs.exe was a link. I should have looked more closely. There is a ticket open for 4OS2 for better handling of symlinks. One day, I'll remember that. ;-)

Attaching log (hopefully, it will fit; if not, I'll split it).

Changed 7 years ago by Lewis Rosenthal

Attachment: error_log added

CUPS error log trimmed for one job

comment:4 Changed 7 years ago by Silvan Scherrer

do you have any "SHELL" env var? If not please add SHELL=sh and try again.

comment:5 Changed 7 years ago by Silvan Scherrer

Priority: majorFeedack pending

hmm no reaction within a week.

comment:6 Changed 7 years ago by Silvan Scherrer

is this also still open?

comment:7 Changed 6 years ago by Lewis Rosenthal

Sorry, yes, this is still an issue (just tested now against ArcaOS 5.0.2 pre).

Without setting SHELL= to sh, the job fails as before. The log indicates /bin/bash is the shell (which does not exist):

D [26/Sep/2017:13:44:35 +0500] [Job 7] Added option Font
D [26/Sep/2017:13:44:35 +0500] [Job 7] Parameter Summary
D [26/Sep/2017:13:44:35 +0500] [Job 7] -----------------
D [26/Sep/2017:13:44:35 +0500] [Job 7] Spooler: cups
D [26/Sep/2017:13:44:35 +0500] [Job 7] Printer: HP_Color_LaserJet_3800
D [26/Sep/2017:13:44:35 +0500] [Job 7] Shell: /bin/bash
D [26/Sep/2017:13:44:35 +0500] [Job 7] PPD file: /@unixroot/etc/cups/ppd/HP_Color_LaserJet_3800.ppd
D [26/Sep/2017:13:44:35 +0500] [Job 7] ATTR file:
D [26/Sep/2017:13:44:35 +0500] [Job 7] Printer model: HP Color LaserJet 3800 hpijs pcl3, 3.16.5
D [26/Sep/2017:13:44:35 +0500] [Job 7] Job title: Test Page
D [26/Sep/2017:13:44:35 +0500] [Job 7] File(s) to be printed:
D [26/Sep/2017:13:44:35 +0500] [Job 7] <STDIN>
D [26/Sep/2017:13:44:35 +0500] [Job 7] Ghostscript extra search path (\'GS_LIB\'
): /@unixroot/usr/share/cups/fonts

I'll change the SHELL statement in CONFIG.SYS, reboot, and retry.

Last edited 6 years ago by Lewis Rosenthal (previous) (diff)

comment:8 Changed 6 years ago by Lewis Rosenthal

No change.

D [26/Sep/2017:13:56:20 +0500] [Job 8] Filetype: PDF
D [26/Sep/2017:13:56:20 +0500] [Job 8] Storing temporary files in H:\\var\\temp
D [26/Sep/2017:13:56:20 +0500] [Job 8] Process is dying with \"Failed to execute ghostscript to determine number of input pages!

Changed 6 years ago by Lewis Rosenthal

Attachment: error_log-2017-09-26 added

Fresh error log (SET SHELL=sh in CONFIG.SYS)

Changed 6 years ago by Lewis Rosenthal

Attachment: error_log-2017-09-26-socket added

Error log (same configuration, but using socket backend)

comment:9 Changed 6 years ago by Silvan Scherrer

please update to latest cups and cups-filters. I doubt a lot changed, but it's easier to track the errors as I don't work in the old source anymore.

comment:10 Changed 6 years ago by Silvan Scherrer

also as a test, change the symlink of gs.exe to gs in usr/bin

comment:11 Changed 6 years ago by Silvan Scherrer

also use the gsos2.exe which is attached to ticket http://trac.netlabs.org/ports/ticket/157 and add the setting as described in http://trac.netlabs.org/ports/ticket/157#comment:19

I wonder if gs is called at all, or if it fails earlier in popen()

comment:12 Changed 6 years ago by Lewis Rosenthal

Updated cups to 2.1.3-8 and cups-filters to 1.13.3-2 (and requires updated as necessary).

In the interim, I had to reinstall this test setup. CUPS selects hpijs pcl3, 3.16.11 (en) for this LaserJet 3800 (socket backend for this test).

Without changing the symlink or using the gsos2.exe attached to ticket #157, I get the same failure as originally described. That is, of course, until the CUPS admin stops responding at all (links on Maintenance and Administration dropdowns for printer to nothing). Downgrading to 2.1.3-7 and 1.13.3-1 gets me a workable printer page (it's not just restarting the daemon by itself did *not* fix this; I had to downgrade to get access again.

Renaming the symlink from gs.exe to gs does not help.

Installed gsos2.exe from #157 and set master environment as described in ticket:157#comment:19. I can verify that I do have the gsos2.exe attached to that ticket and GS_DEBUG is set to 1, but I get no log file. In fact, I can't run that gsos2.exe from dash (I just return to a prompt).

That version of gsos2.exe is not for gnu-ghostscript (gsdll2.dll is not compatible). Switched to Artifex GS 9.18-5. Also re-updated cups, etc., again. Last test was with latest cups and friends.

Debug log attached.

Last edited 6 years ago by Lewis Rosenthal (previous) (diff)

Changed 6 years ago by Lewis Rosenthal

Attachment: gs_debug.log added

GS debug log

comment:13 Changed 6 years ago by Silvan Scherrer

ok GS clearly gets back with an rc of -101. I need to see why this would be. But this might take some time, as I'm at other tasks right now.

comment:14 Changed 6 years ago by Silvan Scherrer

hmm now I'm really lost. as this rc -101 is quit, which is ok and handled in gsos2.exe. I need to try to have such a setup as well, as else we get nowhere.

comment:15 Changed 6 years ago by Silvan Scherrer

do you have files in h:\var\temp which have about the same timestamp as your tests? I wonder if there are some or how they look like. If there are any, please attach one of them here.

Changed 6 years ago by Lewis Rosenthal

Attachment: foomatic-5SP7AX added

comment:16 Changed 6 years ago by Lewis Rosenthal

Attached foomatic-5SP7AX, and as stated in comment:1, this is a fully readable PDF.

comment:17 Changed 6 years ago by Silvan Scherrer

I might have an idea. Could you change your tmpdir env from h:\var\temp to h:/var/temp and try again? In my limited tests with the file you provided and a backslash as directory separator gs fails.

Changed 6 years ago by Lewis Rosenthal

TMPDIR set with slashes in master environment

comment:18 Changed 6 years ago by Lewis Rosenthal

Nice idea, Silvan. Unfortunately, that's not quite it, either:

D [28/Sep/2017:14:49:30 +0500] [Job 11] Storing temporary files in H:/var/temp
D [28/Sep/2017:14:49:30 +0500] [Job 11] Process is dying with \"Failed to execute ghostscript to determine number of input pages!
D [28/Sep/2017:14:49:30 +0500] [Job 11] \", exit stat 6
D [28/Sep/2017:14:49:30 +0500] [Job 11] Cleaning up...

Full error log for the job, attached.

comment:19 Changed 6 years ago by Silvan Scherrer

Then I have no more ideas atm. I will try to do a setup like your in the next weeks.

comment:20 Changed 6 years ago by Silvan Scherrer

one thing you could try it to gather libc logs with the libc logging dll. I bet Steven knows exactly how to do it, if you don't. This might shed some light.

As it seems like popen() doesn't end correct. See
http://trac.netlabs.org/ports/browser/cups-filter/trunk/filter/foomatic-rip/pdf.c#L56
that's the falling line.

comment:21 Changed 6 years ago by Silvan Scherrer

you might also try the latest 1.17.2 rpm of cups-filters. there I added the errno, which popen() brings us back. This might also shed some lights

comment:22 Changed 6 years ago by Silvan Scherrer

any followup?

comment:23 Changed 6 years ago by Lewis Rosenthal

Steve and I are going to spend some quality time with this before the beginning of the week, and I will follow up then. Thanks for the additional suggestion of the newer cups-filtersw, Silvan. I'll be sure that I have that in place before testing.

comment:24 Changed 6 years ago by Silvan Scherrer

dos the latest cups, cups-filters and ghostscript combination work better?

comment:25 Changed 6 years ago by Lewis Rosenthal

HPIJS still fails (GPL Ghostscript 9.18):

D [27/Nov/2017:11:37:58 +0500] [Job 4] PDF template file doesn\'t have form. It\'s okay.
D [27/Nov/2017:11:37:58 +0500] [Job 4] PAGE: 1 1
D [27/Nov/2017:11:37:58 +0500] [Job 4] Filetype: PDF
D [27/Nov/2017:11:37:58 +0500] [Job 4] Storing temporary files in C:\\var\\temp
D [27/Nov/2017:11:37:58 +0500] [Job 4] PID 330 (/@unixroot/usr/lib/cups/filter/bannertopdf.exe) exited with no errors.
D [27/Nov/2017:11:37:58 +0500] [Job 4] PID 332 (/@unixroot/usr/lib/cups/filter/pdftopdf.exe) exited with no errors.
D [27/Nov/2017:11:37:58 +0500] [Job 4] GPL Ghostscript 9.18: Unrecoverable error, exit code 1
D [27/Nov/2017:11:37:58 +0500] [Job 4] Process is dying with \"Unable to determine number of pages, page count: -1
D [27/Nov/2017:11:37:58 +0500] [Job 4] \", exit stat 3
D [27/Nov/2017:11:37:58 +0500] [Job 4] Cleaning up...
D [27/Nov/2017:11:37:58 +0500] [Job 4] PID 334 (/@unixroot/usr/lib/cups/filter/foomatic-rip.exe) stopped with status 3.
D [27/Nov/2017:11:37:58 +0500] [Job 4] Hint: Try setting the LogLevel to "debug" to find out more.
D [27/Nov/2017:11:37:58 +0500] [Job 4] PID 336 (/@unixroot/usr/lib/cups/backend/ipp.exe) exited with no errors.
D [27/Nov/2017:11:37:58 +0500] [Job 4] End of messages
D [27/Nov/2017:11:37:58 +0500] [Job 4] printer-state=3(idle)
D [27/Nov/2017:11:37:58 +0500] [Job 4] printer-state-message="Filter failed"
D [27/Nov/2017:11:37:58 +0500] [Job 4] printer-state-reasons=none

Still need to get some time with Steve to catch more debug data. Sorry this has taken so long.

Changed 6 years ago by Lewis Rosenthal

Attachment: error_log-2017-12-07 added

updated error log (GPL GS 9.18, CUPS 2.1.3)

comment:26 Changed 6 years ago by Silvan Scherrer

I will create a foomatic-rip.exe which does not delete the temp file after failure. Then we can send that file to GS and see if GS has an issue, or GS within cups.

comment:27 Changed 6 years ago by Silvan Scherrer

I just see we already did that and found out that the PDF file is valid. So still a mystery to me why GS fails. But it might also be it's not GS that fails, but instead popen(). We have some popen() issues afair.

comment:28 Changed 6 years ago by Silvan Scherrer

hmm but I also see you have a slightly different error now. GS is executed, while it failed earlier. Could you try the tempdir \ vs / again please? As GS doesn't like \ from what I see.

comment:29 Changed 6 years ago by Lewis Rosenthal

Okay, I found it for the ijs driver (thanks to your help, Silvan).

Yes, it was the backslash issue. Setting:

TMP=C:/VAR/TEMP
TEMP=C:/VAR/TEMP

in the session and then starting the daemon in the foreground does not get us around it (apparently, we either read this from the master environment, so setting it in a wrapper won't get us what we need, or "/" is converted to "\" when coming from the shell - I can't very easily set the master environment to use slashes when everything else on OS/2 wants backslashes - LOL). Rather, what is necessary is to explicitly set

 TempDir

in cups-files.conf gets us past this. All I did was to uncomment the "default" setting of /@unixroot/var/spool/cups/tmp and restart the daemon.

PS still fails, however. Logs coming momentarily.

Last edited 6 years ago by Lewis Rosenthal (previous) (diff)

Changed 6 years ago by Lewis Rosenthal

Attachment: error_log-2017-12-08 added

TMP and TEMP changed in session (not master env). Printing fails.

Changed 6 years ago by Lewis Rosenthal

TempDir uncommented in cups-files.conf. ijs succeeds.

Changed 6 years ago by Lewis Rosenthal

TempDir uncommented in cups-files.conf. PS fails (filter error).

comment:30 Changed 6 years ago by Lewis Rosenthal

Looking at the earlier log when I did set TMP in the master environment with slashes, either something else changed between versions or perhaps it's not the slashes so much as the drive letter which caused the issue with the ijs driver.

comment:31 Changed 6 years ago by Steven Levine

Cc: steve53@… added

comment:32 Changed 6 years ago by Silvan Scherrer

A lot changed in between the versions. And also gs changed. So comparing those makes not much sense 😂. As it looks like the slash issue I will build a fixed foomatic exe. Then we know more.

comment:33 Changed 6 years ago by Silvan Scherrer

Ok ic why it then still fails. Hpps python scriot needs a fix also. I will provide that on monday. We get there soon.

Last edited 6 years ago by Silvan Scherrer (previous) (diff)

Changed 6 years ago by Silvan Scherrer

Attachment: foomatic-rip.exe added

please try the attached foomatic-rip.exe. this should resolve the tempdir issue.

comment:34 Changed 6 years ago by Silvan Scherrer

does the system you test on has a "klibc user management" in place? as in theory it should grab the missing infos from there.

what do the following 2 python scripts print?

# Std Lib
import sys
import os
import os.path
import locale
import pwd
import stat
import re

username = pwd.getpwuid(os.getuid())[0]
pdb = pwd.getpwnam(username)
userhome = pdb[5]

print("username %s, userhome %s %s") % (username, userhome, os.name)
# Std Lib
import sys
import os
import os.path
import locale
import pwd
import stat
import re

if os.name == 'os2':
  username = os.getenv("USER")
  userhome = os.getenv("HOME")
else:
  username = pwd.getpwuid(os.getuid())[0]
  pdb = pwd.getpwnam(username)
  userhome = pdb[5]

print("username %s, userhome %s %s") % (username, userhome, os.name)

Changed 6 years ago by Silvan Scherrer

Attachment: g.py added

please try this script as well and gimme the output

Changed 6 years ago by Silvan Scherrer

Attachment: stdout added

patch usr/share/hplip/base/g.py with the attached one please and try again

comment:35 Changed 6 years ago by Lewis Rosenthal

Overall clarification

I didn't look at the actual printer output form the last tests until much later. While we did get printer output (i.e., no failure reported per this initial ticket), the output was :

**** Unable to open the initial device, quitting.

This same output was produced for the ijs and ps drivers. The job is then shown as stopped "Filter failed" in the list.

foomatic-rip.exe

After commenting the TempDir directive from cups-files.conf, output is sent to the printer. the error log says:

D [11/Dec/2017:11:25:40 +0500] [Job 16] envp[13]="TMPDIR=C:\\var\\temp"

so this appears to be working as expected.

Other info

does the system you test on has a "klibc user management" in place? as in theory it should grab the missing infos from there.

Yes, though completely unconfigured from the initial installation. root is UID 0, with no full name (Missing message 193!) and no %HOME% specified. User is UID 100 with a username of User and %HOME% of H:\Home (as set in CONFIG.SYS).

First script pasted above (test1.py):

{0}[c:\] python test1.py
Traceback (most recent call last):
  File "test1.py", line 10, in <module>
    username = pwd.getpwuid(os.getuid())[0]
KeyError: 'getpwuid(): uid not found: 0'

Second script pasted above (test2.py):

{0}[c:\] python test2.py
username User, userhome H:\Home os2

These rather confirm what I just posted about users listed in klusrmgr (guest is also shown with no username and "Missing message 194!" in the list).

g.py as attached

Output:

{0}[h:\home\downloads] python g.py
username User, userhome H:\Home os2

Patched usr/share/hplip/base/g.py

Output:

{0}[c:\usr\share\hplip\base] python g.py
Traceback (most recent call last):
  File "g.py", line 29, in <module>
    from .sixext import PY3
ValueError: Attempted relative import in non-package

Note the following from sixext.py:

"""Utilities for writing code that runs on Python 2 and 3"""
import sys
import types
from .six import *

Looks like we have an extraneous "." perhaps. These scripts are filled with them. Also, the line:

from . import logger

makes no sense, and neither does the line following that:

from . import os_utils

(os_utils should be in base, I think).

logger.py is also broken.

If I've misstated any of this, please advise. My python is still rather limited, so I may be missing something obvious.

Last edited 6 years ago by Lewis Rosenthal (previous) (diff)

comment:36 Changed 6 years ago by Silvan Scherrer

You can't run the patched g.py direct. Please try to print something with the patched g.py in place and the foomatic-rip from here.
The klusrmgr has an issue and a new version of that is also almost done.

Last edited 6 years ago by Silvan Scherrer (previous) (diff)

comment:37 Changed 6 years ago by Lewis Rosenthal

Hmmm...

Okay. the syntax definitely disagrees with the syntax on a couple Linux boxes I have here, but you're the expert.

Same result:

**** Unable to open the initial device, quitting.

and in the web UI:

stopped "Filter failed"

Log attached.

This was with the installed g.py patched per your attachment.

How did you want me to run the g.py which attached?

Last edited 6 years ago by Lewis Rosenthal (previous) (diff)

Changed 6 years ago by Lewis Rosenthal

Attachment: error_log-2017-12-12 added

new foomatic-rip.exe and installed g.py patched per previous attachment

comment:38 Changed 6 years ago by Silvan Scherrer

ok we do progress. The origin issue of not finding the number of pages is solved now.
But now GS wants to start a hpijs server and can't find that. I need to see what the heck this is. It seems to me that this ijs stuff is done a bit too complex.
but as said earlier we get there once.

Last edited 6 years ago by Silvan Scherrer (previous) (diff)

comment:39 Changed 6 years ago by Silvan Scherrer

I did some debugging and unfortunately it's a fork() issue. fork gives back EINTR. Why I have no idea. you can see that with the attached gs.sh. I used the file which was attached here some weeks ago.
Eventually Steven has an idea

Changed 6 years ago by Silvan Scherrer

Attachment: gs.sh added

comment:40 Changed 6 years ago by Steven Levine

On possible cause of EINTR is an unhandled signal. Another is that the kLIBC has detected a hang in the fork process and issues a signal to cancel the fork process. Do you have a link to the sources with the failing fork() call?

comment:41 Changed 6 years ago by Silvan Scherrer

http://trac.netlabs.org/ports/browser/ghostscript/trunk/ijs/ijs_exec_unix.c#L53
The pid which fork() brings back is -1, so the crash is within fork itself.

comment:42 Changed 6 years ago by Steven Levine

FYI, the bulk of the kLIBC fork code is at src/emx/src/lib/sys/libcfork.c.

One possible reason for EINTR is an exception in the child. Since the failure is repeatable, you might want to test with the logging version of libc066.dll which will report the exception in more detail.

comment:43 Changed 6 years ago by Silvan Scherrer

I did that already, but it tells me not much to be honest. It might be a memory issue, but that's more a wild guess.

Changed 6 years ago by Silvan Scherrer

Attachment: libc_log.zip added

libc_logs. pid 041 is the falling one

comment:44 Changed 6 years ago by Silvan Scherrer

[E:\Trees\tests]sh gs.sh
bww trace before fork()
bww trace after fork() child_pid=-1 errno=4 (Interrupted system call)
GPL Ghostscript 9.18: Can't start ijs server "hpijs"
**** Unable to open the initial device, quitting.

LIBC PANIC!!
LIBC fork: Child aborting fork()! rc=0xfffffffc
pid=0x0041 ppid=0x0040 tid=0x0001 slot=0x008a pri=0x0200 mc=0x0000 ps=0x0010
C:\USR\BIN\GSOS2.EXE
LIBC066 0:00004f6e
cs:eip=005b:1f424f6e      ss:esp=0053:0212beec      ebp=0212bf24
 ds=0053      es=0053      fs=150b      gs=0000     efl=00010246
eax=00058fba ebx=5cd273c0 ecx=02140048 edx=0212bc80 edi=00000040 esi=19120000
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
00059024 01 0d 0018 Asrt: Assertion Failed!!!
00059024 01 0d 0018 Asrt: Function: __libc_spmSocketDeref
00059024 01 0d 0018 Asrt: File:     D:/Coding/libc/libc-0.6/src/emx/src/lib/sys/
sharedpm.c
00059024 01 0d 0018 Asrt: Line:     1681
00059024 01 0d 0018 Asrt: Expr:     0
00059024 01 0d 0018 Asrt: iSocket=2105 already 0! (uRefs=0xffff0000)

this is what libc told me, when using the libc logchk dll
The 2 additional outputs are a printf() right before and after fork()

comment:45 Changed 6 years ago by Steven Levine

This is a reference counting issue. The failing code is:

src\lib\sys\sharedpm.c:1681

LIBC_ASSERTM_FAILED("iSocket=%d already 0! (uRefs=%#x)\n", iSocket,

uRefs);

rc = -EBADF;

The assert overrides the EBADF return code.

The assert is triggering because there is at least one too many calls to libc_spmSocketDeref for socket #2105.

I need to poke about in the full log to find the ref/deref sequence.

I wonder if it is possible that socktidy is getting in the way here? I doubt that kLIBC expects other code to close the socket behind the curtains.

comment:46 Changed 6 years ago by Steven Levine

OK. The assertion failure is a red herring. The real failure is your basic illegal access exception:

00058fba 01 09 0019 Mesg 0002 forkChlExceptionHandler: forkChlExceptionHandler: ExceptionNum?=0xc0000005 ExceptionAddress?=0x1f424f6e eip=0x1f424f6e ExceptionInfo?={0x000002,0x19120000,0x0051ec,0xfe2356a0} fHandlerFlags=0x0

We need to convert eip=0x1f424f6e to a selector:offset.

comment:47 Changed 6 years ago by Steven Levine

OK, the log tells us

libc_logInit: addr 0x1f469840 iObj=0 offObj=0x49840

which means object 0 loads at 1f469840 - 49840 = 1f420000

which means the trap is at

1f424f6e - 1f420000 = 4f6e = 0000:00004f6e

Note that the object numbers reported by kLIBC are off by 1 compared to anything else.

The bad news is that the map tells us that this is a static number and I don't have a fully disassembled logchk version of libc066.

A WAG says the trap is somewhere in the _spawnve processing.

comment:48 Changed 6 years ago by Steven Levine

Make sense. It's probably

ijs_exec_unix.c:83

status = execvp (argv[0], argv);

that is causing the trap.

Can we add some logging code to verify this?

Has anyone checked for an exceptq report? forkChlExceptionHandler notifies the fork parent passes the exception on to the next handler in the fork child.

comment:49 Changed 6 years ago by Steven Levine

No need. It's not execvp. It's been too long since I spent any quality time with Knut's logging code:

00058fba 01 09 0019 Entr 0002 forkBthCopyPagesMMX: pvDst=0x19120000 pvSrc=0x5cd273c0 cb=0x00001000
00058fba 01 09 0019 Mesg 0002 forkChlExceptionHandler: forkChlExceptionHandler: ExceptionNum?=0xc0000005 ExceptionAddress?=0x1f424f6e eip=0x1f424f6e ExceptionInfo?={0x000002,0x19120000,0x0051ec,0xfe2356a0} fHandlerFlags=0x0

I need to find the assembly code for forkBthCopyPagesMMX and decide how to figure out why 0x19120000 is not accessible.

comment:50 in reply to:  48 Changed 6 years ago by Silvan Scherrer

Replying to stevenhl:

Make sense. It's probably

ijs_exec_unix.c:83

status = execvp (argv[0], argv);

that is causing the trap.

Can we add some logging code to verify this?

Has anyone checked for an exceptq report? forkChlExceptionHandler notifies the fork parent passes the exception on to the next handler in the fork child.

yes it's not execv, as it never get there at all. as fork() already crashes and gives back pid -1.
But you already saw that as well according to your comments.

comment:51 Changed 6 years ago by Silvan Scherrer

and I removed socktidy, rebooted and tested again. I got the same crash. so socktidy seems not to interfere

comment:52 Changed 6 years ago by Silvan Scherrer

it smells like http://trac.netlabs.org/libc/ticket/372
I will use DosLoadModuleEx?() and see what we gain

Changed 6 years ago by Silvan Scherrer

Attachment: gsos2.exe added

with this gsos2.exe the crash is gone. please try

comment:53 Changed 6 years ago by Lewis Rosenthal

ijs output is terribly slow (minutes delay), and while I do now get a test page, it is very poor quality (snapshot coming soon). PS fails with "filter failed" and no output so far (log coming).

Perhaps I need to roll back previous script changes?

comment:54 Changed 6 years ago by Lewis Rosenthal

After applying the test libcx0.dll as discussed in today's meeting:

11-29-17  18:53         119,344      0   ___A_  libcx0.dll

Output is much faster, but test page result for ijs and PS is:

*** HEAP RESULT 1

Logs attached.

Changed 6 years ago by Lewis Rosenthal

Attachment: error_log-2017-12-13 added

ijs error after test libcx0 (may be unrelated to fixes here)

Changed 6 years ago by Lewis Rosenthal

Attachment: error_log-2017-12-13-PS added

PS error after test libcx0 (may be unrelated to fixes here)

comment:55 in reply to:  54 Changed 6 years ago by Silvan Scherrer

Replying to lewisr:

After applying the test libcx0.dll as discussed in today's meeting:

11-29-17  18:53         119,344      0   ___A_  libcx0.dll

Output is much faster, but test page result for ijs and PS is:

*** HEAP RESULT 1

thats because it's a debug libcx. ask @dmik to build a non debug for testing

Logs attached.

comment:56 Changed 6 years ago by Silvan Scherrer

any news on that?

comment:57 Changed 6 years ago by Lewis Rosenthal

On my initial test system, this is resolved now with:

cups-libs-2.1.3-10.oc00.i686
cups-filters-1.17.2-3.oc00.i686
cups-filesystem-2.1.3-10.oc00.noarch
cups-2.1.3-10.oc00.i686
cups-filters-libs-1.17.2-3.oc00.i686
cups-client-2.1.3-10.oc00.i686
ghostscript-fonts-6.0-1.oc00.noarch
ghostscript-9.18-7.oc00.i686
gutenprint-libs-5.2.12-1.oc00.i686
gutenprint-cups-5.2.12-1.oc00.i686
gutenprint-5.2.12-1.oc00.i686

but on a fresh install of ArcaOS with a yum update (latest cups), it still fails. Checking for differences, and will follow up.

comment:58 Changed 6 years ago by Lewis Rosenthal

I have been unable to successfully print from the alternate (fresh) partition.

I:

  1. unlocked \usr\* /r
  2. renamed \usr, \etc, \var\cache, \var\spool, \var\run, and \var\log\yum.log
  3. copied the above from the working partition
  4. renamed CONFIG.SYS
  5. copied CONFIG.SYS from the working partition, changing references to the correct drive letter
  6. rebooted the system

Still, printing fails with "filter failed."

Printing from hpijs works fine on the other partition, where we originally experimented.

I removed the HP 3800 from the CUPS setup, and recreated it, too. The failure persists. Until I can find what changed to make it work on the original partition, I can't say that this issue has been fixed.

comment:59 Changed 6 years ago by Silvan Scherrer

eventually a fresh debug log says more.

Changed 6 years ago by Lewis Rosenthal

Attachment: error_log-2018-02-05 added

Fresh error log, latest exp packages installed

comment:60 Changed 6 years ago by Silvan Scherrer

are you really sure everything is updated? as you seem to have the same issue again, which we fixed long ago
D [05/Feb/2018:12:36:08 +0500] [Job 3] Process is dying with \"Failed to execute ghostscript to determine number of input pages! rc=2 (No such file or directory)
D [05/Feb/2018:12:36:08 +0500] [Job 3] \", exit stat 6

comment:61 Changed 6 years ago by Silvan Scherrer

hmm as it seems no follow up, I tend to close it soon.

comment:62 Changed 6 years ago by Silvan Scherrer

Resolution: fixed
Status: newclosed

still no follow up, so please open a new ticket, with recent test info, if it's reproducible.

Note: See TracTickets for help on using tickets.