Opened 13 years ago

Closed 13 years ago

#23 closed defect (fixed)

odininst crashes reproducibly in kernel32.dll

Reported by: herwigb Owned by:
Priority: Feedback Pending Milestone: odinized java
Component: odin Version:
Severity: Keywords:
Cc:

Description

Release build 31.12.2010

01-04-2011  16:50:50  SYS3175  PID 0194  TID 0001  Slot 00e5
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
133cf0c3
P1=00000001  P2=00000000  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=00000000  ECX=00000000  EDX=ffffffff
ESI=00000000  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:133cf0c3  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a150  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a16c  FLG=00012246

KERNEL32.DLL 0001:0001f0c3

------------------------------------------------------------

01-04-2011  16:50:52  SYS3175  PID 0194  TID 0001  Slot 00e5
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
133d309b
P1=00000001  P2=00000034  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000030  EBX=00000001  ECX=1500644c  EDX=15006454
ESI=00000030  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:133d309b  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a134  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a184  FLG=00012206

KERNEL32.DLL 0001:0002309b


Change History (30)

comment:1 Changed 13 years ago by dmik

Does the previous build (2010-10-01) crash there too?

comment:2 Changed 13 years ago by herwigb

No, the previous build installs just fine.

comment:3 Changed 13 years ago by dmik

Are you sure? According to your logs, the previous one fails in a place that was not changed. And please give the exact information on what system it is.

comment:4 Changed 13 years ago by dmik

The previous one = the latest one.

comment:5 Changed 13 years ago by dmik

And, also, please try to install the debug WPI from 2010-12-30 (ignoring the crash), then start odininst.exe manually, collect odin*.log files it generates and attach them here.

comment:6 Changed 13 years ago by Silvan Scherrer

Milestone: odinized java

comment:7 Changed 13 years ago by Silvan Scherrer

Priority: criticalFeedback Pending

comment:8 Changed 13 years ago by dmik

Herwig, please try the same steps with the new test debug version you can get here ftp://ftp.dmik.org/tmp/herwig/od20110321.wpi.

comment:9 Changed 13 years ago by herwigb

tX: (1cdd95c) GDI32 exit
tX: (1cdd95c) kernel32 exit 4
tX: (1cdd95e) DebugInfo is not in our list: 0!!!
tX: (1cdd95e) DebugInfo is not in our list: 0!!!
tX: (1cdd95e) _smalloc 14 returned 5b4532c0
tX: (1cdd95e) InitializeDebugInfo DebugInfo: 5B4532C0
tX: (1cde3ba) KERNEL32: DestroySharedHeap 1
tX: (1cde3ba) allocated  5b4532e8 24
tX: (1cde3ba) allocated  5b4532c0 20 at D:\Coding\odin32\src\kernel32\critsection.cpp 95
tX: (1cde3ba) allocated  5b453118 24
tX: (1cde3ba) allocated  5b457020 65568 at D:\Coding\odin32\src\kernel32\heapshared.cpp 240
tX: (1cde3ba) allocated  5b453340 15552
tX: (1cde3ba) allocated  5b453170 320
tX: (1cde3ba) allocated  5b467050 4016
tX: (1cde3ba) KERNEL32: releaseShared 5b457000 69632

No other logs...

comment:10 Changed 13 years ago by dmik

Thanks, but that's not likely, it should create more logs. Search through all of your drive for odin32*.log, odininst.exe changes the working directory AFAIR and will put the logs there. Also, make sure no DLLs from the release version of Odin are picked up. Also, please provide a fresh popuplog.os2 entry for the crash (of the debug version's executable).

comment:11 Changed 13 years ago by herwigb

It does not create more logs as soon as it crashes.

However I found a way to 100% reproducibly create the crash on a freshly booted machine here: As soon as I have Seamonkey 1.1.11 and OOo 3.2 running at the same time, OdinInst?.EXE crashes.

Closing one of them is enough to make OdinInst?.EXE run normally and produce a bigger log.

I also tried using Firefox 3.6 instead of Seamonkey, but that is not sufficient to make OdinInst?.EXE crash...

comment:12 Changed 13 years ago by dmik

Can you supply a new POPUPLOG.OS2?

comment:13 Changed 13 years ago by herwigb

03-21-2011  17:14:30  SYS3175  PID 007c  TID 0001  Slot 00ae
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
1050b28e
P1=00000001  P2=00000000  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=00000000  ECX=ffffffff  EDX=10fc2830
ESI=00000000  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:1050b28e  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a28c  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a294  FLG=00012286

ODINCRTD.DLL 0001:0000b28e

------------------------------------------------------------

03-21-2011  17:14:33  SYS3175  PID 007c  TID 0001  Slot 00ae
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
0ffebe17
P1=00000001  P2=00000034  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=00000000  ECX=00000030  EDX=10fe7cb0
ESI=00000000  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:0ffebe17  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a298  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a2ac  FLG=00012282

KERNEL32.DLL 0001:0002be17

comment:14 Changed 13 years ago by herwigb

05-31-2011  17:16:31  SYS3175  PID 00dd  TID 0001  Slot 010c
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
0ea1f3c9
P1=00000001  P2=00000000  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=00000000  ECX=00000000  EDX=ffffffff
ESI=00000000  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:0ea1f3c9  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a150  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a16c  FLG=00012246

KERNEL32.DLL 0001:0001f3c9

------------------------------------------------------------

05-31-2011  17:16:34  SYS3175  PID 00dd  TID 0001  Slot 010c
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
0ea2339b
P1=00000001  P2=00000034  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000030  EBX=00000001  ECX=0eb95840  EDX=0eb95848
ESI=00000030  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:0ea2339b  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a12c  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a17c  FLG=00012206

KERNEL32.DLL 0001:0002339b

comment:15 Changed 13 years ago by herwigb

That was running odininst.exe (release 20110512).

comment:16 Changed 13 years ago by herwigb

Running odininst.exe (debug 20110512) with SET WIN32LOG_ENABLED=1

popuplog.os2 entries:

05-31-2011  17:21:58  SYS3175  PID 00eb  TID 0001  Slot 011a
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
0e65b28e
P1=00000001  P2=00000000  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=00000000  ECX=ffffffff  EDX=0e8f2030
ESI=00000000  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:0e65b28e  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a28c  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a294  FLG=00012286

ODINCRTD.DLL 0001:0000b28e

------------------------------------------------------------

05-31-2011  17:22:00  SYS3175  PID 00eb  TID 0001  Slot 011a
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
0e73c173
P1=00000001  P2=00000034  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=00000000  ECX=00000030  EDX=0e9170a4
ESI=00000000  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:0e73c173  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a298  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a2ac  FLG=00012282

KERNEL32.DLL 0001:0002c173

odin32_0.log:

tX: (21ccf22) GDI32 exit
tX: (21ccf22) kernel32 exit 4
tX: (21ccf26) DebugInfo is not in our list: 0!!!
tX: (21ccf26) DebugInfo is not in our list: 0!!!
tX: (21ccf26) _smalloc 14 returned 5b7532c0
tX: (21ccf26) InitializeDebugInfo DebugInfo: 5B7532C0
tX: (21cd965) KERNEL32: DestroySharedHeap 1
tX: (21cd965) allocated  5b7532e8 24
tX: (21cd965) allocated  5b7532c0 20 at D:\Coding\odin32\src\kernel32\critsection.cpp 95
tX: (21cd965) allocated  5b753118 24
tX: (21cd965) allocated  5b757020 65568 at D:\Coding\odin32\src\kernel32\heapshared.cpp 240
tX: (21cd965) allocated  5b753340 15552
tX: (21cd965) allocated  5b753170 320
tX: (21cd965) allocated  5b767050 4016
tX: (21cd965) KERNEL32: releaseShared 5b757000 69632

comment:17 Changed 13 years ago by dmik

Great, thanks!

comment:18 Changed 13 years ago by dmik

ODINCRTD.DLL at 0001:0000b28e tells us that it crashes inside strcpy(), most likely due to NULL arguments. Which in turn may mean (taking into account that you have OOo and Seamonkey loaded that use a lot of shared memory) that it fails to allocate a shared memory block but doesn't check for NULL and just goes on until the crash.

As I couldn't reproduce the problem here locally so far, I will ask you to take the process dump of the crashing ODININST.EXE application using PROCDUMP.EXE to get the call stack and other useful information. Just do the following before starting odininst.exe:

pdumpusr SUMM,MVDM,SEM,SYSFS,SYSVM
procdump on /L:D:\

where D:\ is a directory where to save dump files (named PDUMP.nnn).

Now make odininst.exe crash, this will create one or more PDUMP files. Then please ZIP them up and upload to my FTP. Try it with the release version of Odin (or with the debug one, whichever it is easier to crash for you).

Another question: is it only odininst.exe that crashes, or any other application using this Odin version will crash too?

comment:19 Changed 13 years ago by herwigb

Any Odin App will crash in this situation, I did several tests to ensure. I created the pdump files. I don't have your ftp at hand I put them here: http://msplins06.bon.at/%7Eadmin139/files/pdump.zip

comment:20 Changed 13 years ago by dmik

Thanks for the dump. I found where it crashes: it's inittermKernel32(). At the very beginning OSLibGetDllName() is called to get the full path to kernel32.dll from its module handle. However, in your case this call fails and returns NULL instead of the full path which is then passed to strcpy() which crashes.

I have no idea why OSLibGetDllName() (actually, just DosQueryModuleName?()) fails. I added some more debugging so we should see the error code. Please download the new debug build of Odin from here and try again. It should not crash in that strcpy() any longer but still, try to enable process dumping (it may crash somewhere else). I will need both PDUMP.xxx (if any) and odin32_x.log.

PS. As far as I see, you use an old non-SMP OS/2 kernel 14.104a, is there a reason? Did you try the 14.105 kernel, UNI or SMP? (These kernels come with the latest eCS 2.x releases).

comment:21 Changed 13 years ago by herwigb

Will try with the test build ASAP. Regarding the kernel - that's a very old installation on a an old machine, that's why. I can try any kernel you want me to.

comment:22 Changed 13 years ago by dmik

If possible, yes, it would be interesting to know if the problem occurs on 14.105 SMP or UNI as well.

comment:23 Changed 13 years ago by herwigb

Redid the tests with Odin from here and reuploaded the pdumps - same link as last time, file is updated however. Will test with UNI and SMP kernels tomorrow.

comment:24 Changed 13 years ago by dmik

Hmm, where did you upload your pdumps? Certainly not to ftp://ftp.dmik.org/incoming, there is nothing there. Please upload again.

comment:25 Changed 13 years ago by Silvan Scherrer

comment:26 Changed 13 years ago by dmik

Thanks, but still, that's not enough. I need odin32*.log as well but there are only dumps. Please supply logs.

Re the new dumps, it crashes in some other place, this time in _HMHandleGetFree(void), this line:

    if (INVALID_HANDLE_VALUE == TabWin32Handles[ulLoop].hmHandleData.hHMHandle)

because TabWin32Handles is NULL. TabWin32Handles is initialized by HMInitialize() which is called from inittermKernel32(). The poor code doesn't analyze the error from HMInitialize() and seems to go on with initialization even when the latter returns a failure.

The strange thing is that the HMInitialize() call comes after the OSLibGetDllName() call where it failed before. This may mean that failures are quite random and given that there are lots of places in Odin where errors are not checked, it may fail everywhere...

comment:27 Changed 13 years ago by dmik

Actually, I simulated a failure in OSLibGetDllName() to cause inittermKernel32() to return a failure early and I got the very same crash as the last one from Herwig. Looks like the scenario of a failed KERNEL32.DLL initialization is completely not expected and not supported in Odin. Really sad.

Herwig, I still need the odin32.log file to see why DosQueryModuleName?() fails.

comment:28 Changed 13 years ago by dmik

I cleaned up the initialization code so that it now exits gracefully instead of crashing if it detects a runtime problem like above.

Note that OS/2 itself doesn't support any mechanism to report failures during DLL initialization other than returning a non-zero exit code (which of course will usually get unnoticed by the end user).

In order to solve this problem, in r21651 I added some code to show a message box if KERNEL32.DLL fails to initialize itself. The message box suggests to close other applications and try again (as such a runtime failure is most likely due to low memory conditions). Since the initialization sequence starts with KERNEL32 anyway (no matter which imported Odin DLL gets loaded by the OS/2 kernel first), this should be enough to catch most fatal runtime errors like above.

The new Odin build is here ftp://ftp.dmik.org/tmp/j/ (o is release, od is debug). Please try.

comment:29 Changed 13 years ago by dmik

Thanks to a series of tests performed by Herwig, it is clear now that it fails in various places due to low memory conditions indeed. Last time it even failed in CopyBitmap?() when doing malloc() (which returned NULL). This needs to be investigated one day why exactly all these cases fail to allocate memory (this is not necessarily the shared memory issue), but it's a bigger task and not for now. I created #35 for that.

For now an error message asking the user to close the applications and retry is enough.

I uploaded new test odin builds to the same location and asked Herwig to test both (especially, the release build which should work in normal conditions and show a message box in low mem conditions too).

comment:30 Changed 13 years ago by dmik

Resolution: fixed
Status: newclosed

Herwig reports that all is fine.

Note: See TracTickets for help on using tickets.