Opened 14 years ago

Closed 14 years ago

#23 closed defect (fixed)

odininst crashes reproducibly in kernel32.dll

Reported by: herwigb Owned by:
Priority: Feedback Pending Milestone: odinized java
Component: odin Version:
Severity: Keywords:
Cc:

Description

Release build 31.12.2010

01-04-2011  16:50:50  SYS3175  PID 0194  TID 0001  Slot 00e5
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
133cf0c3
P1=00000001  P2=00000000  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=00000000  ECX=00000000  EDX=ffffffff
ESI=00000000  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:133cf0c3  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a150  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a16c  FLG=00012246

KERNEL32.DLL 0001:0001f0c3

------------------------------------------------------------

01-04-2011  16:50:52  SYS3175  PID 0194  TID 0001  Slot 00e5
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
133d309b
P1=00000001  P2=00000034  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000030  EBX=00000001  ECX=1500644c  EDX=15006454
ESI=00000030  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:133d309b  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a134  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a184  FLG=00012206

KERNEL32.DLL 0001:0002309b


Change History (30)

comment:1 by dmik, 14 years ago

Does the previous build (2010-10-01) crash there too?

comment:2 by herwigb, 14 years ago

No, the previous build installs just fine.

comment:3 by dmik, 14 years ago

Are you sure? According to your logs, the previous one fails in a place that was not changed. And please give the exact information on what system it is.

comment:4 by dmik, 14 years ago

The previous one = the latest one.

comment:5 by dmik, 14 years ago

And, also, please try to install the debug WPI from 2010-12-30 (ignoring the crash), then start odininst.exe manually, collect odin*.log files it generates and attach them here.

comment:6 by Silvan Scherrer, 14 years ago

Milestone: odinized java

comment:7 by Silvan Scherrer, 14 years ago

Priority: criticalFeedback Pending

comment:8 by dmik, 14 years ago

Herwig, please try the same steps with the new test debug version you can get here ftp://ftp.dmik.org/tmp/herwig/od20110321.wpi.

comment:9 by herwigb, 14 years ago

tX: (1cdd95c) GDI32 exit
tX: (1cdd95c) kernel32 exit 4
tX: (1cdd95e) DebugInfo is not in our list: 0!!!
tX: (1cdd95e) DebugInfo is not in our list: 0!!!
tX: (1cdd95e) _smalloc 14 returned 5b4532c0
tX: (1cdd95e) InitializeDebugInfo DebugInfo: 5B4532C0
tX: (1cde3ba) KERNEL32: DestroySharedHeap 1
tX: (1cde3ba) allocated  5b4532e8 24
tX: (1cde3ba) allocated  5b4532c0 20 at D:\Coding\odin32\src\kernel32\critsection.cpp 95
tX: (1cde3ba) allocated  5b453118 24
tX: (1cde3ba) allocated  5b457020 65568 at D:\Coding\odin32\src\kernel32\heapshared.cpp 240
tX: (1cde3ba) allocated  5b453340 15552
tX: (1cde3ba) allocated  5b453170 320
tX: (1cde3ba) allocated  5b467050 4016
tX: (1cde3ba) KERNEL32: releaseShared 5b457000 69632

No other logs...

comment:10 by dmik, 14 years ago

Thanks, but that's not likely, it should create more logs. Search through all of your drive for odin32*.log, odininst.exe changes the working directory AFAIR and will put the logs there. Also, make sure no DLLs from the release version of Odin are picked up. Also, please provide a fresh popuplog.os2 entry for the crash (of the debug version's executable).

comment:11 by herwigb, 14 years ago

It does not create more logs as soon as it crashes.

However I found a way to 100% reproducibly create the crash on a freshly booted machine here: As soon as I have Seamonkey 1.1.11 and OOo 3.2 running at the same time, OdinInst.EXE crashes.

Closing one of them is enough to make OdinInst.EXE run normally and produce a bigger log.

I also tried using Firefox 3.6 instead of Seamonkey, but that is not sufficient to make OdinInst.EXE crash...

comment:12 by dmik, 14 years ago

Can you supply a new POPUPLOG.OS2?

comment:13 by herwigb, 14 years ago

03-21-2011  17:14:30  SYS3175  PID 007c  TID 0001  Slot 00ae
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
1050b28e
P1=00000001  P2=00000000  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=00000000  ECX=ffffffff  EDX=10fc2830
ESI=00000000  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:1050b28e  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a28c  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a294  FLG=00012286

ODINCRTD.DLL 0001:0000b28e

------------------------------------------------------------

03-21-2011  17:14:33  SYS3175  PID 007c  TID 0001  Slot 00ae
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
0ffebe17
P1=00000001  P2=00000034  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=00000000  ECX=00000030  EDX=10fe7cb0
ESI=00000000  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:0ffebe17  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a298  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a2ac  FLG=00012282

KERNEL32.DLL 0001:0002be17

comment:14 by herwigb, 14 years ago

05-31-2011  17:16:31  SYS3175  PID 00dd  TID 0001  Slot 010c
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
0ea1f3c9
P1=00000001  P2=00000000  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=00000000  ECX=00000000  EDX=ffffffff
ESI=00000000  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:0ea1f3c9  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a150  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a16c  FLG=00012246

KERNEL32.DLL 0001:0001f3c9

------------------------------------------------------------

05-31-2011  17:16:34  SYS3175  PID 00dd  TID 0001  Slot 010c
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
0ea2339b
P1=00000001  P2=00000034  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000030  EBX=00000001  ECX=0eb95840  EDX=0eb95848
ESI=00000030  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:0ea2339b  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a12c  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a17c  FLG=00012206

KERNEL32.DLL 0001:0002339b

comment:15 by herwigb, 14 years ago

That was running odininst.exe (release 20110512).

comment:16 by herwigb, 14 years ago

Running odininst.exe (debug 20110512) with SET WIN32LOG_ENABLED=1

popuplog.os2 entries:

05-31-2011  17:21:58  SYS3175  PID 00eb  TID 0001  Slot 011a
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
0e65b28e
P1=00000001  P2=00000000  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=00000000  ECX=ffffffff  EDX=0e8f2030
ESI=00000000  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:0e65b28e  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a28c  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a294  FLG=00012286

ODINCRTD.DLL 0001:0000b28e

------------------------------------------------------------

05-31-2011  17:22:00  SYS3175  PID 00eb  TID 0001  Slot 011a
E:\ODIN\SYSTEM32\ODININST.EXE
c0000005
0e73c173
P1=00000001  P2=00000034  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=00000000  ECX=00000030  EDX=0e9170a4
ESI=00000000  EDI=00000000  
DS=0053  DSACC=f0f3  DSLIM=ffffffff  
ES=0053  ESACC=f0f3  ESLIM=ffffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:0e73c173  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0008a298  SSACC=f0f3  SSLIM=ffffffff
EBP=0008a2ac  FLG=00012282

KERNEL32.DLL 0001:0002c173

odin32_0.log:

tX: (21ccf22) GDI32 exit
tX: (21ccf22) kernel32 exit 4
tX: (21ccf26) DebugInfo is not in our list: 0!!!
tX: (21ccf26) DebugInfo is not in our list: 0!!!
tX: (21ccf26) _smalloc 14 returned 5b7532c0
tX: (21ccf26) InitializeDebugInfo DebugInfo: 5B7532C0
tX: (21cd965) KERNEL32: DestroySharedHeap 1
tX: (21cd965) allocated  5b7532e8 24
tX: (21cd965) allocated  5b7532c0 20 at D:\Coding\odin32\src\kernel32\critsection.cpp 95
tX: (21cd965) allocated  5b753118 24
tX: (21cd965) allocated  5b757020 65568 at D:\Coding\odin32\src\kernel32\heapshared.cpp 240
tX: (21cd965) allocated  5b753340 15552
tX: (21cd965) allocated  5b753170 320
tX: (21cd965) allocated  5b767050 4016
tX: (21cd965) KERNEL32: releaseShared 5b757000 69632

comment:17 by dmik, 14 years ago

Great, thanks!

comment:18 by dmik, 14 years ago

ODINCRTD.DLL at 0001:0000b28e tells us that it crashes inside strcpy(), most likely due to NULL arguments. Which in turn may mean (taking into account that you have OOo and Seamonkey loaded that use a lot of shared memory) that it fails to allocate a shared memory block but doesn't check for NULL and just goes on until the crash.

As I couldn't reproduce the problem here locally so far, I will ask you to take the process dump of the crashing ODININST.EXE application using PROCDUMP.EXE to get the call stack and other useful information. Just do the following before starting odininst.exe:

pdumpusr SUMM,MVDM,SEM,SYSFS,SYSVM
procdump on /L:D:\

where D:\ is a directory where to save dump files (named PDUMP.nnn).

Now make odininst.exe crash, this will create one or more PDUMP files. Then please ZIP them up and upload to my FTP. Try it with the release version of Odin (or with the debug one, whichever it is easier to crash for you).

Another question: is it only odininst.exe that crashes, or any other application using this Odin version will crash too?

comment:19 by herwigb, 14 years ago

Any Odin App will crash in this situation, I did several tests to ensure. I created the pdump files. I don't have your ftp at hand I put them here: http://msplins06.bon.at/%7Eadmin139/files/pdump.zip

comment:20 by dmik, 14 years ago

Thanks for the dump. I found where it crashes: it's inittermKernel32(). At the very beginning OSLibGetDllName() is called to get the full path to kernel32.dll from its module handle. However, in your case this call fails and returns NULL instead of the full path which is then passed to strcpy() which crashes.

I have no idea why OSLibGetDllName() (actually, just DosQueryModuleName()) fails. I added some more debugging so we should see the error code. Please download the new debug build of Odin from here and try again. It should not crash in that strcpy() any longer but still, try to enable process dumping (it may crash somewhere else). I will need both PDUMP.xxx (if any) and odin32_x.log.

PS. As far as I see, you use an old non-SMP OS/2 kernel 14.104a, is there a reason? Did you try the 14.105 kernel, UNI or SMP? (These kernels come with the latest eCS 2.x releases).

comment:21 by herwigb, 14 years ago

Will try with the test build ASAP. Regarding the kernel - that's a very old installation on a an old machine, that's why. I can try any kernel you want me to.

comment:22 by dmik, 14 years ago

If possible, yes, it would be interesting to know if the problem occurs on 14.105 SMP or UNI as well.

comment:23 by herwigb, 14 years ago

Redid the tests with Odin from here and reuploaded the pdumps - same link as last time, file is updated however. Will test with UNI and SMP kernels tomorrow.

comment:24 by dmik, 14 years ago

Hmm, where did you upload your pdumps? Certainly not to ftp://ftp.dmik.org/incoming, there is nothing there. Please upload again.

comment:26 by dmik, 14 years ago

Thanks, but still, that's not enough. I need odin32*.log as well but there are only dumps. Please supply logs.

Re the new dumps, it crashes in some other place, this time in _HMHandleGetFree(void), this line:

    if (INVALID_HANDLE_VALUE == TabWin32Handles[ulLoop].hmHandleData.hHMHandle)

because TabWin32Handles is NULL. TabWin32Handles is initialized by HMInitialize() which is called from inittermKernel32(). The poor code doesn't analyze the error from HMInitialize() and seems to go on with initialization even when the latter returns a failure.

The strange thing is that the HMInitialize() call comes after the OSLibGetDllName() call where it failed before. This may mean that failures are quite random and given that there are lots of places in Odin where errors are not checked, it may fail everywhere...

comment:27 by dmik, 14 years ago

Actually, I simulated a failure in OSLibGetDllName() to cause inittermKernel32() to return a failure early and I got the very same crash as the last one from Herwig. Looks like the scenario of a failed KERNEL32.DLL initialization is completely not expected and not supported in Odin. Really sad.

Herwig, I still need the odin32.log file to see why DosQueryModuleName() fails.

comment:28 by dmik, 14 years ago

I cleaned up the initialization code so that it now exits gracefully instead of crashing if it detects a runtime problem like above.

Note that OS/2 itself doesn't support any mechanism to report failures during DLL initialization other than returning a non-zero exit code (which of course will usually get unnoticed by the end user).

In order to solve this problem, in r21651 I added some code to show a message box if KERNEL32.DLL fails to initialize itself. The message box suggests to close other applications and try again (as such a runtime failure is most likely due to low memory conditions). Since the initialization sequence starts with KERNEL32 anyway (no matter which imported Odin DLL gets loaded by the OS/2 kernel first), this should be enough to catch most fatal runtime errors like above.

The new Odin build is here ftp://ftp.dmik.org/tmp/j/ (o is release, od is debug). Please try.

comment:29 by dmik, 14 years ago

Thanks to a series of tests performed by Herwig, it is clear now that it fails in various places due to low memory conditions indeed. Last time it even failed in CopyBitmap() when doing malloc() (which returned NULL). This needs to be investigated one day why exactly all these cases fail to allocate memory (this is not necessarily the shared memory issue), but it's a bigger task and not for now. I created #35 for that.

For now an error message asking the user to close the applications and retry is enough.

I uploaded new test odin builds to the same location and asked Herwig to test both (especially, the release build which should work in normal conditions and show a message box in low mem conditions too).

comment:30 by dmik, 14 years ago

Resolution: fixed
Status: newclosed

Herwig reports that all is fine.

Note: See TracTickets for help on using tickets.