Opened 7 years ago

Closed 7 years ago

#161 closed defect (fixed)

Jeti/2 crashes on startup

Reported by: yoda Owned by:
Priority: blocker Milestone: GA4
Component: general Version: 1.6.0 Build 25 GA3
Severity: medium Keywords:
Cc:

Description (last modified by diver)

Jeti/2 still crashes on startup. Problem started with the GCC builds of Odin.
I have now tested every F***** setting of Jeti/2 to see if any of those can be the difference between our tests.

Well, at the buttom (of course), the setting 'XMPP' seems to be the one causing the problem. When I turn it off, Jeti/2 can start again.

This setting have always been on here - and gives no problems with
Java 1.41, Java 1.42 or Java 1.5 - and also worked with VAC builds of ODIN.

Attachments (2)

Crash192.zip (1.5 KB) - added by yoda 7 years ago.
Crash207.zip (83.5 KB) - added by yoda 7 years ago.

Download all attachments as: .zip

Change History (46)

Changed 7 years ago by yoda

comment:1 Changed 7 years ago by diver

  • Description modified (diff)
  • Severity changed from high to medium

comment:2 Changed 7 years ago by diver

  • Milestone changed from Enhanced to GA3

comment:3 follow-up: Changed 7 years ago by dmik

I've checked it. Jeti/2 still fully works here with the latest b25 build. However, your latest log shows that for you it crashes in IPHLPAPI.DLL (GetAdaptersInfo?). Which suggests that you have some unusual network adapter configuration. I will check the code.

comment:4 follow-up: Changed 7 years ago by dmik

Checked the code. Visually I don't see any problems. Can you quickly try the debug version? ftp://ftp.netlabs.org/pub/odin/test/od.zip. I need the odin*.log files of your failed attempts (and crash logs if any).

comment:5 in reply to: ↑ 3 Changed 7 years ago by yoda

Replying to dmik:

I've checked it. Jeti/2 still fully works here with the latest b25 build.

Did you enable 'XMPP' as described above ? It only crashes here, when that is enabled.

However, your latest log shows that for you it crashes in IPHLPAPI.DLL (GetAdaptersInfo?). Which suggests that you have some unusual network adapter configuration. I will check the code.

It crashes both on my laptops and my desktop PC. None of these uses the same drivers.
OpenFire? also crashes in IPHLPAPI.DLL - again a completely different NIC setup.

comment:6 in reply to: ↑ 4 Changed 7 years ago by yoda

Replying to dmik:

Checked the code. Visually I don't see any problems. Can you quickly try the debug version? ftp://ftp.netlabs.org/pub/odin/test/od.zip. I need the odin*.log files of your failed attempts (and crash logs if any).

All logs incl pdump available here:

ftp://ftp.warpspeed.dk/pdumps/LapCrash185.zip

comment:7 Changed 7 years ago by dmik

Thanks, investigating.

comment:8 Changed 7 years ago by dmik

I can confirm some strange behavior. It doesn't crash here but it gives 100% CPU load if I enable the XMPP plugin. This only happens with the release version of JDK. The debug version works well.

For some reason, the Win32 exception handler is not called in the release build (in the debug build it is). As a result, the unsatisfied exception gets raised again and again.

comment:9 Changed 7 years ago by yoda

Tried the debug version of Odin with OpenFire? too:

ftp://ftp.warpspeed.dk/pdumps/Crash491.zip

comment:10 Changed 7 years ago by dmik

I found the reason why the exception handler did not work -- it's a small but nasty regression of http://svn.netlabs.org/odin32/changeset/21999. We didn't pick up the CONTEXT record changes made by the __except filter (if any). I fixed it now and it goes further but then it crashes at some debug assertion. I can't catch it though yet since the debugger is very unstable...

comment:11 Changed 7 years ago by dmik

The Odin regression is fixed in r 22004 there.

The debug assertion is fixed in r391.

Now I get a valid crash report from JVM that shows that JVM tries to execute code on the stack (!) and then unsurprisingly crashes due to the access violation exception. This is weird. Looks like some memory gets corrupted.

comment:12 Changed 7 years ago by dmik

Btw, can we get the source code of the Jeti/2 and especially of the XMPP plugin? Who has the contact of the author?

comment:13 Changed 7 years ago by yoda

Jeti can be found at http://jeti.sf.net

comment:14 Changed 7 years ago by dmik

Just tested it once more with hotspot rebuilt with ODIN_FORCE_WIN32_TIB and it works... really strange.

comment:15 Changed 7 years ago by dmik

Yoda, please check the debug hotspot build you may take here ftp://ftp.netlabs.org/pub/odin/test/od.zip with Jeti.

comment:16 Changed 7 years ago by yoda

I downloaded that 7 days ago - and reported back.
It doesn't seem to be updated, so the reports are still the same ???

comment:17 Changed 7 years ago by dmik

Mmm, the correct link is ftp://ftp.netlabs.org/pub/odin/test/hd.zip. My fault. Please try ASAP.

comment:18 Changed 7 years ago by yoda

Added that new DLL
Same crash as before in IPHLPAPI

ftp://ftp.warpspeed.dk/pdumps/LapCrash188.zip

comment:19 Changed 7 years ago by yoda

Sorry, I installed the DLL in the wrong place before.
After correcting it, I get a different crash:

ftp://ftp.warpspeed.dk/pdumps/LapCrash192.zip

Maybe this DLL doesn't work correctly with my GA2 version ?

comment:20 Changed 7 years ago by yoda

Tested same with OpenFire? - it looks like same crash as Jeti/2:

ftp://ftp.warpspeed.dk/pdumps/Crash493.zip

comment:21 Changed 7 years ago by dmik

Okay, my last guess that the new crash is because we:

  1. Don't maintain the win32 exception chain in non-ODIN_FORCE_WIN32_TIB mode
  2. Don't make sure that FS contains the right value around the place where hotspot detects the Thread pointer location (which it detects based on the contents of FS:[0] i.e. the pointer to the exception chain, see source:/trunk/openjdk/hotspot/src/os_cpu/windows_x86/vm/os_windows_x86.cpp).

1 is easy, 2 requires some thinking. This access logic of the Thread pointer is also used in the code that generates x86 assembly on the fly and I can't find MacroAssembler? primitives that could load a desired value to FS so far...

comment:23 Changed 7 years ago by yoda

Testing on my Desktop PC.

Jeti never initializes - something seems to be looping 4ever.
I stopped it with ctrl-C

Crash207.zip attached to this ticket

Changed 7 years ago by yoda

comment:24 Changed 7 years ago by yoda

Tested with OpenFire? too. It gives the same endless stack dumps in Odin log.

comment:25 Changed 7 years ago by dmik

Hmm, looks like you also need the new JAVA.EXE version to properly test it. All the tests above are not valid w/o it. I've added JAVA.EXE as ftp://ftp.netlabs.org/pub/odin/test/jd2.zip. Please try again with it and JVM.DLL from hd2.zip.

Please also try another test with JVM.DLL from ftp://ftp.netlabs.org/pub/odin/test/hd3.zip and JAVA.EXE from ftp://ftp.netlabs.org/pub/odin/test/jd3.zip.

comment:26 Changed 7 years ago by yoda

Added J2 (hd2 I already had).
Crash is now back to the usual one in IPHLPAPI.
Crash209.zip attached.

comment:27 Changed 7 years ago by yoda

OK, too big to attach - it is here instead:
ftp://ftp.warpspeed.dk/pdumps/Crash209.zip

comment:28 Changed 7 years ago by yoda

jd3 + hd3 added (Notice: hd3 is older than hd2 !).
Same crash in IPHLPAPI - another in POPUPLOG.OS2
ftp://ftp.warpspeed.dk/pdumps/Crash210.zip

If you need PDUMPS, I can enable them.

comment:29 Changed 7 years ago by dmik

Ok, thx, I will analyze them and give you the new versions to test.

It looks like in your case it simply doesn't run to a point where it starts executing the stack here (hd2) since it crashes in IPHLPAPI.DLL. Doesn't explain anything to me so far...

comment:30 Changed 7 years ago by yoda

Tested j2 on OpenFire?.
Crashes seems different - 498 gave guard page errors again - 499 ran a bit further.
ftp://ftp.warpspeed.dk/pdumps/Crash498.zip
ftp://ftp.warpspeed.dk/pdumps/Crash499.zip

These include PDUMPS

comment:31 Changed 7 years ago by yoda

Crash500.zip is OpenFire? with j3 and hd3. Guard page warning again.
ftp://ftp.warpspeed.dk/pdumps/Crash500.zip

comment:32 Changed 7 years ago by dmik

Found the bug. Should be fixed in Odin in r22006:

iphlpapi: Fix possible crash in GetAdaptersInfo?() and friends.

This could happen under some circumstances due to invalid casting
from int to char. Regression of the switch to GCC (which performs
conversions differently compared to VAC).

The code was wrong and relied on non-standard VAC behavior.

Here's the fixed (debug) version of IPHLPAPI.DLL for you to test: ftp://ftp.netlabs.org/pub/odin//test/iphlpapid.zip. Check and report back please. Test JD2/HD2 first.

Last edited 7 years ago by dmik (previous) (diff)

comment:33 Changed 7 years ago by dmik

Note that JD3/HD3 is the build with the old SEH scheme (ODIN_FORCE_WIN32_TIB) and hence the old guard page warnings (and possible crashes) are back -- switching to the new scheme was done in particular to fix these guard page warnings and related problems.

comment:34 Changed 7 years ago by yoda

Well, it doesn't seem to crash on startup now with JD2/HD2. It used to crash even
if server was unavailable (which it still is).

OpenFire? still doesn't work with any of these builds - crashes at startup:
ftp://ftp.warpspeed.dk/pdumps/Crash503.zip

comment:35 Changed 7 years ago by dmik

Do you mean that enabling the xmpp plugin in Jeti also works fine for you?

Regarding OpenFire?, the question is in #133.

comment:36 Changed 7 years ago by yoda

Well, I found an old account on another server to test.
I don't see any crashes in IPIHLPAPI any more - but it seems
to crash in several different ways now.

ftp://ftp.warpspeed.dk/pdumps/LapCrash204.zip
ftp://ftp.warpspeed.dk/pdumps/LapCrash205.zip
ftp://ftp.warpspeed.dk/pdumps/LapCrash206.zip
ftp://ftp.warpspeed.dk/pdumps/LapCrash207.zip
ftp://ftp.warpspeed.dk/pdumps/LapCrash208.zip

comment:37 Changed 7 years ago by dmik

And how does it behave with JD3/HD3?

And when do the above crashes happen? Enabling XMPP plugin again?

Last edited 7 years ago by dmik (previous) (diff)

comment:38 Changed 7 years ago by yoda

It crashes when all is up, just before it is ready to use.
It doesn't matter if XMPP plugin is enabled or disabled.

Now, trying JD3/HD3 crash is gone, and Jeti/2 seems to work !

comment:39 Changed 7 years ago by dmik

  • Milestone changed from GA3 to GA4
  • Version changed from 1.6.0-b24 GA2 to 1.6.0 Build 25 GA3

These problems need to be analyzed further. Moving it to GA4.

comment:40 Changed 7 years ago by dmik

  • Priority changed from major to blocker

We must solve the SEH problem ASAP. Too many apps are crashing with the new scheme (see #171, #173, #174).

comment:41 Changed 7 years ago by dmik

Seems that there is one more victim of the new SEH scheme - AOO. Java works in AOO (Apache OpenOffice?) if I use GA2 (old SEH scheme) and hangs (with a crash in ODINCDT.DLL) if I use GA3. Here's the POPUPLOG:

07-27-2012  14:41:30  SYS3170  PID 00e9  TID 0001  Slot 00c1
D:\APPS\OPENOFFICE.ORG.3\PROGRAM\SOFFICE.BIN
c0010002
118d00e2
P1=00000000  P2=XXXXXXXX  P3=XXXXXXXX  P4=XXXXXXXX  
EAX=00000000  EBX=168d6514  ECX=00000000  EDX=de4a9dc7
ESI=168d6514  EDI=168d6440  
DS=0053  DSACC=d0f3  DSLIM=5fffffff  
ES=0053  ESACC=d0f3  ESLIM=5fffffff  
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:1c3d07c9  CSACC=d0df  CSLIM=5fffffff
SS:ESP=0053:0012fe80  SSACC=d0f3  SSLIM=5fffffff
EBP=0012fe98  FLG=00010213

ODINCRT.DLL 0002:000000e2

Created #179 for it.

Last edited 7 years ago by dmik (previous) (diff)

comment:42 Changed 7 years ago by dmik

The problem has been fixed in Odin, see r22009 there.

A really tiny bug, regression of switching to the new SEH scheme.

Jeti/2 doesn't crash here anymore when I turn on and off the XMPP plugin.

The test version of JDK.DLL is here: ftp://ftp.netlabs.org/pub/odin/test/j4.zip. Please test and report.

comment:43 Changed 7 years ago by yoda

Tested j4 - and it does indeed seem like Jeti/2 is finally stable now :-)

comment:44 Changed 7 years ago by dmik

  • Resolution set to fixed
  • Status changed from new to closed

Great. Closing.

Note: See TracTickets for help on using tickets.