Opened 11 years ago

Last modified 11 years ago

#106 new defect

Crash in FILT.DLL when using Flash in Firefox 17

Reported by: dmik Owned by:
Priority: major Milestone: general enhancement
Component: odin Version: 0.8.9
Severity: medium Keywords:
Cc: steve53@…

Description

This happens only when starting Firefox 17 for the first time after reboot and attempting to use Flash. Flash pulls in Odin runtime both to the parent process (firefox.exe) and then to the plugin process (plugin-container.exe). Odin initialization goes well for firefox.exe and fails for plugin-container.exe. As a result, NSPR reports an error loading the Flash DLL (error 87). For some reason Firefox authors don't actually expect this to happen, so as a result Flash contents just fails to proceed and no other container process is started.

From what I see in the Odin logs, WINMM.DLL init routine fails because of a crash in FILT.DLL (this is a MMPM library). The nature of the crash is not yet known.

It's strange that all works well when starting Firefox for the second time (and all subsequent ones).

See https://github.com/bitwiseworks/mozilla-os2/issues/20#issuecomment-32423627 for some more info.

Change History (14)

comment:1 by dmik, 11 years ago

Got some more details. The crash in FILT.DLLhappens when the WINMM.DLL's init routine attempts to load MDM.DLL in order to initialize MMPM. So far, It looks like a timing issue for me. For some reason, when MDM.DLL is loaded by the child of the first process, just a few secs later, it crashes. Subsequent tight startup works just fine. Probably because some dangerous initialization has already passed when doing the first load or something like that.

I will try to catch the crash when loading MDM.DLL and attempt a few more times to do so. Let's see if it helps.

comment:2 by dmik, 11 years ago

Summary: Crash in FILT.DLL when starting Firefox 17Crash in FILT.DLL when using Flash in Firefox 17

comment:3 by dmik, 11 years ago

It's not a good idea to play with exception handlers from within the DLL init/term routine. Too complex and error prone. Instead, we will fix that on the Firefox side. See https://github.com/bitwiseworks/mozilla-os2/issues/43.

While this really may be an issue of the combination of MMPM extensions used in eCS 2.2 Beta II (not proved yet), the real best fix for that would be to move away from MMPM to the modern uniaud-based audio library completely. This will be done within #85.

We will leave this open until then. Also for the case if we suddenly resolve this issue on the eCS side.

comment:4 by dmik, 11 years ago

Herwig reports that he gets exactly the same problem and his system eCS 2.1 here (with updated ACPI). So the FILT.DLL problem may be very old actually.

comment:5 by Steven Levine, 11 years ago

Cc: steve53@… added

I wonder if this is similar to

http://bugs.ecomstation.nl/view.php?id=2874

and if a similar fix is reasonable workaround. What triggered the exception was that snd.dll loads/unloads several times during boot up and neglects to clean up semaphore handles stored in shared memory. What mmfix does is loads snd.dll and sleeps for 60 seconds. By then snd.dll has been loaded by some other process and will stay in memory until the system is rebooted. Perhaps preloading filt.dll or one of the DLLs that cause it to be loaded will workaround the exception.

comment:6 by dmik, 11 years ago

Yes, why not give it a try. I was a bit surprised by my workaround but not too much — I've already seen something like that somewhere in the OS/2 code (don't remember what it was though).

comment:7 by dmik, 11 years ago

As I pointed out in https://github.com/bitwiseworks/mozilla-os2/issues/43#issuecomment-33723670, we have another crash that seems to be somehow related to the crash in FILT.DLL.

I guess I know what is going on here. NPFLOS2.DLL uses the following (default) DLL module flags: INITGLOBAL, DATA SINGLE SHARED. As a result, the DLL init routine is called only once and there is a single copy of the data segment shared among all processes.

In turn, the wrapper adds its own level of indirection to the plugin DLL entry points: the exported ones (including NP_GetEntryPoints) do not do any work on their own, they just forward calls to members of the global NPODINWRAPPER structure ('gPlugin variable). This structure is initialized with the real worker functions in the DLL_InitTerm()` function when the DLL is loaded for the first time.

The following is my wild guess. When the first attempt to load NPFLOS2.DLL fails with error 87 due to a crash in FILT.DLL, Firefox (after the fix, see the issue) will try to immediately load it for the second time. However, this second time doesn't cause DLL_InitTerm() in NPFLOS2.DLL to be executed (for whatever reason — I don't know). And since the crash in FILT.DLL happens *before* the first initialization attempt is complete, NPODINWRAPPER is left uninitialized. As a final result, when the loading process attempts to call NP_GetEntryPoints, that function just blindly redirects it to NULL.

Note that I have a proof for the fact that DLL_InitTerm()is not called for the second time: the NPFLOS2 log file remains as it was after the first (crashed) initialization attempt. Otherwise it would have been overwritten by the new text (opening the same file with fopen within the same process is valid and works).

Last edited 11 years ago by dmik (previous) (diff)

comment:8 by dmik, 11 years ago

BTW, I don't know exactly why the additional level of indirection is added to exports. Judging from the comments it was done because we're dealing with a generic Win32 Plugin DLL wrapper intended to be used with many plugins "on the fly" (remember — the Java plugin was also from Windows in innoTek times).. Not relevant today (as we only have Flash) but regardless of that, we need the init routine to get completely executed to make sure the plugin works, of course. So we need a fix for this situation anyway.

Last edited 11 years ago by dmik (previous) (diff)

comment:9 by dmik, 11 years ago

Hmmm, sorry I looked at a wrong place. The DLL is actually INITINSTANCE with DATA MULTIPLE NONSHARED. So these flags are not the case. But still, the init routine is *not* called for the second time. I will try to add a DosSleep between attempts to reload the DLL in Firefox to see if it helps.

Last edited 11 years ago by dmik (previous) (diff)

comment:10 by dmik, 11 years ago

BTW, I just found in Odin logs that reloading NPFLOS2.DLL twice from Firefox (see comment:3) doesn't fully solve the problem. During the second attempt it' doesn't crash, but loading MDM.DLL fails with error 295 instead (this error is a failure in the init routine). This means that there is no audio in Flash in FF17 at all now. We have to solve it somehow.

Another thing is that 'MDM.DLL' is already in memory when Firefox loads — it is loaded by PM at startup so it's always there. However, FILT.DLL has the INITINSTANCE flag set, so its init routine is called per each process. And somehow calling it twice within a few second interval screws it...

comment:11 by dmik, 11 years ago

Another detail: even if you disable loading of MDM in WINMM.DLL, it still crashes in FILT.DLL, but later: when DSOUND.DLL is inited (it just does the same as WINMM.DLL). This really annoys me.

Last edited 11 years ago by dmik (previous) (diff)

comment:12 by dmik, 11 years ago

This comment in FFox ticket is relevant: https://github.com/bitwiseworks/mozilla-os2/issues/43#issuecomment-34776135.

BTW 2. Comment:7 above (error 295 when loading MDM.DLL) applies not only to the second attempt of the first run of Firefox, but also to all subsequent runs of Firefox. It starts to constantly return error 295 after the initial crash until you reboot the system. And not only Firefox 17 gets affected, but also Firefox 10 (which doesn't suffer from that problem per se since there's always only one process).

I guess that any MMPM application trying to use MMPM after this crash will fail. (I just don't have any to check — the standard MMPM applications seem to use PM classes for audio playback and therefore continue to work).

So this becomes a MAJOR problem then.

comment:13 by dmik, 11 years ago

I have written a simple test case that loads MDM and then starts a child that loads MDM. It works. Then I changed MDM to NPFLOS2.DLL to drag Odin in. And it still works well. So, the crash only happens when NPFLOS2 is loaded from the Firefox process. Maybe that's because Firefox is a PM process with the event queue and such, may be because Firefox allocates a lot of memory by the time when NPFLOS2 gets dragged in, who knows. An attempt to create a simple test case for debugging this specific crash is failed though.

Version 0, edited 11 years ago by dmik (next)

comment:14 by dmik, 11 years ago

I have no ideas other than a) debug into MMPM DLLs and b) drop MMPM completely in favor of libkai (#85). Both are lengthy tasks and we don't want to hold the release of the new Firefox for long, therefore we will postpone it and use a workaround that disables IPC for Flash for now (see https://github.com/bitwiseworks/mozilla-os2/issues/43#issuecomment-33723670).

Note: See TracTickets for help on using tickets.