Opened 14 years ago

Closed 13 years ago

#19 closed task (fixed)

Move to GCC

Reported by: dmik Owned by:
Priority: major Milestone: compiler switch
Component: odin Version:
Severity: highest Keywords:
Cc: psmedley

Description

It makes sense to build the whole Odin with the latest GCC 4.x. Among other things such as the overall better quality of the compiler comparing to the ancient VAC3 and the fact that GCC's kLIBC is already widely used in the system (including Odin itself), this has another important advantage: VAC3 runtime (which Odin CRT DLL is currently comprised of) has a limitation that it requires using DosExitList() in a DLL that uses C++ classes and wants destructors of static/global objects to be properly called at program termination which is very inconvenient and kind of dangerous -- a failure in a DosExitList routine may turn process into a zombie.

This is especially a problem when Odin CRT DLL is mixed with some other C runtime DLL like kLIBC. kLIBC does all static destruction at DLL termination time but this happens after processing exit list routines and may easily create a situation when a C++ kLIBC class makes a call to Odin (e.g. some Win32 API) which ends up in a C++ Odin class but since Odin CRT is already uninitialized, the application will most likely crash.

BTW, such DosExitList() usage in Odin and in VAC runtime may be also a reason for the infamous hang in DosExitList at program termination on SMP machines. Or somehow related to that hang. I don't have any proof so far, just a feeling.

Change History (59)

comment:1 by dmik, 14 years ago

Type: defecttask

It makes sense to do this task together with #5.

comment:2 by abwillis, 14 years ago

Due to pe.exe and pec.exe not working with VAC365 Ticket #22 (and it didn't work on a VAC308 build last night) I decided to try with GCC. Check in Committed revision 21559. r21559

comment:3 by abwillis, 14 years ago

peldr, pe.exe and pec.exe built with GCC works. Committed revision r21560

comment:4 by dmik, 14 years ago

A few remarks about your check-ins:

r21559:

  1. There is obviously a typo at the beginning of #36 in odin32.dbg.emx.mk.
  2. I think we should drop old EMX support completely so let's better delete all these explicit EMX library lines instead of commenting them out.
  3. For LD2TARGETFLAGS (in rel.mk and dbg.mk), instead of deleting explicit linker flags, you may use -Zlinker /PM:PM or -Zlinker /PM:VIO for selecting the executable type (emxomfld will convert it appropriately for ILINK and WLINK) and -Zstack to set the stack size.

r21560:

  1. Using predefined paths is bad-bad-bad. You should either add GCC and OS2TK45 include and library directories to your system include/lib paths, or create an own environment script that does that, or enhance configure.cmd so that it does that through the existing makefile.inc. (would be only valid for OS2TK45 as it is supposed that GCC is set up when you run make. BTW, the win32k version of configure.cmd already adds the Toolkit to its own .inc file).

Please correct your check-ins according to the above.

JFYI. Also, when moving to GCC we will actually move from nmake to kBuild. This will dramatically simplify the build process and its configuration and make all these *.mk files obsolete.

comment:5 by abwillis, 14 years ago

I had not meant to commit hardcoded paths. I had started working on the -Zlinker options but I have since found that the resulting executables were not good for several apps such as Lotus Notes so I have backed out r21560 and r21559

comment:6 by abwillis, 14 years ago

Backouts were r21561 and 21562

comment:8 by Silvan Scherrer, 14 years ago

Milestone: compiler switch

comment:9 by Silvan Scherrer, 13 years ago

Severity: highest

comment:10 by dmik, 13 years ago

JFYI, the current status: unicode/guidlib/seh static libraries are done. Now working on kernel32 (the biggest), ~50% done.

comment:11 by dmik, 13 years ago

Regarding _DLL_InitTerm() and exit lists (I've just come to that point). AFAIU, the whole matter of using exit lists in Odin is because the OS/2 loader has a substantial bug: the order in which DLLs are loaded (and _DLL_InitTerm(0) used to initialize DLL data is called) doesn't necessarily match the order in which they are unloaded (and _DLL_InitTerm(1) is called to perform the cleanup). As a result, if dll-B uses something from dll-A in its cleanup procedure (_DLL_InitTerm(1)) and dll-A gets unloaded before dll-B, a crash will occur.

The solution (advertised even in VAC++ Programming Guide!) is to use exit lists. Exit lists are guaranteed to be processed before any DLL gets unloaded at process termination which solves the cleanup order issue (given that the proper exit list order codes that ensure the specific order are used).

This doesn't only relate to LIBC (or ODINCRT) but also to some other key DLLs others depend upon (like KERNEL32). For this reason, we cannot get rid of exit list usage completely.

My idea though is to optimize the related parts of the code by introducing a common wrapper which will ensure all the machinery and also use the exception handler within the exit list handler to prevent any exceptions when the exit list routines are run (which is known to cause unkillable process hangs).

Last edited 13 years ago by dmik (previous) (diff)

comment:12 by dmik, 13 years ago

Correction: exceptions in the exit list handlers don't cause unkillable process hangs. Hangs in the exception handlers (e.g. an endless DosWaitEventSem()) triggered from within the exit list handler or _DLL_InitTerm(1) cause the process to hang forever.

There is no universal way to protect from that since the application may always install a new exception handler on top of the existing exception handler chain and hang from there at exit time. However, with kLIBC used as the CRT library (i.e. after the full switch to GCC), we have full control over all exception handlers in Odin so we may fix this if we find any problems there.

ATM (after we fixed a lot of possible hangs in Odin exception handlers), the only known case causing the unkillable hang is a recursive crash in printf() in VAC CRT in the debug version of Odin. This will go away with GCC.

comment:13 by dmik, 13 years ago

Another interesting observation shown in my tests is that if a DLL is manually loaded by the application with DosLoadModule(), _DLL_InitTerm(0) is called before DosLoadModule() returns. However, calling DosFreeModule() on that module does *NOT* cause _DLL_InitTerm(1) to be called -- it is only called at process termination... This looks odd and means that DosFreeModule() is kind of a no-op. Another OS/2 specialty.

comment:14 by dmik, 13 years ago

In r21733:21734, I separated the repeated DLL init/term code and placed it to a new static library, initdll.lib. This should greatly reduce duplication of the init/term code and simplify the DLL initialization routines.

comment:15 by dmik, 13 years ago

I have to do a lot of manual extern "C" prototype conversions unfortunately because __stdcall doesn't imply it in GCC (as opposed to VAC). This is a bit slow.

Last edited 13 years ago by dmik (previous) (diff)

comment:16 by dmik, 13 years ago

This __stdcall behavior looks like another GCC/2 bug to me... As I can judge from Google, GCC for Windows implies extern "C" for __stdcall declarations too and therefore doesn't mangle file names in a C++ way.

comment:17 by dmik, 13 years ago

Linking DLLs requires special internal versions of .def files from these DLLs which export symbols using their internal C names, not the names actually exported from the DLLs. In order to achieve this process, a tool called impdef is used in Odin. We will have to build it too (together with the common library it drags in).

comment:18 by dmik, 13 years ago

Ported common.lib and impdef.exe.

comment:19 by dmik, 13 years ago

NTDLL.DLL is now built and added a bunch of various fixes in common parts.

Turned out that KERNEL32.DLL needs some bits from ODINCRT.DLL so it is next.

Last edited 13 years ago by dmik (previous) (diff)

comment:20 by dmik, 13 years ago

Built WIN32K.LIB and ODINCRT.DLL.

In the VAC mode, besides providing all VAC C runtime as custom DLL ODINCRT, memory functions like malloc()/free() were overridden to save/restore the FS register. I need to check if we need to do the same for GCC.

comment:21 by dmik, 13 years ago

Finally, KERNEL32.DLL is built. Everybody may now try to compile everything himself, to check the environment requirements and so on. Note that parallel building is not possible yet (since I couldn't find a way to make sure the tools we need for the build process, like winerc.exe and impdef.exe, are built before other stages are started in parallel; this looks like a kBuild bug to me). So you will have to force a single threaded build with -j 1 for now.

Next steps:

  1. Check if malloc()/free() need to be overridden and do so if needed (see above).
  2. Check if the shared data segment in KERNEL32.DLL becomes such under GCC.
  3. USER32.DLL.
  4. Other DLLs.

comment:22 by dmik, 13 years ago

Regarding extern "C". I added it to the code where necessary to workaround the GCC bug with stdcall mangling for the time being (building GCC here turned to be quite complex, I'm stuck now with some duplicate symbol errors when building libstdc++ so this is TBD later).

Version 0, edited 13 years ago by dmik (next)

comment:23 by dmik, 13 years ago

We need also to hack WGSS50.DLL since it refers to some VAC functions (memory allocation again) previously provided by ODINCRT.DLL.

comment:24 by dmik, 13 years ago

Done with malloc()/free(), any DLL which links to odincrt.lib, will use them instead of the LIBC ones.

BTW, I'm seriously thinking about going the innoTek way right from the beginning and having a single DLL that includes all Win32-like DLLs (i.e. KERNEL32, USER32, NTDLL, GDI32 etc) as well as ODINCRT and all other necessary libraries. That is, something similar to what is named INNORT.DLL. There is an obvious benefit of going this way -- less files, less troubles, easier to distribute and link to and so on. Another benefit is that we will protect against intermixing the old VAC DLLs and new GCC ones which may take place on a user's machine if he doesn't install Odin carefully enough. This may lead to some unexpected conflicts at runtime.

The current Odin codebase contains some basics of this all-in-one DLL, it is located in the /src/custombuild directory. A good place to start.

comment:25 by dmik, 13 years ago

Shared data for KERNEL32 and other DLLs is done (using the same scheme of shared data segments in a DLL as in the original VAC build).

WGSS50 hacking requires a bit more time than expected (it uses a lot of undocumented APIs). Doing that now.

comment:26 by dmik, 13 years ago

Hacked WGSS50 to use GCC LIBC instead of VAC runtime and committed the new version of the DLL. USER32 is next.

comment:27 by psmedley, 13 years ago

Does odin need _Optlink support at all? If so that will be a problem with GCC 4.x

comment:28 by psmedley, 13 years ago

Cc: psmedley added

comment:29 by dmik, 13 years ago

No, _Optkink is not needed in GCC mode. Why is it a problem? Is _Optlink completely broken now?

comment:30 by psmedley, 13 years ago

the patches for Optlink need major revisions for GCC 4.x - thus far I haven't successfully made it work and give the lack of need for it haven't worried about spending too much time on it :)

comment:31 by dmik, 13 years ago

USER32.DLL is done.

I'm really tired of the manual battle with mangling of __syscall/__stdcall functions so I will try to hack/fix GCC one more time now.

comment:32 by dmik, 13 years ago

I solved the mangling and the extern "C" problems for __syscall/__stdcall (see http://mantis.smedley.info/view.php?id=474 and http://mantis.smedley.info/view.php?id=494), so Odin should compile more smoothly now.

The only remaining one is handling varargs (a rare case but still needed), should be relatively easy.

Last edited 13 years ago by dmik (previous) (diff)

comment:33 by dmik, 13 years ago

The fixed GCC is much-much better for portning IBM sources, no more hassle with extern "C" at all.

GDI32.DLL is done.

comment:34 by dmik, 13 years ago

IMM32.DLL is done.

comment:35 by dmik, 13 years ago

Built the debug version of the ported libraries. It is necessary to make the first simple application work -- it still crashes, so a more deep exploration is needed.

I also uploaded the fixed GCC binaries to ftp://ftp.netlabs.org/pub/odin/tools/gcc446_hotfix_48e94fcc.zip for someone who wants to try to build Odin himself. The short readme is inside. Later we will made an RPM of GCC 4.4.6 with these fixes included.

comment:36 by dmik, 13 years ago

Okay, the crashes are due to the _Optlink calling convention being completely broken in GCC 4.x.x. I will try to look at it...

comment:37 by dmik, 13 years ago

I fixed _Optlink in GCC, details are here: http://mantis.smedley.info/view.php?id=494. The new binaries are here: ftp://ftp.netlabs.org/pub/odin/tools/gcc446_hotfix_89e310f7.zip.

Now the test case goes much further. Investigating some problems with loading of odin.ini.

comment:38 by dmik, 13 years ago

I fixed all the remaining issues and the simplest test case, testapp/console/file runs now. So, congrats with the first GCC-based Odin application.

Other test cases need more DLLs so I'm switching back to DLL porting.

comment:39 by dmik, 13 years ago

Got a couple more test applications running including a simple GUI application (had to fix a number of things for it to work).

comment:40 by dmik, 13 years ago

Ported SEH testcases and fixed SEH handling under GCC.

Further test cases require COMCTL32 which is my next target.

comment:41 by dmik, 13 years ago

COMCTL32 is done, next are WINMM and MSVFW32 (required by it).

comment:42 by dmik, 13 years ago

Ported WINMM, MCICDA, MCIWAVE, LZ32, VERSION and MSVFW32 + some more testcases.

comment:43 by dmik, 13 years ago

Ported ADVAPI32, RPCRT4, OLEAUT32, OLE32, REGSVR32, SHLWAPI.

comment:44 by dmik, 13 years ago

Ported SHELL32 (another big beast) and the systray testcase.

comment:45 by dmik, 13 years ago

Ported WSOCK32 and IPHLPAPI as well as IPHLIPAPI testcases. Ported the threads testcase (which was the last one).

There are now a small number of simple DLLs remaining to port in order to complete the process.

comment:46 by dmik, 13 years ago

Ported AVICAP32 and AVIFIL32.

Got a big problem with CAPI2032. Seems that EMXOMFLD is bogus and doesn't allow to have the same symbol name in both the IMPORTS section and the EXPORTS section. But in case of CAPI2032, we must both import symbols like CAPI_BLALBA from CAPI20.DLL (OS/2 DLL) and export them from CAPI2032.DLL (Win32 DLL). I don't see any simple solution ATM...

comment:47 by dmik, 13 years ago

Ported WINSPOOL and COMDLG32. CAPI2032 is deferred so far.

comment:48 by dmik, 13 years ago

Ported CRYPT32, CTL3D32, DCIMAN32, DDRAW, DINPUT, DPLAY and DPLAYX. Among the ones necessary for Java is only WS2_32 so far. I will port it and then switch to testing Java.

The rest will be done later.

comment:49 by abwillis, 13 years ago

Just as an FYI, in the Odin32xp there was work done on widl which is used in Wine so may be of help when getting ready to do any syncs with Wine.

comment:50 by dmik, 13 years ago

Ported DSOUND, WS2_32, ICMP, MPR. That's enough for Java. It starts but crashes ATM. Debugging.

comment:51 by dmik, 13 years ago

The crash was due to the compiler bug I fixed in https://github.com/psmedley/gcc/commit/22a7d6473b975049ebcb67f237ffad94f016ae78. So after doing the full rebuild the crash has gone.

I tested our latest release of Java with the new Odin and it works -- I tried the quite heavy SmartSVN GUI application. Got only one regression so far: a crash at application termination. Investigating.

comment:52 by dmik, 13 years ago

The crash happens when translating OS/2 char messages to Windows char messages. The reason is that VAC anc GCC perform conversion between integers of different width and signedness differently. Given this program:

    unsigned int ui;
    unsigned char uc = 0xa2;
    char c = 0xa2;

    ui = uc;
    printf ("ui %x uc %x\n", ui, uc);

    ui = c;
    printf ("ui %x c %x\n", ui, c);

VAC will output:

ui a2 uc a2
ui a2 c a2

GCC will output:

ui a2 uc a2
ui ffffffa2 c ffffffa2

This means that when converting from char to unsigned int, GCC will first converts char to int (which gives us fffffa2) which is then converted to unsigned int w/o changes. VAC first converts char to unsigned char (still a2) which is then converted to unsigned int w/o changes.

Odin obviously relies on the latter and does it in many-many places I guess. Therefore we need some generic solution -- it is unrealistic to find and correct all these cases manually. May be GCC has some compatibility option...

BTW, it seems that this was one of the reasons why the EMX implementation of the OS/2 toolkit headers provided char-wide definitions (e.g. BYTE, PSZ) as both signed and unsigned giving the ability to switch between them with the OS2EMX_PLAIN_CHAR define (which is not set by default meaning that BYTE is unsigned which is not what BYTE is in the toolkit but it solves the above mentioned conversion problem).

comment:53 by dmik, 13 years ago

At least, I found GCC options that will warn us about cases like that: -Wsign-compare will give a warning "conversion to 'unsigned int' from 'char' may change sign of the result" on the assignment ui = c, while -Wcompare will warn us with "conversion to 'char' alters 'int' constant value" about assigning 0xa2 to char c.

This way, we will spot all the relevant places in Odin sources and do the proper explicit conversion there. I will do it right now. It may require some time since there may be many places like that.

comment:54 by dmik, 13 years ago

It turned out that there are too many dangerous conversions between signed and unsigned integers in the Odin code so that resolving them all will take ages (naturally). For that reason, I refused this approach and simply fixed the particular problem (see r21885). We will fix other places as soon as we find it.

In r21886 I fixed a typo that caused the recognition of modifier keys to be broken.

Now, SmartSVN works well under the new Odin. Made both commits from it -)

So the further plans:

  • Test more Java apps.
  • Port remaining DLLs (just a few).
  • Release the new Odin RPM in the experimental repository (aka beta).

comment:55 by dmik, 13 years ago

The rest of DLLs is ported (too many to list here) and I also found a workaround for CAPI2032.DLL w/o patching EMXOMFLD (created a ticket for that though: http://svn.netlabs.org/libc/ticket/254).

Now I will cleanup the branch and then merge it to the trunk and then release a new RPM.

comment:56 by dmik, 13 years ago

Unfortunately, I can't merge the branch to the trunk. Crap. Some stupid SVN errors like "directory is locked" (in a cleanly checked out copy!). Will try to merge on Linux, this may be related to file name case changes I did on the branch (used Linux as well for that).

comment:57 by dmik, 13 years ago

Same problem:

svn: Working copy 'src/ntdll' locked
svn: Error reading spooled REPORT request response
svn: run 'svn cleanup' to remove locks (type 'svn help cleanup' for details)

Piece of crap.

Somehow SVN is unable to deal with directory renames and merges... I guess I will have to rename NTDLL to ntdll and DplayX to dplayx on the trunk first in order to merge.

Last edited 13 years ago by dmik (previous) (diff)

comment:58 by dmik, 13 years ago

OK, Linux != Mac OS. The native FS (HFS) in Mac OS is also case insensitive, just like HPFS. Tried to merge from the real Linux and it worked. I'm now building the new trunk.

Note that the above means that svn update will most likely fail if you run it in your old (pre-merged) trunk on an OS/2 machine. The best way to get the current trunk is to perform a clean checkout.

comment:59 by dmik, 13 years ago

Resolution: fixed
Status: newclosed

Released 0.8.1 in the beta status (which also means that there is no announce on the main page and that RPMs are only available in the experimental repository, netlabs-exp).

A few notes about the build:

  1. The win32k driver is not yet built and therefore not yet packaged. Not a big loss since it is not "officially" supported by us. We can build it later but we will have to depend on MSVC60 for that.
  1. The current kBuild is broken and doesn't build odininst.exe when run from the root of the source tree. In order to build it, you have to run kmk from the tools subdirectory. To be investigated.
Note: See TracTickets for help on using tickets.