Opened 14 years ago
Closed 14 years ago
#12 closed defect (fixed)
Use a single encoding in UI API
Reported by: | dmik | Owned by: | |
---|---|---|---|
Priority: | blocker | Milestone: | |
Component: | odin | Version: | |
Severity: | Keywords: | ||
Cc: |
Description
It is known that there are two system encodings in Windows: the ANSI encoding (used all over in the GUI APIs) and the OEM encoding (used for compatibility with DOS applications and in particular in the command line sessions for file and console output). For God knows what reasons, in some locales, these two encodings differ. For example, in .ru, we have cp1251 (windows-1251) for ANSI and IBM-866 for OEM (this is the same as in OS/2 and MS/PC DOS).
I found out that at least one UI function, MessageBox(), expects two encodings at the same time for two its different arguments: the window title bar should be in IBM866 while the message text itself -- in windows-1251. I guess this happened to be like that because message text is drawn by Odin on its own (and hence the ANSI encoding is expected) while the title bar is drawn by the system (using the OS/2 encoding which is the same as OEM). This is completely wrong and needs to be fixed. According to Win32 docs, ANSI should be supplied in both cases.
Attachments (1)
Change History (7)
comment:1 by , 14 years ago
by , 14 years ago
Attachment: | encoding_hell.png added |
---|
comment:2 by , 14 years ago
On the screen shot, you may see that things go crazy now. The window text is fine (proper ANSI), the window title is wrong (misinterpreted as OEM) and the window list entry is correct again (though it's the same string as the tilte). I bet they use a direct WinSwitchEntry API call to set the window list entry and perform a different code page conversion policy in there.
comment:4 by , 14 years ago
In r21462, I also fixed CreateProcess() that had the same issue. Note that for consistency, all OSLib*() helpers in Odin expect strings to be in ANSI as well (they will internally convert them to OEM).
comment:5 by , 14 years ago
Hmm, actually it is not correct to interpret strings as ANSI in the involved OSLib* calls because all other OSLib* calls seem to interpret them as OEM (e.g. pass them directly to the OS/2 APIs) and changing it would be a whole lot of work. Canceled these changes (only in the part related to OSLib*, the rest of the changes is still valid) and replaced with a different solution.
comment:6 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
In r21464, I solved issue 3 by providing special versions of argc and argv, called __argcA and __argvA, which are exported from KERNEL32.DLL (and available at the source level if you include windows.h). These variables represent the command line arguments in the ANSI encoding (as opposed to argc and argv which are always in OEM from the Windows point of view).
Note that these variables are valid only after WinMain() is entered and must not be used if WinMain() is not used. The common technique to implement the windows behavior for Win32 applications that don't parse the lpCommandLine parameter of WinMain() but rather use the standard argc and argv variables is as follows:
#include <windows.h> #ifndef _MSC_VER #include <odinlx.h> int _main(int argc, char **argv); int WIN32API WinMain(HANDLE hInstance, HANDLE hPrevInstance, LPSTR lpCmdLine, int nCmdShow) { return _main(__argcA, __argvA); } int main(int argc, char **argv) { EnableSEH(); RegisterLxExe(WinMain, NULL); } #else #define _main main #endif int _main(int argc, char **argv) { // do something with argv which is already in ANSI here }
This basically corresponds to the RegisterLxExe() usage pattern anyway; the only difference is using these __argcA and __argvA in WinMain() instead of real argc and argv.
Note that __argcA is in fact the same value as argc (the number of arguments doesn't change after the codepage conversion), it is there just for clarity.
This pattern may be also found in the testapp/encodings test case. Now, the functionality of this test case on both the native Windows machine and on OS/2 under Odin is the same so I consider this ticket as done.
A clarification is needed regarding the ANSI/OEM stuff. There are in fact several issues:
The ideal solution that would resolve 3. (the ANSI/OEM mismatch) would be to make the ANSI encoding equal to the OS/2 system encoding in Odin (e.g. IBM866 for .ru). However this will break binary compatibility with proprietary Windows applications (such as Flash and Opera) which expect that ANSI is in Windows terms (e.g. windows-1251 for .ru).
I will research more.