Opened 18 years ago

Last modified 18 years ago

#28 assigned defect

DBCS don't work properly

Reported by: dmik Owned by: dmik
Priority: normal Milestone:
Component: kernel Version: 3.3.1-rc06
Severity: normal Keywords:
Cc: DBCS Japanese

Description

DBCS characters can neither be entered nor displayed correctly. For example, if you set codepage 932 as the system default code page (provided that you have Japanese fonts and other resources installed), open the Qt font dialog, select Times New Roman MT 30 and select, say, the Katakana or Han script (which produces 41 61 42 62 83 4F 83 5F 83 6F 83 80 and 41 61 42 62 90 9F E4 70 8C 65 E8 D9 byte sequences in the Sample field, respectively), you will get Japanese characters overlaid each other. Also, there are reports that it's impossible to enter characters using IME (Input Method Editor).

Attachments (1)

qt3pko.zip (96.1 KB ) - added by komh 18 years ago.

Download all attachments as: .zip

Change History (9)

by komh, 18 years ago

Attachment: qt3pko.zip added

comment:1 by komh, 18 years ago

Patches for DBCS intput/output. Sometimes input box is misplaced a little.

comment:2 by dmik, 18 years ago

Status: newassigned

Dear Myung-Hun, thank you for the patches. I will check and commit them tomorrow: your changes in qapplication_pm.cpp (KC_CHAR and zero checks) may cause regressions for SBCS codepages in processing of Ctrl/Alt-letter combinations.

Regarding the input box position. Qt doesn't use PM focus (it handles all focus traversal logic on its own) and doesn't update the PM cursor position, so in order to get the correct "caret position" one needs to detect the widget class and use an appropriate way to query the coordinates of whatever the "caret position" for this widget class is. Also, I think we should return QCP_NOCONVERT in response to WM_QUERYCONVERTPOS for those widgets that don't have any "caret position" at all (for example, QFrame, QLabel, QPixmap, etc).

Btw, if you know how to check DBCS input/output w/o installing a DBCS version of OS/2, please share this knowledge to let me test things locally. If I simply set, say, codepage 932 on my SBCS eCS, nothing works as it should of course.

comment:3 by komh, 18 years ago

In case of qapplication_pm.cpp, I think the following code is better.


if( ch ) {

if( chm.chr & 0xFF00 ) DBCS chars ?

text = QString::fromLocal8Bit(( char * )&chm.chr, 2 );

else

text = QString::fromLocal8Bit(( char * )&ch, 1 );

}


In case of the input box position, is there a general way in Qt to query the coordinates of the caret position ? And is it possible to determine whether a widget have a careat or not. I don't know overall structure of Qt. I've looked into a minimum portion of Qt just to support DBCS. So if you give me advice for this, I willing to do so.

In fact, at first I tried to use 'microFocusHint()', but it didn't work as my expect. Maybe it could be my mistake. Anyway the method to determine the input box position need to be enhanced or implemented full-ime aware routine. Nevertheless, it is better than nothing.

Enabling DBCS environment on eCS is easy than your thinking. Just add DBCS codepage as secondary codepage. And before launching app, change its codepage to secondary codepage. I know, WPS of eCS has a ability to set starting codepage per program. That's all.

Of course, you need DBCS fonts such as Times New Roman MT 30.

And if you want to input DBCS directly without using clipboard, you need IME such as KIME(for Korean).

For reference, I list codepages for DBCS countries.

932/943 Japanese(S-JIS?) 949 Korean(Euc-KR) 950 Taiwan(Big-5) 1381/1386 Chinese(GBK?)

comment:4 by dmik, 18 years ago

Myung-Hun, I committed your patches in changeset:169:

  1. DBCS output patches have been slightly optimized, please check that they still work.
  1. For the input code, I recalled what would your corrections break. The problem is that OS/2 doesn't set the KC_CHAR flag for key events with Ctrl and Alt modifiers, while for Alt+letter combinations the chr portion of the key press message still contains a valid (SBCS) char. This "side-effect" is used in Qt/2 to make detection of Alt+letter shortcuts work for non-ASCII letters. Otherwise, if we obey KC_CHAR and don't report a letter to Qt when it is pressed with Alt, it won't be able to find a widget that defines this letter as an underlined shortcut.

I tried to workaround this problem similarly to your second suggestion, please check that it doesn't break something on the DBCS side. Alt+<DBCS_symbol> combinations should also work, provided that OS/2 behaves the same way for DBCS as for SBCS and reports a valid letter code in chr even if KC_CHAR is not set.

Also note that I removed checks for zero in store_key_rec() and friends because it breaks Qt functionality. Instead, I completely ignore events whit code=0 and scancode=0 in translateKeyEvent(). However, I'm not sure if it solves the DBCS problem you tried to solve when added these checks. See comments in the code for details.

  1. About the IME input box position. Actually, detecting the widget type in WM_QUERYCONVERTPOS is not a proper way to go, because a) it is hackish and b) it will only work for known widgets explicitly processed there. The proper way is to use the special QIMEvent and handle it appropriately in the code of individual widgets themselves. Just search for QIMEvent through the code to get the idea. If you like you can try to implement this approach for Qt/2. In the meantime, you can change the WM_QUERYCONVERTPOS handler to report the bottom right corner of the widget as the suggested position -- at least, it should work better than the current implementation.

Unfortunately, your suggestion about testing DBCS input/output on SBCS systems don't work because necessary DBCS system resources are missing there (perhaps, except that new UniConv API), so that OS/2 will simply complain about invalid code pages at boot time if you try to set the primary or secondary code page to one of the above numbers.

P.S. Your patches are very welcome, but please make sure that your editor doesn't replace TAB characters with series of space characters: I want to keep common sources as close to the official Qt repo as possible (including their "weird" tabulation and indentation) to simplify future code base synchronization. However, for newly added and OS/2 specific code you may (and should :) use "normal" tabulation (but still please follow Qt indentation rules, for consistency).

comment:5 by komh, 18 years ago

Hi/2.

For 1 and 2, unfortunately, both input and output doesn't work. I'll look into this.

For 3, I agree with you that QIMEvent is a best solution. But current implementation satisfy me due to your WinCreateCursor() code in QWidget::setMicroFocusHint(), although it is a little lack.

For DBCS env, I'm sorry that I missed that you should modify COUNTRY setting in CONFIG.SYS as well.

081 for Japan, 082 for Korea, 086 for Chinese, 088 for Taiwan.

But I know, these country code accept only 437 or 850 codepage.

Finally, I would follow Qt indentation rules as you said.

KO Myung-Hun

comment:6 by komh, 18 years ago

You should change in qfontengine_pm.cpp

bool isDBCSLeadByte = uconvFirstByteTable[ s[ i ] ] == 2;

to

bool isDBCSLeadByte = uconvFirstByteTable[ s[ i ] & 0xFF ] == 2;

And in qapplication_pm.cpp, you should regress to my patches. Because IME send DBCS chars with 0 scancode, so if you ignore the case with code = 0 and scancode = 0, DBCS chars cannot be delivered to Qt. And the reason I added the code checking scancode and code are 0 in find_key() and store_key(), is to avoid DBCS chars with 0 scancode is recognized as 'Already Pressed Key'. I don't know Qt disallow user's Down/Up unmatched keyboard message, but I don't think this break Qt functinoality. Really, OS/2 IME do that. On composting DBCS chars, Down event is intercecpted by IME, so PM app cannot receive Down event. But it receive Up event because IME doesn't intercept it.

And I think we should change

if ( !(chm.fs & KC_KEYUP) && (chm.chr & 0xFF) && (chm.chr & 0xFF00) &&

qt_os2UconvFirstByteTable()[ (chm.chr & 0xFF) ] == 2 ) {

to

if((( chm.fs & ( KC_VIRTUALKEY | KC_CHAR )) == KC_CHAR ) && ( chm.chr & 0xFF00 )) {

When IME send DBCS chars on WM_CHAR, KC_CHAR is set but not KC_VIRTUALKEY. So we can avoid by this that some keys with both KC_VIRTUALKEY and KC_CHAR such as ESC, SPACE, ENTER, TAB, etc... are recognized as DBCS chars. As well as, Alt/Ctrl combination chars don't have KC_CHAR, so we can also avoid them. Consequently, we need not such a complicated comparison to determine if chm.chr is DBCS char or not.

comment:7 by dmik, 18 years ago

Thanks for pointing to this signed/unsigned bug in qfontengine_pm.cpp (though I prefer to cast to uchar instead of ANDing 0xFF for better self-descriptiveness).

Regarding zero code/scancode checks, you finally convinced me, so I restored your variant. Your correction to the condition for DBCS chars seems to work well with SBCS, so I've accepted it as well.

See changeset:170 for details.

P.S. Changing COUNTRY helped indeed. VIO sessions lacking proper DBCS fonts seem to be the only problem in this setup.

comment:8 by komh, 18 years ago

Ok, it works well.

And you can use KShell to overcome DBCS problem of VIO session on non-DBCS OS/2.

KO Myung-Hun

Note: See TracTickets for help on using tickets.