#227 closed defect (invalid)
wcrtomb() broken
Reported by: | Silvan Scherrer | Owned by: | bird |
---|---|---|---|
Priority: | normal | Milestone: | libc-0.6.5 |
Component: | libc | Version: | 0.6 |
Severity: | normal | Keywords: | |
Cc: |
Change History (9)
comment:1 Changed 14 years ago by
comment:2 Changed 14 years ago by
Owner: | changed from bird to Silvan Scherrer |
---|
comment:3 Changed 14 years ago by
Owner: | changed from Silvan Scherrer to bird |
---|
i hate trac :) code snipped now better formated.
#include <stdlib.h> #include <stdio.h> int main() { char buf[256] = {0}; int rc = wctomb(buf, 0xE4); printf("rc %d char %02x (%c)\n", rc, (unsigned char)buf[0], buf[0]); return 0; }
comment:4 Changed 13 years ago by
I'm a bit surprise you get anything since you're not calling setlocale(LC_ALL, "")... Could you do that and also supply the hex value of the LATIN SMALL LETTER A WITH DIAERESIS character you're expecting.
comment:5 Changed 13 years ago by
Component: | baselayout → libc |
---|
comment:6 Changed 13 years ago by
ok did the test. even with LC_ALL i get the same output. the hex value of LATIN SMALL LETTER A WITH DIAERESIS (ä) is 84
comment:7 Changed 12 years ago by
Resolution: | → invalid |
---|---|
Status: | new → closed |
What happens is that when setting LC_ALL, LC_CTYPE or LANG to 'de_CH' you get a 'ISO8859-1' codepage, and in that code page LATIN SMALL LETTER A WITH DIAERESIS has the same value as the unicode code point. See http://en.wikipedia.org/wiki/ISO/IEC_8859-1 for details.
The codepage you're expecting to use is probably 850 (http://en.wikipedia.org/wiki/Code_page_850). To get output in that codepage you need to set LC_ALL, LC_CTYPE or LANG to the value 'de_CH.IBM850'.
I've added a testcase in r3756 which shows very clearly what setlocale() does and that the mbtowc <-> wctomb round-trip works correctly. There is no bug here, I think.
comment:8 Changed 12 years ago by
Milestone: | → libc-0.6.5 |
---|---|
Version: | → 0.6 |
comment:9 Changed 12 years ago by
r3788 changes the setlocale behavior when no codeset is specified, it now defaults to the process code page.
i get the following output: rc 1 char e4 (õ)
lang is de_CH
and i expected to get a ä