Opened 9 months ago

Closed 9 months ago

#328 closed defect (fixed)

Python does not recognize OS/2 Chinese codepage (1386)

Reported by: ataylor Owned by:
Priority: major Milestone:
Component: yum Version:
Severity: low Keywords:
Cc:

Description

YUM unfortunately fails when run on OS/2 Simplified Chinese version using the standard codepage (1386).

Traceback (most recent call last):
  File "/@unixroot/usr/lib/python2.7/site.py", line 550, in <module>
    main()
  File "/@unixroot/usr/lib/python2.7/site.py", line 540, in main
    setencoding()
  File "/@unixroot/usr/lib/python2.7/site.py", line 494, in setencoding
    sys.setdefaultencoding(encoding) # Needs Python Unicode build !
LookupError: unknown encoding: CP1386

I wonder if this is because (being ported software) it expects the GBK encoding to be codepage 936 instead? (936 is the GBK codepage under Windows. OS/2 uses codepage number 1386 instead.)

Perhaps an alias or mapping needs to be added somewhere?

(In spite of the superficial similarity to ticket #88, I suspect this is not actually the same error condition. However, similarly to #88, yum can be forced to run by changing the process codepage to 850 and setting LANG=en_US.)

Change History (7)

comment:1 Changed 9 months ago by lewisr

Generally, these are added to encodings/aliases.py. Looking there, I find no mention of CP1386. Have a look at the file and see where it might be appropriate to add it.

(FWIW, this is not just our port; I am likewise not finding it in Python 2.7 on openSUSE Leap 42.1 x64.)

Perhaps adding:

    'cp1386'             : 'gbk',

to the gbk codec section, where 936, cp936, and ms936 currently reside might do the trick?

Last edited 9 months ago by lewisr (previous) (diff)

comment:2 Changed 9 months ago by lewisr

A less ham-fisted approach to this might be that for those systems impacted by this (currently, Simplified Chinese), we add a CUBE (or sed) line to STARTUP.CMD (or even CONFIG.SYS) to check for the presence of this alias in the file, and if not there, insert it. This will ensure that even if Python is updated, the encoding will be added at the next boot.

Alternatively, we could install a custom aliases.py on these systems, but I don't have a good way of making that a requirement on a system based upon its current CP setting.

comment:3 Changed 9 months ago by ataylor

Indeed, adding these lines to aliases.py fixed the problem on my local test system:

    '1381'          : 'gb2312',
    'cp1381'        : 'gb2312',

    '1386'          : 'gbk',
    'cp1386'        : 'gbk',

Nice catch. Now if we can get these added to the default distribution... <g>

comment:4 Changed 9 months ago by lewisr

For the purists who want to follow Fedora's model, I would say that the only reason we require this on OS/2 is that we do not use Microsoft's CP for Simplified Chinese. Thus, other systems are not likely to need this alias.

That's probably the best reason I have for making this part of the "standard" OS/2 port.

comment:5 Changed 9 months ago by lewisr

If making changes, we should ensure that Traditional Chinese is also properly aliased.

comment:6 Changed 9 months ago by ataylor

Traditional Chinese OS/2 uses codepage 950, same as other platforms, and that is already defined in aliases.py.

comment:7 Changed 9 months ago by diver

  • Resolution set to fixed
  • Status changed from new to closed

fixed in r1565
rpm is uploaded to exp repo

Note: See TracTickets for help on using tickets.