Opened 8 years ago

Closed 8 years ago

#148 closed defect (fixed)

python: os.popen doesn't accept backslashes

Reported by: dmik Owned by:
Priority: major Milestone:
Component: python Version:
Severity: medium Keywords:
Cc:

Description

I discovered a strange problem with os.popen regarding paths containing backslashes:

import os

gcc_exe = "C:\\usr\\bin\\gcc.exe"

print "["+repr(gcc_exe)+"]"

# [1] This works
exists = os.access(gcc_exe, os.X_OK)
print "exists", exists

if exists:

    # [2] This works too, if uncommented
    #os.execl(gcc_exe, "gcc", "-dumpversion")

    # [3] This doesn't work :(((
    out = os.popen(gcc_exe + ' -dumpversion', 'r')
    try:
        out_string = out.read()
        print out_string
    finally:
        out.close()

This gives the following output here:

D:>python popen.py
['C:\\usr\\bin\\gcc.exe']
exists True
C:usrbingcc.exe: not found

Both [1] and [2] cases work, but case [3] does not. Seems that some extra string evaluation happens before the string is passed down to the popen implementation (as if str(gcc_exe) were called) which effectively kills backslashes.

Change History (3)

comment:1 Changed 8 years ago by dmik

This problem is the reason why distutils/emxccompiler.py fails like this:

running build_ext
C:USRBIN/gcc.exe: not found

when building 3rd party python modules with their setup.py build if the PATH environment variable contains backslashes.

A quick fix is to convert backslashes to forward slashes before calling python but we need to properly fix it instead.

comment:2 Changed 8 years ago by dmik

It turns out that the problem is not python. In the C implementation of popen all backslashes are preserved, I checked it with printf. It's actually LIBC. Well, not LIBC itself but the way how popen and system work. They pass command to the sell (using the -c option), unmodified. And when the shell is Unix-like (e.g. bash or ash), it of course interprets all backslashes as escape chars and eventually removes them before interpreting the command. Hence the result.

This is actually not easy to fix because command is just an arbitrary shell script and the function has no knowledge of which parts of it are paths (to e.g. replace backslashes with slashes there). And blindly replacing all slashes in the command or escaping them again will surely break many existing scripts where backslashes are used as escaping chars rather than path separators.

As this problem originally comes from the PATH setting (where back slashes are used for native reasons) I thought that we could fix it globally at least in python by changing backslashes to slashes in all PATH-like environment variables when initializing os.environ but this solution also looks bug-prone (these variables may be passed back to the process environment and some native program may not expect it at all).

I got a better idea. In a well written python script (this includes emxccompiler.py as well) paths are manipulated with os.path.split and join. In particular, spawn.py (used by emxccompiler.py) uses os.path.join to join the executable name with PATH entries when it searches for an executable. So if I fix os.path.join by making it replace all os.path.altsep (backslashes on EMX) occurrences with os.path.sep (forward slashes) this will clear up this particular problem and many similar.

comment:3 Changed 8 years ago by dmik

Resolution: fixed
Status: newclosed

Did it in r607 and tested with emxccompiler.py. Works. Closing.

Note: See TracTickets for help on using tickets.