Opened 7 years ago

Closed 7 years ago

#267 closed defect (fixed)

python: Piping via subprocess.Popen is broken

Reported by: dmik Owned by:
Priority: major Milestone:
Component: python Version:
Severity: medium Keywords:
Cc:

Description

While using subprocess.PIPE works per se (i.e. you may read from this pipe later on in python code), chaining several processes together doesn't.

Consider this snippet:

import os, subprocess

os.putenv ('LANG', 'en_US')

cmd1 = ['cat', 't.py']
cmd2 = ['grep', 'cmd']

proc1 = subprocess.Popen (cmd1, stdout = subprocess.PIPE)
proc2 = subprocess.Popen (cmd2, stdin = proc1.stdout)

rc = proc2.wait ()
print '*** proc2 rc', rc

rc = proc1.wait ()
print '*** proc1 rc', rc

On OS/2 (python 2.7.6-18.oc00) it gives the following output:

grep: (standard input): Operation not supported on socket
*** proc2 rc 2
*** proc1 rc 0

Expected result (tested on macOS):

cmd1 = ['cat', 't.py']
cmd2 = ['grep', 'cmd']
proc1 = subprocess.Popen (cmd1, stdout = subprocess.PIPE)
proc2 = subprocess.Popen (cmd2, stdin = proc1.stdout)
*** proc2 rc 0
*** proc1 rc 0

I.e. the child process fails to read from its stdin (which is a pipe connected to the parent's stdout) on OS/2.

Note that if you replace cat t.py | grep cmd with e.g. rpm2cpio MYRPM | cpio -idm, the error will be as follows:

2 blocks
cpio: error closing archive: Bad file number

meaning that cpio worked per se (files unpacked correctly) but failed to close the input stream.

Also note that both examples work smoothly as shell commands from under dash or ash.

It looks like we're using pipes (or socket pairs?) incorrectly in Python.

Change History (3)

comment:1 by dmik, 7 years ago

I have a feeling that this is because Python uses socketpair() instead of pipe() on OS/2 and TCP/IP socket pairs are known for returning EBADF when the other end is closed (IIRC). Perhaps dash and ash use pipe() for piping and since pipe() is a native OS/2 pipe, all works. If it's the case then a fix would be to simply change socketpair() back to pipe() — but only for stdio redirection in a pipeline (we can't use pipe() in cases where select() is used as it can't wait on OS/2 pipes). select() is rarely (never?) used for stdinn/out/err, so pipe() should be safe there. Perhaps dash and ash work because they use pipe(). This needs checking.

comment:2 by dmik, 7 years ago

BTW, I did some checks. First, dash/ash use pipe() indeed and python uses socketpair() on OS/2 in Modules/posixmodule.c. Changing socketpair() back to pipe() makes the above test case work as it should. However, this breaks subprocess.communicate() functionality which uses select()... Quite a puzzle to solve.

I should mention though that communicate() doesn't work well even with select(): you will get nothing if the started program ends too fast (besides the original problem of this ticket with piping).

comment:3 by dmik, 7 years ago

Resolution: fixed
Status: newclosed

The puzzle is solved within #275 so that r1254 brings usage of pipe() instead of socketpair() which makes multiple piping as well as communicate() work really well.

Note: See TracTickets for help on using tickets.