Opened 6 years ago

Closed 5 years ago

#284 closed defect (worksforme)

Inconsistent realpath() and stat() behavior

Reported by: dmik Owned by: bird
Priority: normal Milestone: libc-0.6.6
Component: libc Version: 0.6.5
Severity: normal Keywords:
Cc:

Description

The behavior of realpath() and stat() for paths containing path rewrite components is inconsistent.

In particular, realpath() behaves as if such components were symlinks. However stat() doesn't recognize them as symlinks and will not report the S_IFLINK mode bit for them.

For example, calling reaplath("/@unixroot/usr") will give you C:/usr (if UNIXROOT is set to C:). Given this, one might expect that "/@unixroot" is a symlink but apparentlystat("/@unixroot")shows that this is a false assumption.

This is not only a theoretical question. There is at least one program that is broken because of that: Python (at least the version hosted here and distributed in our RPM that supports symlinks and therefore makes a difference between abspath and realpath).

Its os.path.realpath function is implemented not as a call to the libc realpath() function (as one might expect) but as a loop that walks over individual path components and calls readlink on those of them for which islink returns true. However islink (eventually backed up by stat()) returns false for paths like /@unixroot. First of all, it will return false because stat() itself will fail (see #283) but even if we imagine that that failure is fixed and stat() starts to work, it will simply return 0 in the S_IFLINK so that islink will still return false.

Such a behavior of islink completely breaks os.path.realpath functionality for paths containing /@unixroot and similar. in such cases realpath will just return the same value as os.path.abspath.

I discovered this problem when I was trying to fix the Python's virtualenv script (used to create directory-local python environments on the fly). Virtualenv does some path rebasing which fails on OS/2 for Python installed from RPM because it can't detect that C:/usr and /@unixroot/usr is the same thing. In virtualenv they originally do path comparison after calling os.path.abspath. Replacing that with os.path.realpath would let us fix the problem but we can't do that since realpath itself doesn't work.

Change History (5)

comment:1 Changed 6 years ago by dmik

  • Component changed from baselayout to libc
  • Priority changed from highest to high
  • Version set to 0.6.5

comment:2 Changed 6 years ago by bird

  • Milestone set to libc-0.6.6
  • Priority changed from high to normal
  • Severity changed from blocker to normal

comment:3 Changed 6 years ago by bird

  • Owner set to bird
  • Status changed from new to accepted

comment:4 Changed 5 years ago by dmik

BTW, readlink() behaves just like stat() and returns EINVAL if you give it something that starts with /@unixroot. Also wrong (given that realpath() works on such a path).

comment:5 Changed 5 years ago by bird

  • Resolution set to worksforme
  • Status changed from accepted to closed

The readlink() problem you mention in comment:4 is related to using UNIXROOT=X: instead of UNIXROOT=X:/, see #283.

I think you misunderstand the way the rewrites work in kLibC 0.6.x (will probably remain in 0.7.x, but may be changed later some day). The /@unixroot (only with chroot or UNIXROOT), /@tmpdir, /@system_drive, /@system_root and /@executable_path are not represented as symbolic links anywhere - there is currently no virtual file system that they are attached to. The path re-writer component from 0.5 that they're implemented via, works a bit like /usr/bin/sed, where the presence of a valid UNIXROOT makes us add a s,^/@unixroot,$UNIXROOT, rule into the rewriter. The rules are applied while doing path resolving, where the chroot '/' and symbolic links are handled as well, but it's a completely separate step.

For various reasons, realpath() will resolve thru the pathrewriter and translate /@unixroot to $UNIXROOT even if lstat(/@unixroot) will not indicate that it is a link (as the link is already resolved by the time the actual stat code fills in the stat struct). One of the reasons for resolving the rewritten paths into real paths, is that you can pass the result to other programs and get the same interpretation when that program is running. For reason of interacting with OS/2 programs, there is also the _realrealpath() function that will go one step further ('/' in chroot'ed process).

I'm afraid this is much to big a change in 0.6.x, so you just have to deal with it for the present.

Note: See TracTickets for help on using tickets.