Opened 10 years ago

Closed 10 years ago

#26 closed defect (fixed)

ash: Import sources

Reported by: dmik Owned by:
Priority: major Milestone:
Component: shell Version:
Severity: Keywords:
Cc:

Description

We need to have a copy of ASH sources in ports so that we can fix OS/2 specific bugs in it.

Attachments (2)

shell-perf.cmd (246 bytes ) - added by dmik 10 years ago.
shell-perf.sh (560 bytes ) - added by dmik 10 years ago.

Download all attachments as: .zip

Change History (24)

comment:1 by dmik, 10 years ago

Type: taskdefect

This ticket needs to be done to resolve #27.

comment:2 by dmik, 10 years ago

Note that the current ASH RPM uses a binary (which was AFAIR built by Knut back in 2006).

comment:3 by dmik, 10 years ago

#42 also depends on that.

comment:4 by Silvan Scherrer, 10 years ago

Component: shell

comment:5 by dmik, 10 years ago

There is actually quite a bit of ash variants, the full story is available in this nice article: http://www.in-ulm.de/~mascheck/various/ash/. After learning it a bit I decided to start off the repository based on the latest NetBSD variant (which looks like a source for many other variants) taken from here: http://cvsweb.netbsd.org/bsdweb.cgi/src/bin/sh/. This is already done.

Note that since the original ash doesn't have a concept of the version number, I think the best is to use the YYYY.MM.DD format as the version number (both in SVN and in a future RPM).

Note that there is already an existing OS/2 port of NetBSD ash maintained by Knut Osmundsen within his LIBC tree, located here: http://trac.netlabs.org/libc/browser/trunk/ash. However it's quite outdated (based on the NetBSD sources from year 2005 if the commit message is correct). We will definitely take some of his patches.

We may also wish to take some patches from other ash variants if we consider they make sense for OS/2 — in terms of speed in the first place.

The next task is to make the current trunk build on OS/2. Shouldn't be too difficult.

Note that once this ticket is done I will also fix a bunch of annoying ash problems for which dedicated tickets already exist: #27, #42, #58, #60.

comment:6 by dmik, 10 years ago

In fact, the original NetBSD version (which I already imported) depends on the NetBSD build environment (the main Makefile includes some bsd.*.mk files). Knut wrote his own Makefile.kmk file for that but I think I will go a bit different way. There is a modern (and still maintained) Debian linux port of ash called dash (http://git.kernel.org/cgit/utils/dash/dash.git) which is also based on NetBSD. It contains normal autotools build stuff which we already support relatively well so we don't have to manually maintain makefiles as we would have if sticking to Makefile.kmk.

For this reason, I will now try to build dash and if it appears to work then I will import that git's snapshot to vendor and re-create trunk based on that. Then I will apply Knut's patches related to OS/2 (there are actually just a few of them).

comment:7 by dmik, 10 years ago

It seems to configure and start building but needs some OS/2-related patching to go on. Instead of importing dash into the ash repo, I will create a separate repo called dash for it. We will then decide how we sort out RPM packages.

comment:8 by dmik, 10 years ago

With r1102, dash is imported and built. Now I need to test it and to see if some Knut's fixes are necessary (what I already fixed was also done by Knut but I did it a bit better).

comment:9 by dmik, 10 years ago

While digging into dash, I found an interesting thing related to fork(). It appears that the NetBSD sources (imported as ash) as well as Knut's sources have support for vfork(), however, dash misses this support. This is strange since according to dash history it has never been there, i.e. the first import of ash was already missing it and this first import (and an number of further resent) happened when vfork() was already in NetBSD. This means that dash is actually inherited from some NetBSD deviation with either vfork() ripped off or when there was no vfork() in NetBSD ash itself (which happened in rev 1.62 of eval.c on Fri Sep 27 20:24:36 2002 UTC).

It's good to know that vfork() is meant as a very light version of fork() — it doesn't create a full clone of the parent process, it only shares some data segment with it. According to some calculations i found on the net this makes vfork() 4-5 times faster than fork() but by the cost of significant limitations: basically, a vforked() child is only allowed to do an exec() or an _exit() call, everything else is not guaranteed to work. I.e. vfork() + exec() is actually very similar to our spawn() (or to posix posix_spawn() which modern LIBC support).

However, this doesn't help us much in case of ash since ash relies on the unspecified behavior related to the shared data segments of parent and child which means child still uses some parent's variables (which is a violation of POSIX, strictly speaking — perhaps due to that vfork() was ripped off of dash at some stage). If kLIBC would provide vfork() with the same behavior it could give us a good speed up as well I suppose. But vfork() is still marked by Knut as @todo (and there is no ticket about that so I doubt he plans to do it in the near future).

Anyway, once done with dash, I will have a quick look at another alternative of ash called yash to see if it's possible to get rid of fork there. Getting rid of fork on OS/2 would give us even better speedup I suppose because forking seems to be more expensive on OS/2 where it is emulated at userspace level (instead of native kernel level support on *nix).

by dmik, 10 years ago

Attachment: shell-perf.cmd added

by dmik, 10 years ago

Attachment: shell-perf.sh added

comment:10 by dmik, 10 years ago

Did some tests. My machine is Intel Core2 Quad Q6600 @ 2.40GHz (PassMark index 2991) and I get the following results with the attached shell-perf scripts when running 1000 cycles (shell-perf.cmd <shell> 1000):

  • dash: 39-40 sec
  • RPM ash (Knut's): 39-40 sec

Which means we don't get any significant performance feedback. But that's expected due to usage of fork() which is the major show stopper here.

BTW, for comparison, on my Mac machine which is Intel Core i7-3770 @ 3.40GHz (index 9373) I get the following results:

  • dash 5-6 sec
  • bash 13-14 sec

It is expected to be faster since the CPU is 3 times faster. And bash is known to be much slower than simple ash-like shells. However, performance on OS/2 still sucks non-proportionally because of fork() emulation. We should still consider rewriting it using spawn() — it will require some work but not much, actually.

Here are linux-world comparisons of bash and dash, just for reference: https://gist.github.com/mlafeldt/1663556 and http://www.pixelbeat.org/programming/shell_script_mistakes.html.

Version 1, edited 10 years ago by dmik (previous) (next) (diff)

comment:11 by dmik, 10 years ago

For further reference. My support for executable extensions differs from Knut's by that I built executable extension support into a normal padvance() function (used everywhere) instead of doing custom extension processing in each and every place. This concentrates all logic in a single place which simplifies managing etc.

It's also worth noting that along with vfork() the dash variant lacks custom hash magic (#!) processing (which is present in the original BSD ash) relying on the fact that the exec() LIBC call should do that. And this is good for us too because our kLIBC exec() does it pretty nicely (especially with the recent LBC 0.6.6 fixes). And this, in particular, solved #42 with no additional effort.

comment:12 by Yuri Dario, 10 years ago

here (core2 duo P8600 2.4GHz) I get around 27 seconds for ash.exe (original 2006 build), while RPM sh takes 2.5 more seconds (same sources but p4 optimized...).

comment:13 by Yuri Dario, 10 years ago

just to compare, using same pc in ubuntu 12.04, sh 6.5 sec, bash 16 sec.

comment:14 by dmik, 10 years ago

Yuri, thanks for the tests. This also proves that fork itself as well as its non-nativeness on OS/2 i a show stopper.

BTW, in my RPM build of dash I will disable job control completely. Our TTY emulation is too dumb to allow for it (if I get it right), so it makes no sense to print the infamous can't access tty; job control turned off each time the shell is started in interactive mode.

comment:15 by dmik, 10 years ago

Regarding the RPM package naming scheme. Since we are going to use the dash shell as our main POSIX shell that replaces the shell currently provided by ash, we need to place dash into a separate package. We also need to have the ability to have both ash and dash installed (to work around possible incompatibility issues in scripts) and, consequently, a nice way to switch the default system shell between ash and dash on the fly.

This is achieved by the following: there will be the dash package that provides dash.exe and the ash package that provides ash.exe. Then there will be two separate sub-packages of each: dash-sh and ash-sh, both will provide sh.exe as a copy/symlink of the respective shell. Given that, the user will be able to install both dash and ash but either one of dash-sh or ash-sh.

Regarding dash-sh and ash-sh. Given that you can't install both packages at the same time (since they provide the same executable) and that many other packages actually already depend on sh.exe, you won't be able to e.g. first remove ash-sh and then instal dash-sh w/o using ugly --force switches that break installation consistency. Happily, there is a nice way to achieve such a swap of the conflicting packages providing the same binary using the yum shel (found this with Yuri's hint):

# yum shell
> remove ash-sh
> install dash-sh
> run

This is not very convenient for the end user but acceptable for developers. I really wonder why yum already doesn't provide such an option that would automate these commands but we surely need to provide on elater.

I actually already built dash and dash-sh and tested this approach: it worked smoothly. I also need to modify the ash and create the ash-sh package. The question here is however how to make yum update work correctly in such a case. I have to find that out yet.

However using dash as the system shell brought the first problem: running basically any configure script results into:

D:/Coding/ports/dash/trunk/configure: 2048: D:/Coding/ports/dash/trunk/configure: 5: Operation not permitted

This has something to do with file descriptor redirection. I need to debug that...

comment:16 by dmik, 10 years ago

It turned out to be a double problem, actually. First, I found a new bug in kLIBC, see http://trac.netlabs.org/libc/ticket/344. I worked around it in r1125.

Next, it was a dash own change that replaced access with some custom code that analyzes mode bits for implementing the test built-in functionality related to file tests. This, however, doesn't work well in kLIBC because not all files have proper permission bits (.e. here /@unixroot/usr/bin/gcc.exe has 0x600 mode which means it is not executable from the Unix point of view and this made configure fail to find a working compiler). Knut's ash works because it uses access() that reports that any file is executable if asked so (with X_OK). In r1126 I changed it to using access on OS/2.

Now, dash seems to work well as the sh replacement. I tried running auto tools with it, then configure, then make (all on the dash project itself). All worked.

comment:17 by dmik, 10 years ago

The .spec file is done but I will actually build the full set of RPMs after I get in touch with Yuri. There are some pending questions that this .spec depends on (See http://trac.netlabs.org/rpm/ticket/134#comment:5) and I want to discuss them to avoid distribution of not so correct things.

We must also find a way on how to cause a new package installation if the user just does 'yum update. We need to force installation of ash-sh for all users that have ash installed (along with updating ash` itself, of course).

comment:18 by Yuri Dario, 10 years ago

installation of ash-sh should be automatic since it provides /@unixroot/bin/sh and ash.rpm doesn't. You need to make sure ash doesn't provide it.

comment:19 by dmik, 10 years ago

So you want to say that yum update (given as is, w/o any arguments) will automatically install a new package if I do so? I can't try it now since I can't build ash here (see below).

Also, I'm moving the discussion from http://trac.netlabs.org/rpm/ticket/134#comment:5 here as it's a more appropriate place:

  1. I now recall that we already discussed it and it turned out that our rpm doesn't process the platform/<name>/macros files at all and the per-platform optimization options are taken from rpmrc instead. Do you now recall why is it so?
  2. I didn't get it. Can you explain me exactly why /@unxirroot/bin/sh ends up among dependencies of many packages? This is not a real file (so it gets never actually installed) and other packages never refer to it explicitly. Why /@unxirroot/bin/sh and not, say, /@unxirroot/FOO_BAR? How to change that?
  3. Could you please search for yacc on your machine to see where it comes from?

comment:20 by Yuri Dario, 10 years ago

1 since it was not a known bug, it has been never tested nor debugged
2 somewhere in rpm source code or rpm scripts
3 it is a 2006 file, maybe some original libc binary build

comment:21 by dmik, 10 years ago

  1. Ok, created http://trac.netlabs.org/rpm/ticket/135 for it.
  2. Could you please find it? I can't. The current value /@unixroot/bin/sh is obviously wrong, we need to think on how to change that. I created http://trac.netlabs.org/rpm/ticket/137 for that.
  3. Can you please give me a link or drop me a zip?
Last edited 10 years ago by dmik (previous) (diff)

comment:22 by dmik, 10 years ago

Resolution: fixed
Status: newclosed

Finally, everything is done. I have just released both the brand new DASH packages (dash and dash-sh) and the updated Ash packages (ash and ash-sh). It turns out that yum update is smart enough to detect that /@unixroot/usr/bin/sh.exe is now provided by ash-sh when it offers you to update ash and installs both automatically.

As I wrote earlier in comment:15, it's a piece of cake to switch from ash-sh to dash-sh now using Yum Shell (and vice versa, of course). I even described it in detail on the RpmHowToEndUsers page so that everybody can repeat.

Now, we may have both Ash and DASH installed at the same time and yet switch between the system shell implementation (i.e. which one of them provides sh.exe) on the fly using the above method.

This task should be considered more than done now. Closing.

Note: See TracTickets for help on using tickets.