Opened 10 years ago
Closed 10 years ago
#26 closed defect (fixed)
ash: Import sources
Reported by: | dmik | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | shell | Version: | |
Severity: | Keywords: | ||
Cc: |
Description
We need to have a copy of ASH sources in ports so that we can fix OS/2 specific bugs in it.
Attachments (2)
Change History (24)
comment:1 by , 10 years ago
Type: | task → defect |
---|
comment:2 by , 10 years ago
Note that the current ASH RPM uses a binary (which was AFAIR built by Knut back in 2006).
comment:4 by , 10 years ago
Component: | → shell |
---|
comment:5 by , 10 years ago
There is actually quite a bit of ash variants, the full story is available in this nice article: http://www.in-ulm.de/~mascheck/various/ash/. After learning it a bit I decided to start off the repository based on the latest NetBSD variant (which looks like a source for many other variants) taken from here: http://cvsweb.netbsd.org/bsdweb.cgi/src/bin/sh/. This is already done.
Note that since the original ash doesn't have a concept of the version number, I think the best is to use the YYYY.MM.DD format as the version number (both in SVN and in a future RPM).
Note that there is already an existing OS/2 port of NetBSD ash maintained by Knut Osmundsen within his LIBC tree, located here: http://trac.netlabs.org/libc/browser/trunk/ash. However it's quite outdated (based on the NetBSD sources from year 2005 if the commit message is correct). We will definitely take some of his patches.
We may also wish to take some patches from other ash variants if we consider they make sense for OS/2 — in terms of speed in the first place.
The next task is to make the current trunk build on OS/2. Shouldn't be too difficult.
Note that once this ticket is done I will also fix a bunch of annoying ash problems for which dedicated tickets already exist: #27, #42, #58, #60.
comment:6 by , 10 years ago
In fact, the original NetBSD version (which I already imported) depends on the NetBSD build environment (the main Makefile includes some bsd.*.mk files). Knut wrote his own Makefile.kmk file for that but I think I will go a bit different way. There is a modern (and still maintained) Debian linux port of ash called dash (http://git.kernel.org/cgit/utils/dash/dash.git) which is also based on NetBSD. It contains normal autotools build stuff which we already support relatively well so we don't have to manually maintain makefiles as we would have if sticking to Makefile.kmk.
For this reason, I will now try to build dash and if it appears to work then I will import that git's snapshot to vendor and re-create trunk based on that. Then I will apply Knut's patches related to OS/2 (there are actually just a few of them).
comment:7 by , 10 years ago
It seems to configure and start building but needs some OS/2-related patching to go on. Instead of importing dash into the ash repo, I will create a separate repo called dash for it. We will then decide how we sort out RPM packages.
comment:8 by , 10 years ago
With r1102, dash is imported and built. Now I need to test it and to see if some Knut's fixes are necessary (what I already fixed was also done by Knut but I did it a bit better).
comment:9 by , 10 years ago
While digging into dash, I found an interesting thing related to fork(). It appears that the NetBSD sources (imported as ash
) as well as Knut's sources have support for vfork(), however, dash misses this support. This is strange since according to dash history it has never been there, i.e. the first import of ash was already missing it and this first import (and an number of further resent) happened when vfork() was already in NetBSD. This means that dash is actually inherited from some NetBSD deviation with either vfork() ripped off or when there was no vfork() in NetBSD ash itself (which happened in rev 1.62 of eval.c on Fri Sep 27 20:24:36 2002 UTC).
It's good to know that vfork() is meant as a very light version of fork() — it doesn't create a full clone of the parent process, it only shares some data segment with it. According to some calculations i found on the net this makes vfork() 4-5 times faster than fork() but by the cost of significant limitations: basically, a vforked() child is only allowed to do an exec() or an _exit() call, everything else is not guaranteed to work. I.e. vfork() + exec() is actually very similar to our spawn() (or to posix posix_spawn() which modern LIBC support).
However, this doesn't help us much in case of ash since ash relies on the unspecified behavior related to the shared data segments of parent and child which means child still uses some parent's variables (which is a violation of POSIX, strictly speaking — perhaps due to that vfork() was ripped off of dash at some stage). If kLIBC would provide vfork() with the same behavior it could give us a good speed up as well I suppose. But vfork() is still marked by Knut as @todo (and there is no ticket about that so I doubt he plans to do it in the near future).
Anyway, once done with dash, I will have a quick look at another alternative of ash called yash to see if it's possible to get rid of fork there. Getting rid of fork on OS/2 would give us even better speedup I suppose because forking seems to be more expensive on OS/2 where it is emulated at userspace level (instead of native kernel level support on *nix).
by , 10 years ago
Attachment: | shell-perf.cmd added |
---|
by , 10 years ago
Attachment: | shell-perf.sh added |
---|
comment:10 by , 10 years ago
Did some tests. My machine is Intel Core2 Quad Q6600 @ 2.40GHz (PassMark index 2991) and I get the following results with the attached shell-perf scripts when running 1000 cycles (shell-perf.cmd <shell> 1000
):
- dash: 39-40 sec
- RPM ash (Knut's): 39-40 sec
Which means we don't get any significant performance feedback. But that's expected due to usage of fork()
which is the major show stopper here.
BTW, for comparison, on my Mac machine which is Intel Core i7-3770 @ 3.40GHz (index 9373) I get the following results:
- dash 5-6 sec
- bash 13-14 sec
- yash 11-12 sec
It is expected to be faster since the CPU is 3 times faster. And bash is known to be much slower than simple ash-like shells. However, performance on OS/2 still sucks non-proportionally because of fork()
emulation. We should still consider rewriting it using spawn()
— it will require some work but not much, actually.
Here are linux-world comparisons of bash and dash, just for reference: https://gist.github.com/mlafeldt/1663556 and http://www.pixelbeat.org/programming/shell_script_mistakes.html.
BTW, I added yash
to the Mac comparison table. It shows results close to bash
so I think I will not spend time on the OS/2 port of it right now.
comment:11 by , 10 years ago
For further reference. My support for executable extensions differs from Knut's by that I built executable extension support into a normal padvance() function (used everywhere) instead of doing custom extension processing in each and every place. This concentrates all logic in a single place which simplifies managing etc.
It's also worth noting that along with vfork()
the dash variant lacks custom hash magic (#!
) processing (which is present in the original BSD ash) relying on the fact that the exec()
LIBC call should do that. And this is good for us too because our kLIBC exec()
does it pretty nicely (especially with the recent LBC 0.6.6 fixes). And this, in particular, solved #42 with no additional effort.
comment:12 by , 10 years ago
here (core2 duo P8600 2.4GHz) I get around 27 seconds for ash.exe (original 2006 build), while RPM sh takes 2.5 more seconds (same sources but p4 optimized...).
comment:13 by , 10 years ago
just to compare, using same pc in ubuntu 12.04, sh 6.5 sec, bash 16 sec.
comment:14 by , 10 years ago
Yuri, thanks for the tests. This also proves that fork
itself as well as its non-nativeness on OS/2 i a show stopper.
BTW, in my RPM build of dash I will disable job control completely. Our TTY emulation is too dumb to allow for it (if I get it right), so it makes no sense to print the infamous can't access tty; job control turned off
each time the shell is started in interactive mode.
comment:15 by , 10 years ago
Regarding the RPM package naming scheme. Since we are going to use the dash shell as our main POSIX shell that replaces the shell currently provided by ash
, we need to place dash
into a separate package. We also need to have the ability to have both ash
and dash
installed (to work around possible incompatibility issues in scripts) and, consequently, a nice way to switch the default system shell between ash
and dash
on the fly.
This is achieved by the following: there will be the dash
package that provides dash.exe
and the ash
package that provides ash.exe
. Then there will be two separate sub-packages of each: dash-sh
and ash-sh
, both will provide sh.exe
as a copy/symlink of the respective shell. Given that, the user will be able to install both dash
and ash
but either one of dash-sh
or ash-sh
.
Regarding dash-sh
and ash-sh
. Given that you can't install both packages at the same time (since they provide the same executable) and that many other packages actually already depend on sh.exe
, you won't be able to e.g. first remove ash-sh
and then instal dash-sh
w/o using ugly --force
switches that break installation consistency. Happily, there is a nice way to achieve such a swap of the conflicting packages providing the same binary using the yum shel (found this with Yuri's hint):
# yum shell > remove ash-sh > install dash-sh > run
This is not very convenient for the end user but acceptable for developers. I really wonder why yum
already doesn't provide such an option that would automate these commands but we surely need to provide on elater.
I actually already built dash
and dash-sh
and tested this approach: it worked smoothly. I also need to modify the ash
and create the ash-sh
package. The question here is however how to make yum update
work correctly in such a case. I have to find that out yet.
However using dash as the system shell brought the first problem: running basically any configure
script results into:
D:/Coding/ports/dash/trunk/configure: 2048: D:/Coding/ports/dash/trunk/configure: 5: Operation not permitted
This has something to do with file descriptor redirection. I need to debug that...
comment:16 by , 10 years ago
It turned out to be a double problem, actually. First, I found a new bug in kLIBC, see http://trac.netlabs.org/libc/ticket/344. I worked around it in r1125.
Next, it was a dash
own change that replaced access
with some custom code that analyzes mode bits for implementing the test
built-in functionality related to file tests. This, however, doesn't work well in kLIBC because not all files have proper permission bits (.e. here /@unixroot/usr/bin/gcc.exe
has 0x600 mode which means it is not executable from the Unix point of view and this made configure
fail to find a working compiler). Knut's ash works because it uses access()
that reports that any file is executable if asked so (with X_OK
). In r1126 I changed it to using access
on OS/2.
Now, dash
seems to work well as the sh
replacement. I tried running auto tools with it, then configure, then make (all on the dash
project itself). All worked.
comment:17 by , 10 years ago
The .spec
file is done but I will actually build the full set of RPMs after I get in touch with Yuri. There are some pending questions that this .spec
depends on (See http://trac.netlabs.org/rpm/ticket/134#comment:5) and I want to discuss them to avoid distribution of not so correct things.
We must also find a way on how to cause a new package installation if the user just does 'yum update. We need to force installation of
ash-sh for all users that have
ash installed (along with updating
ash` itself, of course).
comment:18 by , 10 years ago
installation of ash-sh should be automatic since it provides /@unixroot/bin/sh and ash.rpm doesn't. You need to make sure ash doesn't provide it.
comment:19 by , 10 years ago
So you want to say that yum update
(given as is, w/o any arguments) will automatically install a new package if I do so? I can't try it now since I can't build ash here (see below).
Also, I'm moving the discussion from http://trac.netlabs.org/rpm/ticket/134#comment:5 here as it's a more appropriate place:
- I now recall that we already discussed it and it turned out that our rpm doesn't process the
platform/<name>/macros
files at all and the per-platform optimization options are taken fromrpmrc
instead. Do you now recall why is it so? - I didn't get it. Can you explain me exactly why
/@unxirroot/bin/sh
ends up among dependencies of many packages? This is not a real file (so it gets never actually installed) and other packages never refer to it explicitly. Why/@unxirroot/bin/sh
and not, say,/@unxirroot/FOO_BAR
? How to change that? - Could you please search for
yacc
on your machine to see where it comes from?
comment:20 by , 10 years ago
1 since it was not a known bug, it has been never tested nor debugged
2 somewhere in rpm source code or rpm scripts
3 it is a 2006 file, maybe some original libc binary build
comment:21 by , 10 years ago
- Ok, created http://trac.netlabs.org/rpm/ticket/135 for it.
- Could you please find it? I can't. The current value
/@unixroot/bin/sh
is obviously wrong, we need to think on how to change that. I created http://trac.netlabs.org/rpm/ticket/137 for that. - Can you please give me a link or drop me a zip?
comment:22 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Finally, everything is done. I have just released both the brand new DASH packages (dash
and dash-sh
) and the updated Ash packages (ash
and ash-sh
). It turns out that yum update
is smart enough to detect that /@unixroot/usr/bin/sh.exe
is now provided by ash-sh
when it offers you to update ash
and installs both automatically.
As I wrote earlier in comment:15, it's a piece of cake to switch from ash-sh
to dash-sh
now using Yum Shell (and vice versa, of course). I even described it in detail on the RpmHowToEndUsers page so that everybody can repeat.
Now, we may have both Ash and DASH installed at the same time and yet switch between the system shell implementation (i.e. which one of them provides sh.exe
) on the fly using the above method.
This task should be considered more than done now. Closing.
This ticket needs to be done to resolve #27.