Opened 6 years ago

Closed 3 years ago

#11 closed defect (fixed)

Air-Boot v.1.1.0 hangs without displaying a menu

Reported by: thomabrown Owned by: Ben Rietbroek
Priority: major Milestone:
Component: Boot Manager Version: 1.0
Keywords: Air-Boot hang Cc: steve53@…

Description

Trying to install eCS 2.2 Beta II to a new drive. Used DFSee to set geometry, create new mbr, and a JFS partition. At the end of phase 1, the reboot hung with the 4 lines of Air-boot v.1.1.0 text displayed, but no menu.

There is only one drive in the system at this point, a SATA 500 Gib Seagate ST500DM002.

Am going to try earlier versions of Air-Boot, v.1.0.7, and V.1.0.8.rc3.

Attachments (9)

dfswork201404162208.log (14.9 KB) - added by thomabrown 6 years ago.
DFSee log file from drive geo configuration
dfswork201404172201.log (19.3 KB) - added by thomabrown 6 years ago.
Current view, working PATA drive & non-booting SATA drive
CONFIG.SYS (6.3 KB) - added by thomabrown 6 years ago.
CONFIG.SYS from partially installed ecs22 Beta II
img_0041.jpg (42.5 KB) - added by thomabrown 6 years ago.
Volumes
img_0039.jpg (73.6 KB) - added by thomabrown 6 years ago.
dbb #3
img_0040.jpg (40.0 KB) - added by thomabrown 6 years ago.
dbb #3 after pressing Enter
dfswork201404211058.log (13.7 KB) - added by thomabrown 6 years ago.
DFSee log of current layout
img_0042.jpg (115.4 KB) - added by thomabrown 6 years ago.
dbb-4, Scroll Lock before menu
img_0044.jpg (71.2 KB) - added by thomabrown 6 years ago.
dbb-4, menu, ECS22B2 blinking

Download all attachments as: .zip

Change History (37)

comment:1 Changed 6 years ago by Ben Rietbroek

Last edited 6 years ago by Ben Rietbroek (previous) (diff)

comment:2 in reply to:  description Changed 6 years ago by Ben Rietbroek

Replying to thomabrown:

Trying to install eCS 2.2 Beta II to a new drive. Used DFSee to set geometry, create new mbr, and a JFS partition. At the end of phase 1, the reboot hung with the 4 lines of Air-boot v.1.1.0 text displayed, but no menu.

There is only one drive in the system at this point, a SATA 500 Gib Seagate ST500DM002.

Am going to try earlier versions of Air-Boot, v.1.0.7, and V.1.0.8.rc3.


Hi,

Which (10.x/11.x) version of DFSee did you use ?
Using DOS or OS/2 version ?
What commands did you use exactly to set the geometry ?
The partition you created, what is it's starting sector number ?
Were there any usb devices inserted at the time of the reboot after phase 1 ?
Is your BIOS set to legacy or AHCI mode ?
If you used DFSee OS/2 version, which driver did the system use ?
DANI or AHCI ?

Thanks,

Ben.

comment:3 Changed 6 years ago by thomabrown

This may not relate to AIR-BOOT v.1.1.0.
I have tried V.1.0.8-RC3 and v.1.07 with the same results.

DFSee 11.7
OS/2 version
See attached DFSee log file: dfswork201404162208.log

BASEDEV=OS2AHCI.ADD
BASEDEV=DANIS506.ADD /!BIOS

There may have been a USB stick, FAT32 which contains, among other things, my eCS 2.0 key.

I will have to reboot to check the BIOS setting. There may not be one.
The config.sys statements are as above.

Changed 6 years ago by thomabrown

Attachment: dfswork201404162208.log added

DFSee log file from drive geo configuration

Changed 6 years ago by thomabrown

Attachment: dfswork201404172201.log added

Current view, working PATA drive & non-booting SATA drive

comment:4 Changed 6 years ago by thomabrown

I am assuming the system used the AHCI driver because that was first in config.sys.

comment:5 Changed 6 years ago by thomabrown

There does not seem to be a BIOS selection for AHCI vs compatibility mode, so must assume that the mobo does not know about AHCI & therefore uses compatibility mode. Correct?

In checking the above, I realized that I was booting from the SATA drive using AIR-BOOT v.1.0.8. It boots from the PATA drive just fine, but hangs before displaying the menu when trying to boot the installed-to-phase1 volume on the SATA drive. This is the exact same behavior as when booted directly from the SATA drive with AIR-BOOT v.1.1.0.

There must be something wrong with the D: partition on the SATA drive...

Any ideas as to what or how to test?

Thanks!

Changed 6 years ago by thomabrown

Attachment: CONFIG.SYS added

CONFIG.SYS from partially installed ecs22 Beta II

comment:6 in reply to:  3 Changed 6 years ago by Ben Rietbroek

Replying to thomabrown:

This may not relate to AIR-BOOT v.1.1.0.
I have tried V.1.0.8-RC3 and v.1.07 with the same results.


That does not imply AiR-BOOT is without bugs :-)
If there is something wrong with the partition, and AiR-BOOT is
offered a faulty value to digest, it should protect itself from that.
Let's find out where AiR-BOOT stumbles...

Last edited 6 years ago by Ben Rietbroek (previous) (diff)

comment:7 in reply to:  5 Changed 6 years ago by Ben Rietbroek

Replying to thomabrown:

There does not seem to be a BIOS selection for AHCI vs compatibility mode, so must assume that the mobo does not know about AHCI & therefore uses compatibility mode. Correct?


Yes, one would think so...

In checking the above, I realized that I was booting from the SATA drive using AIR-BOOT v.1.0.8. It boots from the PATA drive just fine, but hangs before displaying the menu when trying to boot the installed-to-phase1 volume on the SATA drive. This is the exact same behavior as when booted directly from the SATA drive with AIR-BOOT v.1.1.0.


Firstly, when AiR-BOOT is started after Phase-1, it does not display the menu.
This is done so that the complete system is installed wihout AiR-BOOT interrupting
the half-way-through installation because it displays it's menu possibly with time-out
disabled.

Secondly, I am now confused about which 4 lines you mean.
Do you mean the AiR-BOOT signature displayed right after it starts ?
(The one that is also seen in the POST-BIOS screen if you press TAB while the menu is displayed)
Or do you mean the 4 lines in the blue colored box that is briefly shown when you start a system ?

In the first case the hang is probably when partitions are scanned.
That's very early in the AiR-BOOT code.

In the second case it happens just before starting the selected system.
This could be in AiR-BOOT or in the PBR-code of the started system, but if it is
the PBR-code then AiR-BOOT should at least have shown the line with the name
of the system being started and the process dots...

There must be something wrong with the D: partition on the SATA drive...


We'll see...

Any ideas as to what or how to test?


Yes.
First I'm going to see if I can replicate this situation.
In the mean time you can let me know if you are able to connect
the serial port of the machine to another one that runs a terminal program
so we can monitor AiR-BOOT debug output. (You need a serial cross-cable for that)
If not, and I cannot replicate the issue, I can send you a debug build that outputs
a few lines to the post-BIOS screen, so we can see how far it comes.

Laters,

Ben.

Last edited 6 years ago by Ben Rietbroek (previous) (diff)

comment:8 Changed 6 years ago by thomabrown

The four lines:

AIR-BOOT v1.1.0 - (c) 1998-2013 Martin Kiewitx, Dedicated to Gerd Kiewitz

Build Date: 2013/04/05 10:01:00 [JWasm]

Boot Disk is Huge : No
eCS Install Phase 1 : WIN7

comment:9 in reply to:  8 Changed 6 years ago by Ben Rietbroek

Replying to thomabrown:

The four lines:

AIR-BOOT v1.1.0 - (c) 1998-2013 Martin Kiewitx, Dedicated to Gerd Kiewitz

Build Date: 2013/04/05 10:01:00 [JWasm]

Boot Disk is Huge : No
eCS Install Phase 1 : WIN7


Ahh, now look:

AiR-BOOT wants to start a partition called WIN7 to continue Phase 2.
The LVM volume name of your fresh system should be displayed here: ECS22B2
But I cannot find any partition in your logfiles with the WIN7 label.
Maybe you renamed Win7Ult ?

172 +--<MBR disk 4></dev/hdd >-------+--------+----------<[ D1 ] >--------+
173 |06 |C:|Prim 07 Inst-FSys| 1|JFS |IBM 4.50|Win7Ult |Win7Ult Win7Ult | 100006.2|
174 |07 |Fd|Log 07 Inst-FSys| 5|JFS |IBM 4.50|ECS22B2 |eCS22B2 eCS22B2 | 10001.4|
175 |08 |Ge|Log 07 Inst-FSys| 6|JFS |IBM 4.50|SAVE |SAVE SAVE | 366929.9|

176 |13 | |Partial Cylinder |-- ---- --- - - - -| | 2.5|

Anyway, what happens when you hard reset the system when AiR-BOOT hangs ?
Does it again enter phase 1 and hang ?

There happens some BIOS stuff at this stage, so instead of trying to replicate
your situation, I will create a debug-build that outputs it progress on the screen
so we can exactly locate where the shit hits the fan.

Please send an e-mail to: rousseau.ecsdev-at-gmail.com with "Ticket 11" in the header
so that I can send you this debug-build.

Thanks.

comment:10 Changed 6 years ago by Steven Levine

Cc: steve53@… added

comment:11 Changed 6 years ago by thomabrown

The WIN7 partition was intended for Windows 7 Ultimate, but I tried to install eCS 2.2 B 2 there since it is a primary. I thought it might have something to do with the problem. Same problem.
Both the reset button and a power cycle give the same result, the four lines from AIR-BOOT, and that's it.

comment:12 in reply to:  11 Changed 6 years ago by Ben Rietbroek

Replying to thomabrown:

The WIN7 partition was intended for Windows 7 Ultimate, but I tried to install eCS 2.2 B 2 there
since it is a primary. I thought it might have something to do with the problem.

Ah, ok.
But that leaves the question why WIN7 is in the AiR-BOOT configuration
as the system to start. Installing eCS should have run SETBOOT.EXE (AB version)
during phase 1 to poke ECS22B2 in the AB configuration on disk.

But as far as I can tell now, the hang occurs earlier, even before using the
volume name to continue with Phase 2. So we might have to also look at the
above issue later.

Same problem.
Both the reset button and a power cycle give the same result, the four lines from AIR-BOOT, and
that's it.

Ok.
I just sent you the debug-builds.
Let's see what they turn up...

Last edited 6 years ago by Ben Rietbroek (previous) (diff)

comment:13 Changed 6 years ago by erdmann

Also, make sure that you DO NOT have a USB stick plugged in on system start/reboot. That will confuse AIRBOOT as either the BIOS or AIRBOOT (I am not sure) will try to boot from the USB stick if your Mobo supports booting from a USB stick (if your Mobo supports booting from USB device is sometimes also not all that obvious).

comment:14 in reply to:  13 Changed 6 years ago by Ben Rietbroek

Replying to erdmann:

Also, make sure that you DO NOT have a USB stick plugged in on system start/reboot. That will confuse AIRBOOT as either the BIOS or AIRBOOT (I am not sure) will try to boot from the USB stick if your Mobo supports booting from a USB stick (if your Mobo supports booting from USB device is sometimes also not all that obvious).

The presence of usb-devices during boot was among the first questions I asked.
The issue here is that AB on a pata disk boots fine on the system but on a sata disk
on the same system it hangs either when doing a short delay based on the timertick
value in the BDA or when checking for a keypress using INT16h.
All disk accesses pass ok.

comment:15 Changed 6 years ago by thomabrown

First, holding down the Alt key when starting dbb #1 or #2 has no apparent effect.

The Win7 partition (primary) initially contained Win7 Ultimate, but I formatted it as JFS and tried installing eCS 2.2 B 2 on it just to see whether installing on a primary would change things. It did not.

I have deleted, recreated, and re-formatted it as JFS, hopefully without reference to Win7. It is just a placeholder, and is not being used in any other way.

After many restarts due to late hours, brain checks, a glass or two of red, etc. I have installed dbb #3. I am attaching three screen shots.

  1. Installation Volume Manager showing allocated volumes. (img_0041.jpg)
  1. The AB screen after a re-boot. No. 4 is blinking. (img_0039.jpg)

Pressed Enter.

  1. Hang after pressing Enter. (img_0040.jpg) I forgot to try pressing any keys at this point. Will do so and put another addendum to this ticket...

comment:16 Changed 6 years ago by thomabrown

Keyboard has no effect after #3.

00:05 on the clock. G'nite. ZZZZzzzzzz

comment:17 in reply to:  15 Changed 6 years ago by Ben Rietbroek

Replying to thomabrown:

First, holding down the Alt key when starting dbb #1 or #2 has no apparent effect.

The Win7 partition (primary) initially contained Win7 Ultimate, but I formatted it as JFS and tried installing eCS 2.2 B 2 on it just to see whether installing on a primary would change things. It did not.

I have deleted, recreated, and re-formatted it as JFS, hopefully without reference to Win7. It is just a placeholder, and is not being used in any other way.


This kinda breaks our debugging context...

After many restarts due to late hours, brain checks, a glass or two of red, etc. I have installed dbb #3. I am attaching three screen shots.


I wish I could find those attachments :-)

  1. Installation Volume Manager showing allocated volumes. (img_0041.jpg)
  1. The AB screen after a re-boot. No. 4 is blinking. (img_0039.jpg)


Cannot see the screenshots, but if you mean checkpoint 4, then you say AB
now does not got beyond checkpoint 4 while earlier the debug-builds went
all the way up until checkpoint 15 ?
Please install dbb-2 again to see if it still goes up to checkpoint 15.
This is important because you recreated and reformatted the partition.

Pressed Enter.

  1. Hang after pressing Enter. (img_0040.jpg) I forgot to try pressing any keys at this point. Will do so and put another addendum to this ticket...


Please attach the screenshots.

comment:18 in reply to:  16 Changed 6 years ago by Ben Rietbroek

Replying to thomabrown:

Keyboard has no effect after #3.

00:05 on the clock. G'nite. ZZZZzzzzzz


He he :-)
When you're up and running again, please:

  • Produce a DFSee logfile with the current disk layout.
  • Also include the LVM information please.
  • Install dbb-1 and note the last checkpoint displayed and the volume it tries to start.
  • Do the same with dbb-2.
  • And the same with dbb-3.

Thanks !

comment:19 Changed 6 years ago by thomabrown

Sorry I forgot the screenshots... Will send them now. More later.

Changed 6 years ago by thomabrown

Attachment: img_0041.jpg added

Volumes

Changed 6 years ago by thomabrown

Attachment: img_0039.jpg added

dbb #3

Changed 6 years ago by thomabrown

Attachment: img_0040.jpg added

dbb #3 after pressing Enter

Changed 6 years ago by thomabrown

Attachment: dfswork201404211058.log added

DFSee log of current layout

comment:20 Changed 6 years ago by thomabrown

Re: comments by erdmann, The system won't boot from any USB device.

DFSee log with LVM attached. Disk 1 is what you want to look at.

dbb testing:

dbb #1 hangs after checkpoint 15. Keyboard has no effect. C-A-D does not work. Hard reset.

dbb #2 May have screwed this up, will re-try

dbb #3 No checkpoints displayed, just the menu.
Per the screenshot, No 04, HD 01, ECS22B2 is flashing.
Enter, tries to boot that drive >> 3rd screenshot (img_0040.jpg).
Hung at this point. KB has no effect.

comment:21 Changed 6 years ago by thomabrown

dbb #2 behaves the same as dbb #1. checkpoint 15, KB has no effect, Alt key has no effect.

I will try hooking up a serial cable to my Thinkpad *IF* I can find it in this mess.
I have a commitment this afternoon and early evening, but will try to put the pieces together and let you know.

comment:22 in reply to:  20 Changed 6 years ago by Ben Rietbroek

Replying to thomabrown:

Re: comments by erdmann, The system won't boot from any USB device.

DFSee log with LVM attached. Disk 1 is what you want to look at.

dbb testing:

dbb #1 hangs after checkpoint 15. Keyboard has no effect. C-A-D does not work. Hard reset.

dbb #2 May have screwed this up, will re-try

dbb #3 No checkpoints displayed, just the menu.


The checkpoints are displayed on the post-BIOS screen.
Press TAB to switch between the post-BIOS screen and the Main Menu.

Per the screenshot, No 04, HD 01, ECS22B2 is flashing.
Enter, tries to boot that drive >> 3rd screenshot (img_0040.jpg).
Hung at this point. KB has no effect.


Ok, now look at the screenshot where the partition is started...
It shows only one dot of the progress dot-bar. That's because the timertick
routine is called here also to delay between printing the dots.

So it seems the timertick values in the Bios Data Area (BDA) don't get updated.
While interrupts are enabled before checkpoint 1, it could be some BIOS service
is faulty and returns with interrupts disabled, preventing the values in the
BDA from getting incremented by the timertick interrupt. Since the BIOS found a
SATA-disk, it now most probably uses different code to access the device.

So I will send you a debug-build 4 that explicitly enables interrupts in the
timertick routine, allowing the timertick interrupt to update the values in the BDA.
Besides that, it will also display the status of the interrupt flag after BIOS disk
accesses, so we can track which one returns with interrupts disabled if that is indeed
the issue.

A debug-build 5 will have a changed timertick routine that does not rely on the BDA,
but just does some delay using a loop. This debug-build should be able to start ECS22B2.
Don't press ECS to enable the timed boot because that will probably fail now.
As a precaution, interrupts are enabled again before starting the partition in case any eCS
bootloader code also relies on the BDA. But that does not prevent the faulty BIOS service from
disabling them again if the eCS loader uses the services.
So the system still might not boot, but then it would stall at some stage in the eCS bootloader.

The blinking is probably a video artifact.
What happens if you press up-key and then down again?
If it keeps blinking, the video adapter interprets a color-bit differently,
but we'll look into that later.

Last edited 6 years ago by Ben Rietbroek (previous) (diff)

comment:23 in reply to:  21 Changed 6 years ago by Ben Rietbroek

Replying to thomabrown:

dbb #2 behaves the same as dbb #1. checkpoint 15, KB has no effect, Alt key has no effect.

I will try hooking up a serial cable to my Thinkpad *IF* I can find it in this mess.
I have a commitment this afternoon and early evening, but will try to put the pieces together and let you know.


For that you'll need a debug-build with serial logging enabled.
Let's first see what dbb-4 and dbb-5 turn up, and if we can't pinpoint the problem yet,
we'll use a debug-build with serial logging to dump more internal state.

But my guess is that dbb-5 will be able to at least load and start the ECS22B2 partition loader.

Changed 6 years ago by thomabrown

Attachment: img_0042.jpg added

dbb-4, Scroll Lock before menu

Changed 6 years ago by thomabrown

Attachment: img_0044.jpg added

dbb-4, menu, ECS22B2 blinking

comment:24 Changed 6 years ago by thomabrown

dbb-4 works! It shows the checkpoints (pressed Scroll lock, img_0042.jpg), Enter shows the menu with ECS22B2 blinking (img_0044.jpg). Enter here started Phase 3, ran to completion.

I did not run ddb-5. Will do so if you wish.

I can also test the original version of Air-Boot from the DVD against the completed eCS 2.2 Beta II installation if that would help.

Bottom line: I now have a working eCS 2.2 Beta II installation, and can begin to test my normal customization, and try installing Win 7 Ultimate on the primary.

Please let me know whether you would like me to test further.

Thanks, Ben, Lars, and Steven!

comment:25 in reply to:  24 Changed 6 years ago by Ben Rietbroek

Replying to thomabrown:

dbb-4 works! It shows the checkpoints (pressed Scroll lock, img_0042.jpg)

Checkpoints are displayed on the post-BIOS screen. You can use the TAB key to switch
between the Main Menu and the post-BIOS screen. No need to use Scroll Lock.

Enter shows the menu with ECS22B2 blinking (img_0044.jpg). Enter here started Phase 3, ran to completion.


Well, there you have it !
Looking at img_0042.jpg shows that interrupts are disabled after reading the LVM-info record.
This function uses Int13X Function 42h and on your BIOS it returns with interrupts disabled.
When interrupts are disabled, the timer-tick values in the BDA are not updated and the AiR-BOOT
delay function would wait for ever.
I earlier wondered why the keyboard did not get blocked, but as it turns out,
BIOS Int16h (keyboard stuff), enables interrupts. (At least on my BIOS)

I did not run ddb-5. Will do so if you wish.


Not needed, It will also work.

I can also test the original version of Air-Boot from the DVD against the completed eCS 2.2 Beta II installation if that would help.


That one will hang again because it is a BIOS issue and not a disk-layout issue.
So you'll have to use ddb-4 for now.
Don't enable timed-boot or press ESC to toggle it, as that will probably cause a hang.

In not too long a time there will be a re-release of v1.1.0, v1.1.0a, which will incorporate a few other fixes and also include the fixes for this issue.
Before committing the sources for the re-release I will send the build to you so you can check whether it still works on your system, if you please.

Bottom line: I now have a working eCS 2.2 Beta II installation, and can begin to test my normal customization, and try installing Win 7 Ultimate on the primary.


Bottom line:
You have a quirky BIOS on that machine. (Any brand or make available? / BIOS name, date?)
If the blinking issue does not occur when only a PATA-disk is attached, there happen
some strange things when the BIOS finds a SATA-disk. The bug in the disk related services I
can understand, but interpreting video-adapter color-bits differently not really.
(Unless they remap a complete BIOS-segment that includes buggy CRT initialization)

Please let me know whether you would like me to test further.

The v1.1.0a re-release when I send it to you, please.

Thanks, Ben, Lars, and Steven!


Thank -you- for taking the time in helping to resolve this issue.
As a result, AiR-BOOT has become more resilient to quirky BIOS implementations.

comment:26 Changed 6 years ago by Ben Rietbroek

Owner: set to Ben Rietbroek
Status: newaccepted

comment:27 Changed 6 years ago by thomabrown

This is a Tyan S2895 Thunder K8WE board with 2 x Opteron 275 CPUs.
It *IS* a quirky motherboard/Bios!
The Bios is PhoenixBIOS dated 11/18/2008, version 1.06.2895.
That is, unfortunately, the latest version.

I have had other problems, such as the PCI routing table being reported as faulty by pci.exe. There have been other things that I can't remember. The board was intended to be a server board, but I didn't know about such things when I bought it.

Will test 1.1.0a when you send it.

comment:28 Changed 3 years ago by Ben Rietbroek

Resolution: fixed
Status: acceptedclosed

* This issue is assumed to be fixed *

There is a new release of AiR-BOOT which should fix this issue.
It can be downloaded here: https://github.com/rousseaux/netlabs.air-boot/releases

If this specific issue is *not* fixed, then please create a *new* ticket
here, http://trac.netlabs.org/air-boot/newticket, describe the problem you
are experiencing and insert a *reference* to this issue if you think it is
related.

Happy booting,

Ben.

Note: See TracTickets for help on using tickets.