Opened 9 years ago

Closed 9 years ago

Last modified 5 years ago

#398 closed defect (fixed)

Boot with ACPI is unstable

Reported by: LarsErdmann Owned by: pasha
Priority: major Milestone:
Component: ACPI PSD Version: 3.16
Keywords: Cc:

Description

Configuration: eCS 1.2 R German with all fixes applied ACPI 3.14 APM 1.28 Kernel: @#IBM:14.104a#@_W4 IBM OS/2 Kernel Single CPU Mainboard

I have been trying various switches to ACPI, alone and in combination: /FS /VBE /EIS /OS="Microsoft Windows" /!NOD /PIC

but the boot is unreliable, sometimes it hangs when loading the .SNP files, sometimes when loading the BASEDEVS (.SYS,.ADD.FLT,.DMD)

Notes: a.) I have been using /OS="Microsoft Windows" as I ran iasl.exe as described in doc and generated .dsl file as described in documentation, find attached. Since it contained a conditional on OS type "Microsoft Windows" I thought it would make a difference but it did not b.) /!NOD does not seem to do anything. Even after adding /!NOD, drivers ACPICA$ and OEMHLP$ reside in the same loaded module (Theseus tells me that) and that module is ACPI.PSD. Adding /!NOD I had the expectation that OEMHLP$ would be taken from the kernel module (where it resides per default) but it is not. c.) I am also running WinXP on this machine (via OS/2 bootmanager). Rebooting from WinXP into eCS seems to be problematic. It seems to trigger the boot hang or at least negatively impact it. I guess WinXP leaves ACPI in some state that makes ACPI.PSD unhappy.

Find attached the config.sys file, the .dsl file, the ACPI and APM log files.

Attachments (15)

CONFIG.SYS (9.7 KB) - added by LarsErdmann 9 years ago.
Config.sys file
dsdt_AWRDACPI.dsl (139.4 KB) - added by LarsErdmann 9 years ago.
DSDT File (.dsl file)
acpica.log (63.9 KB) - added by LarsErdmann 9 years ago.
ACPI log file
acpidaemon.log (8.7 KB) - added by LarsErdmann 9 years ago.
ACPI daemon log file
acpi-test.cmd (8.2 KB) - added by LarsErdmann 9 years ago.
Updated acpi-test.cmd that correctly displays kernel revision
result.txt (958 bytes) - added by LarsErdmann 9 years ago.
Result of running acpi-test.cmd
acpica.2.log (210.2 KB) - added by erdmann 9 years ago.
ACPICA log file on running ACPI316-05_29.zip
rmviewoutput.txt (1.3 KB) - added by Erdmann 9 years ago.
Output of rmview /IRQ
pcioutput.txt (4.0 KB) - added by Erdmann 9 years ago.
Output of PCI.EXE -H -S
acpiirqoutput.txt (8.0 KB) - added by Erdmann 9 years ago.
Output of acpiirq.exe 0
acpica.3.log (10.6 KB) - added by Erdmann 9 years ago.
ACPI log file when LINK LNKH 0,LINK LNKE 5 is specified in acpi.cfg
SNOOP.LST (409 bytes) - added by Erdmann 9 years ago.
snooper used in conjunction with acpi316full.zip
acpi.cfg (2.3 KB) - added by Erdmann 9 years ago.
acpi config used in conjunction with acpi316full.zip
acpica.4.log (169.8 KB) - added by Erdmann 9 years ago.
acpi log file on use of acpi316full.zip
dsdt_AWRDACPI.2.dsl (139.9 KB) - added by Erdmann 9 years ago.
Modified DSDT table to assign Name _BBN to "PCI0" device

Download all attachments as: .zip

Change History (41)

Changed 9 years ago by LarsErdmann

Config.sys file

Changed 9 years ago by LarsErdmann

DSDT File (.dsl file)

Changed 9 years ago by LarsErdmann

ACPI log file

Changed 9 years ago by LarsErdmann

ACPI daemon log file

comment:1 Changed 9 years ago by pasha

  • Owner changed from eco to pasha

Try use psd=acpi.psd /LS:5. And for acpi.psd need remove all from snoop.lst

comment:2 follow-up: Changed 9 years ago by eco

e-co:

Let's start from the beginning,

1) what is the model of PC?

2) does the script report about problems? http://ecomstation.ru/projects/acpitools/download/acpi-test.cmd

comment:3 in reply to: ↑ 2 Changed 9 years ago by LarsErdmann

Replying to eco:

e-co:

Let's start from the beginning,

1) what is the model of PC?

2) does the script report about problems? http://ecomstation.ru/projects/acpitools/download/acpi-test.cmd

1) Model of PC: it's an OEM PC from "Medion" but the motherboard is an MSI MS-6701 with 1 single core Intel Pentium CPU.

2.) script output: see result file. I have these annotations: a.) I did not want to change snoop.lst because I have an application that parses the "detected" tree to find a Parallel port. That won't work if the parallel.snp is not there. If you say it's necessary ... b.) I have not replaced RESOURCE.SYS as the new RESOURCE.SYS does not properly display all info when I look in Hardware manager. Additionally I don't use high IRQs. c.) Didn't know I either had to use UNI or SMP kernel. In general, it DID work with W4 kernel.

Changed 9 years ago by LarsErdmann

Updated acpi-test.cmd that correctly displays kernel revision

Changed 9 years ago by LarsErdmann

Result of running acpi-test.cmd

comment:4 Changed 9 years ago by eco

e-co:

What happens if you use ACPI.PSD without switches, with 1 additional snooper?

Do you use bootable JFS? do you have hard hangs? and then booting eCS with ACPI.. and boot fails..

Do you use Windows on the same PC? Do you reboot from Windows to eComStation?

comment:5 Changed 9 years ago by LarsErdmann

1.) I will have to give this a try for some time. I will report the results. 2.) No, I don't use bootable JFS on this box. 3.) Yes, I have hard hangs (with ACPI but not without). They happen here and there when the eCS boot logo comes up. What I can say is that it happens during loading of the basedev drivers, often when IBM1FLPY.ADD is loaded (diskette light is lit due to IBM1FLPY.ADD firing up the diskette drive motor, then nothing else happens). But it also occasionally happens on loading OS2LVM.DMD. 4.) Yes I use WinXP and eCS on the same PC. I use OS/2 bootmanager to switch between them. 5.) I reboot from WinXP to eCS and vice versa.

A few observations: a.) when I have a hang on booting eCS, I have two choices: I can power down with the front button (press 4 secs). Then I will need to boot into WinXP. Once I did so and shut down from WinXP, I can boot into eCS (most of the time). Or I have to power off the system with the main power switch on the back of the case. Once done, I can boot into eCS (most of the time). b.) Having no snooper or a couple of them in snoop.lst does not seem to make much of a difference. The only thing I seem to see is that if mouse.snp is in it tends to hang more often. I think ibmkbd.snp,ibm1flpy.snp,serial.snp,parallel.snp don't do much anyway, they add hard coded resources to the detected tree (ibmkbd.snp,ibm1flpy.snp) or maybe scan BIOS for COM and PARALLEL ports and add that to the detected tree (serial.snp,parallel.snp). pcibus.snp will most likely scan the PCI config space I have no clue if that is a problem when ACPI is in use. c.) I have a self written application that uses "DosShutdown?(0)" and I also tried "WinShutdownSystem?" instead. Both seem to have its problems to bring the system into the "while(1)" idle loop that will allow you to power off the system (with WinShutdownSystem? causing more problems). That application does not use the ACPI API to completely power down the system. After that application is run, rebooting the system often results in a hang.

Add. note: I have WinXP on NTFS as drive C: It's not hidden in eCS but I don't have the NTFS.IFS driver operational. Is that a problem maybe ?

Let me know if you need more detailed tech info on the chipset etc.

comment:6 Changed 9 years ago by pasha

Try change W4 kernel to Uni

comment:7 Changed 9 years ago by LarsErdmann

I have now tried different things: 1.) ACPI.PSD without cmdline parameter and only ibmkbd.snp in snoop.lst 2.) ACPI.PSD without cmdline parameter and multiple .snp in snoop.lst (the reduced set, excluding mouse.snp) 3.) ACPI.PSD with /OS"Microsoft Windows" and only ibmkbd.snp in snoop.lst 4.) ACPI.PSD with /OS"Microsoft Windows" and multiple .snp in snoop.lst (the reduced set, excluding mouse.snp)

3.) is the most promising but still it hangs here and there. What ALWAYS HELPS in this case is to power off the system with the front button (I have a 4 sec power off delay on that button), boot (via eCS bootmanager) into WinXP and from WinXP reboot (via eCS bootmanager) into eCS. In that case, booting into eCS will go without problems.

What I also observed is that on hitting either Alt-F4 or Alt-F5 (in other words: if each DD load can be acknowledged by the user) that my USB mouse will not work properly with the cursor hopping around and basically being uncontrollable. However that might be due to deficiencies in either the mouse driver (MOUSE.SYS) , the USB mouse class driver (USBMOUSE.SYS) or the USB stack (USBD.SYS,USBOHCD.SYS).

comment:8 Changed 9 years ago by pasha

Please download experimental ACPI build from Mensys site:

  • Experimental build for you:

ACPI316-05_29.ZIP

use psd=acpi.psd

pls attach acpica$ log

Then try SMP kernel

comment:9 Changed 9 years ago by erdmann

1.) I downloaded the version stated above and had it running with the Warp4 kernel for about a week now. It did not solve the problem, I have the same hangs on bootup 2.) Observation: when hitting Alt-Fx (x=1,2,3,4,5) on bootup the system hangs more often. I suspect that either the keyboard input or the screen output of the recovery screen during bootup triggers these hangs (in combination with ACPI, no hangs otherwise). Info: like many of us I use SNAP as graphics driver. 3.) find attached the log file 4.) I will now go to the SMP kernel. Is this a problem since I don't have an SMP system ? How about the UNI kernel ?

Changed 9 years ago by erdmann

ACPICA log file on running ACPI316-05_29.zip

comment:10 Changed 9 years ago by Erdmann

I now replaced the Warp4 kernel with the UNI kernel and with ACPI.PSD parameters unchanged (that is: no parameters). There is no change in behaviour. The system hangs, even more frequently than with ACPI V3.14 (no matter if Warp4 or UNI kernel). I will now try a few ACPI.PSD parameters and see if they improve the situation.

comment:11 Changed 9 years ago by pasha

Try add to \os2\boot\acpi.cfg

LINK LNKH 11

Also can try 12, 5, 3

comment:12 Changed 9 years ago by Erdmann

I tried that but I still have hangs. Please can you explain to me why I would only try to modify LNKH and not the other LNKx ? As far as I understand I have 8 PCI links (LNKA to LNKH) on my PC that I can hook to the various ISA IRQ lines 3,5,11,12 whatever and this is controlled/modified via the instruction you are stating above. But where is the mapping between #INTA to #INTD to the links LNKA to LNKH defined ? As far as I understand the problem is if too many for example #INTA of the various PCI slots are hooked to the same link say LNKB ... How can I use APIC and shall/can I do so with a W4 or UNI kernel ?

comment:13 Changed 9 years ago by Erdmann

I now dumped IRQ info that I am given by pci.exe -H -S. Find attached. Then I dumped IRQ info given by rmview.exe /IRQ, find also attached. I can see a mismatch (pci.exe claims that IDE is hooked to IRQ 11 while rmview.exe /IRQ claims it is hooked to IRQ 14 and IRQ 15). What is that supposed to mean ?

Changed 9 years ago by Erdmann

Output of rmview /IRQ

comment:14 Changed 9 years ago by Erdmann

Forgot to say, I now have the following setup: PSD=ACPI.PSD (no additional parameters) SNOOP.LST contains these snoopers (I don't think these cause the problem): RESRV.SNP IBMKBD.SNP IBM1FLPY.SNP PARALLEL.SNP MOUSE.SNP SERIAL.SNP PCIBUS.SNP acpi.cfg now specifies this (because IRQ 11 was "overloaded" by 4 devices with LINK LNKH 11): LINK LNKH 5

comment:15 Changed 9 years ago by pasha

PCIBUS.SNP must be removed from snoop.lst. Also try add LINK LNKE 5 to acpi.cfg

comment:16 Changed 9 years ago by Erdmann

In the meantime (I hadn't yet read your post) I tried something different: 1.) leave snoop.lst as is (all .SNP as stated above)
2.) added LINK LNKH 0, therefore disabling LINKH
3.) Note that I have a SIS961 Southbridge that contains a USB 2.0 host controller that is known to not work properly and this is also true for WinXP. As far as I could deduce from "acpiirq.exe 0", the USB 2.0 host controller (and only that one) was tied to LINKH. Funny enough an IRQ is still allocated to that host controller and USBEHCD.SYS properly loads on this machine (but does not work as it never did)
As far as I can tell from the last couple of reboots, now my system does not hang anymore but I need more time to say for sure as it certainly is a timing issue.
Since WinXP does never hang, my suspicion is that ACPI.PSD might fail to properly manipulate/update the IRQ routing table or fails to disable IRQS while doing so, so that then no IRQs get to the CPU at all. The effect is that absolutely nothing works after these hangs: no keyboard, no mouse, no disk activity etc.
By the way , have you read through
http://tech.groups.yahoo.com/group/os2ddprog/message/3832[[BR]] http://tech.groups.yahoo.com/group/os2ddprog/message/3833[[BR]] http://tech.groups.yahoo.com/group/os2ddprog/message/3834[[BR]] http://tech.groups.yahoo.com/group/os2ddprog/message/3835[[BR]] etc. ? I have the feeling that it also matters in this case if you write to memory mapped registers of a PCI device.

If I still get hangs, I will try as you suggest and keep you informed.

comment:17 Changed 9 years ago by pasha

  1. PCIBUS.IRQ ca has conflict with acpi
  1. Look to acpi.log, next to:

15:12:22 0:2.0 1039:8 'S962'

next to

15:12:55 Uniaud32: PSD find success. PSD Ftable:F97202BC

15:12:55 FindPCIDevice

15:12:55 Need: B:0x0 D:0x2 F:0x0

You can see, that your Sound has absent routers and IRQ. This need to cure. This curing begin from remove PCIBUS.SNP from your snoop.lst and give me acpi.log after this doing.

  1. LINK LNKE 0 don't disable this link

comment:18 Changed 9 years ago by Erdmann

To start with, I triggered ESCD update via bios.

  1. I do have sound. I have no problem with the sound device, it is device Number 0x0002, function 0x0007, INTC , is using LINKC and is therefore routed to IRQ 5:

pci.exe -S -H:
...
Bus 0 (PCI), Device Number 2, Device Function 7
Vendor 1039h Silicon Integrated Systems (SiS)
Device 7012h SiS7012 Audio Codec
Subsystem ID 70101462h Realtek AC'97 Audio
Subsystem Vendor 1462h Micro-Star International Co Ltd (MSI)
System IRQ 5, INT# C
...
acpiirq.exe 0
...
Device 0x0002 Function ALL INTC Route to Device [LNKC] used IRQ5 triggerred by Level, polarity Low, Sharable
...

  1. I did not disable LINKE. I did disable LINKH, that should disable USB 2.0 host controller:

pci.exe -S -H:
...
Bus 0 (PCI), Device Number 3, Device Function 3
Vendor 1039h Silicon Integrated Systems (SiS)
Device 7002h SiS7002 USB 2.0 Enhanced Controller
Subsystem ID 70101462h Unknown
Subsystem Vendor 1462h Micro-Star International Co Ltd (MSI)
System IRQ 9, INT# D
...
acpiirq.exe 0
...
Device 0x0003 Function ALL INTD Route to Device [LNKH] used IRQ9 triggerred by Level, polarity Low, Sharable
...
where the USB 2.0 host controller is the only device with device Number 0x0003 and PCI interrupt line INTD

As I said, currently, I have no more hangs on boot.
I updated the pci.exe output file for your reference.

Changed 9 years ago by Erdmann

Output of PCI.EXE -H -S

Changed 9 years ago by Erdmann

Output of acpiirq.exe 0

comment:19 Changed 9 years ago by Erdmann

I followed your instructions, the setup is now:
snoop.lst:
RESRV.SNP
IBMKBD.SNP
IBM1FLPY.SNP
PARALLEL.SNP
MOUSE.SNP
SERIAL.SNP

acpi.cfg:
LINK LNKH 0
LINK LNKE 5

Used kernel:
Signature: @#IBM:14.104a#@_UNI IBM OS/2 Kernel
Vendor: IBM
Revision: 14.104
File Version: 14.104
Description: _UNI IBM OS/2 Kernel

I still have occasional hangs (but they occur less often), especially when I boot from eCS back into eCS (via setboot.exe). Find attached acpica.log.

Changed 9 years ago by Erdmann

ACPI log file when LINK LNKH 0,LINK LNKE 5 is specified in acpi.cfg

comment:20 Changed 9 years ago by pasha

From your pci.exe out:

Bus 0 (PCI), Device Number 2, Device Function 0

Vendor 1039h Silicon Integrated Systems (SiS)

Device 0008h SiS PCI to ISA Bridge (LPC Bridge)

From your acpica.3.log

8:30:29 Uniaud32: PSD find success. PSD Ftable:F97202BC

8:30:29 FindPCIDevice

8:30:29 Need: B:0x0 D:0x2 F:0x0

So, here we see problem with uniuad devices detect. When you have "occasional hangs", boot stage? Also, try boot with /!NOD

comment:21 Changed 9 years ago by Erdmann

1.) yes, the "occasional hangs" only happen at boot stage, never during normal operation
2.) I now have the following setup:
@#IBM:14.104a#@_UNI IBM OS/2 Kernel
acpi.psd and apm.add from this zip package: acpi316full.zip:
@#netlabs dot org:3.16#@##1## 24 Jun 2009 06:24:11 pasha::::0::@@ ACPI core PSD Driver. (c) netlabs.org 2005-2009
@#eCoSoft:1.30#@##1## 24 Jun 2009 06:24:11 pasha::::0::@@ ACPI APM. (c) Pavel Shtemenko 2006-2009
I am now running ACPI.PSD with /!NOD switch
Find attached snoop.lst,acpi.cfg and acpica.log
3.) /!NOD seems to cure the problem resulting in message "Need: B:0x0 D:0x2 F:0x0". It has now disappeared.
4.) I had also removed OSLOGO. But it had no effect on the likelyhood of system hang.Therefore it put it back in.
5.) This version of ACPI seems to be pretty reliable. I have not had any hangs lately. But it surely is some timing issue where the result of a hang is that IRQs remain globally disabled leading to a complete system hang needing a power off/on cycle to restart.
5.) I hear some strange "blips" from my computer obviously when the system is in sleep mode. Sounds like they come from the built-in card reader that is attached to the system via USB. What could that mean ?

Changed 9 years ago by Erdmann

snooper used in conjunction with acpi316full.zip

Changed 9 years ago by Erdmann

acpi config used in conjunction with acpi316full.zip

Changed 9 years ago by Erdmann

acpi log file on use of acpi316full.zip

comment:22 Changed 9 years ago by Erdmann

Too bad, it still hangs here and there ....
This time I hit Alt-F2 in order to display loading of device drivers. But the hangs also happen if I don't do that. I have the feeling that if I hit the key combination for too long (a lot of keystrokes are queued up in the keyboard buffer) that this also makes the hang more likely.

comment:23 Changed 9 years ago by Erdmann

Just a thought:
Do you have a PCI card that is capable of generating an NMI ? Something like:
http://www.connecttech.com/sub/Products/PCI_DumpSwitch.asp
or
http://www.summitsoftconsulting.com/DumpSwitchCard.htm

Then, you could either break into the kernel debugger or use DevHelp_RegisterKrnlExit to register an NMI handler. Either way, you would be able to check the "Interrupt Enable" bit in EFLAGS and/or check the Interrupt Controller for masked interrupts. As I said, when I have hangs on bootup, it looks like all interrupts are disabled.

comment:24 Changed 9 years ago by Erdmann

  • Version changed from 3.14 to 3.16

Some news: I had a look into log acpica.log where I found out that ACPI.PSD would complain that it found a bridge "PCI0" but that it would not have a "_BBN" Name. What I then did was to run iasl.exe -d to dump the DSDT ACPI table, I modified it by adding a _BBN Name for it specifying that PCI0 was on bus 0 (which it seemingly is). I then did a couple of other fixes to remove the warnings and errors that iasl.exe would show on compile. I than ran "iasl.exe dsdt_AWRDACPI.DSL" which created file DSDT.AML which I now load from acpi.cfg.
Currently I no longer have hangs.
My current config:
PSD=ACPI.PSD /OS="Microsoft Windows"
acpi.cfg:
FILE DSDT.AML
LINK LNKE 5 (for whatever reason no IRQ line is assigned to LNKE 5 per default)
PCIasACPI (not really sure if this is really necessary)

By the way: apart from logging bugs here, is there a place where I could discuss my ACPI issues with the eCS developers ? I much better understand ACPI now (I read the ACPI specs) and I think I could contribute some observations.
I am attaching my modified dsdt_AWRDACPI.dsl file. You can compare the changes against the original file I had already attached in the past.

Changed 9 years ago by Erdmann

Modified DSDT table to assign Name _BBN to "PCI0" device

comment:25 Changed 9 years ago by erdmann

  • Milestone changed from Release version 3.17 to eCS 2.x
  • Resolution set to fixed
  • Status changed from new to closed

This ticket can be closed as the issues with hang on bootup as far as ACPI.PSD is concerned are fixed with the current trunk build (heading towards V 3.18). Additional problems with hang on boot can be attributed to problems with the USB drivers when the system is rebooted from WinXP to OS/2. It seems that on that occasion USB is left in a state where the OS/2 USB drivers will cycle forever in a wait loop.

comment:26 Changed 5 years ago by dazarewicz

  • Milestone eCS 2.x deleted

Milestone eCS 2.x deleted

Note: See TracTickets for help on using tickets.