Opened 10 years ago

Closed 10 years ago

#41 closed comment (NoChangeNeeded)

System hangs during boot with add-in PCIe SATA card

Reported by: thomabrown Owned by: David Azarewicz
Priority: Feedback Pending Milestone:
Component: driver Version: 1.32
Keywords: Cc: steve53@…

Description

I'm out of SATA ports on my mobo (Tyan S2895 K8WE) so I bought a PCIe 4-port SATA card with a Marvell chipset. PCI shows:

Bus 129 (PCI Express), Device Number 0, Device Function 0

Vendor 1B4Bh Marvell Technology Group Ltd.
Device 9230h 88SE9230 PCIe SATA 6Gb/s Controller
Command 0107h (I/O Access, Memory Access, BusMaster?, System Errors)
Status 0010h (Has Capabilities List, Fast Timing)
Revision 10h, Header Type 00h, Bus Latency Timer 00h
Self test 00h (Self test not supported)
Cache line size 64 Bytes (16 DWords)
PCI Class Storage, type Serial ATA (AHCI 1.0)
Subsystem ID 92301B4Bh Unknown (Generic ID)
Subsystem Vendor 1B4Bh Marvell Technology Group Ltd.
Address 0 is an I/O Port : 5030h..5037h
Address 1 is an I/O Port : 5024h..5027h
Address 2 is an I/O Port : 5028h..502Fh
Address 3 is an I/O Port : 5020h..5023h
Address 4 is an I/O Port : 5000h..501Fh
Address 5 is a Memory Address (0-4GiB) : D0500000h..D05007FFh
System IRQ 5, INT# A
Expansion ROM of 64 KiB decoded by this card, currently disabled
New Capabilities List Present:

Power Management Capability, Version 1.2

Does not support low power State D1 or D2
Supports PME# signalling from mode(s) D3hot
PME# signalling is currently disabled
3.3v AUX Current required : 0 mA (Self powered)
Current Power State : D0 (Device operational, no power saving)

Message Signalled Interrupt Capability

MSI is disabled
MSI function can generate 32-bit addresses

PCI Express Capability, Version 2

Device/Port? Type :

Legacy PCI Express Endpoint Device

Device Capabilities :

Unsupported Request Severity is Fatal

Device Status :

Correctable Error Detected
Unsupported Request Detected

Link Capabilities :

Maximum Link speed : Unknown (02h)!!
Maximum Link Width : x2
Link Port Number : 0

Link Control :

Asynchronous Clocking in Use

Link Status :

Current Link speed : 2.5Gb/s
Current Link Width : x2

Unknown Capability (Code 12h)!!

System boots OK with no drives connected. Adding a drive causes a hang during boot. I have recorded the debug log on another machine per the troubleshooting doc.

[d:\]bldlevel os2krnl
Build Level Display Facility Version 6.12.675 Sep 25 2001
(C) Copyright IBM Corporation 1993-2001
Signature: @#IBM:14.106#@_SMP IBM OS/2 Kernel
Vendor: IBM
Revision: 14.106
File Version: 14.106
Description: _SMP IBM OS/2 Kernel

[d:\]bldlevel \os2\boot\OS2AHCI.ADD
Build Level Display Facility Version 6.12.675 Sep 25 2001
(C) Copyright IBM Corporation 1993-2001
Signature: @#D Azarewicz:1.32#@##1## 10 Nov 2013 07:58:57 DAZAR1 ::::::@@AHCI

Driver (c) 2013 D Azarewicz

Vendor: D Azarewicz
Revision: 1.32
Date/Time?: 10 Nov 2013 07:58:57
Build Machine: DAZAR1
File Version: 1.32
Description: AHCI Driver (c) 2013 D Azarewicz

[d:\]bldlevel \os2\boot\acpi.psd
Build Level Display Facility Version 6.12.675 Sep 25 2001
(C) Copyright IBM Corporation 1993-2001
Signature: @#D Azarewicz:3.22.03#@##1## 28 Nov 2013 11:00:28 DAZAR1 ::::03::@
@ACPI based PSD for eCS (c) 2013 D Azarewicz
Vendor: D Azarewicz
Revision: 3.22.03
Date/Time?: 28 Nov 2013 11:00:28
Build Machine: DAZAR1
File Version: 3.22.3
Description: ACPI based PSD for eCS (c) 2013 D Azarewicz

I have attached the log from the other machine.
Is there anything else you need?

Thanks!

Attachments (4)

AHCI_debug_4.log (44.3 KB) - added by thomabrown 10 years ago.
ECS2-2B2-20140715-ahci-1.32.log (47.0 KB) - added by thomabrown 10 years ago.
Testlog ahci, system NOT hung since drive not attached
ECS2-2B2-20140715-acpi-3.22.03.zip (51.8 KB) - added by thomabrown 10 years ago.
ACPI-3.22.3-VW-trap008_IMG_0310.jpg (92.1 KB) - added by thomabrown 10 years ago.

Download all attachments as: .zip

Change History (15)

Changed 10 years ago by thomabrown

Attachment: AHCI_debug_4.log added

comment:1 Changed 10 years ago by thomabrown

FWIW...
I forgot to mention that the mobo has 2, apparently non-AHCI controllers with a total of 4 SATA ports.
PCI shows:

Bus 0 (PCI Express), Device Number 7, Device Function 0

Vendor 10DEh NVIDIA Corporation
Device 0054h CK804 Serial ATA Controller
Command 0007h (I/O Access, Memory Access, BusMaster?)
Status 00B0h (Has Capabilities List, Supports 66MHz,

Supports Back-To-Back Trans., Fast Timing)

Revision F3h, Header Type 00h, Bus Latency Timer 00h
Minimum Bus Grant 03h, Maximum Bus Latency 01h
Self test 00h (Self test not supported)
PCI Class Storage, type IDE (ATA)
PCI EIDE Controller Features :

BusMaster? EIDE is supported
Primary Channel is in native mode at Addresses 0 & 1
Secondary Channel is in native mode at Addresses 2 & 3

Subsystem ID 289510F1h Tomcat K8E (S2865) (Guess Only!)
Subsystem Vendor 10F1h Tyan Computer
Address 0 is an I/O Port : 1C40h..1C47h
Address 1 is an I/O Port : 1C34h
Address 2 is an I/O Port : 1C38h..1C3Fh
Address 3 is an I/O Port : 1C30h
Address 4 is an I/O Port : 1C10h..1C17h
Address 5 is a Memory Address (0-4GiB) : B0003000h..B0003FFFh
System IRQ 20, INT# A
New Capabilities List Present:

Power Management Capability, Version 1.1

Does not support low power State D1 or D2
Does not support PME# signalling
Current Power State : D0 (Device operational, no power saving)

Bus 0 (PCI Express), Device Number 8, Device Function 0
Vendor 10DEh NVIDIA Corporation
Device 0055h CK804 Serial ATA Controller
Command 0007h (I/O Access, Memory Access, BusMaster?)
Status 00B0h (Has Capabilities List, Supports 66MHz,

Supports Back-To-Back Trans., Fast Timing)

Revision F3h, Header Type 00h, Bus Latency Timer 00h
Minimum Bus Grant 03h, Maximum Bus Latency 01h
Self test 00h (Self test not supported)
PCI Class Storage, type IDE (ATA)
PCI EIDE Controller Features :

BusMaster? EIDE is supported
Primary Channel is in native mode at Addresses 0 & 1
Secondary Channel is in native mode at Addresses 2 & 3

Subsystem ID 289510F1h Tomcat K8E (S2865) (Guess Only!)
Subsystem Vendor 10F1h Tyan Computer
Address 0 is an I/O Port : 1C58h..1C5Fh
Address 1 is an I/O Port : 1C4Ch
Address 2 is an I/O Port : 1C50h..1C57h
Address 3 is an I/O Port : 1C48h
Address 4 is an I/O Port : 1C20h..1C27h
Address 5 is a Memory Address (0-4GiB) : B0004000h..B0004FFFh
System IRQ 21, INT# A
New Capabilities List Present:

Power Management Capability, Version 1.1

Does not support low power State D1 or D2
Does not support PME# signalling
Current Power State : D0 (Device operational, no power saving)

Just so there is no confusion with the log file. :-)>

comment:2 Changed 10 years ago by David Azarewicz

Owner: set to David Azarewicz
Status: newaccepted

There is insufficient information. Please attach the output of 'testlog ahci' even if you do it with no drives attached.

comment:3 Changed 10 years ago by David Azarewicz

Priority: majorFeedback Pending

comment:4 Changed 10 years ago by David Azarewicz

In addition to the output from 'testlog ahci' already requested, please also attach the output from 'testlog acpi'. Both of these should be done with the card installed, even if it has no drives attached.

Also, does your BIOS have a 'Plug n Play OS' setting? If so, what is it set to?

Changed 10 years ago by thomabrown

Testlog ahci, system NOT hung since drive not attached

comment:5 Changed 10 years ago by thomabrown

BIOS does NOT have Plug-n-Play setting.

comment:6 Changed 10 years ago by David Azarewicz

Thanks for the log from 'testlog ahci'. The system has a very strange bus structure and very strange interrupt layout. I will be able to tell more of what is going on when you provide the output from 'testlog acpi'.

Changed 10 years ago by thomabrown

comment:7 Changed 10 years ago by David Azarewicz

Your problem has nothing to do with the AHCI driver.

Your problem is that a significant portion of your motherboard is not supported in APIC mode (mode 2). Any device that is assigned an interrupt higher than 31 will not work. It appears that you have 4 PCI slots and they are assigned interrupts 48,49,50,51, so none of them will work. There are several other on-board devices that also won't work. Your only choice is to run that motherboard with the /VW switch so that the APIC hardware is not used.

The PSD doesn't reassign the interrupt for any devices that it can't support. For example, your Marvell card is at 129:0:0 and was originally assigned PIC interrupt 5. In APIC mode that card gets remapped to interrupt 48 which is not allowed. So, interrupts from the card go nowhere which is why the system hangs.

The PSD issues a warning to the screen that some devices could not be assigned an interrupt. This is a fatal error for those devices. Those devices cannot be used in APIC mode.

comment:8 Changed 10 years ago by thomabrown

/VW causes a TRAP 0008, see screen shot attached.
I am coming to the conclusion that I need a new motherboard. This one is a server board, and perhaps that is why it's BIOS is a bit strange. It's wonky in other ways, too.
Once upon a time, I think Paul Ratcliffe had a program that would change the interrupt for a device at boot time. Senior memory can't come up with the name of it just now.
Might that help, or does it even work with the current ACPI?

Thanks!

comment:9 Changed 10 years ago by Steven Levine

Cc: steve53@… added

comment:10 in reply to:  8 Changed 10 years ago by David Azarewicz

Replying to thomabrown:

/VW causes a TRAP 0008, see screen shot attached.

Yes, you have an Nvidia chipset on that board and they are known to have problems like that. Motherboards with an NVidia chipset are not recommended.

I am coming to the conclusion that I need a new motherboard. This one is a server board, and perhaps that is why it's BIOS is a bit strange. It's wonky in other ways, too.

Probably a good idea. It is extremely unlikely that the OS/2 kernel will ever get updated to handle hardware like that.

Once upon a time, I think Paul Ratcliffe had a program that would change the interrupt for a device at boot time. Senior memory can't come up with the name of it just now.

Interrupts can only be changed within the limits of the hardware. In this case you have a choice between 48,49,50 and 51. None of which will work. It doesn't matter what software you use, software can't rearrange the copper on the PC board. :-)

Changed 10 years ago by thomabrown

comment:11 Changed 10 years ago by David Azarewicz

Resolution: NoChangeNeeded
Status: acceptedclosed

Your other option would be to run that system with only one CPU by using either one of:

PSD=ACPI.PSD /VW /MAXCPU=1
PSD=ACPI.PSD /PIC

Note: See TracTickets for help on using tickets.