Opened 12 years ago

Closed 11 years ago

#32 closed defect (fixed)

R8169 driver -- computer locks up with on-board LAN

Reported by: Cyberpastor Owned by:
Priority: Feedback Pending Component: r8169
Version: 0.1.0 Keywords:
Cc:

Description

The R8169 driver (ver. 0.1.0) works with an add-in card using the 8169 chip, but when I try to use the motherboard's RTL8111E I can ping the router but the machine locks solid as soon as I start Thunderbird -- even with a new profile. Even Ctrl-Alt-Del does not work, and I have to press Reset or power off.

The motherboard is an Asus M5A88-V EVO.

I have had the same result with and without the /VW switch on the ACPI.PSD line (APCI 3.20.02).

Attachments (3)

R8169_On-Board_NIC.zip (17.5 KB) - added by Cyberpastor 12 years ago.
r8169.zip (3.9 KB) - added by Cyberpastor 12 years ago.
R8169_On-Board_NIC_New.zip (23.8 KB) - added by Cyberpastor 12 years ago.

Download all attachments as: .zip

Change History (18)

comment:1 Changed 12 years ago by David Azarewicz

Priority: majorFeedback Pending

Please install the trace version, reboot, collect a trace, and attach it to this ticket. I'll have a look at how the driver has configured itself.

Make sure you have read the "Support and Submitting Tickets" section of the Wiki.

Changed 12 years ago by Cyberpastor

Attachment: R8169_On-Board_NIC.zip added

Changed 12 years ago by Cyberpastor

Attachment: r8169.zip added

comment:2 Changed 12 years ago by Cyberpastor

I have attached the ZIPped trace file and output of the PCI utility.

The machine froze again almost immediately when I started Firefox, just as it had when I started Thunderbird -- but now I am thinking that in all cases it might not have been until I moved the pointing device: the USB EHCI controllers share IRQ9 with the on-board LAN, whereas the add-in NIC uses IRQ3 (with /VW switch on ACPI.PSD line; without it my LSILogic SCSI card doesn't work).

comment:3 Changed 12 years ago by David Azarewicz

From the trace, your hardware is NIC 10EC:8168 and MAC 21

Also from the trace, it appears that your hardware is not working correctly. There is not enough information in the trace to determine the exact failure. I added some more tracing information to the driver so please download this: ftp://dazar.dyndns.biz/r8169-0.1.0.wpi and install the trace version and attach a new trace to this ticket.

Changed 12 years ago by Cyberpastor

Attachment: R8169_On-Board_NIC_New.zip added

comment:4 Changed 12 years ago by Cyberpastor

I am attaching the ZIPped trace using the new driver. I recalled reading a report that this particular on-board LAN device does not always initialize properly and that someone had suggested enabling the boot ROM feature as well and entering the configuration screen at each boot. I did that this time, but I notice now that the output of the PCI utility contains the lines

"Device Status :
Correctable Error Detected
Unsupported Request Detected"

I also noticed that on one occasion, after pinging the router a few times, the pointing device froze after just a few movements. Neither Thunderbird nor Firefox had been started on this occasion.

comment:5 Changed 12 years ago by David Azarewicz

Your hardware is definitely malfunctioning. The driver appears to be functioning correctly.

Your particular hardware requires firmware that gets loaded by the driver. The firmware is not part of the driver, but is in a separate firmware file. Your hardware needs the firmware in file rtl8168e-3.fw.

I looked around on the net and found an updated firmware file from Realtek for your hardware. Please download ftp://dazar.dyndns.biz/r8169-0.1.0.wpi again. Install the normal retail version, reboot and see if your hardware works better. I didn't change anything in the driver, I just put new firmware files into the WPI package. After installing, you should have a new rtl8168e-3.fw file in \IBMCOM\rtl_nic that is 3648 bytes in size.

comment:6 Changed 12 years ago by Cyberpastor

With the new driver/firmware combination I still have the same report in the output of the PCI utility:

Device Status :
Correctable Error Detected
Unsupported Request Detected"

The LAN hardware now has IRQ11 rather than IRQ12, but so do the EHCI controllers, and the machine still freezes when the trackball is moved after connecting to the Internet.

Without the /VW switch on the ACPI.PSD line, the LAN hardware and the EHCI controllers all share IRQ17 and the machine still freezes after a few trackball operations.

comment:7 in reply to:  6 Changed 12 years ago by David Azarewicz

Replying to Cyberpastor:

With the new driver/firmware combination I still have the same report in the output of the PCI utility:

Device Status :
Correctable Error Detected
Unsupported Request Detected"

This is probably not relevant.

The LAN hardware now has IRQ11 rather than IRQ12, but so do the EHCI controllers, and the machine still freezes when the trackball is moved after connecting to the Internet.

What did you do to change the interrupt assignments? According to both of your traces your LAN hardware is using interrupt 9. What makes you think it is not still 9?

Without the /VW switch on the ACPI.PSD line, the LAN hardware and the EHCI controllers all share IRQ17 and the machine still freezes after a few trackball operations.

I'm afraid that it looks like either your hardware is defective, or your specific hardware configuration is not supported, or you have something else broken or misconfigured in your system that is causing your problems. From all the information you have provided so far it appears that the driver is working correctly and the hardware is not. If you want to install the trace version of the newest driver I sent you and attach a new trace, I'll look to see if there is any change with the new firmware.

comment:8 Changed 12 years ago by Cyberpastor

"What did you do to change the interrupt assignments? According to both of your traces your LAN hardware is using interrupt 9. What makes you think it is not still 9?"

The IRQs that I am quoting are what the PCI utility reports. I've done nothing to change them.

Maybe I'll try the latest Trace version, but the add-in NIC (Netgear GA311) seems to be doing OK with the same driver.

comment:9 Changed 12 years ago by David Azarewicz

The errors reported in the trace show that the hardware is not receiving some data packets properly which causes an error to be reported instead of the data being received. These receive errors are properly handled by the driver. The result is that some received packets are lost. This type of hardware error will not cause a system hang. It is possible that your LAN hardware is failing in a different way which is causing your system hang, but there is no evidence of that in the trace. The driver is working correctly for the entire duration of the trace and in fact, I see many packets transmitted and received successfully.

comment:10 Changed 12 years ago by Cyberpastor

You don't consider PCI.EXE's report that the on-board LAN hardware is using IRQ12 -- the same as the USB controllers -- to be reliable? But if they *are* using the same IRQ, might that not explain the lock-up when I move the trackball after there has been LAN activity?

comment:11 in reply to:  10 Changed 12 years ago by David Azarewicz

Replying to Cyberpastor:

You don't consider PCI.EXE's report that the on-board LAN hardware is using IRQ12 -- the same as the USB controllers -- to be reliable?

Well, the trace data comes directly from the driver and I believe what the driver itself tells me it is using over *anything* else. Even though the report that PCI.EXE gives you is 'informational only', on an OS/2 system it should match what the driver is actually using. In your case, the trace and the PCI output you sent me both say the LAN is on interrupt 9.

But if they *are* using the same IRQ, might that not explain the lock-up when I move the trackball after there has been LAN activity?

No, it would not. Sharing an interrupt would not cause the problem you are having. Besides, both the MultiMac? drivers and the USB drivers are known to share properly. Also, I see in the trace that the interrupt is shared and I see that it is operating properly.

The only problem I see in the trace is that the hardware reports that a packet was received, but there is no data. In fact, the status bits look totally wrong as if the hardware is broken or memory got corrupted somehow. The status bits should never be that way. That indicates that the hardware is broken or did something very wrong. It could also mean that even though the hardware reports itself as 10EC:8168, it might not really be compatible with the driver. If the hardware wrote something to memory where it is not supposed to, that would cause your problem.

comment:12 Changed 12 years ago by Cyberpastor

Although I seem not to have saved copies of the PCI.EXE output in which the on-board LAN hardware was indicated as having IRQ11 or IRQ12, I am sure that I did not imagine it. Could the IRQ assigned depend on what other devices are activated or deactivated in the BIOS?

The following may not be useful information, but I find it interesting: I installed 64-bit Linux Mint 13 in a separate partition on the same machine, and the on-board LAN seems to work fine, being assigned IRQ43. The add-in NIC works fine also, and whichever one I connect to the router gets an IP address, whereas the only way to get eCS to use the on-board networking is to remove the add-in card. The EHCI controllers get IRQ17 and IRQ18, and I browsed around the Internet for quite a while with no lockups. If the on-board LAN hardware misbehaved at all, I was not aware of it.

(You will recall that the only way I can get my SCSI devices to work in eCS is to use the /VW switch on the ACPI.PSD line so that the controller does not get assigned a high IRQ. They worked fine in Linux Mint with the controller having a high IRQ -- but I no longer recall which.)

comment:13 Changed 12 years ago by David Azarewicz

Priority: Feedback Pendingmajor

From what I can see, it looks like you have proved that this driver is not compatible with your hardware. Perhaps in the future the driver may get updated and I can resync and it may start working.

comment:14 Changed 11 years ago by David Azarewicz

Priority: majorFeedback Pending

Please try the 0.3.1 version that was just released.

comment:15 Changed 11 years ago by David Azarewicz

Resolution: fixed
Status: newclosed

No response from reporter. This is known to be fixed in version 0.3.1. Closing ticket.

Note: See TracTickets for help on using tickets.