PDA

View Full Version : Are these signs of IRQ conflicts?



xprezons
3rd September 2010, 08:14 PM
When I moved from Fedora 13 from Fedora 11, I thought the annoying application crashes that I had been having were gone for good. But they have come back to ahunt me .. random crashes - firefox, Thunderbird, Abrt, metacity and what have you with SIGSEGV errors.

I'm beginning to believe that these are not down to Fedora or even the RAM (Have run Memtest, several times). I'm not sure if what I am seeing are signs of IRQ conflicts of some sort. Can someone, please have a look and give me some guidance on what to do? Should I reinstall with acpi=off ? noapic? or something else?

Also, don't have a clue why the system complains of ECC being disabled? The RAM is non-ECC.

I have posted some sections of the logs here with links to the full logs instead of making this a massive post.


ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11) *0, disabled.
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11) *0, disabled.
...
pnp 00:02: disabling [mem 0x00000000-0x00000fff window] because it overlaps 0000:00:00.0 BAR 3 [mem 0x00000000-0x1fffffff 64bit]
pnp 00:02: disabling [mem 0x00000000-0x00000fff window disabled] because it overlaps 0000:02:00.0 BAR 6 [mem 0x00000000-0x0000ffff pref]
...
system 00:0c: [mem 0xafee0000-0xafefffff] could not be reserved
...
ata1: SATA max UDMA/133 abar m1024@0xfe02f000 port 0xfe02f100 irq 22
ata2: SATA max UDMA/133 abar m1024@0xfe02f000 port 0xfe02f180 irq 22
ata3: SATA max UDMA/133 abar m1024@0xfe02f000 port 0xfe02f200 irq 22
ata4: SATA max UDMA/133 abar m1024@0xfe02f000 port 0xfe02f280 irq 22
...
hub 1-0:1.0: 6 ports detected
alloc irq_desc for 19 on node 0
alloc kstat_irqs on node 0
ehci_hcd 0000:00:13.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19
...



nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Please use
nf_conntrack.acct=1 kernel parameter, acct=1 nf_conntrack module option or
sysctl net.netfilter.nf_conntrack_acct=1 to enable it.




EDAC MC: Ver: 2.1.0 Aug 27 2010
EDAC amd64_edac: Ver: 3.3.0 Aug 27 2010
EDAC amd64: This node reports that Memory ECC is currently disabled, set F3x44[22] (0000:00:18.3).
EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
Either enable ECC checking or force module loading by setting 'ecc_enable_override'.
(Note that use of the override may cause unknown side effects.)
amd64_edac: probe of 0000:00:18.2 failed with error -22


Complete system spec, BIOS details etc. (https://docs1.google.com/document/edit?id=1XCrfkMdD5Tx5GGeETHaJ6FSRn7Ldb6LPdSbHkzavt o0&hl=en_GB&authkey=CIeZ3usO#)

Full dmesg log on a fresh install (https://docs0.google.com/document/edit?id=1ChAgLfWANQMohTko3EEiW589HQj_AP6KRKg1ay8ei-U&authkey=CLGK3vEF&hl=en_GB#)

No issues experienced with system installation.

forkbomb
3rd September 2010, 08:50 PM
IRQ conflicts? In 2010, with a motherboard new enough to incorporate DDR3? :eek:

More speculative: I doubt it's IRQ conflicts now that Plug and Play is widespread. I'd be more inclined to blame a malfunctioning PNP device, a device that conforms poorly to PNP standards, a HAL bug of some sort, or just something about Fedora itself (I've seen nasty stability issues with F12 onward on one of my machines). I'd put IRQ conflicts pretty low on the list of likelihood. Then again, I don't actually know that much about how PNP works in Linux, so I'm guessing. :p

Less speculative: The ECC errors aren't alarming. The stock Fedora kernel probably has CONFIG_EDAC=m and CONFIG_EDAC_MM_EDAC=m set, which your hardware doesn't support. In other words, your kernel build supports it through modules, but the kernel sees you don't have hardware support for ECC, so it shuts down the EDAC modules because they can't be used on your hardware anyway.

xprezons
12th October 2010, 11:15 AM
Thanks forkbomb. I agree with your comments. This doesn't seem to be related to ACPI (like you said .. very new motherboard). Thanks for your inputs. It was useful to understand the log - that Fedora was checking the RAM for ECC and disabling it since the RAM is non ECC.

I had been trying out various options to try and fix the issue. One thing that made a big difference was a BIOS update (GA-MA785GT-UD3H mobo update F8). Unfortunately motherboard manufacturers put in just one line as a release note and that does not help in any way to understand if an update will make any difference for you. Figured out after consulting some gurus on the motherboard forum that this would be worth trying.

On applying this update, these crashes came down very significantly. The BIOS patch applied an update to the AGESA code on my motherboard. With my limited knowledge of the details of how the memory controller (which in case of the new AMDs, sits within the CPU) interacts with the RAM, I am assuming here that this fixed the problems with way Fedora was trying to access the RAM.

However, there is an occasional problem I am still noticing. I feel this is a different issue, possibly deserves a thread of its own. ==> http://forums.fedoraforum.org/showthread.php?t=252713