We have a problem with one of our FC6 boxes
The server is mainly a data server , which has several nfs and samba shares running.
Today the server stopped responding to anything but pings, so we had to manually turn it off and on again.
I went to check the logs and i found this :
Apr 25 16:53:08 storage1 kernel: BUG: unable to handle kernel paging request at virtual address 00200200
Apr 25 16:53:08 storage1 kernel: printing eip:
Apr 25 16:53:08 storage1 kernel: c04f3b31
Apr 25 16:53:08 storage1 kernel: *pde = 69fd8067
Apr 25 16:53:08 storage1 kernel: Oops: 0000 [#1]
Apr 25 16:53:08 storage1 kernel: SMP
Apr 25 16:53:08 storage1 kernel: last sysfs file: /bus/pci/drivers/megaraid_sas/release_date
Apr 25 16:53:08 storage1 kernel: Modules linked in: nfsd exportfs lockd nfs_acl autofs4 hidp rfcomm l2cap bluetooth sunrpc dm_mirror dm_multipath dm_mod video sbs i2c_ec i2c_core button battery asus_acpi ac parport_pc lp parport ata_piix libata ide_cd bnx2 cdrom serio_raw sg pcspkr megaraid_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Apr 25 16:53:08 storage1 kernel: CPU: 3
Apr 25 16:53:08 storage1 kernel: EIP: 0060:[<c04f3b31>] Not tainted VLI
Apr 25 16:53:08 storage1 kernel: EFLAGS: 00010292 (2.6.19-1.2895.fc6 #1)
Apr 25 16:53:08 storage1 kernel: EIP is at list_del+0x9/0x6c
Apr 25 16:53:08 storage1 kernel: eax: 00200200 ebx: e7a09798 ecx: f7fff080 edx: 00000003
Apr 25 16:53:08 storage1 kernel: esi: e7a09748 edi: c215e540 ebp: 00000282 esp: f7feaf38
Apr 25 16:53:08 storage1 kernel: ds: 007b es: 007b ss: 0068
Apr 25 16:53:08 storage1 kernel: Process events/3 (pid: 17, ti=f7fea000 task=f7e680f0 task.ti=f7fea000)
Apr 25 16:53:08 storage1 kernel: Stack: cb2af5a4 00000000 00000282 e7a09740 c04c4a94 e7a09740 e7a09748 c215e540
Apr 25 16:53:08 storage1 kernel: c04c46db c06a3e40 c06a3e44 c04368c7 00000282 c215e540 c215e560 c04c4624
Apr 25 16:53:08 storage1 kernel: 00000000 c215e560 c215e540 c215e558 00000000 c0437284 00000001 00000000
Apr 25 16:53:08 storage1 kernel: Call Trace:
Apr 25 16:53:08 storage1 kernel: [<c04c4a94>] keyring_destroy+0x28/0x65
Apr 25 16:53:08 storage1 kernel: [<c04c46db>] key_cleanup+0xb7/0xd0
Apr 25 16:53:08 storage1 kernel: [<c04368c7>] run_workqueue+0x97/0xdd
Apr 25 16:53:08 storage1 kernel: [<c0437284>] worker_thread+0xd9/0x10d
Apr 25 16:53:08 storage1 kernel: [<c0439810>] kthread+0xc0/0xec
Apr 25 16:53:08 storage1 kernel: [<c0404c03>] kernel_thread_helper+0x7/0x10
Apr 25 16:53:08 storage1 kernel: =======================
Apr 25 16:53:08 storage1 kernel: Code: 8d 46 04 e8 86 00 00 00 8d 4b 0c 8b 51 04 8d 46 0c 83 c4 18 5b 5e 5f e9 72 00 00 00 89 c3 eb e8 90 90 53 89 c3 83 ec 0c 8b 40 04 <8b> 00 39 d8 74 1c 89 5c 24 04 89 44 24 08 c7 04 24 bc 7e 65 c0
Apr 25 16:53:08 storage1 kernel: EIP: [<c04f3b31>] list_del+0x9/0x6c SS:ESP 0068:f7feaf38
Apr 25 16:53:18 storage1 kernel: <3>BUG: soft lockup detected on CPU#3!
Apr 25 16:53:18 storage1 kernel: [<c0405018>] dump_trace+0x69/0x1b6
Apr 25 16:53:18 storage1 kernel: [<c040517d>] show_trace_log_lvl+0x18/0x2c
Apr 25 16:53:18 storage1 kernel: [<c0405778>] show_trace+0xf/0x11
Apr 25 16:53:18 storage1 kernel: [<c0405875>] dump_stack+0x15/0x17
Apr 25 16:53:18 storage1 kernel: [<c04522c5>] softlockup_tick+0xad/0xc4
Apr 25 16:53:18 storage1 kernel: [<c0430d8f>] update_process_times+0x39/0x5c
Apr 25 16:53:18 storage1 kernel: [<c0419f5a>] smp_apic_timer_interrupt+0x95/0xb3
Apr 25 16:53:18 storage1 kernel: [<c0404a57>] apic_timer_interrupt+0x1f/0x24
Apr 25 16:53:18 storage1 kernel: [<c062585b>] __write_lock_failed+0xf/0x1c
Apr 25 16:53:18 storage1 kernel: DWARF2 unwinder stuck at __write_lock_failed+0xf/0x1c
Apr 25 16:53:18 storage1 kernel:
Apr 25 16:53:18 storage1 kernel: Leftover inexact backtrace:
Apr 25 16:53:18 storage1 kernel:
Apr 25 16:53:18 storage1 kernel: [<c04f3a29>] _raw_write_lock+0x5d/0x74
Apr 25 16:53:18 storage1 kernel: [<c04c4fc7>] keyring_publish_name+0x2c/0x6d
Apr 25 16:53:18 storage1 kernel: [<c04c5016>] keyring_instantiate+0xe/0x13
Apr 25 16:53:18 storage1 kernel: [<c04c3f48>] __key_instantiate_and_link+0x2f/0xa8
Apr 25 16:53:18 storage1 kernel: [<c04c51d7>] keyring_alloc+0x53/0x6a
Apr 25 16:53:18 storage1 kernel: [<c04c6854>] alloc_uid_keyring+0x4c/0xb2
Apr 25 16:53:18 storage1 kernel: [<c0431245>] alloc_uid+0x95/0x13c
Apr 25 16:53:18 storage1 kernel: [<c043416d>] set_user+0xb/0x8e
Apr 25 16:53:18 storage1 kernel: [<c0435c1f>] sys_setresuid+0x111/0x1dd
Apr 25 16:53:18 storage1 kernel: [<c040404b>] syscall_call+0x7/0xb
Apr 25 16:53:18 storage1 kernel: =======================
Looks quite serious, i can't figure out what caused this, does anyone have any ideas?
If you need more info on the machine please let me know.
Thanks in advance