 |
 |
 |
 |
| Servers & Networking Discuss any Fedora server problems and Networking issues such as dhcp, IP numbers, wlan, modems, etc. |

18th April 2007, 12:33 PM
|
 |
Registered User
|
|
Join Date: Nov 2004
Location: Kentucky
Posts: 131

|
|
sockets are hanging
i have an interesting problem where after several minutes, the sockets on my FC6 x64 server start to hang in various states.
free indicates i have plenty of free memory
top indicates im only getting 20% load spikes, normal
netstat confirms active sockets are just sitting in a state of "SYN_RECV" or "ESTABLISHED"
if i try to telnet or ssh into this server, i see the banner and upon entering my password the session hangs!
ssh hangs after this:
Code:
debug2: we sent a password packet, wait for reply
debug1: Authentication succeeded (password).
debug1: channel 0: new [client-session]
debug2: channel 0: send open
debug1: Entering interactive session.
eventually with a "connection reset by peer" error.
selinux is set to permissive
i have plenty of hard drive space
my interfaces eth0 and eth0 are not dropping packets at all, and have a clean bill of health overall.
telnetting to my postfix daemon gets me locked up in a session that doesnt respond (cant even quit!)
even my dovecot daemon hangs up.
this problem occurs on multiple switches (ive tried several to resolve it) and has plagued me for 4 weeks!
__________________
The sage does not hoard. The more he helps others, the more he benefits himself, The more he gives to others, the more he gets himself. The Way of Heaven does one good but never does one harm. The Way of the sage is to act but not to compete.
--Lao Tzu
|

18th April 2007, 04:18 PM
|
 |
Registered User
|
|
Join Date: Jun 2005
Location: Leeds
Posts: 1,264

|
|
|
Does dmesg give you any indication?
Also have a look in /var/log/messages /var/log/secure for other scraps of info
Ibbo
__________________
A Hangover Lasts A Day, But Our Drunken Memories Last A Lifetime
--
Linux user #349545
(GNU/Linux)iD8DBQBAzWjX+MZAIjBWXGURAmflAKCntuBbuKCWenpm XoA7LNydllVQOwCfdjyzXscddzQvlhBedAcD7qfKmHo==zx0H
|

18th April 2007, 04:59 PM
|
 |
Registered User
|
|
Join Date: Nov 2004
Location: Kentucky
Posts: 131

|
|
kernel is:
2.6.20-1.2933.fc6 #1 SMP
+nope, nothing out of the ordinary except this:
Quote:
Apr 16 17:04:28 mail kernel: Calibrating delay using timer specific routine.. 6532.40 BogoMIPS (lpj=13064806)
Apr 17 13:16:25 mail kernel: Nvidia board detected. Ignoring ACPI timer override.
Apr 17 13:16:25 mail kernel
Apr 17 13:16:27 mail kernel: Calibrating delay using timer specific routine.. 5232.43 BogoMIPS (lpj=2616218)
Apr 17 13:16:28 mail kernel: Using local APIC timer interrupts.
Apr 17 13:16:28 mail kernel: Detected 12.557 MHz APIC timer.
Apr 17 13:16:28 mail kernel: Calibrating delay using timer specific routine.. 5216.59 BogoMIPS (lpj=2608296)
Apr 17 13:16:30 mail kernel: Disabling vsyscall due to use of PM timer
Apr 17 13:16:30 mail kernel: time.c: Using 3.579545 MHz WALL PM GTOD PM timer.
|
__________________
The sage does not hoard. The more he helps others, the more he benefits himself, The more he gives to others, the more he gets himself. The Way of Heaven does one good but never does one harm. The Way of the sage is to act but not to compete.
--Lao Tzu
|

24th April 2007, 03:49 PM
|
 |
Registered User
|
|
Join Date: Nov 2004
Location: Kentucky
Posts: 131

|
|
|
this condition is persisting!
crap crap crap! i should have been a plumber!
i cant figure out the solution, but it started in fedora 5, and an upgrade to 6 (not reinstall) has not helped any.
more tips and hints for anyone interested:
the issue affects local user terminal logins too, causing them to time out entirely. pam.d??
Postfix complains of dropped connections midway through EHLO and CONNECT, indicating that yes people can reach my server but my server may or may not respond to anything they say (or even in a timely manner.)
switch is OK
interfaces are OK
cables are OK
network restart, postfix restart, saslauthd restart, and various other daemon restarts do not fix this issue
after reboot this condition randomly persists.
telinit to varied runlevels wont help.
kernel change doesnt help
no errors or warnings are logged in any process.
could this be a corrupt device driver? we had a power outage recently.
im beginning to doubt my Linux zen....
__________________
The sage does not hoard. The more he helps others, the more he benefits himself, The more he gives to others, the more he gets himself. The Way of Heaven does one good but never does one harm. The Way of the sage is to act but not to compete.
--Lao Tzu
|

24th April 2007, 04:30 PM
|
 |
Registered User
|
|
Join Date: Jun 2005
Location: Leeds
Posts: 1,264

|
|
|
Its defineately sounding like perhaps a hardware issue.
APIC makes me think this as I have had lots of APIC issues in the past that also killed my sockets. Do you have another NIC lying around you can stick in and play with?
Or is this coming from your router?
Ibbo
__________________
A Hangover Lasts A Day, But Our Drunken Memories Last A Lifetime
--
Linux user #349545
(GNU/Linux)iD8DBQBAzWjX+MZAIjBWXGURAmflAKCntuBbuKCWenpm XoA7LNydllVQOwCfdjyzXscddzQvlhBedAcD7qfKmHo==zx0H
|

24th April 2007, 05:09 PM
|
 |
Registered User
|
|
Join Date: Nov 2004
Location: Kentucky
Posts: 131

|
|
hmmm...shouldnt apic throw warnings?
i have two nics, the infamous integrated GBnic's found on most HP Proliant servers. both eth0 and eth1 exhibit this (which youre correct, apic would be a precursor for such behavior)
pci=noacpi has been specified as a boot parameter in the current kernel...heres to crossing ma' fingers!
__________________
The sage does not hoard. The more he helps others, the more he benefits himself, The more he gives to others, the more he gets himself. The Way of Heaven does one good but never does one harm. The Way of the sage is to act but not to compete.
--Lao Tzu
Last edited by nimbius; 24th April 2007 at 05:39 PM.
|

24th April 2007, 06:09 PM
|
 |
Registered User
|
|
Join Date: Nov 2004
Location: Kentucky
Posts: 131

|
|
|
no good. acpi does not appear to be the issue
__________________
The sage does not hoard. The more he helps others, the more he benefits himself, The more he gives to others, the more he gets himself. The Way of Heaven does one good but never does one harm. The Way of the sage is to act but not to compete.
--Lao Tzu
|

25th April 2007, 10:52 AM
|
 |
Registered User
|
|
Join Date: Jun 2005
Location: Leeds
Posts: 1,264

|
|
|
"the infamous integrated GBnic's"
Could it be these cards that are the problem? Do you have aloose card you can stick in and test?
Ibbo
__________________
A Hangover Lasts A Day, But Our Drunken Memories Last A Lifetime
--
Linux user #349545
(GNU/Linux)iD8DBQBAzWjX+MZAIjBWXGURAmflAKCntuBbuKCWenpm XoA7LNydllVQOwCfdjyzXscddzQvlhBedAcD7qfKmHo==zx0H
|

25th April 2007, 05:37 PM
|
 |
Registered User
|
|
Join Date: Nov 2004
Location: Kentucky
Posts: 131

|
|
|
and i thought this as well, however mii-tool, ethtool, and ifconfig all confirm the gbnics are functioning 100% normally with no magical cisco flow control or anything at the switch.
I tried changing a setting in my resolv.conf. one of my interns specified a domain name (domain=). this might not be a good idea as the server operates in two domains (our intranet and our extranet.) I also removed from resolv.conf the primary nameserver, which was set as our internal resolver for our office machines, and replaced it with 0.0.0.0 (good old onboard caching nameserver.) the server, as one can imagine a mail server would, does a staggaring number of resolutions.
as of this time...the system is functioning normally again.
__________________
The sage does not hoard. The more he helps others, the more he benefits himself, The more he gives to others, the more he gets himself. The Way of Heaven does one good but never does one harm. The Way of the sage is to act but not to compete.
--Lao Tzu
|

26th April 2007, 12:47 PM
|
 |
Registered User
|
|
Join Date: Jun 2005
Location: Leeds
Posts: 1,264

|
|
|
"as of this time...the system is functioning normally again."
Ah the good old solved itself. My favorate bit of sys admin.
Ibbo
__________________
A Hangover Lasts A Day, But Our Drunken Memories Last A Lifetime
--
Linux user #349545
(GNU/Linux)iD8DBQBAzWjX+MZAIjBWXGURAmflAKCntuBbuKCWenpm XoA7LNydllVQOwCfdjyzXscddzQvlhBedAcD7qfKmHo==zx0H
|

29th April 2007, 05:58 PM
|
 |
Registered User
|
|
Join Date: Nov 2004
Location: Kentucky
Posts: 131

|
|
|
crap. nevermind. thought it was resolution but that would have been too simple. resolution is perfect.
the condition occurs again if i try to transfer a large file through scp to the system, or if the system comes under heavy load (many connections.) connections time out, network services hang too. if however im logged in locally on a terminal, that terminal is OK. wtf?
__________________
The sage does not hoard. The more he helps others, the more he benefits himself, The more he gives to others, the more he gets himself. The Way of Heaven does one good but never does one harm. The Way of the sage is to act but not to compete.
--Lao Tzu
|

30th April 2007, 06:49 AM
|
 |
Registered User
|
|
Join Date: Nov 2004
Location: Kentucky
Posts: 131

|
|
|
both onboard nic ports are broadcom BCM5721. downing eth1 (intranet) restores connectivity. im thinking this is an issue with broadcom nics and the latest kernel?
nics have been bonded for further testing...just one network now.
__________________
The sage does not hoard. The more he helps others, the more he benefits himself, The more he gives to others, the more he gets himself. The Way of Heaven does one good but never does one harm. The Way of the sage is to act but not to compete.
--Lao Tzu
|

19th May 2007, 12:37 PM
|
 |
Registered User
|
|
Join Date: Nov 2004
Location: Kentucky
Posts: 131

|
|
|
ultimate issue for reference: two nics on fedora core 6 could not be on separate subnets and expect stability. the system stopped responding to all network requests either immediately or latently, and local tty loging would also hang.
system has been reverted to primary subnet, both nics bonded.
__________________
The sage does not hoard. The more he helps others, the more he benefits himself, The more he gives to others, the more he gets himself. The Way of Heaven does one good but never does one harm. The Way of the sage is to act but not to compete.
--Lao Tzu
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
Current GMT-time: 06:18 (Tuesday, 18-06-2013)
|
|
 |
 |
 |
 |
|
|