..reading.. |
A server was all bottled up and causing my my cell phone to page me like crazy. In fact, the message buffer got all full -- not that I needed more pages to know my dinner date at LeCheval (yummy!) had ended.
A monitoring script (cron'd) was restarting the spun Apache instance every 5 minutes.
I use the no more children >= MaxChildren as an error state on this machine. The /server-status showed about 400 apaches in the ..reading.. state. Never seen that before.
Server Version: Apache
Server Built: May 3 2005 02:54:24
Current Time: Wednesday, 18-Jan-2006 19:14:25 PST
Restart Time: Wednesday, 18-Jan-2006 19:12:26 PST
Parent Server Generation: 0
Server uptime: 1 minute 59 seconds
Total accesses: 3677 - Total Traffic: 35.6 MB
CPU Usage: u30.5781 s8.79688 cu9.40625 cs0 - 41% CPU load
30.9 requests/sec - 306.5 kB/second - 9.9 kB/request
418 requests currently being processed, 17 idle workers
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
RRRRRWRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRWRRRRRRRRRRRRRRRRRRRRRWR
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRWRRRRRRR
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR_RRRRRRRRRRRR
RWWRWRWRRRRRRRRRRWRWRR_R_WWRWRRWWWRWWRWRRWWRWWWRWWWWWWRWRRRRRRRR
RRWWWWRRRWRWWWWRRWWWWRWWWWRWRWWRWWWWRRWR_WWRWWWWWRRWW_WWW_RWW__R
RW__WRRRWRRW_CWCWC_RWCRW_WR_R_WWCWWWWRRWW__WWRWWWRW.............
................................................................
Scoreboard Key:
"_" Waiting for Connection, "S" Starting up, "R" Reading Request,
"W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
"C" Closing connection, "L" Logging, "G" Gracefully finishing,
"I" Idle cleanup of worker, "." Open slot with no current process
Srv PID Acc M Slot Client VHost Request
0-0 99372 0/5/5 R 0.00 ? ? ..reading..
1-0 99373 0/33/33 R 0.19 ? ? ..reading..
2-0 99374 0/31/31 R 0.03 ? ? ..reading..
3-0 99375 0/8/8 R 0.01 ? ? ..reading..
4-0 99376 0/4/4 R 0.00 ? ? ..reading..
5-0 99377 0/10/10 R 0.02 ? ? ..reading..
6-0 99378 0/5/5 R 0.00 ? ? ..reading..
7-0 99380 0/13/13 R 0.00 ? ? ..reading..
8-0 99381 0/10/10 R 0.13 ? ? ..reading..
9-0 99382 0/8/8 R 0.00 ? ? ..reading..
10-0 99383 0/25/25 R 0.11 ? ? ..reading..
11-0 99384 0/22/22 R 0.16 ? ? ..reading..
ect...
|
Mike mentioned that all the requests were coming from the same IP. As a matter
of course I had logged all the open sockets with 'sockstat > /tmp/sockstat-all-locked-up' in a tmp file. I looked through that log, and indeed all the conx were coming from 203.115.205.70 -- Kuala Lumpur, Malaysia! Hmmm... fishy. I blocked that IP in the firewall:
ipfw add 103 deny log ip from 203.115.205.70 to any
|
What was this reading state anyways? I grepped for reading in the apache source tree...
#find . -depth 3 -exec grep -H -B 2 -A 3 "\.\.reading\.\." {} \;
if (ws_record->status == SERVER_BUSY_READ)
ap_rprintf(r,
" | ? | ? | ..reading.. |
\n\n");
else
ap_rprintf(r,
"%s | %s | %s | \n\n",
I searched the net, and found an email from one year ago on apache-httpd-dev with the title " Meaning of ..reading.. in server-status and SERVER_BUSY_READ". That looks perfect! The author was Massimiliano Panichi, but he had the same question I did! No one answered! Poop on apache-httpd-dev list.
I searched the web and found Admission Control and Scheduling for High-Performance WWW Servers published in 1997! It actually had exactly what I needed: an explaination of SERVER_BUSY_READ.
I was able to telnet to a web server on port 80, leave the telnet session hung, and then see the ..reading.. state in the /server-status. (I then typed GET /CHECK THIS OUT!!!! <ENTER>, reload on the /server-status immediately, and saw that the same child was indeed changed from ..readind.. to Get /CHECK THIS OUT!!!)
This eliminated the possiblilty that the server had gotten a request and was acting on it - phew!. Perhaps some sort of Header DoS was in effect.
The trap. Just to check, I set up a trap for that IP. On the server, I have a netcat instance waiting for an incoming connection. nc -l -p 666 localhost > /tmp and in the firewall, I am forwarding requests to nc from the offending IP: ipfw add 101 fwd localhost,666 tcp from 203.115.205.70 to any . Nothing in the trap yet, the deny rule I set an hour ago seems to have run the bunny off.
|
Time for bed! Tomorrow, I'll reread some tuning advice and test dropping the apache Timeout value from 300 to 10. (If you are concerned about such a low value, you can try 30 seconds). What is Timeout? Read this ApacheWeek Article.
Timeout 300
|