CLEANACCESS Archives

September 2005

CLEANACCESS@LISTSERV.MIAMIOH.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Ryan Dorman <[log in to unmask]>
Reply To:
Perfigo SecureSmart and CleanMachines Discussion List <[log in to unmask]>
Date:
Thu, 1 Sep 2005 17:29:05 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (173 lines)
The kclick process is the proprietary routing daemon that Perfigo/CCA uses.  It captures all available resources on the CAS.  When we first got clean machines, the load averages on the machines made me nausious :) my gernealy rule for other *NIX boxen being load = # of CPU's (in a perfect world)....

Are you noticing a significant performance hit for subnets protected by CCA?

Ryan Dorman, CCNP
Netowrk Communications Specialist
Communications and Network Services
Millerville University



-----Original Message-----
From: Perfigo SecureSmart and CleanMachines Discussion List on behalf of Michael Grinnell
Sent: Thu 9/1/2005 4:49 PM
To: [log in to unmask]
Subject: Re: [PERFIGO] Https 404's and High Load Average
 
We have the same model, but with 2GBs of RAM running in virtual  
gateway mode.  Our CASs seem to be running ok.  Your load avg does  
seem a bit high, but I've noticed on the CAS that it always seems to  
run at 100% busy.  Here's our numbers:

   4:36pm  up 12 days,  5:08,  1 user,  load average: 2.55, 2.30, 2.22
93 processes: 90 sleeping, 3 running, 0 zombie, 0 stopped
CPU states:  1.1% user, 98.8% system,  0.0% nice,  0.0% idle
Mem:  2059480K av,  728240K used, 1331240K free,      60K shrd,    
50744K buff
Swap: 1048552K av,       0K used, 1048552K free                   
404620K cached

   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
1167 root      14   0     0    0     0 RW   92.7  0.0 17243m kclick
30325 root      14   0  1012 1012   752 R     3.3  0.0   0:01 top
1475 root       9   0  193M 193M 14900 S     0.9  9.6  94:55 java
22982 root       9   0  193M 193M 14900 S     0.1  9.6   0:00 java
     1 root       8   0   520  520   452 S     0.0  0.0   0:04 init

and from the event log heartbeat
Sep  1 16:12:22 <snip> Perfigo: CleanAccessServer:192.168.10.10  
System Stats:    Load factor  0 (max since reboot: 477)    Mem  
(bytes) Total: 2108907520 Used: 743907328 Free: 1365000192 Shared:  
61440 Buffers: 51953664 Cached: 413302784    CPU User: 1% Nice: 0%  
System: 99% Idle: 0%
Sep  1 16:12:22 <snip> Perfigo: CleanAccessServer:192.168.10.20  
System Stats:    Load factor  1 (max since reboot: 325)    Mem  
(bytes) Total: 2108907520 Used: 568860672 Free: 1540046848 Shared:  
61440 Buffers: 51179520 Cached: 294789120    CPU User: 0% Nice: 0%  
System: 100% Idle: 0%

I would look at other factors than just load.  You might take a look  
at the apache logs (/perfigo/access/apache/logs/) and the tomcat logs  
(/perfigo/access/tomcat/logs) for more clues.

Also, Gigabit Ethernet does not support manual duplex and speed  
negotiation, IIRC.  You are supposed to use auto.  Here's our  
modules.conf
alias eth0 bcm5700
alias eth1 bcm5700
alias parport_lowlevel parport_pc
<snip>

Hope that helps,

Michael Grinnell
Network Security Administrator
The American University
e-mail: [log in to unmask]

On Sep 1, 2005, at 4:18 PM, Brad Kramer wrote:

> Sorry, should have mentioned that I have broadcoms, in the  
> modules.conf, we
> hardset the speed/duplex 1000 and full....
>
> I honestly think this has something to do with load....
>
> Here's the first 20 or so lines of top
>
> 4:15pm  up 20 days, 23:31,  1 user,  load average: 3.40, 3.49, 3.47
> 99 processes: 96 sleeping, 3 running, 0 zombie, 0 stopped
> CPU states:  2.6% user, 97.3% system,  0.0% nice,  0.0% idle
> Mem:  3597396K av,  728352K used, 2869044K free,      56K shrd,    
> 55248K
> buff
> Swap: 1048552K av,       0K used, 1048552K free                   
> 485632K
> cached
>
>   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
>  1183 root      14   0     0    0     0 RW   90.1  0.0 29273m kclick
> 18026 root      14   0  1016 1016   752 R     3.0  0.0   0:03 top
>  1479 root       9   0  106M 106M 14892 S     1.9  3.0  65:35 java
>    16 root       9   0     0    0     0 SW    0.7  0.0   0:51  
> kjournald
>   566 root       9   0   616  616   508 S     0.3  0.0   0:58 syslogd
>     1 root       8   0   520  520   452 S     0.0  0.0   0:08 init
>     2 root       8   0     0    0     0 SW    0.0  0.0   0:00 keventd
>     3 root      19  19     0    0     0 RWN   0.0  0.0   0:00  
> ksoftirqd_CPU0
>     4 root       9   0     0    0     0 SW    0.0  0.0   0:00 kswapd
>     5 root       9   0     0    0     0 SW    0.0  0.0   0:00  
> kreclaimd
>     6 root       9   0     0    0     0 SW    0.0  0.0   0:00 bdflush
>     7 root       9   0     0    0     0 SW    0.0  0.0   0:00 kupdated
>     8 root      -1 -20     0    0     0 SW<   0.0  0.0   0:00  
> mdrecoveryd
>   131 root       9   0     0    0     0 SW    0.0  0.0   0:00  
> kjournald
>   571 root       9   0  1108 1108   444 S     0.0  0.0   0:00 klogd
>   713 root       9   0  1252 1252  1112 S     0.0  0.0   0:00 sshd
>   765 root       9   0  1272 1272   904 S     0.0  0.0   0:00 nessusd
>   783 root       8   0   676  676   568 S     0.0  0.0   0:00 crond
>   819 daemon     9   0   572  572   488 S     0.0  0.0   0:00 atd
>  1060 root       9   0   936  936   860 S     0.0  0.0   0:00  
> _plutorun
>  1061 root       9   0   400  400   340 S     0.0  0.0   0:00 logger
>  1065 root       9   0   936  936   860 S     0.0  0.0   0:00  
> _plutorun
>
> A little farther down we have about 20 instances of java...
>
> Let me know what you think...
> Thanks
> -Brad Kramer
>
>
> On 9/1/05 4:02 PM, "Simon Bell" <[log in to unmask]> wrote:
>
>
>> check your speed and duplex? ifconfig on the CAS/CAM
>>
>>
>>>>> [log in to unmask] 9/1/2005 10:17 AM >>>
>>>>>
>> Here at Ashland we also had a "successful" rollout of CCA this  
>> fall... We
>> are currently running 3.5.4, and we have experienced a small hiccup.
>>
>> Most users are getting the initial redirect page just fine,  
>> however the
>> secure web server times out on many requests, and when it does  
>> respond it is
>> extremely slow....
>>
>> Currently we have roughly 1000 users behind an inline CAS.
>> The CAS is running on a Compaq DL360 3.4ghz with 4G RAM....
>> I think it is due to high load on the server, because the load  
>> average sits
>> around 2.5 all day (through slow times and busy times) our RAM is  
>> virtually
>> untouched...
>> I have contacts at other universities that are running 5x more  
>> users behind
>> a single CAS with no problems...... Any suggestions?
>>
>> Thanks!
>>
>> -------------
>> Bradley W. Kramer
>> Network/Telecom Intern.
>> Ashland University
>> (419) 289-5630
>> [log in to unmask]
>>
>
> -------------
> Bradley W. Kramer
> Network/Telecom Intern.
> Ashland University
> (419) 289-5630
> [log in to unmask]
>

ATOM RSS1 RSS2