RESCOMP Archives

July 2009

RESCOMP@LISTSERV.MIAMIOH.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"Robin, Robin" <[log in to unmask]>
Reply To:
Research Computing Support <[log in to unmask]>, Robin, Robin
Date:
Fri, 17 Jul 2009 18:34:49 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (26 lines)
Hi,

These are the nodes today that have inexplicably high load.
I disabled them from torque.

I logged in there and I couldn't see anything running at all that could take
up the load.

The file servers are fairly busy; but we don't quite see those.

We should run some jobs against them before enabling them.

Not sure if it's related with raodm's jobs (getting access to his file
spaces, etc.).

Since Torque schedules machines starting from c4-X, when we have something
happening on c4-X bad things, jobs tend to get scheduled there..

I did not see any apparent hardware issues.

Those compute nodes have over 580 days uptime.

We should add monitoring to include load averages of the machines.

Robin

ATOM RSS1 RSS2