RESCOMP Archives

December 2008

RESCOMP@LISTSERV.MIAMIOH.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"Woods, David M. Dr." <[log in to unmask]>
Reply To:
Research Computing Support <[log in to unmask]>, Woods, David M. Dr.
Date:
Tue, 2 Dec 2008 12:01:51 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (76 lines)
Theresa,

  I'll take a look at this and see if I can find any more clues about what is happening.  I'm wondering if this might be an filesystem or network issue.

  If I need more info I'll e-mail you.

Dave


-----Original Message-----
From: Research Computing Support [mailto:[log in to unmask]] On Behalf Of Ramelot, Theresa A. Dr.
Sent: Tuesday, December 02, 2008 11:40 AM
To: [log in to unmask]
Subject: Xplor-nih parallel problem

Hi,

I have a problem with xplor-nih that seems to be related to the parallel
processors.  This problem has occurred for both Mike Kennedy and myself.
It's kind of hard to explain via e-mail, so we may have to set up a meeting
if it's confusing.

Mike Kennedy has an example in:
/shared/mkennedy/kennedy_king_file0/MaR214a/structurecalcs/xplor_40

We start the parallel job with:
sh
source /software/autostructure/AutoStructure-2.2.1/bin/autostructure
sh ./run-xplor-sh

The file run-xplor-sh says to use 25 processors (np) and calculate 5
structures on each (nstr) and keep the lowest energy 10 (nb) and the end.

This creates the files:  sa_for_AS_#.inp
There are 25 of them, one for each processor, and the first one creates
files MaR214a_sa_001.pdb - MaR214a_sa_005.pdb, the second one, 6-10, etc.

What is happening is that some of the jobs on some of the processors are
failing.  You can see that MaR214a_sa_080-100.pdb are missing.  This should
have been created by sa_for_AS_17.inp (should created 81-85.pdb).

If I run the same file again it runs fine:
module load xplor-nih
xplor < sa_for_AS_17.inp > sa_for_AS_17.out2 &

There is a little bit of information in the failed output file:
sa_for_AS_17.out.

It seems to have trouble finding the files:
/software/xplor-nih/xplor-nih-2.20/databases/c13/shifts/rcoil_c13.tbl
or
expected_edited_c13.tbl

But, it fails only some of the time on some of the processors.  Sometimes it
works fine.  I had a week about a month ago where it happened every time,
but this week, I haven't had the problem at all.  Only Mike.

Thanks,

Theresa

 ________________________________________
 Theresa Ramelot, PhD
 105 Hughes Laboratory
 Department of Chemistry and Biochemistry, Miami University

 FedEx address:
 701 E. High Street
 Department of Chemistry and Biochemistry, Miami University
 Oxford, OH   45056

 lab phone:   (513) 529-0283
 cell phone:  (513) 593-2402
 fax:  (513) 529-5715
 __________________________________________

ATOM RSS1 RSS2