Thank you, Brian, for your reply.
I am using MPICH2 which needs mpd.hosts in home directory.
I was not intentioanlly asking the machine list.
I just ssh to hpc01 hpc02 and hpc03 (i.e., open three ssh windows)and went into gdb mode, then observe the rank number of three processors respectively in the three windows,and got the observation described in my last email.
I wonder if this indicates that there is potential error in my program.
By the way, I have another problem as follows. Please give me some advices if you can.Thanks in advance
--------------
our project is to develop a parallel program which can search documents (*.txt files) from a large document collection matching with query requirements. Now, we have finished the search engine program using C++ and MPI on Linux. We compiled and tested the program in command line mode.
To complete the project, we need to create a interface for thesearch engine program in Linux box so that the system can accept query words and display query results in a user friendly fashion (a window interface, in which users can simly click on a button to submit his querys, and see the results displayed in a table. A doucment shown in a row of the table can be opened when users click on that row).
We do not have any experiences with this. We do not know which computer language is best suitable to our case. Can anyone please share your experiences with us, or provide some helpful advices.
--------------
Yong Chen from China
> CC: cy163_at_[hidden]> From: brbarret_at_[hidden]> Subject: Re: LAM: same machine different rank number in different runs> Date: Thu, 7 Jun 2007 22:49:23 -0600> To: lam_at_[hidden]> > On Jun 7, 2007, at 10:44 PM, chenyong wrote:> > > Is it normal a machine in a cluster is assigned different rank > > number in different runs.> > In my case, there are three machines (hpc01, hpc02, hpc03) in the > > cluster. the content of the file mpd.hosts is as follows> >> > hpc01> > hpc02> > hpc03> >> > I found that in some runs, hpc01 has rank number '0' hpc02 has rank > > number '1' hpc03 has rank number '2';> > the order of rank numbers just follows the order of machine names > > listed in the file.> > However, in some other runs, hpc01 has rank number '0', hpc02 has > > rank number '2' , hpc03 has rank number '1'.> > the order does not follow the file name order.> > Is this nornal or not.> > Are you using LAM/MPI or MPICH2? The mpd.boot suggests MPICH2, in > which case you would be best off asking the MPICH lists. Thi
s > behavior would be highly unusual for LAM/MPI. If you are using LAM/ > MPI, are you always running lamboot from the same node? Are you > running multiple jobs at the same time?> > Thanks,> > Brian> > -- > Brian Barrett> LAM/MPI Developer> Make today a LAM/MPI day!> >
_________________________________________________________________
ʹÓÃÏÂÒ»´úµÄ MSN Messenger¡£
http://imagine-msn.com/messenger/launch80/default.aspx?locale=zh-cn&source=wlmailtagline
|