LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2004-09-03 04:46:29


On Sep 2, 2004, at 8:00 PM, William Boatin wrote:

> before i ask my real question, can some one please tell me how to
> reply to any responses to this thread, without creating a new thread?
> in the past i have had to reply to responses to my question by
> starting a new thread insstead of continuing the old one.

By "new thread," do you mean a new thread on the mailing list archives?
  If so, it's probably a function of your mailer. Most mail clients
automatically put in the Right Headers when you "Reply" to thread
messages properly and our web archiver script (MHonArc) ususally
handles this properly.
<more-detail-than-most-people-probably-want>

I haven't looked at your messages closely, but I'm guessing that your
mail client isn't doing this, and therefore MHonArc is probably
treating your messages as new threads.

MHonArc also has a mode where it can additionally thread messages by
stringing together identical (or nearly identical) subject lines. But
we had to disable that feature because we've been running this list for
many years, and over the years, people tend to use the same subject
lines (e.g., "lamboot problem", "mpirun help", etc.). So messages sent
in 2001 would show up in threads in 1999, and so on.

</more-detail-than-most-people-probably-want>

> now my question is about xwindows and x11 forwarding
> i read the faq entry at this link
> http://www.lam-mpi.org/faq/category6.php3#question5
> i did what was suggested and i didnt get the desired output
> after searching through the archives i modified my script to look like
> this:
>
> #!/bin/bash -f
>
> echo "Running xterm on `hostname`"
> /opt/kde3/bin/konsole -T `hostname` -e $*
> exit 0
>
> i had to explicitly give the path to the konsole terminal program; i
> need to add that to the path i guess.

This is probably a function of how your "dot" files (i.e., your
.tcshrc / .login / .profile / .bashrc... whichever one is appropriate
for you) and/or how the system-level equivalents of these files are
setup on the nodes in question. Some systems tend to treat interactive
logins differently than non-interactive logins, and make the paths be
much longer (i.e., perhaps you / your sysadmin / your distro didn't
anticipate logging in non-interactively and running a konsole).

> i stuck the '-T `hostname` so that things were less confusing (puts
> hostname in title bar of terminal window).
> however, i was expecting to see a window for each rank i launch on the
> local machine. my hostfile looks lile this:
>
> simba cpu=3
> tom cpu=2
>
> and i execute my program (called m) like this, frome the local machine
> (simba):
>
> mpirun c0 c1 c3 -sf -x DISPLAY xterm.bash m
>
> i expected to get three windows, one for my master rank and two for my
> slave ranks, with c3 being a

Sidenote: it's "processes", not "ranks". A "rank" is just an integer
specifying a process' identity within a communicator. People usually
refer to the process' rank in COMM_WORLD as the "rank", but in reality,
every MPI process has at least two ranks: their COMM_WORLD rank, and
their COMM_SELF rank (which is always, by definition, 0). If your
process creates any more communicators, then it also has more ranks.
So "process" is more specific than "rank".

(a smart person corrected me on exactly this same issue this years ago
:-)

> window to whats happening on tom, the remote machine.
>
> per my scrip this is the output i get:
> Running xterm on neuro-simba
> Running xterm on neuro-simba
> Running xterm on neuro-tom

Looks right so far.

> however i was gettin only two windows popping up; a slave and a
> master. couldnt figure out was going on, till i took a look at the
> remote machine screen and lo and behold the terminal windows were
> popping up on the remote machine (tom)!

This is because you're exporting your DISPLAY variable.

In an ssh world, you likely don't need to export DISPLAY because ssh
will set it to a relevant value in the remote shell. Most important:
although it will be a different value than what you have in the shell
that you ran mpirun in, the output will still appear on your screen.

For example:

shell$ ee
# launches the "ee" X program on your local display
shell$ ssh other-machine.example.come ee
# if ssh is setup appropriately (and many systems do this by default),
ee will be running on other-machine but will appear on your local
display

rsh doesn't do this for you, which is why you typically either need to
-x the DISPLAY (and ensure that it's set to a value that is relevant on
all nodes) or setup your "dot" files to do the Right Thing (which is
usually far more trouble than it's worth).

So if you're using ssh, the solution here is likely to just drop the
"-x DISPLAY" from your mpirun command line.

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/