Hi --
The answers to your questions are inlined below!
# --- Is there a distributed lock within LAM daemons at publish time ?
- No. There is no distributed lock. The lam daemon on node 0 acts as a
central repository/server for all the published names. During a Publish
call, an out-of-band message is sent to this server lamd to enter the name
in its repository. The request is served one at a time and another request
with the same publish name will result in an error.
# --- Is the publish/unpublish/publish sequence for the same service gives
# always the same port ?
- If you publish/unpublish and before you could do another publish,
someone else (some other process in the same LAM universe) does a publish
with the same name in the meantime, then you will not necessarily get the
same port
# --- How to get the MPI_Comm_connect primitive failed after a timeout if
# it doesn't connect ?
- In LAM, the MPI_Comm_connect would block infinitely if no corresponding
MPI_Comm_accept has been provided. So effectively, no timeout mechanism is
present (which is infinite timout).
# --- Is there a way to ask for a list of already published services ?
- MPI standard does not provide any such call
# --- How many processes can be simultaneously running in LAM universe ?
- The lamd's have a fixed process table size -- somewhere around 70 or so.
It can be increased by tweaking the code. That's really the only limit --
scheduling is really outside the scope of LAM; we let the OS handle that
# --- What is the maximum number of Connect requests that can be waiting
# for an accept ?
- The connect/accepts in LAM are based on UDP send/recvs through the
lamds. So that means theoretically you can have as many requests as you
want (since there are no socket sematics in use). However there is no
guarantee about the ordering of the connect requests, if multiple clients
are trying to connect to the server.
Hope this helps!
-Vishal
|