LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Lei_at_[hidden]
Date: 2004-07-13 20:25:07


Jeff and Neil:

Sorry for returning to this issue
late. Please see blow.

Thanks a lot,

-Lei

On 6/28/04 4:51 AM, "Jeff Squyres" <jsquyres_at_[hidden]> wrote:

> On Mon, 28 Jun 2004, Lei_at_ICS wrote:
>
>> I was going to use tag until I realize that the tag space (number of
>> usable values) is too small for my application.
>>
>> A message is identified uniquely by an integer pair (i, j), both i and j
>> go from 0 to N-1. So the tag I use is: tag=i*N+j. This maps the two
>> dimensional (i, j) space uniquely to the one dimensional tag space.
>> Unfortunately, this solution limits the N value to sqrt(max_tag). Even
>> if max_tag=MAX_INT, that's still not enough for me (N~=50k).
>
> Not sure I follow there -- MAX_INT is almost 4.3B. With your mapping
> scheme, 50k*50K+50K = ~2.5B. This is still far less than MAX_INT.
>

If a tag has to be non-negative, a signed integer
can only go as large as about 2B. Even if we use
unsigned integers for tags, N will be about 70k,
which is not big enough for me.

>> One of the following could solve my problem:
>> 1) A user adjustable max_tag, so I can set it to be MAX_INT^2;
>
> The tag field specified by the MPI standard is an int, so no MPI
> implementation can give you more than sizeof(int) bytes/bits for a tag.
>
>> 2) A more efficient 2D -> 1D tag map so I won't waste the 1D tag space
>> too much.
>
> This depends on your application. Typical solutions include a more
> coarse-grained tag mapping and then put enough information in your message
> to specify the rest of the exact mapping. E.g., use "i" as the tag, and
> then embed "j" in the message somewhere. Since you're receiving on
> ANYTAG, then you clearly don't need your message receiving logic to care
> about the (i,j) values -- you only need those *after* you have received
> the message. So if you can receive a message and then deduce (i,j) from
> it, I'm guessing that should be sufficient for you.

To deduce (i, j) from a message after receiving it
and then decide what to do with the message will
work, and it is another version of my implementation.
But that requires using application level buffers
to cache the messages that are received but are
pre-mature to be processed.

In this version of implementation, I wanted to
use tags to help me process the messages from
the MPI system buffer in a particular order
I want.

For example, if I wanted to process the messages:
1. (1, 4)
2. (2, 4)
3. (3, 4)
4. (4, 4)
in that particular order,

but messages arrive to me in this order:
1. (1, 5)
2. (3, 4)
3. (4, 7)
4. (1, 4)
5. (4, 4)
6. (2, 4)

(Obviously the extra two messages, 1. and 3.,
will be taken care of as well, but in another loop.)

If we only tag the messages with the value of
i or j, we might grab the wrong message from
the MPI system buffer.

The idea of using t = i*N + j as tags enables
us to uniquely identify a message (i, j). But
the limitation is that now N cannot be too large.

In my application, N is number of matrix rows -
so 70k is not very large. i and j are column
indices of two columns, one of which is sent to
the other for processing. The granularity level
is reasonably coarse and we have seen very nice
speedups in another implementation that uses
the same granularity.

Any scheme that helps us tag the messages
so a (i, j) pair can be uniquely identified, but does
not "waste" the tag space the way our scheme
does will help us.

But if we look at two single digit integers
m and n each going from 0 to 9, all possible
combinations of (m, n) would correspond to
these two-digit integers m*10+n ranging from
0 to 99. So if we have a tag space of 0-99,
we can only tag the (m, n) pairs uniquely
with m and n ranging from 0 to 9. It sounds
like if a more efficient ways of tagging exists,
there are ways to improve the current decimal or
binary systems.

> Why do you need to identify the message so precisely? Why not just use "j"
> for the tag (if "j" is the "inner loop"). You can pass the values "i" AND
> "j" as values in the data, rather than the TAG.
>
> The TAG is just to ensure that you can identify the message, without having
> to read the data first. However if you absolutely have to identify the
> message unequivocably, you would still have the "i" and "j" values passed as
> part of the data.
>
> Regards
> Neil

As I described above, tagging the messages
with just "i" or just "j" will not help me
in identifying the messages, therefore an
application level buffering is not avoidable.
In another version of my implementation, I
make "i" and "j" part of the messages
(of course, adding 8 bytes to a mega-byte
message incurs negligible overhead) and
buffer those messages that are pre-mature
for processing. This works out fine even
without any message tags ("partial" tags
with only "j" might be somewhat helpful but it
complicates the code logic).