Since you've only got effectively 31 bits (you were correct -- I
incorrectly said 4B as MAX_INT, neglecting the fact that you can't use
negative tags), you really can only use sqrt(2^31) max values for M and
N (assuming your matrix is square).
Another possibility is to setup your own internal envelope buffering --
so always send the (M, N) tuple in a small envelope on a specific tag,
and then send the buffer itself on another (unique) tag. This would
involve both ordering of envelopes on the receive side and negotiating
tags between sender and receiver, but it would allow you to avoid
expensive memory copies and still preserve your uniqueness
requirements. For example:
- sender needs to send (1, 4) (2, 4) (3, 4) (4, 4), but they are
generated in the order that you indicated below ((3, 4) (1, 4) (4, 4)
(2, 4)).
- for each tuple (in the order that it is generated), the sender does
the following:
- allocates an envelope
- fills in the tuple value (e.g., 3, 4)
- allocates a unique tag (e.g., tag++)
- fills the tag value in the envelope
- sends the envelope to the receiver
- the receiver receives the envelopes
- the receiver re-orders the envelopes and processes them in the order
desired (i.e., it may wait until it receives the envelope for (1,4)
before doing anything
- when it processes an envelope, it does a receive on the tag in the
envelope to receive the data into the target buffer
- the receiver then sends back a short envelope releasing the tag
In this way, the sender and receiver simply associate specific tags
with specific messages; the receiver "returns" the tags when it has
completed the receive so that the sender can re-use it in a future
envelope. Think of it as implementing your own protocol over MPI.
This requires a higher total number of messages and some additional
work on both sides, but may be worth exploring.
On Jul 13, 2004, at 9:25 PM, Lei_at_ICS wrote:
> Jeff and Neil:
>
> Sorry for returning to this issue
> late. Please see blow.
>
> Thanks a lot,
>
> -Lei
>
> On 6/28/04 4:51 AM, "Jeff Squyres" <jsquyres_at_[hidden]> wrote:
>
>> On Mon, 28 Jun 2004, Lei_at_ICS wrote:
>>
>>> I was going to use tag until I realize that the tag space (number of
>>> usable values) is too small for my application.
>>>
>>> A message is identified uniquely by an integer pair (i, j), both i
>>> and j
>>> go from 0 to N-1. So the tag I use is: tag=i*N+j. This maps the two
>>> dimensional (i, j) space uniquely to the one dimensional tag space.
>>> Unfortunately, this solution limits the N value to sqrt(max_tag).
>>> Even
>>> if max_tag=MAX_INT, that's still not enough for me (N~=50k).
>>
>> Not sure I follow there -- MAX_INT is almost 4.3B. With your mapping
>> scheme, 50k*50K+50K = ~2.5B. This is still far less than MAX_INT.
>>
>
> If a tag has to be non-negative, a signed integer
> can only go as large as about 2B. Even if we use
> unsigned integers for tags, N will be about 70k,
> which is not big enough for me.
>
>>> One of the following could solve my problem:
>>> 1) A user adjustable max_tag, so I can set it to be MAX_INT^2;
>>
>> The tag field specified by the MPI standard is an int, so no MPI
>> implementation can give you more than sizeof(int) bytes/bits for a
>> tag.
>>
>>> 2) A more efficient 2D -> 1D tag map so I won't waste the 1D tag
>>> space
>>> too much.
>>
>> This depends on your application. Typical solutions include a more
>> coarse-grained tag mapping and then put enough information in your
>> message
>> to specify the rest of the exact mapping. E.g., use "i" as the tag,
>> and
>> then embed "j" in the message somewhere. Since you're receiving on
>> ANYTAG, then you clearly don't need your message receiving logic to
>> care
>> about the (i,j) values -- you only need those *after* you have
>> received
>> the message. So if you can receive a message and then deduce (i,j)
>> from
>> it, I'm guessing that should be sufficient for you.
>
> To deduce (i, j) from a message after receiving it
> and then decide what to do with the message will
> work, and it is another version of my implementation.
> But that requires using application level buffers
> to cache the messages that are received but are
> pre-mature to be processed.
>
> In this version of implementation, I wanted to
> use tags to help me process the messages from
> the MPI system buffer in a particular order
> I want.
>
> For example, if I wanted to process the messages:
> 1. (1, 4)
> 2. (2, 4)
> 3. (3, 4)
> 4. (4, 4)
> in that particular order,
>
> but messages arrive to me in this order:
> 1. (1, 5)
> 2. (3, 4)
> 3. (4, 7)
> 4. (1, 4)
> 5. (4, 4)
> 6. (2, 4)
>
> (Obviously the extra two messages, 1. and 3.,
> will be taken care of as well, but in another loop.)
>
> If we only tag the messages with the value of
> i or j, we might grab the wrong message from
> the MPI system buffer.
>
> The idea of using t = i*N + j as tags enables
> us to uniquely identify a message (i, j). But
> the limitation is that now N cannot be too large.
>
> In my application, N is number of matrix rows -
> so 70k is not very large. i and j are column
> indices of two columns, one of which is sent to
> the other for processing. The granularity level
> is reasonably coarse and we have seen very nice
> speedups in another implementation that uses
> the same granularity.
>
> Any scheme that helps us tag the messages
> so a (i, j) pair can be uniquely identified, but does
> not "waste" the tag space the way our scheme
> does will help us.
>
> But if we look at two single digit integers
> m and n each going from 0 to 9, all possible
> combinations of (m, n) would correspond to
> these two-digit integers m*10+n ranging from
> 0 to 99. So if we have a tag space of 0-99,
> we can only tag the (m, n) pairs uniquely
> with m and n ranging from 0 to 9. It sounds
> like if a more efficient ways of tagging exists,
> there are ways to improve the current decimal or
> binary systems.
>
>
>> Why do you need to identify the message so precisely? Why not just
>> use "j"
>> for the tag (if "j" is the "inner loop"). You can pass the values "i"
>> AND
>> "j" as values in the data, rather than the TAG.
>>
>> The TAG is just to ensure that you can identify the message, without
>> having
>> to read the data first. However if you absolutely have to identify the
>> message unequivocably, you would still have the "i" and "j" values
>> passed as
>> part of the data.
>>
>> Regards
>> Neil
>
> As I described above, tagging the messages
> with just "i" or just "j" will not help me
> in identifying the messages, therefore an
> application level buffering is not avoidable.
> In another version of my implementation, I
> make "i" and "j" part of the messages
> (of course, adding 8 bytes to a mega-byte
> message incurs negligible overhead) and
> buffer those messages that are pre-mature
> for processing. This works out fine even
> without any message tags ("partial" tags
> with only "j" might be somewhat helpful but it
> complicates the code logic).
>
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|