Hi --
# Since I did not specify prefix option BLCR was installed in default
# /usr/local folder. Then I installed lam with commands
# configure --with-blcr=/usr/local --with-rpi-crtcp
The configure switch to enable crtcp as default is --with-rpi=crtcp (NOTE
"equals" -- not a "dash")
For blcr to work, atleast "gm" or "crtcp" rpi should be set as default. In
your case, --with-rpi-crtcp did not work, and hence blcr module was not
configured (as you saw in the error message you got while using -ssi cr
blcr). You can check the output of config.log to confirm this.
I think this lead to all other problems you got after that.
If your problem persists, send across the config.log, which can help
pinpoint the problem.
Hope this helps...
-Vishal
#
# Now when I do check point of ordinary processes using blcr it works fine. I
# started lamboot and then I invoked a parallel process with command
# mpirun -ssi rpi crtcp -ssi cr blcr -np 4 ./ring
# This produces error stating blcr module in CR kind was not found. This
# typically means you have misspelled the module name.
#
# So I ran the program with command
# mpirun -ssi rpi crtcp -np 4 ./ring
# and it works fine. Now I checkpoint with the command
# cr_checkpoint 23245 where 23245 is PID of mpirun. One file named
# context.23245 is created and no other files are created (Should other files
# be created). This file is created on node where I run command cr_checkpoint.
# (Note I don't have NFS on my test cluster)
#
# When I try to restart the original program from context file with command
# cr_restart 23245 I get the error
# mpirun (rpwait) : bad file descriptor. (Note: The original process has
# already completed execution)
#
# Please let me know if these errors are due to some lapse in installation or
# if I am missing some options.
#
# Thanks in Advance,
# Pirabhu
#
# _________________________________________________________________
# Masterpieces made affordable! Buy art prints.
# http://go.msnserver.com/IN/42736.asp MSN Shopping.
#
# _______________________________________________
# This list is archived at http://www.lam-mpi.org/MailArchives/lam/
#
|