LAM/MPI logo

LAM/MPI Development Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Fahim Faisal (faisal_lead_at_[hidden])
Date: 2006-01-10 08:04:52


  These days in our cosmos, most of the users like to use P-IV or at least P-III machines. But just a few years ago, we were using P-I and P-II machines. Now what we should do with these old machines, put them into the garbage? Of course there are some uses. Some parts are reusable. But there is a much more huge opportunity esp. for organizations to gain a huge computing performance by gathering the old machines in a network appeared as a cluster. Again we see in most organizations, most of the machines are underutilized. Most desktop PCs are busy less than 5% of the time. Again while users are using machines, usually they don’t utilize heavily. These also promote to utilize our PCs properly. By using cluster we can get high performance compared to supercomputers are being used now a days. In the cluster environment we generally use a parallel platform called Message Passing Interface (MPI). There are some other platforms. But this has now been a de facto standard. We will
 concentrate on a preferable implementation of MPI and on providing a complete cluster software suit.
   
  My objective is to provide a fully object oriented implementation of MPI, that is proudly named as Java Powered Object Oriented Message Passing Interface (JPOOMPI). Then gradually get into cluster computing and provide a smart cluster command and control software. Java is my chosen language, cause it is firmly popular as a fully object oriented language. I have used Java NIO as the backbone of the implementation.
   
  In the past few years there has been a number of projects done on Message Passing System using java. And day by day the interest is growing. So far the number of efforts made to develop a message passing system using Java can be divided into three categories based on the technologies used behind the development. These are RMI, JNI and Socket.
   
  RMI is an interesting technology from developer’s point of view because it can save a lot of time and effort. RMI sends all data through object. This is an overhead, as we all know an object carry lots of information other than the original data. Another thing is that we need at least one RMI registry to locate distributed objects. Other than these factors a major drawback of RMI in the context of gaining high performance is relating to its backend mechanism, as we all know RMI itself uses Sockets as its underlying communication medium, so the optimization of performance in communication would not be in the hand of a developer while using RMI.
   
  Java Native Method (JNI) API calls the native C routines from java application. So far the number of implementation made by developers using JNI is simply calls the C bindings of MPI routines from java application. This is obviously not the best approach. There are lots of security issues and breaking-programming-model issues involved in this approach. Besides this JNI causes an extra copying of data between, the Java code and the native MPI code, which also makes the performance question bold.
   
  Java Socket is the only low-level communication protocol defined in Java language. From my programming experience I know that any low level architecture can give the best performance if it can be handled properly. The introduction of Java NIO package also makes the Socket as a communication protocol much stronger.
   
  Web services definitely have no reason to be chosen to use in MPI implementation, but it could be a very interesting consideration and might be helpful in cluster implementation.
   
  VIA is a very possible choice for MPI implementation. It is really efficient from the performance point of view. Truly saying, my first approach was to consider VIA as the backbone for my MPI implementation. But two things prompted me not to use it. Firstly the cost, still we find VIA enabled devices very costly. Secondly devices support VIA are still not very available. These are being used in limited areas.
   
  I also had a little try with Datagram instead of using TCP/IP sockets. But the keyword that stopped me to step into is unreliable transport.
   
  In my project, to the user, no MPI class is provided, only the interfaces (except Status that is currently a class) Comm, Request, Buffer and Code. User’s all the classes must be grouped in a jar file. And the main class must inherit the interface Code. So, the user class needs to implement the function execCode that is declared in Code interface. There are two classes Scheduler and Node. Scheduler acts like a socket client. It sends the url of the jar file to all donor machines. It is donor machines’ responsibility to download the jar file and execute user code. Node is a socket server. It is a concurrent server. Node resides in every donor machine and when a machine is started, Node automatically starts.
   
  Scheduler reads IP addresses of all machines from a file named machines.txt. All the Nodes have same port to be bound. But if we want to execute multiple processes in a single machine, i.e.; the Scheduler machine, then we need to run Nodes on different ports. For connection establishment, java NIO asynchronous methods are used. Including the name of the jar, Scheduler also sends machine addresses, and rank to every donor machine. Rank, of course, is assigned to the very donor machine. Actually a specific algorithm is used for sending the requirements to Nodes.
   
  I have a Communicator class that implements Comm interface and handles almost everything relating to MPI. Comm interface declares all the major MPI functions like send, isend, recv, irecv, etc. Communicator, of course, defines these functions. Again it has some private functions, inner class, and some utility functions. After sending all the requirements to child Nodes, Scheduler sets its own rank through Communicator. Then two threads, sender thread and receiver thread, are started through Communicator and lastly the code is executed.
   
  My tests and benchmarking results show that the very popular MPICH2 definitely dominates over my implementation in performance. But there still have some chances to increase the performance in the near future. We also need to appreciate that it is not possible to beat the C implementation, but we can get performance very much closer. Again loosing little bit of performance gain, we get great advantage of sending and receiving data quite easily. Suppose one process needs to send an object as data to any other process. Now a days though we have C++ binding of MPICH2, it still looks like a nightmare to send object as data. User needs to define the total memory requirements to send object. It is really difficult both sending and receiving. The implementers of MPICH also appreciate this. But in my implementation it is pretty much simpler. As Java is used for implementation, we, of course, would experience the advantages provided by Java.
   
  Check the site: http://sourceforge.net/projects/jpoompi
   

                
---------------------------------
 Yahoo! DSL Something to write home about. Just $16.99/mo. or less