WEEKLY STATUS REPORT
Camilo A. Silva
June 23 – June 29
ACTIVITIES:
Action. That’s what this whole week has been about. I had the time to read some papers and further my studies on the topics of MPI, MPICH, autonomic computing, and a little on the usage of the Rocks GCB cluster. On Monday June 23, I had my group meeting with my bioinformatics team members and I was able to share with them my last presentation on MPI: MPICH implementation and derived data types.
During that meeting, it was decided to start designing the structure models for the communication of the MPI program for project18 parallelization. Thus, I volunteered to design the model and I was able to present this model to my group members on the following meeting of the week on Wednesday, June 25. During that meeting, we were able to discuss that the first parallel implementation of project18 is to simply start the processing of different discriminating probes genomes on the different nodes. In other words, project18 would be sent to each node of the GCB cluster and after it has finished computing, the result files will be saved and accessed in the /share/../bioinformatics/results/ folder.
In order to handle files, MPI possesses a library of I/O functions; such library is known as MPI-IO. Therefore, my task from that Wednesday meeting until today was to learn MPI-IO and be able to run a simple test on the GCB cluster, where a file is created and opened collectively so that all nodes could write on it. Additionally, I wanted to create a different IO test that allows each node to open a file and read its contents and append info on it.
I have been able to report my progress with Dr. Duran here at the University of Guadalajara. Every Tuesday and Thursday from 4:00 p.m. – 5:00 p.m, Sean, Allison and I have a meeting with Dr. Duran. I have been able to talk with Dr. Duran about autonomic computing self-healing properties for my project as well as the MPI-IO implementations that I needed to learn. I was able to share to him that the self-healing implementation of my project was not as concrete as I was expecting since I have not had a chance to program the parallel program of it, and I was not fully aware of what faults I would be expecting besides the famous ones of “a node going down or connection losses.”
ACCOMPLISHMENTS:.
The great accomplishment for this week is that the design structure of the MPI communication of the program was completed. The power point presentation could be found in here http://latinamericangrid.org/elgg/camilo.silva/files/23
Another accomplishment is that I was able to learn the basis of MPI-IO after completing a lot of readings. Here are some of the materials that guided me tremendously in order to learn the basics of MPI-IO:
- http://www-unix.mcs.anl.gov/mpi/tutorial/advmpi/sc2005-advmpi.pdf
- http://beige.ucs.indiana.edu/I590/node88.html
- http://www.mhpcc.edu/training/workshop2/mpi_io/MAIN.html
ISSUES/PROBLEMS:
I could say that the biggest issue until now (BTW, this is something that I am still trying to solve) it’s a technical issue. Through out this weekend, I was working on some testing programs for the MPI-IO functions in order to practice and learn how they perform in the cluster. The testing program that I created needed interaction with the user—meaning that it was asking the user to input some information such as the name of the file to be created or opened—what happened, unfortunately, was that at the time of run-time, the program would ignore the input from the user and it would just carry along until the end of the program. It was kind of funny to see a program act this way! So, the only thing left to do was to Google. And so I tried to look for more similar examples asking the user for input, and I compiled them as well with hopes of solving the problem. But, guess what? The same error was constantly happening over and over again.
It was until today, Monday, June 30 2008 that I decided to consult other friends of mine that are more knowledgeable and experienced about MPI-IO to give me a hand. Thus, I am still waiting to solve this little issue of user interaction during the execution of a parallel program.
PLANS:
The major goal for this week is to write the parallel code for project18 and hopefully have a test run over this weekend.
SUMMARIES/CRITIQUES OF PAPERS:
FIRST READINGS: MPI-IO
These are a collection of documents that I found online from credible sources that talk about the basics of MPI-IO. The important thing that I learn about MPI-IO is that it allows the programmer to design a file IO system where all the nodes can access collectively a particular file. What that means is that a file could be opened and all the nodes will be able to write on that same file by following an offset that is generated after each node has written to the file. There are different types of functions depending on the objective and purpose of the parallel program that will be run. Some of the functions are categorized as blocking and non-blocking functions. There are some other functions that allow a file to be saved non-contiguously or contiguously.
I found that the different references were helpful in different ways. For example, the first reference that is from Argonne labs in Chicago focused their presentation not only on the basics of MPI-IO but also in some other topics such as sparse matrix I/O, passive target RMA and improving performance. In the document material of Indiana.edu I found very interesting all the program examples and detailed explanations of them. On the mhpcc.edu document, I liked very much how each function was described and how all of its parameters were presented and explained.
The information that I learned from these documents was really important because I was able to learn all the basics of MPI-IO. Mostly everything that I earned would be used in project18. Thus, these documents will be of great reference for the work I am currently doing.
Keywords: bioinformatics, Mexico, progress, report