Center for Fluid Mechanics, Turbulence and Computation, Brown University

Parallel I/O

Currently MPI does not support parallel I/O, so we need to think about how to do this. There are in fact several ways of performing I/O in a parallel program -- we give some examples here of how you could go about this.
Let us assume we have a SPMD program (Single Program Multiple Data) running on a parallel machine. We will consider two problems: First, reading in data from a file (foo.in), and second, writing out data to a file (foo.out). We will consider reading and writing a vector of 1000 double precision numbers, which reside on each processor.
To make the example more concrete, we will give excerpts of c-code. Let pid be the processor id, and nprocs be the number of processors, so:

    int pid, nprocs;

    MPI_Comm_rank (MPI_COMM_WORLD, &pid);
    MPI_Comm_size (MPI_COMM_WORLD, &nprocs);

Input

Input is no worries. Each processor maintains its own file pointer.

Input I: All processors read the same data

    double x[1000];	

    fp = fopen ("foo.in", "r");
    fread  (x, sizeof(double), 1000, fp);
    fclose (fp);

Input II: Each processor reads its data

Note here that processor pid reads pid lots of 1000 doubles before it gets to its own data, so the last processor would read all the other processor's data first, and consequently take the longest.

    double x[1000];	

    fp = fopen ("foo.in", "r");
    for (p = 0; p < pid; p++)
      fread (x, sizeof(double), 1000, fp);  /* better to use lseek? */
    fclose (fp);

Output

Output I: Via processor 0

All data gets written to file foo.out via processor 0. Note the sends and receives.

    double x[1000];	

    if (pid == 0) {
      fp = fopen ("foo.out", "w");
      fwrite (x, sizeof(double), 1000, fp);    /* data from proc 0 */
      for (p = 1; p < nprocs; p++) {
        MPI_Recv (x, 1000, MPI_DOUBLE, p, p, MPI_COMM_WORLD, &status);
        fwrite (x, sizeof(double), 1000, fp);  /* data from proc p */
      }
      fclose (fp);
    } else
      MPI_Send (x, 1000, MPI_DOUBLE, 0, pid, MPI_COMM_WORLD); /* send to 0 */

Output II: Sequentially

Each processor opens, appends to, then closes foo.out. This is slightly faster than the previous method, but still sequential. Note the use of MPI_Barrier in the for loop.

    double x[1000];	

    for (p = 0; p < nprocs; p++) {
      if (pid == p) {
        fp = fopen ("foo.out", "a");
        fwrite (x, sizeof(double), 1000, fp);  /* data from proc p */
        fclose (fp);
      }
      MPI_Barrier (MPI_COMM_WORLD);
    }

Output III: On /tmp

This is writing in parallel, and the fastest, but here foo.out is a different file for each processor. You would still need to decide how you were going to merge /tmp/foo.out on each of the processor's disks after the run.

    double x[1000];	

    fp = fopen ("/tmp/foo.out", "w");     /* each foo.out is different */
    fwrite (x, sizeof(double), 1000, fp);
    fclose (fp);