简体   繁体   中英

Error writing to files PBS MPI

I have a big trouble while writing some data to files using MPI on a cluster with PBS. Here is the example of simple problem-emulating programm.

#include <mpi.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>


#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <cstdlib>
#include <unistd.h>

int main(int argc, char* argv[]){
int rank;
int size;

MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);


// Define hostname
char hostname[128];
gethostname(hostname, 128);

// check and create dump directory
  struct stat buf;
  int rc;
  char *dir="Res";

  rc = stat( dir, &buf );
  if( rc ) // no dir, create
  { if( rank == 0 )
    {
      rc = mkdir( dir, 0771);
      if( rc )
      {std::ostringstream oss;
       oss << "Can't create dump directory \""
          << dir
          << "\"";
      }
    }
    else {
       sleep (2);
    }
  }
  else if( !S_ISDIR( buf.st_mode ) )
  {std::ostringstream oss;
   oss << "Path \""
       << dir
       << "\" is not directory for dump";
  }


   MPI_Barrier(MPI_COMM_WORLD);
// Every process defines name of file for output (res_0, res_1, res_2.....)
std::ostringstream filename;
filename << dir << "/res_"<< rank;

// Open file 
std::ofstream file(filename.str().c_str());

// Output to file . Output seems like "I am 0 from 24. hostname"
file  << "I am " << rank << " from " << size << ".   " << hostname  << std::endl;

file.close();

MPI_Finalize();

return 0;
}

I compile it with openmpi_intel-1.4.2, using comand

mpicxx -Wall test.cc -o test

Then I queue this program with script:

#!/bin/bash

#PBS -N test
#PBS -l select=8:ncpus=6:mpiprocs=6
#PBS -l walltime=00:01:30
#PBS -m n
#PBS -e stderr.txt
#PBS -o stdout.txt

cd $PBS_O_WORKDIR
echo "I run on node: `uname -n`"
echo "My working directory is: $PBS_O_WORKDIR"
echo "Assigned to me nodes are:"
cat $PBS_NODEFILE

mpirun -hostfile $PBS_NODEFILE ./test 

I expected this result:

1. New directory "Res" to be created

2. 8*6 different files (res_0, res_1, res_2, ...) to be written to the Res dir

But only res_* file from the first node are written (res_{0..5}) while the rest are not.

What is the problem?

Thank you!

OK, let's assume you run on a file system coherently mounted across all your compute nodes. This is the case, right? So then the main issue I see with your code snippet is that all processes do state the directory at the same time and then try to create it if it doesn't exist. I'm not sure what truly happens but I'm sure this isn't the smartest idea ever.

Since in essence what you want is a serial sanity check of the directory and/or it's creation if needed, why not just letting MPI process of rank 0 doing it?

That would give you something like this:

if ( rank == 0 ) { // Only master manages the directory creation
    int rc = stat( dir, &buf );
    ... // sanity check goes here and directory creation as well
    // calling MPI_Abort() in case of failure seems also a good idea
}
// all other processes wait here
MPI_Barrier( MPI_COMM_WORLD );
// now we know the directory exists and is accessible
// let's do our stuff

Could this work for you?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM