简体   繁体   中英

IsADirectoryError: [Errno 21] Is a directory: '/' error while using multiprocessing

Say, I have a function to run multiple data frames in a list. Like this,

listdF = [os.path.join(os.sep,path,x) for x in os.listdir(path) if x.endswith('.csv')]
def corre_arrys(listdF):
   data = []
for files in listdF:
    df = pd.read_csv(files,sep='\t',header=0,engine='python')
    #do something
return(df)
        

When I try to run the above function as it is, there is no error. It prints out the output I needed. However, when I try to run it using multiprocessing like follows,

from multiprocessing import Pool
NUM_PROCS = 8    
pool = Pool(processes=NUM_PROCS)
allDfs = pool.map(corre_arrys,listdF)

It is throwing the following error message,

RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/alva/anaconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/home/alva/anaconda3/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "<ipython-input-42-e4b97b52ffff>", line 4, in corre_arrys
    df = pd.read_csv(files,sep='\t',header=0,engine='python')
  File "/home/alva/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 676, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/home/alva/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 448, in _read
    parser = TextFileReader(fp_or_buf, **kwds)
  File "/home/alva/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 880, in __init__
    self._make_engine(self.engine)
  File "/home/alva/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1126, in _make_engine
    self._engine = klass(self.f, **self.options)
  File "/home/alva/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 2269, in __init__
    memory_map=self.memory_map,
  File "/home/alva/anaconda3/lib/python3.7/site-packages/pandas/io/common.py", line 431, in get_handle
    f = open(path_or_buf, mode, errors="replace", newline="")
IsADirectoryError: [Errno 21] Is a directory: '/'
"""

The above exception was the direct cause of the following exception:

IsADirectoryError                         Traceback (most recent call last)
<ipython-input-46-4971753cdf30> in <module>
      4 NUM_PROCS = 8
      5 pool = Pool(processes=NUM_PROCS)
----> 6 allDfs = pool.map(corre_arrys,listdF)

~/anaconda3/lib/python3.7/multiprocessing/pool.py in map(self, func, iterable, chunksize)
    266         in a list that is returned.
    267         '''
--> 268         return self._map_async(func, iterable, mapstar, chunksize).get()
    269 
    270     def starmap(self, func, iterable, chunksize=None):

~/anaconda3/lib/python3.7/multiprocessing/pool.py in get(self, timeout)
    655             return self._value
    656         else:
--> 657             raise self._value
    658 
    659     def _set(self, i, obj):

IsADirectoryError: [Errno 21] Is a directory: '/'

The listDF looks like the following, which has both paths and files.

['/path/scripts/pc_2_lc_1_T.csv',
 '/path/scripts/pc_2_lc_2_T.csv',
 '/path/scripts/pc_1_lc_1_T.csv',
 '/path/scripts/pc_1_lc_2_T.csv']

I am not able to understand where is the exact problem.

Any help is greatly appreciated. Thanks!!

From your stack trace it looks like a directory is creeping in your listdF and pandas.read_csv() fails trying to load that. Try explicitly filtering out directories: listDf = [x for x in os.listdir(path) if os.path.isfile(os.path.join(path, x)) and x.endswith('.csv')]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM