简体   繁体   中英

Python: read files from directory and concatenate that

I have a folder et with .csv files and I try to read that and next concatenate that and get one file. I try

import os

path = 'et/'
for filename in os.listdir(path):
    et = open(filename)
    print et

but I get an error

Traceback (most recent call last):
File "C:/Users/����� �����������/Desktop/projects/PMI/join et.py", line 5, in <module>
et = open(filename)
IOError: [Errno 2] No such file or directory: '0et.csv'

I can't understand, why I get this error, because when I print filename I get

0et.csv
1et.csv
2et.csv
3et.csv
4et.csv
5et.csv
6et.csv
7et.csv
8et.csv

You probably want to use et = open(path+filename) , instead of just et = open(filename) .

Edit : as suggested by @thiruvenkadam best practice would be to use et = open(os.path.join(path,filename))

Using glob.glob will be a better option, along with using os.path.join to get to the full path:

from glob import glob
from os.path import join, abspath
from os import listdir, getcwd

import pandas as pd

data_frame = pd.DataFrame()
dir_path = "et"
full_path = join(abspath(getcwd()), dir_path, "*.csv")
for file_name in glob(full_path):
    csv_reader = pd.read_csv(file_name, names=columns)
    # Guessing that all csv files will have the header
    #If header is absent, use names=None
    data_frame = data_frame.append(csv_reader, ignore_index=True)
    # There is also a concat funtion to use. I am comfortable with append
    # For concat, it will be data_frame = pd.concat(data_frame, csv_reader, ignore_index=True)
  1. abspath will make sure that the full directory from the root(in case of windows, from the main file system drive) is taken
  2. Adding *.csv with the join will make sure that you will check for csv files with the directory
  3. glob(full_path) will return a list of csv files, with absolute path, of the given directory
  4. Always make sure that you will either close the file descriptor explicitly or use the with statement to do it automatically, as it is a clean practice. Any C developer can vouch that closing the file descriptor is best. Since we need to put the value in the dataframe, I took out the with statement and added the read_csv from pandas.
  5. pandas.read_csv will make life lot better while reading the csv, in case we are into writing the csv file contents to dataframes. With read_csv and pandas append(or concat), we can write csv files easily without writing the header content from other csv files. I have given in append, because of personal opinion. Added how to use concat in the comment though.

Maybe it's the coding problem

You can try to add following code at the top of your code

# -*- coding: utf-8 -*-

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM