简体   繁体   中英

How to combine multiple csv into one file in serial manner using python?

I am trying to merge multiple CSV file into one CSV file.

The CSV files are like

 Energy_and_Power_Day1.csv,     
 Energy_and_Power_Day2.csv, 
 Energy_and_Power_Day3.csv,      
  ....................., 
 Energy_and_Power_Day31.csv

I have used a small python script to concatenate the multiple CSV file.The script is doing it's job but it is not concatenate the files in serial manner. It should take Energy_and_Power_Day1.csv then Energy_and_Power_Day2.csv then Energy_and_Power_Day3.csv like this way.. but instead of this it takes randomly not in serially. This is my code

import pandas as pd
import csv
import glob
import os

os.chdir("/home/mayukh/Downloads/Northam_bill_data")
results = pd.DataFrame([])
filelist = glob.glob("Energy_and_Power_Day*.csv")
#dfList=[]
for filename in filelist:
  print(filename)  
  namedf = pd.read_csv(filename, skiprows=0, index_col=0)
  results = results.append(namedf)

results.to_csv('Combinefile.csv')

The script is giving this output from print(filename) and combine these csv files in this manner.

Energy_and_Power_Day1.csv
Energy_and_Power_Day16.csv
Energy_and_Power_Day23.csv
Energy_and_Power_Day22.csv
Energy_and_Power_Day11.csv
Energy_and_Power_Day21.csv
Energy_and_Power_Day31.csv
Energy_and_Power_Day17.csv
Energy_and_Power_Day25.csv
Energy_and_Power_Day28.csv
Energy_and_Power_Day9.csv
Energy_and_Power_Day19.csv
Energy_and_Power_Day7.csv
Energy_and_Power_Day15.csv
Energy_and_Power_Day20.csv
Energy_and_Power_Day24.csv
Energy_and_Power_Day4.csv
Energy_and_Power_Day6.csv
Energy_and_Power_Day14.csv
Energy_and_Power_Day13.csv
Energy_and_Power_Day27.csv
Energy_and_Power_Day3.csv
Energy_and_Power_Day18.csv
Energy_and_Power_Day8.csv
Energy_and_Power_Day30.csv
Energy_and_Power_Day12.csv
Energy_and_Power_Day29.csv
Energy_and_Power_Day10.csv
Energy_and_Power_Day5.csv
Energy_and_Power_Day2.csv
Energy_and_Power_Day26.csv

My question is how or which way I can combine those CSV files serially?

It's not "random" (it would depend on how these files are organised by the underlying file system – @tripleee ).

You can sort the filenames before you open the files. Use list.sort with a key parameter. Following this, you can use a list comprehension, and pass a list of dataframes to pd.concat . It should be more efficient than DataFrame.append .

import re

filelist = glob.glob("Energy_and_Power_Day*.csv")
filelist.sort(key=lambda x: int(re.search('\d+', x).group()))

df = pd.concat([
        pd.read_csv(f, skiprows=0, index_col=0) for f in filelist
     ],
     axis=0
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM