简体   繁体   中英

import multiple CSV into HDF5 using Python

i trying to import multiple CSV files in a specific path into dateset HDF5 file using this code:

import numpy as np
import h5py
import pandas as pd
import glob
yourpath = '/root/Desktop/mal/ex1'
all_files = glob.glob(yourpath + "/*.csv")
li = []
for filename in all_files:
df = pd.read_csv(filename,index_col=None, header=0)
li.append(df)

frame = pd.concat(li, axis=0, ignore_index=True)

hf = h5py.File('data.h5', 'w')
hf.create_dataset('dataset_1', data=frame)
hf.close()

But i have an error:

line 15, in frame = pd.concat(li, axis=0, ignore_index=True) File "/usr/local/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 281, in concat sort=sort, File "/usr/local/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 329, in init raise ValueError("No objects to concatenate") ValueError: No objects to concatenate –

try to concat the csv file this way:

PATH = r"/...." # your Path
extension = 'csv'
os.chdir(PATH)
csv_list = glob.glob('*.{}'.format(extension))
print(csv_list)

# creates new df
df = pd.DataFrame()

for csv in csv_list:
    temp = pd.read_csv(csv)
    df = pd.concat([df, temp], ignore_index=True)
    
df.drop_duplicates(keep='first', inplace=True)

# .... here comes the rest of your code

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM