简体   繁体   中英

create excel with python from multiple csv sorted by date for every csv file

I'am in neead of help. My job is the following:

  • copy a variable from each CSV file from position 2,2
  • create the sum of all values of column 17 in every CSV
  • put this all into a single excel file.

But each CSV file contains data from several months. These should be summarized orderd by month. The date specification is in the following form 04/11/2022 11:54:43. The Excel should look like this:

< ID-----Value of C.--------month

12--------30--------------Jan

12--------12--------------Feb

15---------3-------------Jan

My code so fare:

import os, glob
path = os.getcwd()
csv_files = glob.glob(os.path.join(path, "*.csv"))
  
# data in single dataframe
df4 = pd.DataFrame(columns =['SIM-Karte', 'Datenverbrauch'])
  
# loop over the list of csv files
for f in csv_files:
      
    # read the csv file
    df = pd.read_csv(f,sep=';', skiprows = 1, usecols=[1,16],header=None)
    #ID
    ID = (df.iloc[0][1])
    #summ of col.16
    dat_Verbr = df[16].sum()
    df4.loc[len(df4.index)] = [ID, dat_Verbr]

# Specify the name of the excel file
file_name = 'Auswertung.xlsx'
  
# saving the excelsheet
df4.to_excel(file_name, index=False)
print(' record successfully exported into Excel File')

The coad create only a sum for each csv file. What should I do to:

  • sort the date in each CSV file after the month
  • summ all date from only one month and how did I bring this into a singel Exel file. I' a beginner with python so please help.

not professional, but should work. the main idea is to group each data frame from .csv, by ID and sum column 16:

import os, glob
import pandas as pd

path = os.getcwd()
csv_files = glob.glob(os.path.join(path, "*.csv"))

df4 = pd.DataFrame([])
# loop over the list of csv files
for f in csv_files:
    # read the csv file
    df = pd.read_csv(f, sep=';', skiprows=1, usecols=[1, 16], header=None)
    # ID
    df= df.groupby(df.columns[0])[df.columns[1]].sum() # grouping by ID and summing data usage
    df4 = pd.concat([df4,df])

# colunm names for final DF
cols_names = ['SIM-Karte', 'Datenverbrauch']

# ID was in the INdex, - removing
df4 = df4.reset_index()
# renaming col headers
df4.columns = cols_names

# Specify the name of the excel file
file_name = 'Auswertung.xlsx'

# saving the excelsheet
df4.to_excel(file_name, index=False)
print(' record successfully exported into Excel File')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM