簡體   English   中英

從多個 CSV 文件計算列並將結果保存到新文件中

[英]Calculate columns from multiple CSV files and save results into a new file

我是 Python 新手,並嘗試使用 python/pandas 執行以下操作。

我有四個看起來像這樣的 CSV 文件(唯一的區別是第一列中的日期值):

First_week.csv:

     date       id   name   total_unitCount   total_orderCount   total_invoiceCount
  2020-02-12     1   Guitar        300              600                   500
  2020-02-12     2   Drums         500              600                   500
  2020-02-12     3   Piano         700              1000                  400

Second_week.csv:

     date       id   name   total_unitCount   total_orderCount   total_invoiceCount
  2020-02-05     1   Guitar        300              800                   500
  2020-02-05     2   Drums         500              300                   500
  2020-02-05     3   Piano         700              350                  400

我需要計算每個 csv 文件中/每周之間的兩個數字之間的百分比差異(first_week.total_orderCount 與 second_week.total_orderCount,第二個與第三個,第三個與第四個):

計算示例: Difference = ((total_orderCount[where date is 2020-02-12] - total_orderCount[where date is 2020-12-05] ) / Units[where date is 2020-12-05]) * 100%

然后將每周的結果保存到一個新的 CSV 文件中(這里我只提供了 week1vsweek2 的結果):

    id   name   %difference_week1vsweek2  %difference_week2vsweek3  %difference_week3vsweek4
    1   Guitar             -0.25                          
    2   Drums                1                       
    3   Piano               0.65                      

有人可以幫助我或給我一些分步說明嗎? 先感謝您!

關於如何從多個 CSV 文件進行列計算並將結果保存到一個新文件的偽代碼是在 python 中使用 Pandas

import pandas as pd
df1 = pd.read_csv('First.csv')
df2 = pd.read_csv('Second.csv')
output_df = pd.DataFrame(columns = ["col1", "col2"])
output_df['result'] = df1['col2'] -df2['col2'] # some column calculation
df3.to_format("output.format")

這是問題中給定示例的實際代碼

#import libraries
import pandas as pd

#read files
df1 = pd.read_csv('First_week.csv')
df2 = pd.read_csv('Second_week.csv')

#Create new file and save results
column_names = ["id", "name"]
df3 = pd.DataFrame(columns = column_names)
df3[['id', 'name']] = df1[['id', 'name']]
df3['%difference_week1vsweek2'] = (df1['total_orderCount']-df2['total_orderCount'])/df2['total_orderCount']*100
print(df3)

df3.to_csv("output.csv")

希望能幫助到你。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM