![](/img/trans.png)
[英]Combining data from multiple excel files in python using pandas package
[英]combining Data from multiple excel files
我正在嘗試將來自 2 個 Excel 文件的數據相互結合,但它只是無法正常工作。我的代碼:
import pandas as pd
import numpy as np
import xlsxwriter
import warnings
open_tradein_xlsx = "Z_results.xlsx"
open_keepa_xlsx = "keepa_data.xlsx"
with warnings.catch_warnings(record=True):
warnings.simplefilter("always")
keepa_data = pd.read_excel(open_keepa_xlsx, usecols=['Used: Lowest'])
tradein_data = pd.read_excel(open_tradein_xlsx, index_col=0,)
dataframe = pd.DataFrame =(tradein_data,keepa_data)
data = pd.concat(dataframe, ignore_index=True)
print(data)
#if dataframe['Used: Lowest'] < dataframe['Rebuy'] or tradein_data['Momox']:
#print(x)
和 output:
ISBN Rebuy Momox Used: Lowest
0 Unnamed: 0 Unnamed: 1 Unnamed: 1 NaN
1 NaN NaN NaN NaN
2 9783630876672 12.19 2.6 NaN
3 9783423282789 11.48 2.8 NaN
4 9783833879500 16.92 10.15 NaN
5 9783898798822 7.07 2.28 NaN
6 9783453281417 13.06 7.41 NaN
7 NaN NaN NaN 13.5
8 NaN NaN NaN 14.0
9 NaN NaN NaN 19.9
10 NaN NaN NaN 2.0
11 NaN NaN NaN 16.4
Process finished with exit code 0
我想你可以看到我想要做什么,“使用:最低”數據應該在第 2-6 行。
我已經嘗試過data = pd.concat(dataframe, ignore_index=True, axis=1)
但隨后出現以下錯誤: pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
創建“Z_results.xlsx”的代碼:
import pandas as pd
import numpy as np
import xlsxwriter
import pandas as pd
from pathlib import Path
open_momox_xlsx = ("momox_ergebnisse.xlsx")
momox_data = pd.read_excel(open_momox_xlsx,usecols='B')
open_rebuy_xlsx = ("rebuy_ergebnisse.xlsx")
rebuy_data = pd.read_excel(open_rebuy_xlsx,usecols='B')
open_isbn_xlsx = ("momox_ergebnisse.xlsx")
isbn_data = pd.read_excel(open_rebuy_xlsx,usecols='A')
dataframe = pd.DataFrame =({'ISBN': isbn_data, 'Rebuy': rebuy_data, 'Momox': momox_data})
data = pd.concat(dataframe,axis=1)
data[['Rebuy','Momox']] = data[['Rebuy','Momox']].replace({"///": np.nan, ",": "."}, regex=True).astype(float)
data = data.loc[data[['Rebuy','Momox']].ge(1.).all(axis="columns")]
isbn_output = data['ISBN']
datatoexcel = pd.ExcelWriter("Z_results.xlsx", engine='xlsxwriter')
data.to_excel(datatoexcel)
datatoexcel.save()
np.savetxt("ISBN_output.txt",isbn_output,fmt = "%s")
我認為 xlsx 將是最好的存儲類型,但現在我覺得它有點復雜..
您不必創建一個新的 dataframe 因為您在下一行代碼中連接它。 下面的代碼應該可以工作
import pandas as pd
import numpy as np
import xlsxwriter
import warnings
open_tradein_xlsx = "Z_results.xlsx"
open_keepa_xlsx = "keepa_data.xlsx"
with warnings.catch_warnings(record=True):
warnings.simplefilter("always")
keepa_data = pd.read_excel(open_keepa_xlsx, usecols=['Used: Lowest'])
tradein_data = pd.read_excel(open_tradein_xlsx, index_col=0,)
data = pd.concat([tradein_data, keepa_data], axis=1, ignore_index=True)
print(data)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.