簡體   English   中英


[英]Break up a data-set into separate excel files based on a certain row value in a given column in Pandas?

我有一個相當大的數據集,我想根據A列中的名稱拆分成單獨的excel文件(下面提供的示例中的“Agent”列)。 我已經提供了一個粗略的例子,說明這個數據集在下面的Ex1中的樣子。


例如,在給定的示例中,我想為John Doe,Jane Doe和Steve Smith分別包含其姓名后面的信息(商家名稱,商家ID等)。


Agent        Business Name    Business ID    Revenue

John Doe     Bobs Ice Cream   12234          $400
John Doe     Car Repair       445848         $2331
John Doe     Corner Store     243123         $213
John Doe     Cool Taco Stand  2141244        $8912
Jane Doe     Fresh Ice Cream  9271499        $2143
Jane Doe     Breezy Air       0123801        $3412
Steve Smith  Big Golf Range   12938192       $9912
Steve Smith  Iron Gyms        1231233        $4133
Steve Smith  Tims Tires       82489233       $781

我相信python / pandas對於這個來說是一個有效的工具,但我對熊貓還是比較新的,所以我開始時遇到了麻煩。


dfs = [d for _,d in df.groupby('Agent')]

for df in dfs:
    print(df, '\n')


      Agent    Business Name  Business ID Revenue
4  Jane Doe  Fresh Ice Cream      9271499   $2143
5  Jane Doe       Breezy Air       123801   $3412 

      Agent    Business Name  Business ID Revenue
0  John Doe   Bobs Ice Cream        12234    $400
1  John Doe       Car Repair       445848   $2331
2  John Doe     Corner Store       243123    $213
3  John Doe  Cool Taco Stand      2141244   $8912 

         Agent   Business Name  Business ID Revenue
6  Steve Smith  Big Golf Range     12938192   $9912
7  Steve Smith       Iron Gyms      1231233   $4133
8  Steve Smith      Tims Tires     82489233    $781 


s = df.groupby('Agent')

for name, group in s:


import pandas as pd
for unique_val in df['Agent'].unique():
    df[df['Agent'] == unique_val].to_csv(f"{unique_val}.csv")


import pandas as pd
for unique_val in df['Agent'].unique():
    df[df['Agent'] == unique_val].to_excel(f"{unique_val}.xlsx")

分組是你在這里尋找的。 您可以遍歷組,這將為您提供分組屬性和與該組關聯的數據。 在您的情況下,代理名稱和關聯的業務列。


import pandas as pd
# make up some data
ex1 = pd.DataFrame([['A',1],['A',2],['B',3],['B',4]], columns = ['letter','number'])

# iterate over the grouped data and export the data frames to excel workbooks
for group_name,data in ex1.groupby('letter'):
    # you probably have more complicated naming logic
    # use index = False if you have not set an index on the dataframe to avoid an extra column of indices
    data.to_excel(group_name + '.xlsx', index = False)


聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

粵ICP備18138465號  © 2020-2024 STACKOOM.COM