简体   繁体   English

通过查找 NaN 将 pandas 数据帧拆分为多个数据帧

[英]Split pandas dataframe into multiple dataframes by looking for NaN

I'm trying to split a dataframe from excel into multiple dataframes.我正在尝试将数据框从 excel 拆分为多个数据框。

The dataframe looks like this:数据框如下所示:

Name姓名 Value价值 Unit单元
0 0 AA AA 10 10 mm毫米
1 1 BDC BDC 20 20 mm毫米
2 2 NaN NaN NaN
3 3 AFD AFD 60 60 mm毫米
4 4 AKW AKW 18 18 cm厘米
5 5 TDF TDF 0,5 0,5 mm毫米
6 6 NaN NaN NaN
7 7 AA AA 10 10 mm毫米
8 8 FB脸书 65 65 l l

I already got the whole dataframe in python stored correctly, however I have no clue how I can split the dataframe into multiple dataframes when there is a NaN row.我已经在 python 中正确存储了整个数据帧,但是我不知道当存在 NaN 行时如何将数据帧拆分为多个数据帧。 I tried to iterate over .iterrows() but it only gives me the rows until the first NaN row.我试图迭代 .iterrows() 但它只给了我直到第一个 NaN 行的行。 And what is the best practise to merge the rows into an new dataframe?将行合并到新数据框中的最佳做法是什么? Any help is appreciated.任何帮助表示赞赏。

content = pd.read_excel(filepath, sheet_name='Parameter'))
right_tables = content[['Name', 'Value', 'Unit']]

for i, row in right_tables.head().iterrows():
    print(row)

Output in console:控制台输出:

Name      AA
Value     10
Unit      mm
Name: 0, dtype: object
Name     BDC
Value     20
Unit      mm
Name: 1, dtype: object
Name     NaN
Value    NaN
Unit     NaN
Name: 2, dtype: object

And the result I need should be like: |我需要的结果应该是: | | | Name |姓名 | Value |价值 | Unit |单位 | |--|--|--|--| |--|--|--|--| |0|AA|10|mm| |0|AA|10|毫米| |1|BDC|20|mm| |1|BDC|20|毫米|

Name姓名 Value价值 Unit单元
0 0 AFD AFD 60 60 mm毫米
1 1 AKW AKW 18 18 cm厘米
2 2 TDF TDF 0,5 0,5 mm毫米
Name姓名 Value价值 Unit单元
0 0 AA AA 10 10 mm毫米
1 1 FB脸书 65 65 l l

Remove missing rows by DataFrame.dropna and groupby by Series created by Series.isna with Series.cumsum :按 DataFrame.dropna 删除缺失的行, DataFrame.dropnaSeries.cumsum创建的Series.isna分组:

for g, df in df.dropna().groupby(df['Name'].isna().cumsum()):
    print (df.reset_index(drop=True))
  Name Value Unit
0   AA    10   mm
1  BDC    20   mm
  Name Value Unit
0  AFD    60   mm
1  AKW    18   cm
2  TDF   0,5   mm
  Name Value Unit
0   AA    10   mm
1   FB    65    l

If need list of DataFrames:如果需要数据框列表:

dfs= [df.reset_index(drop=True) for g,df in df.dropna().groupby(df['Name'].isna().cumsum())]
print (dfs)
[  Name Value Unit
0   AA    10   mm
1  BDC    20   mm,   Name Value Unit
0  AFD    60   mm
1  AKW    18   cm
2  TDF   0,5   mm,   Name Value Unit
0   AA    10   mm
1   FB    65    l]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM