[英]Split pandas dataframe into multiple dataframes by looking for NaN
I'm trying to split a dataframe from excel into multiple dataframes.我正在尝试将数据框从 excel 拆分为多个数据框。
The dataframe looks like this:数据框如下所示:
Name![]() |
Value![]() |
Unit![]() |
|
---|---|---|---|
0 ![]() |
AA ![]() |
10 ![]() |
mm![]() |
1 ![]() |
BDC ![]() |
20 ![]() |
mm![]() |
2 ![]() |
NaN![]() |
NaN![]() |
NaN![]() |
3 ![]() |
AFD ![]() |
60 ![]() |
mm![]() |
4 ![]() |
AKW ![]() |
18 ![]() |
cm![]() |
5 ![]() |
TDF ![]() |
0,5 ![]() |
mm![]() |
6 ![]() |
NaN![]() |
NaN![]() |
NaN![]() |
7 ![]() |
AA ![]() |
10 ![]() |
mm![]() |
8 ![]() |
FB![]() |
65 ![]() |
l ![]() |
I already got the whole dataframe in python stored correctly, however I have no clue how I can split the dataframe into multiple dataframes when there is a NaN row.我已经在 python 中正确存储了整个数据帧,但是我不知道当存在 NaN 行时如何将数据帧拆分为多个数据帧。 I tried to iterate over .iterrows() but it only gives me the rows until the first NaN row.
我试图迭代 .iterrows() 但它只给了我直到第一个 NaN 行的行。 And what is the best practise to merge the rows into an new dataframe?
将行合并到新数据框中的最佳做法是什么? Any help is appreciated.
任何帮助表示赞赏。
content = pd.read_excel(filepath, sheet_name='Parameter'))
right_tables = content[['Name', 'Value', 'Unit']]
for i, row in right_tables.head().iterrows():
print(row)
Output in console:控制台输出:
Name AA
Value 10
Unit mm
Name: 0, dtype: object
Name BDC
Value 20
Unit mm
Name: 1, dtype: object
Name NaN
Value NaN
Unit NaN
Name: 2, dtype: object
And the result I need should be like: |我需要的结果应该是: | |
| Name |
姓名 | Value |
价值 | Unit |
单位 | |--|--|--|--|
|--|--|--|--| |0|AA|10|mm|
|0|AA|10|毫米| |1|BDC|20|mm|
|1|BDC|20|毫米|
Name![]() |
Value![]() |
Unit![]() |
|
---|---|---|---|
0 ![]() |
AFD ![]() |
60 ![]() |
mm![]() |
1 ![]() |
AKW ![]() |
18 ![]() |
cm![]() |
2 ![]() |
TDF ![]() |
0,5 ![]() |
mm![]() |
Name![]() |
Value![]() |
Unit![]() |
|
---|---|---|---|
0 ![]() |
AA ![]() |
10 ![]() |
mm![]() |
1 ![]() |
FB![]() |
65 ![]() |
l ![]() |
Remove missing rows by DataFrame.dropna
and groupby by Series created by Series.isna
with Series.cumsum
:按 DataFrame.dropna 删除缺失的行,
DataFrame.dropna
和Series.cumsum
创建的Series.isna
分组:
for g, df in df.dropna().groupby(df['Name'].isna().cumsum()):
print (df.reset_index(drop=True))
Name Value Unit
0 AA 10 mm
1 BDC 20 mm
Name Value Unit
0 AFD 60 mm
1 AKW 18 cm
2 TDF 0,5 mm
Name Value Unit
0 AA 10 mm
1 FB 65 l
If need list of DataFrames:如果需要数据框列表:
dfs= [df.reset_index(drop=True) for g,df in df.dropna().groupby(df['Name'].isna().cumsum())]
print (dfs)
[ Name Value Unit
0 AA 10 mm
1 BDC 20 mm, Name Value Unit
0 AFD 60 mm
1 AKW 18 cm
2 TDF 0,5 mm, Name Value Unit
0 AA 10 mm
1 FB 65 l]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.