[英]Fill in empty space by merging columns using Pandas Python
I am a Python noob.我是 Python 菜鸟。 I have an unstructured text file that I'm trying to capture to a dataframe and export to excel. I need to merge 38 to 36, 45 to 43, and 79 to 78 filling in the empty space with the data on the merging column.
我有一个非结构化文本文件,我试图将其捕获到 dataframe 并导出到 excel。我需要合并 38 到 36、45 到 43 和 79 到 78,用合并列上的数据填充空白区域。
Dummy Dataset虚拟数据集
0 ![]() |
5 ![]() |
36 ![]() |
38 ![]() |
43 ![]() |
45 ![]() |
78 ![]() |
79 ![]() |
|
---|---|---|---|---|---|---|---|---|
1 ![]() |
A![]() |
01JUN2022 ![]() |
1.2 ![]() |
B![]() |
1.2 ![]() |
|||
2 ![]() |
C ![]() |
01JUN2022 ![]() |
1.4 ![]() |
D![]() |
1.4 ![]() |
|||
3 ![]() |
E![]() |
01JUN2022 ![]() |
1.5 ![]() |
F ![]() |
1.6 ![]() |
|||
4 ![]() |
G ![]() |
01JUN2022 ![]() |
1.7 ![]() |
H ![]() |
1.7 ![]() |
|||
5 ![]() |
I![]() |
01JUN2022 ![]() |
1.4 ![]() |
J![]() |
1.8 ![]() |
|||
6 ![]() |
K![]() |
01JUN2022 ![]() |
1.7 ![]() |
L![]() |
1.3 ![]() |
|||
1 ![]() |
A![]() |
01JUN2022 ![]() |
1.2 ![]() |
B![]() |
1.2 ![]() |
|||
2 ![]() |
C ![]() |
01JUN2022 ![]() |
1.4 ![]() |
D![]() |
1.4 ![]() |
|||
3 ![]() |
E![]() |
01JUN2022 ![]() |
1.5 ![]() |
F ![]() |
1.6 ![]() |
|||
4 ![]() |
G ![]() |
01JUN2022 ![]() |
1.7 ![]() |
H ![]() |
1.7 ![]() |
|||
5 ![]() |
I![]() |
01JUN2022 ![]() |
1.4 ![]() |
J![]() |
1.8 ![]() |
|||
6 ![]() |
K![]() |
01JUN2022 ![]() |
1.7 ![]() |
L![]() |
1.3 ![]() |
Required output必填 output
0 ![]() |
5 ![]() |
36 ![]() |
43 ![]() |
79 ![]() |
|
---|---|---|---|---|---|
1 ![]() |
A![]() |
01JUN2022 ![]() |
1.2 ![]() |
B![]() |
1.2 ![]() |
2 ![]() |
C ![]() |
01JUN2022 ![]() |
1.4 ![]() |
D![]() |
1.4 ![]() |
3 ![]() |
E![]() |
01JUN2022 ![]() |
1.5 ![]() |
F ![]() |
1.6 ![]() |
4 ![]() |
G ![]() |
01JUN2022 ![]() |
1.7 ![]() |
H ![]() |
1.7 ![]() |
5 ![]() |
I![]() |
01JUN2022 ![]() |
1.4 ![]() |
J![]() |
1.8 ![]() |
6 ![]() |
K![]() |
01JUN2022 ![]() |
1.7 ![]() |
L![]() |
1.3 ![]() |
1 ![]() |
A![]() |
01JUN2022 ![]() |
1.2 ![]() |
B![]() |
1.2 ![]() |
2 ![]() |
C ![]() |
01JUN2022 ![]() |
1.4 ![]() |
D![]() |
1.4 ![]() |
3 ![]() |
E![]() |
01JUN2022 ![]() |
1.5 ![]() |
F ![]() |
1.6 ![]() |
4 ![]() |
G ![]() |
01JUN2022 ![]() |
1.7 ![]() |
H ![]() |
1.7 ![]() |
5 ![]() |
I![]() |
01JUN2022 ![]() |
1.4 ![]() |
J![]() |
1.8 ![]() |
6 ![]() |
K![]() |
01JUN2022 ![]() |
1.7 ![]() |
L![]() |
1.3 ![]() |
Would start by converting ''
to NaN
as follows首先将
''
转换为NaN
,如下所示
df = df.replace(r'^\s*$', np.nan, regex=True)
Then one can use pandas.Series.combine_first
然后可以使用
pandas.Series.combine_first
df['36'] = df['36'].combine_first(df['38'])
df['43'] = df['43'].combine_first(df['45'])
df['79'] = df['79'].combine_first(df['78'])
[Out]:
id 0 5 36 38 43 45 78 79
0 1 A 01JUN2022 1.2 1.2 B B NaN 1.2
1 2 C 01JUN2022 1.4 1.4 D D NaN 1.4
2 3 E 01JUN2022 1.5 NaN F NaN 1.6 1.6
3 4 G 01JUN2022 1.7 NaN H NaN 1.7 1.7
4 5 I 01JUN2022 1.4 NaN J NaN 1.8 1.8
5 6 K 01JUN2022 1.7 NaN L NaN 1.3 1.3
6 1 A 01JUN2022 1.2 1.2 B B NaN 1.2
7 2 C 01JUN2022 1.4 1.4 D NaN 1.4 1.4
8 3 E 01JUN2022 1.5 1.5 F NaN 1.6 1.6
9 4 G 01JUN2022 1.7 NaN H NaN 1.7 1.7
10 5 I 01JUN2022 1.4 NaN J J NaN 1.8
11 6 K 01JUN2022 1.7 NaN L NaN NaN 1.3
Finally, one can drop the columns that one doesn't want or select the one's to display as follows最后,可以删除不需要的列或 select 显示如下
df = df[['0', '5', '36', '43', '79']]
[Out]:
0 5 36 43 79
0 A 01JUN2022 1.2 B 1.2
1 C 01JUN2022 1.4 D 1.4
2 E 01JUN2022 1.5 F 1.6
3 G 01JUN2022 1.7 H 1.7
4 I 01JUN2022 1.4 J 1.8
5 K 01JUN2022 1.7 L 1.3
6 A 01JUN2022 1.2 B 1.2
7 C 01JUN2022 1.4 D 1.4
8 E 01JUN2022 1.5 F 1.6
9 G 01JUN2022 1.7 H 1.7
10 I 01JUN2022 1.4 J 1.8
11 K 01JUN2022 1.7 L 1.3
and this gives the desired output.这给出了所需的 output。
Notes:笔记:
There are additional ways to remove NaN
and combine the columns.还有其他方法可以删除
NaN
。 For that, see: How to remove nan value while combining two column in Panda Data frame?为此,请参阅: How to remove nan value while combining two column in Panda Data frame?
There are also additional ways to delete columns in Python. If one wants to learn more about it, see: Delete a column from a Pandas DataFrame删除Python中的列也有其他方法,如果想了解更多,请参见:删除Pandas中的列 DataFrame
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.