[英]conditionally merge cells' contents in a column
Looking for a pandanic way to turn the following df: 寻找一种轻松的方法来打开以下df:
name desc
0 A a
1 NaN aa
2 NaN aaa
3 B b
4 NaN bb
into: 成:
name desc
0 A a
aa
aaa
3 B b
bb
# strings in desc are concat-ed together with end of line char
I am thinking of the general directions of either itertuple or backfill+groupby, but both of those approaches require some juggling. 我正在考虑itertuple或backfill + groupby的一般方向,但是这两种方法都需要进行一些调整。
here is the starting point: 这是起点:
import pandas as pd
import numpy as np
nan = np.nan
df = pd.DataFrame(
{'name': ['A', nan, nan, 'B', nan],
'desc': ['a', 'aa', 'aaa', 'b', 'bb']}
)
you can call ffill
directly and agg
without using apply
and lambda
您可以直接调用
ffill
和agg
而无需使用apply
和lambda
In [719]: df.ffill().groupby('name').agg('\n'.join).reset_index()
Out[719]:
name desc
0 A a\naa\naaa
1 B b\nbb
or: 要么:
In [729]: df.ffill().groupby('name', as_index=False).agg({'desc': '\n'.join})
Out[729]:
name desc
0 A a\naa\naaa
1 B b\nbb
I think you want a combination of fillna(method='ffill')
and groupby
. 我认为您想要
fillna(method='ffill')
和groupby
。
How does this look? 看起来如何?
import pandas as pd
import numpy as np
nan = np.nan
df = pd.DataFrame(
{'name': ['A', nan, nan, 'B', nan],
'desc': ['a', 'aa', 'aaa', 'b', 'bb']}
)
df['name'] = df['name'].fillna(method='ffill')
df = df.groupby('name')['desc'].apply(lambda d: '\n'.join(d)).reset_index()
print df
prints 版画
name desc
0 A a\naa\naaa
1 B b\nbb
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.