[英]Concatenate multiple columns of dataframe with a seperating character for Non-null values
I have a data frame like this:我有一个这样的数据框:
df:
C1 C2 C3
1 4 6
2 NaN 9
3 5 NaN
NaN 7 3
I want to concatenate the 3 columns to a single column with comma as a seperator.我想用逗号作为分隔符将 3 列连接成一列。 But I want the comma(",") only in case of non-null value.
但我只在非空值的情况下才想要逗号(“,”)。
I tried this but this doesn't work for non-null values:我试过了,但这不适用于非空值:
df['New_Col'] = df[['C1','C2','C3']].agg(','.join, axis=1)
This gives me the output:这给了我 output:
New_Col
1,4,6
2,,9
3,5,
,7,3
This is my ideal output:这是我理想的 output:
New_Col
1,4,6
2,9
3,5
7,3
Can anyone help me with this?谁能帮我这个?
In your case do stack
在你的情况下做
stack
df['new'] = df.stack().astype(int).astype(str).groupby(level=0).agg(','.join)
Out[254]:
0 1,4,6
1 2,9
2 3,5
3 7,3
dtype: object
Judging by your (wrong) output, you have a dataframe of strings and NaN values are actually empty strings (otherwise it would throw TypeError: expected str instance, float found
because NaN is a float).从您的(错误的)output 来看,您有一个 dataframe 字符串,而 NaN 值实际上是空字符串(否则它会抛出
TypeError: expected str instance, float found
because NaN is a float)。
Since you're dealing with strings, pandas is not optimized for it, so a vanilla Python list comprehension is probably the most efficient choice here.由于您正在处理字符串,因此 pandas 没有针对它进行优化,因此香草 Python 列表理解可能是这里最有效的选择。
df['NewCol'] = [','.join([e for e in x if e]) for x in df.values]
You can use filter
to get rid of NaN
s:您可以使用
filter
摆脱NaN
s:
df['New_Col'] = df.apply(lambda x: ','.join(filter(lambda x: x is not np.nan,list(x))), axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.