[英]How do I merge multiple columns with similar names in a Pandas Dataframe without losing data
I am working with some messy data and I'm trying to figure out how to merge multiple columns with similar information onto one column.我正在处理一些杂乱的数据,我正在尝试弄清楚如何将具有相似信息的多列合并到一列中。 For example I have a dataframe that looks like this and i want to know how to condense all three columns into one:例如,我有一个看起来像这样的数据框,我想知道如何将所有三列压缩为一列:
Country ------------State ------ Temp ------ Temperature ------ Degrees国家 ------------州 ------ 温度 ------ 温度 ------ 度
United States -----Kentucky --- $76 ------ 76 -------------------- N/A美国 -----肯塔基州 --- $76 ------ 76 -------------------- 不适用
United States -----Arizona ----- 92\\n ------- N/A ------------------ N/A美国 -----亚利桑那州 ----- 92\\n ------- N/A ------------------ N/A
United States ----- Michigan -- 45 ----------- 45@ ----------------- 60美国 ----- 密歇根州 -- 45 ----------- 45@ ----------------- 60
You can try this, then drop the unwanted columns:你可以试试这个,然后删除不需要的列:
df['combined'] = df.apply(lambda x: list([x['Temp'],
x['Temperature'],
x['Degrees']]),axis=1)
You can also do something like this if you want them separated with a slash如果你想让它们用斜线分开,你也可以做这样的事情
df.apply(lambda x: x.Temp + ' / ' + x.Temperature + ' / ' + x.Degrees, axis=1)
# or simply
df['combined'] = df.Temp + ' / ' + df.Temperature + ' / ' + df.Degrees
I tested this on some data i have with NaN data and it worked with NaN's, maybe worth a try:我用 NaN 数据对一些数据进行了测试,它与 NaN 一起工作,也许值得一试:
import numpy as np
def combine_with_nan(x):
try:
np.isnan(x.Temp)
Temp = 'NaN'
except:
Temp = x.Temp
try:
np.isnan(x.Temperature)
Temperature = 'NaN'
except:
Temperature = x.Temperature
try:
np.isnan(x.Degrees)
Degrees = 'NaN'
except:
Degrees = x.Degrees
return Temp + ' / ' + Temperature + ' / ' + Degrees
df.apply(combine_with_nan, axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.