[英]need to split dataframe columns using comma delimited
我有一个类似列的数据框,
comments
misha,park@gmail.com,233432
ammesh,,3545657
",,,"
neta,ne34@gmail.com,,
我想使用逗号分割,当出现两个逗号时,需要将该列填充为 NA。 当出现三个逗号时需要将所有三列都填充为 NA(如第三行)
EXPECTED OUTPUT :
comments name mail phone
misha,park@gmail.com,233432 misha park@gmail.com 233432
ammesh,,3545657 ammesh NA 3545657
",,," NA NA NA
neta,ne34@gmail.com,, neta ne34@gmail.com NA
使用的代码:
b = a.join(a['comments'].str.split(',', expand=True).add_prefix('comments')).fillna(np.nan)
如果你找不到更多 Pythonic 的东西,下面的代码应该可以正常工作。 我试图涵盖 ',,' 出现的所有场景:
a['name']=''
a['mail']=''
a['phone']=''
for i in range(len(a)):
if ',,' not in a.comments.iloc[i] and ',,,' not in a.comments.iloc[i]:
s=a.comments.iloc[i].split(',')
a['name'].iloc[i]=s[0]
a['mail'].iloc[i]=s[1]
a['phone'].iloc[i]=s[2]
elif ',,,' in a.comments.iloc[i]:
a['name'].iloc[i]=np.nan
a['mail'].iloc[i]=np.nan
a['phone'].iloc[i]=np.nan
else:
s=a.comments.iloc[i].split(',')
if len(s)==5:
a['name'].iloc[i]=np.nan
a['mail'].iloc[i]=s[2]
a['phone'].iloc[i]=np.nan
if len(s)==4:
if s[0]=='':
a['name'].iloc[i]=np.nan
a['mail'].iloc[i]=s[2]
a['phone'].iloc[i]=s[3]
elif s[-1]=='':
a['name'].iloc[i]=s[0]
a['mail'].iloc[i]=s[1]
a['phone'].iloc[i]=np.nan
if len(s)==3:
a['name'].iloc[i]=s[0]
a['mail'].iloc[i]=np.nan
a['phone'].iloc[i]=s[2]
print(a)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.