需要使用逗号分隔来拆分数据框列

Question

I have a data frame column-like,我有一个类似列的数据框，

comments
misha,park@gmail.com,233432
ammesh,,3545657
",,,"
neta,ne34@gmail.com,,

I want to split using comma, when two comma occurs conituosly need to fill that column as NA.我想使用逗号分割，当出现两个逗号时，需要将该列填充为 NA。 When three comma occurs need to fill all the three columns as NA (like in the third row)当出现三个逗号时需要将所有三列都填充为 NA（如第三行）

EXPECTED OUTPUT :
comments                              name          mail             phone
misha,park@gmail.com,233432          misha      park@gmail.com      233432
ammesh,,3545657                      ammesh          NA             3545657
",,,"                                 NA             NA               NA
neta,ne34@gmail.com,,                neta       ne34@gmail.com        NA

CODE USED:使用的代码：

b = a.join(a['comments'].str.split(',', expand=True).add_prefix('comments')).fillna(np.nan)

Answer 1

In case you will not find something more pythonic, the following code should work properly.如果你找不到更多 Pythonic 的东西，下面的代码应该可以正常工作。 I tried to cover all scenarios of ',,' appearance:我试图涵盖 ',,' 出现的所有场景：

a['name']=''
a['mail']=''
a['phone']=''

for i in range(len(a)):
    if ',,' not in a.comments.iloc[i] and ',,,' not in a.comments.iloc[i]:
        s=a.comments.iloc[i].split(',')
        a['name'].iloc[i]=s[0]
        a['mail'].iloc[i]=s[1]
        a['phone'].iloc[i]=s[2]
    elif ',,,' in a.comments.iloc[i]:
        a['name'].iloc[i]=np.nan
        a['mail'].iloc[i]=np.nan
        a['phone'].iloc[i]=np.nan   
    else:
        s=a.comments.iloc[i].split(',')
        if len(s)==5:
            a['name'].iloc[i]=np.nan
            a['mail'].iloc[i]=s[2]
            a['phone'].iloc[i]=np.nan 
        if len(s)==4:
            if s[0]=='':
                a['name'].iloc[i]=np.nan
                a['mail'].iloc[i]=s[2]
                a['phone'].iloc[i]=s[3]
            elif s[-1]=='':
                a['name'].iloc[i]=s[0]
                a['mail'].iloc[i]=s[1]
                a['phone'].iloc[i]=np.nan
         if len(s)==3:
             a['name'].iloc[i]=s[0]
             a['mail'].iloc[i]=np.nan
             a['phone'].iloc[i]=s[2]
print(a)

需要使用逗号分隔来拆分数据框列

问题描述

1 个解决方案

解决方案1
0 2020-09-07 09:53:57

需要使用逗号分隔来拆分数据框列

问题描述

1 个解决方案

解决方案1 0 2020-09-07 09:53:57

解决方案1
0 2020-09-07 09:53:57