用`和`条件替换熊猫列中的一部分字符串

Question

I have a pandas dataframe that looks like this: 我有一个看起来像这样的pandas dataframe ：

    Size    Measure     Location    Messages
    Small     1         Washington  TXT0123 TXT0875 TXT874 TXT0867 TXT0875 TXT0874
    Medium    2         California  TXT020 TXT017 TXT120 TXT012
    Large     3         Texas       TXT0123 TXT0123 TXT0123 TXT0123 TXT0217 TXT0206
    Small     4         California  TXT020 TXT0217 TXT006
    Tiny      5         Nevada      TXT0206 TXT0217 TXT0206

I am trying to remove the 0 from the individual words in the Messages column if the length equals 7 and the fourth character is 0. 如果长度等于7并且第四个字符为0，我试图从“ Messages列中的各个单词中删除0。

I've tried for loop, but it's removing all 0's: 我试过for循环，但它删除了所有0：

for line in df.Messages:
    for message in line.split():
        if len(message) == 7 and message[3] == '0':
            print(message.replace('0', ''))

I also tried .map which gave me some errors: 我还尝试了.map ，这给了我一些错误：

df.Messages = df.Messages.map(lambda x: x.replace('0', '') for message in line.split() for line in df.Messages if (len(message) == 7 and message[3] == '0'))

TypeError: 'generator' object is not callable

Is there a way to do this with .map that includes the if and and conditionals? 有没有办法使用包含if和and条件的.map来做到这一点？

Answer 1

Given you want to do this for each word, first split your column with str.split , call apply , and then re-join with str.join : 如果您想对每个单词执行此操作，请先使用str.split拆分列，调用apply ，然后使用str.join重新加入：

def f(l):
    return [w.replace('0', '') if len(w) == 7 and w[3] == '0' else w for w in l]

df.Messages.str.split().apply(f).str.join(' ')

0    TXT123 TXT875 TXT874 TXT867 TXT875 TXT874
1                  TXT020 TXT017 TXT120 TXT012
2     TXT123 TXT123 TXT123 TXT123 TXT217 TXT26
3                         TXT020 TXT217 TXT006
4                           TXT26 TXT217 TXT26
Name: Messages, dtype: object

If you want to replace just the single 0 (and not all of them), use w.replace('0', '', 1) in function f instead. 如果只想替换单个0（而不是全部），请在函数f使用w.replace('0', '', 1) 。

Answer 2

df.Messages.str.split().apply(pd.Series).fillna('').\
    applymap(lambda x : x[:2]+x[4:] if len(x)==7 and x[3]=='0' else x).\ 
       apply(' '.join,1)

Out[597]: 出[597]：

0    TX123 TX875 TXT874 TX867 TX875 TX874
1           TXT020 TXT017 TXT120 TXT012  
2     TX123 TX123 TX123 TX123 TX217 TX206
3                  TXT020 TX217 TXT006   
4                    TX206 TX217 TX206   
dtype: object

Answer 3

IIUC: IIUC：

In [17]: df['Messages'] = df['Messages'].str.replace(r'(\D+)0(\d{3})',r'\1\2')

In [18]: df
Out[18]:
     Size  Measure    Location                                   Messages
0   Small        1  Washington  TXT123 TXT875 TXT874 TXT867 TXT875 TXT874
1  Medium        2  California                TXT020 TXT017 TXT120 TXT012
2   Large        3       Texas  TXT123 TXT123 TXT123 TXT123 TXT217 TXT206
3   Small        4  California                       TXT020 TXT217 TXT006
4    Tiny        5      Nevada                       TXT206 TXT217 TXT206

用`和`条件替换熊猫列中的一部分字符串

问题描述

3 个解决方案

解决方案1
4 已采纳 2017-11-14 19:30:49

解决方案2
3 2017-11-14 19:43:39

解决方案3
3 2017-11-14 19:48:26

用`和`条件替换熊猫列中的一部分字符串

问题描述

3 个解决方案

解决方案1 4 已采纳 2017-11-14 19:30:49

解决方案2 3 2017-11-14 19:43:39

解决方案3 3 2017-11-14 19:48:26

解决方案1
4 已采纳 2017-11-14 19:30:49

解决方案2
3 2017-11-14 19:43:39

解决方案3
3 2017-11-14 19:48:26