简体   繁体   English

用`和`条件替换熊猫列中的一部分字符串

[英]Replacing part of a string in pandas column with `and` condition

I have a pandas dataframe that looks like this: 我有一个看起来像这样的pandas dataframe

    Size    Measure     Location    Messages
    Small     1         Washington  TXT0123 TXT0875 TXT874 TXT0867 TXT0875 TXT0874
    Medium    2         California  TXT020 TXT017 TXT120 TXT012
    Large     3         Texas       TXT0123 TXT0123 TXT0123 TXT0123 TXT0217 TXT0206
    Small     4         California  TXT020 TXT0217 TXT006
    Tiny      5         Nevada      TXT0206 TXT0217 TXT0206

I am trying to remove the 0 from the individual words in the Messages column if the length equals 7 and the fourth character is 0. 如果长度等于7并且第四个字符为0,我试图从“ Messages列中的各个单词中删除0。

I've tried for loop, but it's removing all 0's: 我试过for循环,但它删除了所有0:

for line in df.Messages:
    for message in line.split():
        if len(message) == 7 and message[3] == '0':
            print(message.replace('0', ''))

I also tried .map which gave me some errors: 我还尝试了.map ,这给了我一些错误:

df.Messages = df.Messages.map(lambda x: x.replace('0', '') for message in line.split() for line in df.Messages if (len(message) == 7 and message[3] == '0'))

TypeError: 'generator' object is not callable

Is there a way to do this with .map that includes the if and and conditionals? 有没有办法使用包含ifand条件的.map来做到这一点?

Given you want to do this for each word, first split your column with str.split , call apply , and then re-join with str.join : 如果您想对每个单词执行此操作,请先使用str.split拆分列,调用apply ,然后使用str.join重新加入:

def f(l):
    return [w.replace('0', '') if len(w) == 7 and w[3] == '0' else w for w in l]

df.Messages.str.split().apply(f).str.join(' ')

0    TXT123 TXT875 TXT874 TXT867 TXT875 TXT874
1                  TXT020 TXT017 TXT120 TXT012
2     TXT123 TXT123 TXT123 TXT123 TXT217 TXT26
3                         TXT020 TXT217 TXT006
4                           TXT26 TXT217 TXT26
Name: Messages, dtype: object

If you want to replace just the single 0 (and not all of them), use w.replace('0', '', 1) in function f instead. 如果只想替换单个0(而不是全部),请在函数f使用w.replace('0', '', 1)

df.Messages.str.split().apply(pd.Series).fillna('').\
    applymap(lambda x : x[:2]+x[4:] if len(x)==7 and x[3]=='0' else x).\ 
       apply(' '.join,1)

Out[597]: 出[597]:

0    TX123 TX875 TXT874 TX867 TX875 TX874
1           TXT020 TXT017 TXT120 TXT012  
2     TX123 TX123 TX123 TX123 TX217 TX206
3                  TXT020 TX217 TXT006   
4                    TX206 TX217 TX206   
dtype: object

IIUC: IIUC:

In [17]: df['Messages'] = df['Messages'].str.replace(r'(\D+)0(\d{3})',r'\1\2')

In [18]: df
Out[18]:
     Size  Measure    Location                                   Messages
0   Small        1  Washington  TXT123 TXT875 TXT874 TXT867 TXT875 TXT874
1  Medium        2  California                TXT020 TXT017 TXT120 TXT012
2   Large        3       Texas  TXT123 TXT123 TXT123 TXT123 TXT217 TXT206
3   Small        4  California                       TXT020 TXT217 TXT006
4    Tiny        5      Nevada                       TXT206 TXT217 TXT206

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM