繁体   English   中英

如果存在于另一列的字符串中,则从一列中删除字符串 pandas

[英]Remove string from one column if present in string of another column pandas

我觉得我很接近,但我正在寻找这样的东西,其中新专栏写的是公司名称,其中没有城市:

                           company  postal_code  name state         city  \
2000-01-01          abc gresham co        97080  john    mi      gresham   
2000-01-01             startup llc        97080  jeff    hi     portland   
2001-01-01  beaverton business biz        99999   sam    ca    beaverton   
2002-01-01                 andy co        92222  joey    or  los angeles   

                 new_col  
2000-01-01        abc co  
2000-01-01   startup llc  
2001-01-01  business biz  
2002-01-01       andy co  

这是我到目前为止所拥有的,但它抛出了一个TypeError: unhashable type: 'Series'

for idx in df1.index:
    if df1["city"].loc[idx] in df1['company'].loc[idx]:
        print("figure out how to print to new column the company name without the city included")
    else:
        print(df1['company'].loc[idx])

谢谢!

这是一个解决方案:

df = (
    df.reset_index()
    .assign(new_col=df.reset_index()
        .pipe(lambda x: x.assign(x=x['company'].str.split(' ')))
        .explode('x')
        .loc[lambda x: x['x'] != x['city'], 'x']
        .groupby(level=0)
        .agg(list)
        .str.join(' ')
    )
    .set_index('index')
)

Output:

>>> df
                           company  postal_code  name state         city       new_col
index                                                                                 
2000-01-01          abc gresham co        97080  john    mi      gresham        abc co
2000-01-01             startup llc        97080  jeff    hi     portland   startup llc
2001-01-01  beaverton business biz        99999   sam    ca    beaverton  business biz
2002-01-01                 andy co        92222  joey    or  los angeles       andy co

单线:

df = df.reset_index().assign(new_col=df.reset_index().pipe(lambda x: x.assign(x=x['company'].str.split(' '))).explode('x').loc[lambda x: x['x'] != x['city'], 'x'].groupby(level=0).agg(list).str.join(' ')).set_index('index')

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM