[英]How to match first string with column and print Match?
我的數據框是
data = {
'company_name' : ['auckland suppliers', 'Octagone', 'SodaBottel','Shimla Mirch'],
'year' : [2000, 2001, 2003, 2004],
'desc' : [' auckland has some good reviews','Octagone','we shall update you','we have varities of shimla mirch'],
}
df = pd.DataFrame(data)
我嘗試了這段代碼
df['CompanyMatch'] = df ['company_name'] == df ['desc']
如果company_name列的第一個單詞與desc列匹配,我想打印“匹配”。我對放置索引[0]的位置感到困惑,因此它的打印方式如下:
> company_name desc CompanyMatch
> auckland suppliers auckland has some good reviews Match
> Octagone Octagone Match
> SodaBottel we shall update you NA
> Shimla Mirch we have varities of shimla mirch Match
您可以將numpy.where
與apply
一起apply
,以in
檢查一個列的值, axis=1
用於按行處理:
import numpy as np
m = df.apply(lambda x: x['company_name'].lower() in x['desc'].lower(), axis=1)
df['CompanyMatch'] = np.where(m, 'Match', np.nan)
print (df)
company_name desc year CompanyMatch
0 auckland suppliers auckland has some good reviews 2000 nan
1 Octagone Octagone 2001 Match
2 SodaBottel we shall update you 2003 nan
3 Shimla Mirch we have varities of shimla mirch 2004 Match
編輯:
僅用於比較第一個單詞:
m = df.apply(lambda x: x['company_name'].split()[0].lower() in x['desc'].lower(), axis=1)
df['CompanyMatch'] = np.where(m, 'Match', np.nan)
print (df)
company_name desc year CompanyMatch
0 auckland suppliers auckland has some good reviews 2000 Match
1 Octagone Octagone 2001 Match
2 SodaBottel we shall update you 2003 nan
3 Shimla Mirch we have varities of shimla mirch 2004 Match
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.