如何檢查 Pandas 的另一列中是否存在一列中的數據？

Question

我有一個包含兩列“位置”和“職位”的數據框。 我需要檢查 Job Title 中的哪些行中包含 Locations 的名稱。

        Location    Job Title
0   New York New York   Regional Manager Las Vegas and San Diego
1   New York City   Full Stack Engineer
2   San Francisco Bay Area  Director of Guitar Studies
3   Greater Los Angeles New England Institute of Technology
4   Greater Chicago New England Institute of Technology
... ... ...
984710  NaN Catering Sales Manager
984711  NaN Director, Research & Development and
984712  NaN HR Manager
984713  NaN Director of Development
984714  NaN Development Officer

Location 中有 625 行，Job Location 有接近一百萬行。

我嘗試df['exist1']= df['Location'].isin(df['Job Title'])之后，我嘗試根據 True 值對其進行過濾，但它將 625 以下的每個值都顯示為 TRUE。 Location 列中沒有低於 625 的值。

我哪里錯了？ 任何幫助將不勝感激。

Answer 1

這回答了你的問題了嗎？：

df['exist1'] = df.apply(lambda x: x['Location'] in x['Job Title'], axis=1)

這是逐行 substring 檢查（即，每行的位置在同一行的職位名稱中進行檢查）。 如果您想對照所有地點檢查所有職位，請告訴我們，我很樂意相應地對其進行編輯。

Answer 2

您可以使用str.contains

df['exist1'] = df['Location'].str.contains('|'.join(df['Job Title'].dropna().tolist()))

如果你想匹配每一行

df1=df.dropna()
df1['exist1'] = [ x in y for x, y  in zip(df1['Location'], df1['Job Title'])]
df['exist1']=df1['exist1']

如何檢查 Pandas 的另一列中是否存在一列中的數據？

問題描述

2 個解決方案

解決方案1
0 2020-06-12 02:09:47

解決方案2
0 2020-06-12 02:10:39

如何檢查 Pandas 的另一列中是否存在一列中的數據？

問題描述

2 個解決方案

解決方案1 0 2020-06-12 02:09:47

解決方案2 0 2020-06-12 02:10:39

解決方案1
0 2020-06-12 02:09:47

解決方案2
0 2020-06-12 02:10:39