簡體   English   中英

pandas:字符串列中的前兩個元素與字典鍵匹配

[英]pandas: first two elements in a string column matches dictionary key

我有一個 dataframe 如下:

import pandas as pd

df = pd.DataFrame({'data1':['the weather is nice today','This is interesting','the weather is good'],
             'data2':['It is raining','The plant is green','the weather is sunny']})

我有一本字典如下:

my_dict = {'the weather':'today','the plant':'tree'}

如果在字典鍵中找到它們,我想替換 data2 列中的前兩個單詞。 我做了以下事情:

for old, new in dic.items():    
    if pd.Series([' '.join(map(str, l)) for l in df['data2'].str.lower().str.split().map(lambda x: x[0:2])]).str.contains('|'.join(old.capitalize()).any():
       df['data2'] = df['data2'].str.replace(old, new.capitalize(), regex=False)
    else:
       print('does not exist')

但是當我打印(df)時,沒有任何東西被替換。

預期的 output:

                       data1                 data2
0  the weather is nice today         It is raining
1        This is interesting    Tree is green
2        the weather is good    Today is sunny

如果我理解正確,這是一種方法(可能有更有效的方法):

df.data2 = df.data2.str.lower()
for k in my_dict:
  df.data2 = df.data2.str[:len(k)].replace(k, my_dict[k]) + df.data2.str[len(k):]

df.data2 = df.data2.str.capitalize()

小寫和大寫不在您的問題中,而是您代碼的一部分,因此我將它們放入(否則它會失敗,因為您的示例代碼中的大寫不匹配)。

  1. 通過 arrays 使用 python map function 到 go
  2. 在 dataframe 中,我們喜歡The plant並且我們試圖將它與the plant進行比較而不將其轉換為小寫。
    for old, new in my_dict.items():    
    if pd.Series([' '.join(map(str, l)) for l in df['data2'].str.lower().str.split().map(lambda x: x[0:2])]).str.contains('|'.join(old)).any():
       df['data2'] = list(map(lambda x: x.lower().replace(old, new.capitalize()), df['data2']))
    else:
       print('does not exist')

您可以嘗試使用pandas.Series.str.replace

for key, val in my_dict.items():
    df['data2'] = df['data2'].str.replace(f'^{key}', val, case=False, regex=True)
print(df)

                       data1           data2
0  the weather is nice today   It is raining
1        This is interesting   tree is green
2        the weather is good  today is sunny

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM