简体   繁体   English

pandas:字符串列中的前两个元素与字典键匹配

[英]pandas: first two elements in a string column matches dictionary key

I have a dataframe as follows:我有一个 dataframe 如下:

import pandas as pd

df = pd.DataFrame({'data1':['the weather is nice today','This is interesting','the weather is good'],
             'data2':['It is raining','The plant is green','the weather is sunny']})

and I have a dictionary as follows:我有一本字典如下:

my_dict = {'the weather':'today','the plant':'tree'}

I would like to replace the first two words in the data2 column if they are found in the dictionary key.如果在字典键中找到它们,我想替换 data2 列中的前两个单词。 I have done the following:我做了以下事情:

for old, new in dic.items():    
    if pd.Series([' '.join(map(str, l)) for l in df['data2'].str.lower().str.split().map(lambda x: x[0:2])]).str.contains('|'.join(old.capitalize()).any():
       df['data2'] = df['data2'].str.replace(old, new.capitalize(), regex=False)
    else:
       print('does not exist')

but when i print(df), nothing has been replaced.但是当我打印(df)时,没有任何东西被替换。

the expected output:预期的 output:

                       data1                 data2
0  the weather is nice today         It is raining
1        This is interesting    Tree is green
2        the weather is good    Today is sunny

If I understand correctly, this is one way to do it (there may be more efficient ways):如果我理解正确,这是一种方法(可能有更有效的方法):

df.data2 = df.data2.str.lower()
for k in my_dict:
  df.data2 = df.data2.str[:len(k)].replace(k, my_dict[k]) + df.data2.str[len(k):]

df.data2 = df.data2.str.capitalize()

Lowercasing and capitalization weren't in your question but were part of your code, so I put them in (otherwise it would fail because the capitalization doesn't match in your sample code).小写和大写不在您的问题中,而是您代码的一部分,因此我将它们放入(否则它会失败,因为您的示例代码中的大写不匹配)。

  1. use python map function to go through the arrays通过 arrays 使用 python map function 到 go
  2. in the dataframe we have like The plant and we are trying to compare it with the plant without converting it to lower case.在 dataframe 中,我们喜欢The plant并且我们试图将它与the plant进行比较而不将其转换为小写。
    for old, new in my_dict.items():    
    if pd.Series([' '.join(map(str, l)) for l in df['data2'].str.lower().str.split().map(lambda x: x[0:2])]).str.contains('|'.join(old)).any():
       df['data2'] = list(map(lambda x: x.lower().replace(old, new.capitalize()), df['data2']))
    else:
       print('does not exist')

You can try with pandas.Series.str.replace您可以尝试使用pandas.Series.str.replace

for key, val in my_dict.items():
    df['data2'] = df['data2'].str.replace(f'^{key}', val, case=False, regex=True)
print(df)

                       data1           data2
0  the weather is nice today   It is raining
1        This is interesting   tree is green
2        the weather is good  today is sunny

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将两个熊猫列转换为字典,但将同一第一列(键)的所有值合并为一个键? - How to convert two pandas columns into a dictionary, but merge all values of same first column (key) into one key? 如果输入字符串与键匹配,如何打印字典的元素? - How can i print the elements of a dictionary if an input string matches with a key? 在字符串中查找与字典中的值匹配的单词,然后在新列中返回键 - Looking for words in string that matches the values in a dictionary, then return key in a new column 当字典的键匹配时如何从列中提取字符串 - How to extract string from a column when the key of dictionary matches Pandas - 将带有字典的列拆分为带有键和值的两列 - Pandas - split column with dictionary into two columns with key and value 如果匹配,比较两个字典提取键 - compare two dictionary extract key if matches 熊猫-使用列作为字典的键 - pandas - using a column as a key for a dictionary 给定熊猫数据框列,如果 X 是字典中的键,如何用字典中的值替换嵌套列表中的元素 X? - Given the pandas dataframe column, how to replace elements X in a nested list with the values in dictionary if X is a key in a dictionary? Pandas:返回匹配条件的第一列号 - Pandas: Return first column number that matches a condition 从列列表中的元素中获取字典,其中键值对是 pandas 中元素的数量和元素值 - Get the dictionary from elements in the list of a column where key-value pair are number of elements and element value in pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM