搜索字符串中的单词列表并返回匹配的单词

Question

Edit: @rong @shaik moeed Here is a code that generates part of the data frame and the problem I am facing: 编辑：@rong @shaik moeed这是一个代码，它生成部分数据框和我面临的问题：

temp = [[1, 'blblblblblb. The quaity of research was good. blblblblb'],
        [2, 'blblblblblb. The quaity of research was average. blblblblb'],
        [3, 'blblblblblb. The quaity of research was poor. blblblblb'],
        [4, 'blblblblblb. The quaity of research was good. blblblblb']
        ]
Data = pd.DataFrame(temp,columns=['ID','Report'])
Data['Sentence']=Data['Report'].str.extract(r"([^.]*?The quaity of research was [^.]*\.)")

Quality_dic=dict([(1, 'excellent'), (2, 'good'),  (3, 'average') , (4, 'poor'), (5, 'unassessable')])



Data['Quality']=[k for k,v in Quality_dic.items() if v in  Data['Sentence'].str.split()]

The solutions suggested still unfortunately dont work. 不幸的是，建议的解决方案仍然不起作用。

Any thoughts on how to solve this? 有关如何解决这个问题的任何想法？ Thank you everyone for your time and input 谢谢大家的时间和意见

Answer 1

I have created a df as your data and implemented completely as you required. 我已经创建了一个df作为您的数据，并根据您的需要完全实现。

In Quality_dic , you have same key for Good and Unassessable . 在Quality_dic ，您具有Good和Unassessable相同键。 So Good will override with Unassessable . 所以Good将覆盖Unassessable 。

Try now, 现在试试，

>>> temp = [[1, 'blblblblblb. The quaity of research was good. blblblblb'],
        [2, 'blblblblblb. The quaity of research was average. blblblblb'],
        [3, 'blblblblblb. The quaity of research was poor. blblblblb'],
        [4, 'blblblblblb. The quaity of research was good. blblblblb']
        ]

>>> Data = pd.DataFrame(temp,columns=['ID','Report'])

>>> Data['Sentence']=Data['Report'].str.extract(r"([^.]*?The quaity of research was [^.]*\.)")

>>> index_col = []

>>> for index, row in Data.iterrows():
        index_col.append([k for k,v in Quality_dic.items() if v.lower() in  row['Sentence'].replace('.','').split()][0])
>>> Data["index_col"]=index_col

Output: 输出：

>>> Data

   ID    ...    index_col
0   1    ...            2
1   2    ...            3
2   3    ...            4
3   4    ...            2

[4 rows x 4 columns]

Note: 注意：

... - means columns are hiding as there is no space to display.

Answer 2

quality_dic = dict([(1, 'Excellent'), (2, 'Good'), (3, 'Average') , (4, 'Poor'), (2, 'Unassessable')])

sentence = 'The quality of the research was Poor' # note that 'Poor' here is capitalized

for rating in quality_dic:
    if quality_dic[rating] in sentence:
        print(quality_dic[rating]) # df['Quality'] = quality_dic[rating]

# or if you want a one-liner:
df['Quality'] = [quality_dic[rating] in sentence for rating in quality_dic]

搜索字符串中的单词列表并返回匹配的单词

问题描述

2 个解决方案

解决方案1
0 已采纳 2019-05-22 10:52:22

解决方案2
0 2019-05-22 10:56:03

搜索字符串中的单词列表并返回匹配的单词

问题描述

2 个解决方案

解决方案1 0 已采纳 2019-05-22 10:52:22

解决方案2 0 2019-05-22 10:56:03

解决方案1
0 已采纳 2019-05-22 10:52:22

解决方案2
0 2019-05-22 10:56:03