检查列表并在python中拆分句子

Question

I have a list as follows. 我的清单如下。

mylist = ['test copy', 'test project', 'test', 'project']

I want to see if my sentence includes the aforementioned mylist elements and split the sentence from the first match and obtain its first part. 我想看看我的句子中是否包含上述mylist元素，并从第一个匹配项中拆分出该句子并获得其第一部分。

For example: 例如：

mystring1 = 'it was a nice test project and I enjoyed it a lot'

output should be: it was a nice 输出应该是： it was a nice

mystring2 = 'the example test was difficult'

output should be: the example 输出应为： the example

My current code is as follows. 我当前的代码如下。

for sentence in L:
    if mylist in sentence:
        splits = sentence.split(mylist)
        sentence= splits[0]

However, I get an error saying TypeError: 'in <string>' requires string as left operand, not list . 但是，我收到一条错误消息，提示TypeError: 'in <string>' requires string as left operand, not list 。 Is there a way to fix this? 有没有办法解决这个问题？

Answer 1

You need another for loop to iterate over every string in mylist . 您需要另一个for循环来遍历mylist每个字符串。

mylist = ['test copy', 'test project', 'test', 'project']
mystring1 = 'it was a nice test project and I enjoyed it a lot'
mystring2 = 'the example test was difficult'

L = [mystring1, mystring2]
for sentence in L:
    for word in mylist:
        if word in sentence:
            splits = sentence.split(word)
            sentence= splits[0]
            print(sentence)
# it was a nice 
# the example

Answer 2

Probably the most effective way to do this is by first constructing a regex, that tests all the strings concurrently: 可能最有效的方法是首先构建一个正则表达式，该正则表达式同时测试所有字符串：

import re

split_regex = re.compile('|'.join(re.escape(s) for s in mylist))

for sentence in L:
    first_part = split_regex.split(sentence, 1)[0]

This yields: 这样产生：

>>> split_regex.split(mystring1, 1)[0]
'it was a nice '
>>> mystring2 = 'the example test was difficult'
>>> split_regex.split(mystring2, 1)[0]
'the example '

If the number of possible strings is large, a regex can typically outperform searching each string individually. 如果可能的字符串数量很多，则正则表达式通常可以胜过单独搜索每个字符串。

You probably also want to .strip() the string (remove spaces in the front and end of the string): 您可能还希望.strip()字符串（删除字符串的.strip()和结尾的空格）：

import re

split_regex = re.compile('|'.join(re.escape(s) for s in mylist))

for sentence in L:
    first_part = split_regex.split(sentence, 1)[0].strip()

Answer 3

mylist = ['test copy', 'test project', 'test', 'project']
L = ['it was a nice test project and I enjoyed it a lot','a test copy']
for sentence in L:
    for x in mylist:
        if x in sentence:
            splits = sentence.split(x)
            sentence= splits[0]
            print(sentence)

the error says you are trying to check a list in sentence. 该错误表明您正在尝试检查句子列表。 so you must iterate on elements of list. 因此您必须迭代list的元素。

检查列表并在python中拆分句子

问题描述

3 个解决方案

解决方案1
3 已采纳 2018-01-02 10:47:13

解决方案2
2 2018-01-02 10:47:35

解决方案3
0 2018-01-02 11:57:09

检查列表并在python中拆分句子

问题描述

3 个解决方案

解决方案1 3 已采纳 2018-01-02 10:47:13

解决方案2 2 2018-01-02 10:47:35

解决方案3 0 2018-01-02 11:57:09

解决方案1
3 已采纳 2018-01-02 10:47:13

解决方案2
2 2018-01-02 10:47:35

解决方案3
0 2018-01-02 11:57:09