[英]Search a python list for matches to a custom list of stem words of varying length
I'm trying to search word-tokenized abstracts for custom stem words using python. 我正在尝试使用python在单词标记的摘要中搜索自定义词干。 The following code is almost what I want.
以下代码几乎是我想要的。 That is, do any of the values in stem_words appears once or more in word_tokenized_abstract?
也就是说,stem_words中的任何值是否在word_tokenized_abstract中出现一次或多次?
if(any(word in stem_words for word in word_tokenized_abstract)):
do stuff
where... 哪里...
I based the above at one-liner to check if at least one item in list exists in another list? 我以单行代码为基础,检查列表中是否至少有一个项目存在于另一个列表中?
My issue is that my stem_words are of different lengths. 我的问题是我的stem_words的长度不同。 I've tried the following code (a modification of the above) which did not work for me.
我尝试了以下代码(对上面的修改),但对我来说不起作用。 I've tried a few other modifications but they either don't work or cause a crash.
我尝试了其他一些修改,但它们要么不起作用,要么会导致崩溃。
if(any(word in stem_words for word[0:len(word)] in word_tokenized_abstract)):
do stuff
That is, do any of the values word_tokenized_abstract begin with any of the values in stem_words
? 也就是说,word_tokenized_abstract的任何值是否都以
stem_words
任何值stem_words
?
if it helps, my stem_words = ['pancrea', 'muscul', 'derma', 'ovar']
如果有帮助,我的
stem_words = ['pancrea', 'muscul', 'derma', 'ovar']
Thanks! 谢谢! I apologize if this question has been answered previously but I couldn't find it.
如果这个问题先前已得到解答,我很抱歉,但我找不到它。
So you want to check if any string in a first list is contained in any of the strings of the second list. 因此,您要检查第二个列表的任何字符串中是否包含第一个列表中的任何字符串。
I'd try this: 我会尝试这样的:
any(y.startswith(x) for y in word_tokenized_abstract for x in stem_words)
Explanation: for each stem x
in stem_words
check if any string in word_tokenized_abstract
starts with x
. 说明:每个干
x
在stem_words
检查是否在任何字符串word_tokenized_abstract
开头x
。
If you just want the stem to be a substring of the word then use: 如果只希望词干成为单词的子串,请使用:
any(x in y for y in word_tokenized_abstract for x in stem_words)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.