[英]Extract specific conditions in python
I create python code like this, but it does not work well.我像这样创建 python 代码,但效果不佳。 (result not return) (结果不返回)
I want to extract just "動詞" "名詞" "形容詞"我只想提取“动词”“名词”“形容词”
Do you have any idea?你有什么主意吗?
Thank you so much.太感谢了。
m = MeCab.Tagger("-Ochasen")
for result in results:
# word = m.parse(result['text'])
word = [line.split()[0] for line in m.parse(result['text']).splitlines() if "名詞" in line.split()[-1]
for line in m.parse(result['text']).splitlines() if "動詞" in line.split()[-1]
for line in m.parse(result['text']).splitlines() if "形容詞" in line.split()[-1]]
result['mecab'] = word
I am mostly guessing what you are trying to do.我主要是在猜测您要做什么。 I assume you have a results list.我假设你有一个结果列表。 You try to extract a specific set of characters from a each result element in your results list.您尝试从结果列表中的每个结果元素中提取一组特定的字符。 Then you need to do:然后你需要做:
m = MeCab.Tagger("-Ochasen")
for result in results:
result_text = result["text"]
result_text = m.parse(result_text)
text_lines = result_text.splitlines()
word = None
for line in text_lines:
if "名詞" in line:
word = "名詞"
elif "動詞" in line:
word = "動詞"
elif "形容詞" in line:
word = "形容詞"
if word is not None:
result['mecab'] = word
Or something along these lines或者类似的东西
This is easier if you use data that has been parsed.如果您使用已解析的数据,这会更容易。 You should use fugashi , which is also a MeCab wrapper.您应该使用fugashi ,它也是 MeCab 包装器。
import fugashi
tagger = fugashi.Tagger()
nodes = tagger.parseToNodeList("図書館から赤い本を借りた")
goodpos = ['名詞', '動詞', '形容詞']
nodes = [nn.surface for nn in nodes if nn.feature.pos1 in goodpos]
# => ['図書', '赤い', '本', '借り']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.