简体   繁体   English

提取python中的具体条件

[英]Extract specific conditions in python

I create python code like this, but it does not work well.我像这样创建 python 代码,但效果不佳。 (result not return) (结果不返回)

I want to extract just "動詞" "名詞" "形容詞"我只想提取“动词”“名词”“形容词”

Do you have any idea?你有什么主意吗?

Thank you so much.太感谢了。

m = MeCab.Tagger("-Ochasen")
for result in results:
 #     word = m.parse(result['text'])

    word = [line.split()[0] for line in m.parse(result['text']).splitlines() if "名詞" in line.split()[-1] 
                                for line in m.parse(result['text']).splitlines() if "動詞" in line.split()[-1] 
                                     for line in m.parse(result['text']).splitlines() if "形容詞" in line.split()[-1]]
    result['mecab'] = word

I am mostly guessing what you are trying to do.我主要是在猜测您要做什么。 I assume you have a results list.我假设你有一个结果列表。 You try to extract a specific set of characters from a each result element in your results list.您尝试从结果列表中的每个结果元素中提取一组特定的字符。 Then you need to do:然后你需要做:

m = MeCab.Tagger("-Ochasen")
for result in results:
    result_text = result["text"]
    result_text = m.parse(result_text)
    text_lines = result_text.splitlines()
    word = None
    for line in text_lines:
        if "名詞" in line:
            word = "名詞"
        elif "動詞" in line:
            word = "動詞"
        elif "形容詞" in line:
            word = "形容詞"
    if word is not None:
        result['mecab'] = word

Or something along these lines或者类似的东西

This is easier if you use data that has been parsed.如果您使用已解析的数据,这会更容易。 You should use fugashi , which is also a MeCab wrapper.您应该使用fugashi ,它也是 MeCab 包装器。

import fugashi
tagger = fugashi.Tagger()
nodes = tagger.parseToNodeList("図書館から赤い本を借りた")
goodpos = ['名詞', '動詞', '形容詞']
nodes = [nn.surface for nn in nodes if nn.feature.pos1 in goodpos]
# => ['図書', '赤い', '本', '借り']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM