简体   繁体   English

python-如何从python模块Pattern的parsetree输出转换为Text对象?

[英]How to convert a Text object from a parsetree output of module Pattern in python?

I have a list of words like this: 我有一个这样的单词列表:

['Urgente', 'Recibimos', 'Info']

I used the parsetree (parsetree(x, lemmata = True) function to convert the words and the output for each Word is this: 我使用了(parsetree(x, lemmata = True)函数来转换单词,每个Word的输出是这样的:

[[Sentence('urgente/JJ/B-ADJP/O/urgente')],
[Sentence('recibimos/NN/B-NP/O/recibimos')],
[Sentence('info/NN/B-NP/O/info')]]

Each component of the list has the type pattern.text.tree.Text . 列表的每个组件的类型为pattern.text.tree.Text

I need to obtain only the group of words into the parenthesis but I don´t know how to do this, I need this output: 我只需要获取括号中的一组单词,但是我不知道该怎么做,我需要以下输出:

[urgente/JJ/B-ADJP/O/urgente,
recibimos/NN/B-NP/O/recibimos,
info/NN/B-NP/O/info]

I use str to convert to string each component to the list but this changes all output. 我使用str将每个组件转换为字符串列表,但这会更改所有输出。

From their documentation , there doesn't seem to be a direct method or property to get what you want. 从他们的文档来看,似乎没有直接的方法或属性来获取您想要的东西。

But I found that a Sentence object can be printed as Sentence('urgente/JJ/B-ADJP/O/urgente') using repr . 但是我发现可以使用reprSentence对象打印为Sentence('urgente/JJ/B-ADJP/O/urgente') So I looked at the source code for the __repr__ implementation to see how it is formed: 因此,我查看__repr__实现的源代码,以了解其形成方式:

def __repr__(self):
    return "Sentence(%s)" % repr(" ".join(["/".join(word.tags) for word in self.words]))

It seems that the string "in parenthesis" is a combination of words and tags. 字符串“在括号中”似乎是单词和标签的组合。 You can then reuse that code, knowing that if you already have pattern.text.tree.Text objects, " a Text is a list of Sentence objects. Each Sentence is a list of Word objects. " (from the Parse trees documentation ). 然后,您就可以重复使用该代码,知道如果您已经有了pattern.text.tree.Text对象,则“ 一个Text是一个Sentence对象的列表。每个Sentence是一个Word对象的列表。 ”(来自“ 解析树”文档 )。

So here's my hacky solution: 所以这是我的hacky解决方案:

parsed = list()
for data in ['Urgente', 'Recibimos', 'Info']:
    parsed.append(parsetree(data, lemmata=True))

output = list()
for text in parsed:
    for sentence in text:
        formatted = " ".join(["/".join(word.tags) for word in sentence.words])
        output.append(str(formatted))

print(output)

Printing output gives: 打印output给出:

['Urgente/NNP/B-NP/O/urgente', 'Recibimos/NNP/B-NP/O/recibimos', 'Info/NNP/B-NP/O/info']

Note that this solution results in a list of str s (losing all the properties/methods from the original parsetree output). 请注意,此解决方案产生了str列表(丢失了原始parsetree输出中的所有属性/方法)。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从 python 模块(tqdm 模块)output 更改文本颜色? (所以不是 python 文本编辑器输出) - How to change text color from a python module (tqdm module) output ? (so not python text editor output) Python:如何将文本数据从请求对象转换为数据框? - Python: how to convert text data from requests object to a dataframe? 如何重定向输出<None>在python中将对象输入到文本文件? - How to redirect output from <None> type object to text file in python? 从SyntaxNet获取输出为python对象,而不是文本 - Get output from SyntaxNet as python object, not text 如何将输出从curl转换为字符串对象 - How to convert an output from a curl into a string object 如何在 Python 中将输出从秒转换为 hhmmss - How to convert output from seconds into hhmmss in Python 如何将python for循环的输出转换为表格? - How to convert output from python for loop into a table? 如何以编程方式将基于“自定义类”的单例对象转换为python模块? - How to convert a “custom class”-based singleton object programmatically into a python module? How to convert List of List to flat List from Class Object output in Python without getting error “'User' object is not iterable” - How to convert List of List to flat List from Class Object output in Python without getting error “'User' object is not iterable” 如何在Python中将文本从文件转换为列表? - How to convert text from a file into a list in Python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM