如何使用拆分方法从python中的文本文件制作一袋单词

Question

I am trying to learn TFIDF.我正在尝试学习 TFIDF。 But I coudnt bag the words from file.但是我无法从文件中提取单词。

code:代码：

docA = open("/home/user/Desktop/da/doca","r")
print(docA.read())
bowA = docA.split(" ")

error:错误：

AttributeError                            
Traceback (most recent call last)
<ipython-input-32-06e07f9dd975> in <module>
----> 1 bowA = docA.split(" ")

AttributeError: '_io.TextIOWrapper' object has no attribute 'split'`
Can anyone help me solve this?

Answer 1

I assume that you meant this:我假设你的意思是：

docA = open("/home/user/Desktop/da/doca","r")
# print(docA.read())
bowA = docA.read().split(" ") # or just split() will do
docA.close()

When you call read() the read cursor reads the entire file, leaving the read-cursor at the end.当您调用read() ，读取游标读取整个文件，将读取游标留在最后。 So calling read() again will return empty string.所以再次调用read()将返回空字符串。 Hence if you would like to print the content, you can assign the content to a variable, print it and use it as you wish:因此，如果您想打印内容，您可以将内容分配给一个变量，打印它并根据需要使用它：

docA = open("/home/user/Desktop/da/doca","r")
data = docA.read()
print(data)
bowA = data.split()
docA.close()

Or simply或者干脆

with open("/home/user/Desktop/da/doca","r") as docA:
    data = docA.read()
print(data)
bowA = data.split()

Answer 2

You want to use the returned string instead of the file handle:您想使用返回的字符串而不是文件句柄：

docA = open("/home/user/Desktop/da/doca","r")
document_string = docA.read()
bowA = document_string.split()

You can just call split , by default it splits on whitespace您可以调用split ，默认情况下它会在空白处拆分

如何使用拆分方法从python中的文本文件制作一袋单词

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-02-27 10:25:11

解决方案2
0 2020-02-27 10:16:15

如何使用拆分方法从python中的文本文件制作一袋单词

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-02-27 10:25:11

解决方案2 0 2020-02-27 10:16:15

解决方案1
1 已采纳 2020-02-27 10:25:11

解决方案2
0 2020-02-27 10:16:15