[英]How to find phrases in a text file
My text file is this: 我的文本文件是这样的:
123 Numbers 4.5
456 Words 6.7
789 Sentences 8.9
And my code is this: 我的代码是这样的:
s = open('test.txt', 'r')
file = s.read()
numbers, words, decimals = [], [], []
I've gotten thus far, and i'm trying to work out how to create a list for all the numbers, words and decimals in the file. 到目前为止,我已经在尝试找出如何为文件中的所有数字,单词和小数创建列表。 I've heard you can use the split method, so i tried this:
我听说您可以使用split方法,所以我尝试了以下方法:
with open('test.txt', 'r') as f:
for line in f:
numbers, words, decimals = f.split(","), f.split(","), f.split(",")
I did this assuming it would split every time it encountered a space, but that didn't happen, i just got the error: 我这样做是假设它每次遇到一个空间都会分裂,但这没有发生,我只是报错:
AttributeError: '_io.TextIOWrapper' object has no attribute 'split'
Any help would be appreciated. 任何帮助,将不胜感激。 If any elaboration is necessary on what i want to do please tell me, i'm aware this may have been worded poorly.
如果需要对我想做的事情进行详细说明,请告诉我,我知道这可能措辞很差。
First of all, the text file you've posted does not have commas separating the columns, so splitting the string at commas won't work. 首先,您发布的文本文件没有逗号分隔各列,因此以逗号分隔字符串将不起作用。 If you can trust that every line of the file will be identical in structure, then you can simply change your code to be
如果您可以相信文件的每一行在结构上都是相同的,则只需将代码更改为
numbers, words, decimals = [], [], []
with open('test.txt', 'r') as f:
for line in f:
number, word, decimal = line.split()
numbers.append(number)
words.append(word)
decimals.append(decimal)
with open('test.txt', 'r') as f:
numbers, words, decimals = zip(*(line.split() for line in f))
You want to split each line into fields 您想将每一行拆分为多个字段
with open('test.txt', 'r') as f:
for line in f:
number, word, decimal = line.split() # split on whitespace as indicated by your example file which does not use commas
numbers.append(int(number))
words.append(word)
decimals.append(float(decimal))
If you really intend to use ral decimals than you should use decimal.Decimal
instead of float
. 如果您确实打算使用ral十进制,则应该使用
decimal.Decimal
而不是float
。
Unless you are constrained in some way, I'd recommend using some library designed for working with tabular data, eg pandas where all this would be just 除非您受到某种方式的约束,否则我建议您使用一些设计用于处理表格数据的库,例如熊猫,其中所有这些都只是
import pandas as pd
df = pd.read_table('test.txt', delim_whitespace=True)
It should be line.split
and not f.split
since you're splitting the line and not the file. 它应该是
line.split
而不是f.split
因为要分割行而不是文件。 Also, you're separating your file on commas but the example file is separated by spaces? 另外,您要用逗号分隔文件,但示例文件是否用空格分隔? If it is separated by spaces you need to use
line.split(" ")
Also, using with open() as f
you don't need to open you're file beforehand or close it afterwards as it sorts that for you. 如果用空格隔开,则需要使用
line.split(" ")
此外,将with open() as f
使用时with open() as f
无需事先打开文件或在文件关闭line.split(" ")
对其进行排序,因为它会为您排序。 Also, you were saving the entire line split array to each variable and overwriting them each time. 另外,您将整个行拆分数组保存到每个变量,并每次都覆盖它们。 Overall code:
总体代码:
numbers, words, decimals = [], [], []
with open('test.txt', 'r') as f:
for line in f:
numbers.append(line.split(" ")[0])
words.append(line.split(" ")[1])
decimals.append(line.split(" ")[2])
a,b,c=[],[],[]
with open('new.txt', 'r') as f:
for line in f:
m=line.split()
a.append(m[0])
b.append(m[1])
c.append(m[2])
print a,b,c
Check if this is what you wanted to achieve. 检查这是否是您想要实现的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.