[英]How can I get select specific lines of a text file in python?
I have two problems now with a text file : 我现在有两个关于文本文件的问题:
a) First: I have a text file (it's a log), this log has many lines: 2221. I just want to print from line 2211 to 2220. How can I do this? a) 首先:我有一个文本文件(它是一个日志),该日志有很多行:2221。我只想从2211到2220行打印。我该怎么做?
I have this code: 我有以下代码:
line_number=2011
with open('file.log') as f:
i = 2011
for line in f:
if i == line_number:
break
i += 1
print (line)
but print all the file 但打印所有文件
b) Second: Well, the lines 2211 to 2220 are this: b)第二:好吧,第2211至2220行是这样的:
Dominio1.BL00010001.pdb 24.69530 Dominio1.BL00010001.pdb 24.69530
Dominio1.BL00020001.pdb 14.33748 Dominio1.BL00020001.pdb 14.33748
Dominio1.BL00030001.pdb 30.53454 Dominio1.BL00030001.pdb 30.53454
Dominio1.BL00040001.pdb 23.82516 Dominio1.BL00040001.pdb 23.82516
Dominio1.BL00050001.pdb 27.48684 Dominio1.BL00050001.pdb 27.48684
Dominio1.BL00060001.pdb 18.17364 Dominio1.BL00060001.pdb 18.17364
Dominio1.BL00070001.pdb 30.98407 Dominio1.BL00070001.pdb 30.98407
Dominio1.BL00080001.pdb 17.19927 Dominio1.BL00080001.pdb 17.19927
Dominio1.BL00090001.pdb 19.02460 Dominio1.BL00090001.pdb 19.02460
Dominio1.BL00100001.pdb 22.57086 Dominio1.BL00100001.pdb 22.57086
I want to create a code that selects the number line that has the smallest number (identify),and read the name of the .pdb (just the 24 characters of the line that has the smallest number).Cause, I need identify what's the .pdb that has the smallest number, and use it like a string in other script, like this: 我想创建一个代码来选择编号最小的数字行(标识),并读取.pdb的名称(仅编号最小的行的24个字符)。原因是,我需要确定什么是具有最小数字的.pdb,并像其他脚本中的字符串一样使用它,如下所示:
model= '%s'%R 型号= '%s'%R
where '%s'%R is the name of .pdb that i need 其中'%s'%R是我需要的.pdb的名称
How can I do it? 我该怎么做?
Your code merely breaks when you reach the line of interest, but you have no condition associated with the print, so it prints every line it encounters. 您的代码仅在到达感兴趣的行时中断,但是您没有与打印相关的条件,因此它将打印遇到的每一行。 If you change your code to something like:
如果您将代码更改为以下内容:
start = 2011
end = 2220
with open('file.log') as f:
for line_number, line in enumerate(f):
if line_number > end:
break
if line_number > start:
print line
And you can treat the filehandle as a list and slice it: 您可以将文件句柄视为列表并对其进行切片:
with open('file.log') as f:
print "".join(list(f)[2011:2220])
A: A:
with open('file.log') as f:
print f.read().split('\n')[2211:2220+1]
First of all create a list of all the lines in the text file (Lines are seperated by a new line character("\\n"), then slice the list, easy as that. 首先,创建文本文件中所有行的列表(行用新的行字符(“ \\ n”)分隔,然后对列表进行切片,就这么简单。
Edit: Alternatively you could use the bulit-in function "readlines" If you don't mind the '\\n' at the end: 编辑:或者,您可以使用bulit-in函数“ readlines”,如果您不介意最后的'\\ n':
with open('file.log') as f:
print f.readlines()[2211:2220+1]
B: B:
def s(item):
return item[num_of_spaces:]
num_of_spaces = len("Dominio1.BL00010001.pdb ")
with open('file.log') as f:
lines = f.read().split('\n')[2211:2221]
print sorted(lines, key=s)[0]
This should work 这应该工作
with open('file.log') as f:
rd=f.readlines()
print (rd[2211:2221])
readlines()
returns a list, so just slice the list with indices. readlines()
返回一个列表,因此只需将列表切成索引即可。 Indices starting from 0
and the last number doesn't count, so you have to write 2220+1. 从
0
到最后一个数字的索引不计算在内,因此您必须写2220 + 1。
The problem with your snippet is you are always printing the line you read until you get to the desired line, then you break the loop! 您的代码段存在的问题是,您始终打印要读取的行,直到到达所需的行,然后中断循环! Try this instead
试试这个
line_number=2011
with open('file.log') as opened_file:
for i, line in enumerate(opened_file):
# Only print it if you got to the desired line or upper
if i >= line_number:
print(line)
However, there are better approaches to this problem, specially if you're dealing with large size files. 但是,有更好的方法来解决此问题,特别是在处理大型文件时。 Take a look at this question .
看一下这个问题 。
If you want to take the name I suggest splitting the line, you may write: 如果您想使用我建议分隔的名称,可以输入:
columns = line.split()
print('File name is', columns[0])
So you get a list like this for each line ['Dominio1.BL00010001.pdb', '24.69530']
. 因此,您会得到每行这样的列表
['Dominio1.BL00010001.pdb', '24.69530']
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.