[英]How do I read a file line by line and print the line that have specific string only in python?
I have a text file containing these lines 我有一个包含这些行的文本文件
wbwubddwo 7::a number1 234 **
/// 45daa;: number2 12
time 3:44
I am trying to print for example if the program find string number1
, it will print 234
我正在尝试打印例如程序是否找到字符串
number1
,它将打印234
I start with simple script below but it did not print what I wanted. 我从下面的简单脚本开始,但是没有显示我想要的内容。
with open("test.txt", "rb") as f:
lines = f.read()
word = ["number1", "number2", "time"]
if any(item in lines for item in word):
val1 = lines.split("number1 ", 1)[1]
print val1
This return the following result 这将返回以下结果
234 **
/// 45daa;: number2 12
time 3:44
Then I tried changing f.read()
to f.readlines()
but this time it did not print out anything. 然后我尝试将
f.read()
更改为f.readlines()
但这一次它没有打印出任何内容。
Does anyone know other way to do this? 有人知道其他方法吗? Eventually I want to get the value for each line for example
234
, 12
and 3:44
and store it inside the database. 最后,我想对每行的值,例如
234
, 12
和3:44
,并将其存储在数据库中。
Thank you for your help. 谢谢您的帮助。 I really appreciate it.
我真的很感激。
Explanations given below: 解释如下:
with open("test.txt", "r") as f:
lines = f.readlines()
stripped_lines = [line.strip() for line in lines]
words = ["number1", "number2", "time"]
for a_line in stripped_lines:
for word in words:
if word in a_line:
number = a_line.split()[1]
print(number)
1) First of all 'rb' gives bytes object ie something like b'number1 234'
would be returned use 'r' to get string object. 1)首先,“ rb”给出字节对象,即将使用“ r”返回类似
b'number1 234'
字符串对象。
2) The lines you read will be something like this and it will be stored in a list. 2)您阅读的行将是这样,并将存储在列表中。
['number1 234\\r\\n', 'number2 12\\r\\n', '\\r\\n', 'time 3:44']
Notice the \\r\\n
those specify that you have a newline. 注意
\\r\\n
这些指定您有换行符。 To remove use strip()
. 要删除使用
strip()
。
3) Take each line
from stripped_lines
and take each word
from words
and check if that word is present in that line using in
. 3)从
stripped_lines
获取每一line
,并从words
获取每个word
,并使用in
检查该行中是否存在该单词。
4) a_line
would be number1 234
but we only want the number part. 4)
a_line
将为number1 234
但我们只希望数字部分。 So split()
output of that would be 所以
split()
输出将是
['number1','234']
and split()[1]
would mean the element at index 1. (2nd element). ['number1','234']
和split()[1]
表示索引1处的元素(第二个元素)。
5) You can also check if the string is a digit using your_string.isdigit()
5)您还可以使用
your_string.isdigit()
检查字符串是否为数字
UPDATE: Since you updated your question and input file this works: 更新: 由于您更新了问题和输入文件,因此可以:
import time
def isTimeFormat(input):
try:
time.strptime(input, '%H:%M')
return True
except ValueError:
return False
with open("test.txt", "r") as f:
lines = f.readlines()
stripped_lines = [line.strip() for line in lines]
words = ["number1", "number2", "time"]
for a_line in stripped_lines:
for word in words:
if word in a_line:
number = a_line.split()[-1] if (a_line.split()[-1].isdigit() or isTimeFormat(a_line.split()[-1])) else a_line.split()[-2]
print(number)
why this isTimeFormat()
function? 为什么这是
isTimeFormat()
函数?
def isTimeFormat(input):
try:
time.strptime(input, '%H:%M')
return True
except ValueError:
To check if 3:44 or 4:55 is time formats. 检查3:44或4:55是时间格式。 Since you are considering them as values too.
因为您也将它们视为价值。 Final output:
最终输出:
234
12
3:44
After some try and error, I found a solution like below. 经过一番尝试和错误,我找到了下面的解决方案。 This is based on answer provided by @s_vishnu
这基于@s_vishnu提供的答案
with open("test.txt", "r") as f:
lines = f.readlines()
stripped_lines = [line.strip() for line in lines]
for item in stripped_lines:
if "number1" in item:
getval = item.split("actual ")[1].split(" ")[0]
print getval
if "number2" in item:
getval2 = item.split("number2 ")[1].split(" ")[0]
print getval2
if "time" in item:
getval3 = item.split("number3 ")[1].split(" ")[0]
print getval3
output 产量
234
12
3:44
This way, I can also do other things for example saving each data to a database. 这样,我还可以做其他事情,例如将每个数据保存到数据库。
I am open to any suggestion to further improve my answer. 我愿意提出任何进一步改善答案的建议。
You're overthinking this. 您想得太多了。 Assuming you don't have those two asterisks at the end of the first line and you want to print out lines containing a certain value(s), you can just read the file line by line, check if any of the chosen values match and print out the last value (value between a space and the end of the line) - no need to parse/split the whole line at all:
假设您在第一行的末尾没有这两个星号,并且要打印出包含某个值的行,则可以逐行读取文件,检查是否选择了任何匹配的值,并且打印出最后一个值(空格和行尾之间的值)-根本不需要解析/分割整行:
search_values = ["number1", "number2", "time"] # values to search for
with open("test.txt", "r") as f: # open your file
for line in f: # read it it line by line
if any(value in line for value in search_values): # check for search_values in line
print(line[line.rfind(" ") + 1:].rstrip()) # print the last value after space
Which will give you: 这会给你:
234
12
3:44
If you do have asterisks you have to more precisely define your file format as splitting won't necessarily yield you your desired value. 如果确实有星号,则必须更精确地定义文件格式,因为拆分不一定会产生所需的值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.