![](/img/trans.png)
[英]How to extract only particular set of structs from a file between braces in python
[英]How to extract particular set of value from a file in Python?
我在这里坚持逻辑......我必须从一个看起来像这样的文本文件中提取一些值
AAA
+-------------+------------------+
| ID | count |
+-------------+------------------+
| 3 | 1445 |
| 4 | 105 |
| 9 | 160 |
| 10 | 30 |
+-------------+------------------+
BBB
+-------------+------------------+
| ID | count |
+-------------+------------------+
| 3 | 1445 |
| 4 | 105 |
| 9 | 160 |
| 10 | 30 |
+-------------+------------------+
CCC
+-------------+------------------+
| ID | count |
+-------------+------------------+
| 3 | 1445 |
| 4 | 105 |
| 9 | 160 |
| 10 | 30 |
+-------------+------------------+
我无法仅从BBB中提取价值并将其附加到类似
f = open(sys.argv[1], "r")
text = f.readlines()
B_Values = []
for i in text:
if i.startswith("BBB"):(Example)
B_Values.append("only values of BBB")
if i.startswith("CCC"):
break
print B_Values
应该导致
['| 3 | 1445 |','| 4 | 105 |','| 9 | 160 |','| 10 | 30 |']
d = {}
with open(sys.argv[1]) as f:
for line in f:
if line[0].isalpha(): # is first character in the line a letter?
curr = d.setdefault(line.strip(), [])
elif filter(str.isdigit, line): # is there any digit in the line?
curr.append(line.strip())
对于这个文件, d
现在是:
{'AAA': ['| 3 | 1445 |',
'| 4 | 105 |',
'| 9 | 160 |',
'| 10 | 30 |'],
'BBB': ['| 3 | 1445 |',
'| 4 | 105 |',
'| 9 | 160 |',
'| 10 | 30 |'],
'CCC': ['| 3 | 1445 |',
'| 4 | 105 |',
'| 9 | 160 |',
'| 10 | 30 |']}
你的B_values
是d['BBB']
您可以使用bstarted状态标志来跟踪B组开始的时间。 扫描B组后,删除三个标题行和一个页脚行。
B_Values = []
bstarted = False
for i in text:
if i.startswith("BBB"):
bstarted = True
elif i.startswith("CCC"):
bstarted = False
break
elif bstarted:
B_Values.append(i)
del B_Values[:3] # get rid of the header
del B_Values[-1] # get rid of the footer
print B_Values
您应该避免迭代已读取的行。 每当您想要阅读下一行时调用readline并检查它是什么:
f = open(sys.argv[1], "r")
B_Values = []
while i != "":
i = f.readline()
if i.startswith("BBB"): #(Example)
for temp in range(3):
f.skipline() #Skip the 3 lines of table headers
i = f.readline()
while i != "+-------------+------------------+" and i !="":
#While we've not reached the table footer
B_Values.append(i)
i = f.readline()
break
#Although not necessary, you'd better put a close function there, too.
f.close()
print B_Values
编辑 :@eumiro的方法比我的方法更灵活。 因为它从所有部分读取所有值。 尽管在我的示例中可以实现isalpha
测试以读取所有值,但他的方法仍然更易于阅读。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.