简体   繁体   English

python帮助在文本文件中分隔列表

[英]python help separating lists in a text file

I have a text file with hundreds of lists stored in a text file. 我有一个文本文件,其中有数百个列表存储在一个文本文件中。 How could i separate the lists and then store them as lists and search for the smallest second value between the lists. 我如何分隔列表,然后将它们存储为列表,并搜索列表之间的最小第二个值。 I am open to any new ways of tackling this problem. 我愿意采用任何新方法来解决这个问题。 Here is the first few 'lists' and my attempt to separate them 这是前几个“列表”,我尝试将它们分开

var line1=[["Apr 02 2014 01: +0",0.6,"295"],["Apr 03 2014 01: +0",0.641,"245"],["Apr 04 2014 01: +0",0.625,"246"],["Apr 05 2014 01: +0",0.665,"267"],["Apr 06 2014 01: +0",0.632,"226"],["Apr 07 2014 01: +0",0.672,"170"],["Apr 08 2014 01: +0",0.655,"147"],["Apr 09 2014 01: +0",0.654,"121"],["Apr 10 2014 01: +0",0.62,"136"],["Apr 11 2014 01: +0",0.629,"176"],["Apr 12 2014 01: +0",0.68,"190"],["Apr 13 2014 01: +0",0.677,"176"],["Apr 14 2014 01: +0",0.73,"153"],["Apr 15 2014 01: +0",0.587,"148"],["Apr 16 2014 01: +0",0.591,"134"],["Apr 17 2014 01: +0",0.612,"148"],["Apr 18 2014 01: +0",0.593,"142"],["Apr 19 2014 01: +0",0.612,"153"],["Apr 20 2014 01: +0",0.654,"203"],["Apr 21 2014 01: +0",0.713,"156"],["Apr 22 2014 01: +0",0.711,"153"],["Apr 23 2014 01: +0",0.625,"128"],["Apr 24 2014 01: +0",0.629,"122"],["Apr 25 2014 01: +0",0.603,"139"],["Apr 26 2014 01: +0",0.6,"169"],["Apr 27 2014 01: +0",0.589,"177"],["Apr 28 2014 01: +0",0.585,"132"],["Apr 29 2014 01: +0",0.612,"120"],["Apr 30 2014 01: +0",0.626,"116"],["May 01 2014 01: +0",0.57,"142"]

and my attempt to separate them 和我试图分开他们

with open('test.txt','r') as csvfile:
  writer=csv.reader(csvfile,delimeter=' , ',quotechar=csv.QUOTE_MINIMAL)
    for row in writer:
      print ','.join(row)

Ok, assuming this is representative input as far as format is concerned: 好的,假设就格式而言,这是代表性的输入:

var line1=[["Apr 02 2014 01: +0",0.6,"295"],["Apr 03 2014 01: +0",0.641,"245"]];

Then you could do the following: 然后,您可以执行以下操作:

import json

with open('test.txt', 'r') as datafile:
    data = datafile.read()

json_str = data.split('=', 1)[1].rstrip(';\n\r ')
my_data = json.loads(json_str)

for row in my_data:
    print row

print "Minimum by second value"
print min(my_data, key=lambda x: x[1])

Which will print: 将打印:

[u'Apr 02 2014 01: +0', 0.6, u'295']
[u'Apr 03 2014 01: +0', 0.641, u'245']
Minimum by second value
[u'Apr 02 2014 01: +0', 0.6, u'295']

You can use re.finditer() to get your expected lists within a generator.which is an optimized way for saving your memory. 您可以使用re.finditer()在生成器中获取期望的列表。这是节省内存的优化方法。

Then you can use ast.literal_eval to convert your string to a list object : 然后,您可以使用ast.literal_eval将字符串转换为列表对象:

import ast
import re
with open(filename) as f:
     all_list=re.finditer(r'\[([^[\]]*?)\]',f)

Demo: 演示:

print [ast.literal_eval(i.group()) for i in all_list]

[['Apr 02 2014 01: +0', 0.6, '295'], ['Apr 03 2014 01: +0', 0.641, '245'], ['Apr 04 2014 01: +0', 0.625, '246'], ['Apr 05 2014 01: +0', 0.665, '267'], ['Apr 06 2014 01: +0', 0.632, '226'], ['Apr 07 2014 01: +0', 0.672, '170'], ['Apr 08 2014 01: +0', 0.655, '147'], ['Apr 09 2014 01: +0', 0.654, '121'], ['Apr 10 2014 01: +0', 0.62, '136'], ['Apr 11 2014 01: +0', 0.629, '176'], ['Apr 12 2014 01: +0', 0.68, '190'], ['Apr 13 2014 01: +0', 0.677, '176'], ['Apr 14 2014 01: +0', 0.73, '153'], ['Apr 15 2014 01: +0', 0.587, '148'], ['Apr 16 2014 01: +0', 0.591, '134'], ['Apr 17 2014 01: +0', 0.612, '148'], ['Apr 18 2014 01: +0', 0.593, '142'], ['Apr 19 2014 01: +0', 0.612, '153'], ['Apr 20 2014 01: +0', 0.654, '203'], ['Apr 21 2014 01: +0', 0.713, '156'], ['Apr 22 2014 01: +0', 0.711, '153'], ['Apr 23 2014 01: +0', 0.625, '128'], ['Apr 24 2014 01: +0', 0.629, '122'], ['Apr 25 2014 01: +0', 0.603, '139'], ['Apr 26 2014 01: +0', 0.6, '169'], ['Apr 27 2014 01: +0', 0.589, '177'], ['Apr 28 2014 01: +0', 0.585, '132'], ['Apr 29 2014 01: +0', 0.612, '120'], ['Apr 30 2014 01: +0', 0.626, '116'], ['May 01 2014 01: +0', 0.57, '142']]

Note that you can process every list in each iteration and you are not forced to get all the list at once. 请注意,您可以处理每次迭代中的每个列表,而不必强制一次获取所有列表。

for i in all_list:
   # do stuff with ast.literal_eval(i.group())

Something along these lines should work as long as the data isn't too long(so it fits in memory) 只要数据不是太长,这些方面的东西都应该起作用(因此它适合存储在内存中)

import json

filename = "D:\\tmp\\test.txt"


final_lists = []

with open(filename, "r") as fl:
    #Read everything
    content = fl.read()

#Now split first line to keep only the lists
valid_part = content.split("=", 1)[1]
#Load valid part into json
cur_lists = json.loads(valid_part)

#Now find min value
_min = min(cur_lists, key=lambda x:x[1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM